Apr 12, 2012

Designing relational databases

While I was introduced to database theory in University, we never really did much in the way of practically implementing that theory. As such, when I am tasked to design a database for a web application I had no idea where to start. Well, not really; I had a vague idea of where to begin.

Thus I scrambled through my notes and various on-line articles to relearn the topic. Here are my notes; use them wisely...

  • The Relational Database Model describes a database that has a series of unordered relations (or a related set of information) that can be manipulated using operations that return tables.
  • Some interchangeable terms:
    Relations = tables (set of related information)
    Attributes = columns/fields (Descriptor of the information)
    Tuples = rows/record (Actual data)
  • Each table should only represent one (and only one) thing or event.
  • A primary key is a column/attribute of a relation/table that guarantees the uniqueness of a record/row/tuple. You can only have one primary key per relation.
  • A foreign key is an attribute used to reference a primary key in another relation/table.
  •  Normalisation is the process of simplifying the design of a database so that it achieves optimum structure.

    1. First normal form: All attributes must be atomic. Attributes should not be able to be broken down or repeated in a relation.

      In other words, an attribute cannot be an aggregate or list of data.
    2. Second normal form: Every non-key attribute must be dependant on the primary key.

      In other words, the postcode attribute will be dependant on the address and not the person. It stands that the address should be moved to it's own table.
    3. Third normal form: All non-key attributes are mutually independent. And attribute should not be reproducible or rely on information already existing in the tuple.

      For instance, the total cost for a sale is the sum of all costs, and does not deserve it's own table or attribute.
  • A join is used to collate relations into one relation with relevant data. This can be used to perform other operations without going back and forth between the tables.
  • A Natural join relies on an intersection of attributes (i.e. both tables contain attributes with the same name). This can be either a primary key or a foreign key.
  • An Inner Join specifies which matching attributes to use (this is done through the USING keyword). In other words, you get to to specify which attributes are going to be used to match records together.
  • An Outer Join is used where you want to display all data, whether they have a matching record or not. This is because the inner join would simply discard any record that did not have a matching attribute in the other table.
  • A view (also called a named query or stored query) is simply a SELECT statement that is given a name and is stored in the database.
  • A good practice in database design is to avoid having NULL values stored in your relation. If the attribute is not required to make the relation complete, do we need the attribute? Or can we move it to it's own relation?
  • Some common mistakes in database design are:
    1. Poor design/Planning
    2. Ignoring normalisation
    3. Poor naming standards
    4. Lack of documentation
    5. One table to hold them all (a sort of repeat of the 2nd normalisation rule)
    6. Using identity/guid columns as your only key
    7. Not using SQL facilities to protect data integrity
    8. Not using stored procedures
    9. Trying to build generic objects
    10. Lack of testing

References

No comments:

Post a Comment

Thanks for contributing!! Try to keep on topic and please avoid flame wars!!