Wednesday, November 20, 2013

Database design - Best practices

Over the years I have designed a number of databases and from that experience the following best practices have evolved.

All tables should have a primary key
Some tables might not need a primary key but even so I would recommend including one for consistency reasons and since some ORMs get really confused if tables lack primary keys.

Choose surrogate primary keys over natural primary keys
A surrogate key is a key that has no business meaning. A natural key has meaning. Social security numbers would be a typical example of a natural key. Surrogate keys never change, natural keys sometimes do change and are thus not very good candidates for a primary key. You can find endless discussions on the net about pros and cons using surrogate or natural keys. In my experience surrogate keys are the better choice in almost all situations.

Use an auto incremented identity column for all primary keys
A rather common, but in my opinion bad, pattern is to use a composite primary key in order to ensure that some combination of column values is unique, sometimes it even includes one or more foreign keys. It might also be tempting to use a composite primary key simply because the current domain logic dictates that this particular combination of columns uniquely identifies a row in the table. Even so, do not do this. If you need to make sure that some combination of column values are unique, use a unique index instead. Having a single integer as a primary key in all your tables makes a lot of things easier, especially if you are using some kind of ORM in your applications. Most database servers default to cluster tables on the primary key. This is, of course, rather useless when having identity columns as primary keys. So, instead, if you need to speed up selects, create a clustered index that matches your most frequent selects.

Name your tables in singular. 
A table containing trades should be named Trade and not Trades.

Tables and columns should have meaningful names
Be careful with abbreviations. Try to name your tables and columns so that the names actually mean something. A name should reflect purpose and not simply describe what something happens to be. For example, a column containing the name of the user who owns an order should be named owner and not user. The need for this becomes obvious as soon as you have more than one column representing, for example, users in different roles.

Foreign key names should follow a pattern
The name of a foreign key should start with the role of the key and end with Id. So, if the Position table has a foreign key relationship with the Account table there should be a foreign key in Position named AccountId pointing to the Id column in Account. Most often these foreign keys will be named <table they are pointing to>Id but not always. Note that the names should reflect the relationships meaning and in the case of positions and accounts AccountId captures that very well. However, consider a table named Project that has several foreign keys to the table User. They cannot all be named UserId. So, instead they should be named according to what the relationship represents. For example, OwnerId, CreatedById, EditedById etc.

Use CamelCase notation for tables and columns. 
A table holding email addresses should be named EmailAddress and not EMAIL_ADRESS. Even if this is mostly a matter of taste using the former integrates much better with ORM:s like the Entity Framework. A foreign key relation between the table User and EmailAddress would result in a navigation property in the User entity called EmailAddresses instead of EMAIL_ADDRESSes. Not that it matters all that much but I find it better to have property names that resemble the written language as much as possible.

Sunday, August 25, 2013

Running IIS Express in 64 bit mode

The following registry key will make IIS Express run in 64 bit mode.

reg add HKEY_CURRENT_USER\Software\Microsoft\VisualStudio\11.0\WebProjects /v Use64BitIISExpress /t REG_DWORD /d 1

reg add HKEY_CURRENT_USER\Software\Microsoft\VisualStudio\12.0\WebProjects /v Use64BitIISExpress /t REG_DWORD /d 1

According to Microsoft this is not supported though. I found the info here.

Speeding up inserts using the Entity Framework - Part 2

Inserting large amounts of data using the entity framework can be very slow. The latest post in this serie can be found at
Speeding up inserts using Linq-To-Entities - Part 4

This post contains the code necessary when using EF 5 and POCO objects. It also differs from the previous examples in that it handles EF models where then POCO field names do not map exactly with table names. All you have to do is to extend your generated context class using this partial class. Naturally you have to change class name and namespace but other than that you should be fine.

    public partial class MyEntities
        public override int SaveChanges()
            var objectContext = ((IObjectContextAdapter)this).ObjectContext;
            objectContext.CommandTimeout = 5 * 60;

            return base.SaveChanges();

        public void BulkInsertAll<T>(T[] entities) where T : class
            var conn = (SqlConnection)Database.Connection;

            Type t = typeof(T);
            var objectContext = ((IObjectContextAdapter)this).ObjectContext;
            var workspace = objectContext.MetadataWorkspace;
            var mappings = GetMappings(workspace, objectContext.DefaultContainerName, typeof(T).Name);

            var tableName = GetTableName<T>();
            var bulkCopy = new SqlBulkCopy(conn) { DestinationTableName = tableName};

            // Foreign key relations show up as virtual declared 
            // properties and we want to ignore these.
            var properties = t.GetProperties().Where(p => !p.GetGetMethod().IsVirtual).ToArray();
            var table = new DataTable();
            foreach (var property in properties)
                Type propertyType = property.PropertyType;

                // Nullable properties need special treatment.
                if (propertyType.IsGenericType &&
                    propertyType.GetGenericTypeDefinition() == typeof(Nullable<>))
                    propertyType = Nullable.GetUnderlyingType(propertyType);

                // Since we cannot trust the CLR type properties to be in the same order as
                // the table columns we use the SqlBulkCopy column mappings.
                table.Columns.Add(new DataColumn(property.Name, propertyType));
                var clrPropertyName = property.Name;
                var tableColumnName = mappings[property.Name]; 
                bulkCopy.ColumnMappings.Add(new SqlBulkCopyColumnMapping(clrPropertyName, tableColumnName));

            // Add all our entities to our data table
            foreach (var entity in entities)
                var e = entity;
                table.Rows.Add(properties.Select(property => GetPropertyValue(property.GetValue(e, null))).ToArray());

            // send it to the server for bulk execution


        private string GetTableName<T>() where T : class
            var dbSet = Set<T>();
            var sql = dbSet.ToString();
            var regex = new Regex(@"FROM (?<table>.*) AS");
            var match = regex.Match(sql);
            return match.Groups["table"].Value;

        private object GetPropertyValue(object o)
            if (o == null)
                return DBNull.Value;
            return o;

        private Dictionary<string, string> GetMappings(MetadataWorkspace workspace, string containerName, string entityName)
            var mappings = new Dictionary<string, string>();
            var storageMapping = workspace.GetItem<GlobalItem>(containerName, DataSpace.CSSpace);
            dynamic entitySetMaps = storageMapping.GetType().InvokeMember(
                BindingFlags.GetProperty | BindingFlags.NonPublic | BindingFlags.Instance,
                null, storageMapping, null);

            foreach (var entitySetMap in entitySetMaps)
                var typeMappings = GetArrayList("TypeMappings", entitySetMap);
                dynamic typeMapping = typeMappings[0];
                dynamic types = GetArrayList("Types", typeMapping);

                if (types[0].Name == entityName)
                    var fragments = GetArrayList("MappingFragments", typeMapping);
                    var fragment = fragments[0];
                    var properties = GetArrayList("AllProperties", fragment);
                    foreach (var property in properties)
                        var edmProperty = GetProperty("EdmProperty", property);
                        var columnProperty = GetProperty("ColumnProperty", property);
                        mappings.Add(edmProperty.Name, columnProperty.Name);

            return mappings;

        private ArrayList GetArrayList(string property, object instance)
            var type = instance.GetType();
            var objects = (IEnumerable)type.InvokeMember(property, BindingFlags.GetProperty | BindingFlags.NonPublic | BindingFlags.Instance, null, instance, null);
            var list = new ArrayList();
            foreach (var o in objects)
            return list;

        private dynamic GetProperty(string property, object instance)
            var type = instance.GetType();
            return type.InvokeMember(property, BindingFlags.GetProperty | BindingFlags.NonPublic | BindingFlags.Instance, null, instance, null);