Archive

Archive for April 19th, 2005

General considerations for selecting Database Indexes

April 19th, 2005


Clustered Index
Most tables should have a clustered index* since there is little point storing the data in a random order. Some DBMSs have problems with hotspots if you use a clustered index on a column containing a monotonically increasing id or a date column in OLTP databases. SQL Server 7+ handle this quite nicely though. You probably want to pick the column the majority of large queries will be performed on (often a date column) This article on Clustered Indexes in SQL Server goes into more detail.

Index Selectivity
There is little point putting an index on a column that is not very selective (since the optimiser will probably ignore it). The canonical example of a non-selective index is a column indicating sex (since it only has two possible values).** Broadly speaking if a query returns more than 5-10% of the table the index is unlikely to be used since a table scan will be more efficient.

Foreign Key Indexes
Most of the time an index should be created on any foreign keys within a table (indexes on Primary Keys are created for you)

Query Patterns
Obviously there is little point having an index if it is never going to form part of a query. It’s ok not to have every where clause covered by a query but your core queries should be covered

Key Length on Indexes
Indexes take up space! It’s not uncommon for the total size of all the indexes in a database to be greater than the total size of all the data. Keeping this in mind you should favour indexes on columns containing small datatypes (numbers and dates not large text columns)

Composite Indexes
Related to keeping index sizes down be careful when using composite indexes (as an aside you should avoid composite primary keys if you can) be careful how you order the columns that make up the index.
E.g.
If you have an index on columns A, B and C (in that order) the index will only be used if you’re querying on A, B, C or A, B or A. If you query on A, C the index won’t be used.

Covering Indexes
A covering index is one where all the columns required from the table in question are in the index. This saves an I/O operation to fetch the row from the table. It can be advantageous to include an extra column (or two) that aren’t strictly needed if you expect this index to be used frequently.

Remember that the optimiser can only use the data it has available to it so you must ensure all statistics are kept updated for optimal performance!

* A clustered index means that the data in the table is stored on disk in the same order as the index. For obvious reasons this mens you can only have one per table
** You can calculate selectivity for a coulumn using the following queries
– This will get the total number of rows in the table
SELECT count(*) from table
– This will get the counts for each value
SELECT colname, count(*) from table
GROUP BY colname
You can then divide the count for each value by the total number of rows. Then multiply by 100 to get the percentage. Low numbers = high selectivity!

/Technology ,

WebMethods with Interface/Base Class parameters (.NET)

April 19th, 2005

Not immediately obvious how you can create a web service method that can take/return either an interface pointer or a base class reference.
You need to use the SoapInclude attribute to specify the concrete type(s).
E.g. (assume Cat inherits from Animal)
[WebMethod]
[SoapRPCMethod]
[SoapInclude(typeof(Cat))]
public void processAnimal(Animal animal){}
This MSDN Article on SOAP Attributes has more info if you’re interested

/Technology

Firebird - Perhaps in the next incarnation

April 19th, 2005

I’ve been tinkering with Firebird in conjunction with .NET. I’ve been using IB Expert which seems to be a reasonably competent UI.

My main problems so far are:

  • Lack of documentation for use with .NET
  • Could not get my simple Insert SP to work when called from .NET
  • None of the VS.NET tools work with it

On a commercial project I’d persevere but for what I’m trying to do in my own time it’s too much hassle. Back to access for me I’m afraid

/Technology ,