Speaking on SQL Server Algebrizer at SQL Saturday #82 Indianapolis

I was selected to present a session Back to basics – How does the Algebrizer work  has been selected for  SQL Saturday #82 in Indianapolis, IN on Saturday,   June 25, 2011!

Location is:
University of Indianapolis-1400 E Hanna Ave
Indianapolis, IN 46227

View Larger Map

SQLSaturday is a training event for SQL Server professionals and those wanting to learn about SQL Server. Admittance to this event is free, all costs are covered by donations and sponsorships. Please register soon as seating is limited, and let friends and colleagues know about the event.
If you are attending please stop by and introduce yourself. Also if you are on Twitter don’t forget to use hashtag #sqlsat82 and see what others are doing.

Hope to see you there!

What’s the big difference between Oracle and SQL Server indexes (Part I)

This entry is part 1 of 1 in the series SQL Server vs. Oracle

Last week Erin Stellato (blog| twitter) has asked a very interesting question on Twitter that triggered this post:

So how is it you can have good perf in Oracle, w/o IOTs, but in SQL Server, everyone says need CIs for good perf?

There are two aspects to that question

  1. What is the difference between index implementations in Oracle vs. SQL Server?
  2. How do you define good performance?Peeling an Onion

Until now I have not realized that this subject is not very well covered and since it involves diving into the internals  I offered to make an attempt to fill this gap over the Memorial Day long weekend and soon realized that there is more to this than I originally thought it is.  This will be the first post in a set that I am planning to write in the next weeks.

First, let’s begin by comparing the different types of indexes supported by Oracle and SQL Server:

Type of index SQL  Server 2008 R2 Oracle 11g
B-tree indexes Yes (Non-clustered) Yes (Default)
B+ tree indexes Yes (Clustered Indexes) Yes (IOT)
Function Based Indexes Yes (Column has to exist in table) Yes
Filtered Indexes Yes Yes
Hash cluster indexes No Yes
B-Tree cluster indexes No Yes
Bitmap Indexes No Yes

So both Oracle and SQL Server support B-tree indexes which are ordered lists of key values, associated with the storage location of the table record that contains the respective value. So far it doesn’t seem to be a big difference between the two RDBMS. Both also support B+ tree indexes although there are differences between the two implementations. B+ trees are similar to B-trees but the table records are stored in the leaf nodes of the primary key index. This allows  fast access for singleton searches (exact match) or range searches on the primary key of the table.  Microsoft calls them clustered indexes (CI) while Oracle uses the term Index Organized Tables (IOT).

Let’s look at non-clustered indexes which are the default indexes in Oracle. Here the two implementations are quite similar: Indexes are using a B-tree structure that contains  the key values and a unique reference to the storage location of the record. This is where the implementations start to separate:

In Oracle every record has a unique (inside the table) 10 byte pseudocolumn ROWID representing the physical location on disk that the record is stored. When printed, each ROWID is displayed in BASE64 format (A-Za-z0-9+/): AAAAaoAATAAABrXAAA
ROWID has a four-piece format, OOOOOOFFFBBBBBBRRR:

  • OOOOOO: The data object number that identifies the database segment (AAAAao in the example). Schema objects in the same segment, such as a cluster of tables, have the same data object number.
  • FFF: The tablespace-relative datafile number of the datafile that contains the row (file AAT in the example).
  • BBBBBB: The data block that contains the row (block AAABrX in the example). Block numbers are relative to their datafile, not tablespace. Therefore, two rows with identical block numbers could reside in two different datafiles of the same tablespace.
  • RRR: The row in the block.

Users can use ROWIDs directly for several functions:

  • Rowids are the fastest way of accessing  rows.
  • Rowids can be used to see how a table is organized.
  • Rowids uniquely identify rows in a given table.

There are two completely different perspectives when migrating from one RDBMS to another:
Most Oracle users have no concept of clustered or non clustered indexes because primary key indexes created in Oracle are balanced non-clustered indexes that include ROWID for fast access.

A common mistake that Oracle users make is assume that  the indexes are implemented the same way  in SQL server and underestimate the performance hit that comes with using a large composite primary key in SQL server on a Clustered table. It takes a while to understand that the non clustered index size  index size in SQL server can be exponentially bigger in SQL server and is not necesarly proportional to the size of the columns specified in the non clustered index. Based on this, some Oracle experts say that the clustered indexes in SQL Sever have a 110% overhead because index key is stored “twice”, once with the index entry in the intermediate index nodes – and again with the row data at the leaf level.

Another performance hit can come from the  default value of the free space (PCTFREE) that is left on a page  as Oracle leave by default 10% space on each block (at 91% block is split). After a data block is filled to the limit determined by PCTFREE, Oracle considers the block unavailable for the insertion of new rows until the percentage of that block falls beneath the parameter PCTUSED.

The alternative of using a non clustered table (heap)  is not recommended in most cases  and Paul Randal ( blog | Twitter )  has some excellent posts on this subject  as well as Erin’s recent blog post on Internals and Deletes for SQL University which I highly recommend reading.

On the opposite camp, most SQL Server users are looking for an alternative to clustered indexes and end up choosing to use Index Optimized Tables (IOT) instead of using simple tables just because they search for something that has the same structure as clustered tables in SQL Server in order to have comparable performance.

In the next part of my series I will continue with access methods in Oracle, and other index types.