Speaking at SQLSaturday #110 in Tampa,FL

I am happy to announce that my session on “Optimizing SQL Server I/O with Solid State Drives ” was selected for SQL Saturday #110 in Tampa, FL  on Saturday March 10, 2012!

Location is:
1001 East Palm Avenue,
Tampa, FL 33605

SQLSaturday is a training event for SQL Server professionals and those wanting to learn about SQL Server. Admittance to this event is free, all costs are covered by donations and sponsorships. Please register soon as seating is limited, and let friends and colleagues know about the event.
If you are attending please stop by and introduce yourself. Also if you are on Twitter don’t forget to use hashtag #sqlsat110 and see what others are doing.


Due to a family emergency, I had to cancel this speaking engagement and luckily my slot was filled by another speaker.I hope to make it next year!

Weekly links for 11-21-2011

SQL Server 2012 Release Candidate is now Available from SQL Server Team blog

The Data Scientist from BuckWoody

7 Scaling Strategies Facebook Used to Grow to 500 Million Users  HighScalabilty

10 Sloppy Social Media Mistakes to Fix NOW from Hubspot Blog

What is Hadoop? And Why is it Good for Big Data?  from The Data Roundtable

Hacker Says Texas Town Used Three Character Password To Secure Internet Facing SCADA System from ThreatPost

The AdventureWorks2008R2 Books Online Random Workload Generator from Jonathan Kehayias



SQLskills Community Mentoring and me

As some of you might be aware Paul Randal (blog | twitter) has just announced the launch of a new SQL Skills Community Mentoring program for a few individuals that took part in the SQL Skills Immersion Events. I’m deeply honored to be one of the select few that get access to invaluable advice from what I consider the best team in the SQL Server world.LukeAndPaul

Allow me to go back to a year ago (and boy what a great year 2011 was), when I could only dream of getting trained by Paul Randal and Kimberly Tripp, but my perseverance paid off and after multiple requests,  me and my buddy Vince managed to convince the company to send us to the first week of Immersion training in May this year especially since it was in the western suburbs of Chicago. Very soon I realized that this is by far the best training I have ever received, period. The amount of content was overwhelming (as Brad put it “it’s like drinking from a proverbial fire hose”) but the presentation style was extremely engaging and well prepared. Paul and Kimberly don’t just tell you “this is what you should/shouldn’t do” they go beyond that and explain why and how  and then they demo the whole concept. Just like a good teacher they can take a difficult topic  and present it in a way that makes it easy for you to follow and absorb  a lot of  information.
On the fourth day of the IE 1 class I volunteered to present a slide deck. I was extremely embarrassed as I felt like I was not in the same league as the rest of the folks present, but everything went so well that it made me reconsider my thoughts. As a matter of fact it made me submit  a couple sessions for SQL Saturday #82 in Indianapolis and this was the beginning of my public speaking adventure. A few days ago I actually presented the same slide deck from IE1 at SQL Saturday #96 in Washington, DC and the feedback was excellent.

Last month I continued my learning odyssey with the second week of training (IE2). Right away  the fact that we attended IE1 made a huge difference and it helped us understand a lot better the subjects presented to us. This time it wasn’t just Paul and Kimberly but Jonathan and Joe  have joined them adding their technical skills and expertise to the table.   I learned so many new things that I have already used in my environment.

To conclude, these classes have helped me to understand significantly better how SQL Server works. Things like PLE, forwarding records, heap fragmentation, index tuning, page structure, SAN performance, waits and stats, extended events, NUMA architectures, MCM program are only a few of the subjects that were explained in great detail.

Now on the last binary day of  the century all this becomes even better by having the unique chance to have someone like  Joe Sack as Mentor and working with him to advance my career. It’s been an incredible year for me and I look forward to work with my Mentor Joe Sack and  see what the next 6 months will bring !

Once more, thank you SQLskills. It is an honor to be in such good company.



We are family!

This is the second time I’m writing a post in the  “Meme Monday” series and this time it originates from Tom LaRock (blog | twitter).Tom makes coffee The topic is : What #SQLfamily means to me.

This weekend I participated in a #SQLfamily reunion in Washington, DC. Some might argue that this is called SQL Saturday #96 but I prefer to see it as a family reunion because this is how it felt. It started with a #SQLfamily dinner where we had a good time told stories (some would even make a good Agatha Christie book)… but I digress. We all sat around the table and felt like we are part of a family. I met Tom LaRock, Joey Dantoni, Karen Lopez, Andy Leonard, Brian Moran, Aaron Nelson, Matt Velic, Chris Skorlinski, Wayne Sheffield, Sebastian Meine and many others. I truly felt like a family reunion and it was.
On Saturday we continued, had fun presenting our sessions, learning new things, hanging out in the speaker room and talking about all kinds of things. Everybody contributed to making this  #SQLfamily reunion an event to remember. Tom even offered to make coffee and but had no idea what to do with the extension cord. (…after the last power outage they had in New England you would think he should know better 🙂 )

Looking back I realize that every #SQLfamily event I have participated to has made me feel like this whether it was a SQL Saturday in Indianapolis or in Chicago or a user group meeting I never felt left out. I had my brothers and sisters and uncles and aunts and they made me feel like I am part of the #SQLfamily.

And the best part is that our #SQLfamily is growing every day. Every reunion we discover more people we can relate to even though we are not related (but in a way we are).

God bless our #SQLfamily!

How to: Poor man’s cursor template for batch processing

John Sterrett blog|twitter asked a question about “How do I loop by rowcount in SSIS? I have a table with 5 million IDs and would like to loop to process 10K at a time. #sqlhelp”

This cannot be done natively in SSIS but you could create a stored procedure that uses the concept known as “poor man’s cursor” to split it in batches.
I included a template of code that I use  to create smaller transactions that do not block normal users accessing the same tables

Let me know what you think.

 * Name:            Poor mans cursor example for batch processing
 * Author:            Luke Jian - luke(a.t)sqlpositive.com
 * Creation Date:    11/02/11
 * Purpose:            Template for batch processing using Poor man's cursor
    DECLARE @RowStart [INT]
    DECLARE @RowEnd [INT]
    DECLARE @RowToDo [INT]
    DECLARE @RowDone [INT]
    DECLARE @BatchSize [INT]
    --    Load Users To Enable
    IF ( OBJECT_ID('[TempDB]..[#PoorManCursor]') IS NOT NULL )
    DROP TABLE [#PoorManCursor]
    [ID] = IDENTITY ( [INT], 1, 1)
    --.. SOME other rows
    INTO    [#PoorManCursor]
    FROM    dbo.BaseTables
    --ORDER BY [Whatever column for batch order]
    --    Add index to temp table
    --    because table may easily exceed 10,000,000 rows
    CREATE UNIQUE CLUSTERED INDEX [UCPoorManCursorID] ON    [#PoorManCursor] ( [ID] )
    --    Set up batching
    SELECT  @RowStart = 1,
    @RowEnd = @BatchSize, --can use 10K
    @RowToDo = MAX([ID]),
    @RowDone = 0
    FROM    [#PoorManCursor]
    PRINT CAST(@RowToDo AS [VARCHAR]) + ' records to be processed.'
    --    Process batch
    WHILE ( @RowStart <= @RowToDo )
    --        Load PoorManCursor_Tracking
    INSERT  INTO [dbo].[SomeTargetTable]
    SELECT  [col1],
    FROM    #PoorManCursor
    WHERE   [ID] >= @RowStart
    AND [ID] <= @RowEnd
    + ' records completed in this batch'
    SET @RowDone = @RowDone + @RowCount
    SET @RowStart = @RowStart + @BatchSize
    SET @RowEnd = @RowEnd + @BatchSize
    PRINT CAST(@RowDone AS [VARCHAR]) + ' total records processed.'
    IF ( OBJECT_ID('[TempDB]..[#PoorManCursor]') IS NOT NULL )
    DROP TABLE [#PoorManCursor]

Presenting 2 sessions at SQL Saturday #96 in Washington, DC

I am deeply honored that I was chosen to present 2 sessions at SQL Saturday #96 in Washington, DC this Saturday,   November 5, 2011!

Location is:
Microsoft MTC
5404 Wisconsin Ave, Suite 700
Chevy Chase, MD 20815

View Larger Map

Subway: Friendship Heights Metro Station

SQLSaturday is a training event for SQL Server professionals and those wanting to learn about SQL Server. Admittance to this event is free, all costs are covered by donations and sponsorships. Please register soon as seating is limited, and let friends and colleagues know about the event.
If you are attending please stop by and introduce yourself. Also if you are on Twitter don’t forget to use hashtag #sqlsat96 and see what others are doing.

The 2 sessions I will be presenting are :

Optimizing SQL Server I/O with Solid State Drives and Join operations – Explaining Merge,Hash and Loop

Hope to see you there!


Update: 11-09-2011

I have uploaded the slide decks to the session pages:

Thanks to all that attended and don’t forget to leave feedback at:

How to improve your Comcast response time?

Recently I switched my ISP from AT&T DSL (I know … what was I thinking) back to Comcast. The process went thru very quick and in 30 minutes my old cable modem was hooked up and provisioned and was seeing 23 Mb/s speeds. with 11ms response time.
But… wait. As soon as I started using it I realized that a lot of web pages were taking forever to load and I quickly remembered the old  Comcast issue. Unreliable/slow DNS servers. And you need the DNS response before you can actually start HTTP transfer and see the page.

There are a lot of free name servers so I will list a few:

OpenDNS:,  (my choice)
AT&T and Level 3:,,
Google :,

I would  recommend to set the Primary and Secondary from 2 different groups and make these changes on your router so any connected device will inherit the changes.

Problem solved. Happy browsing!

Speaking on SQL Server Algebrizer at SQL Saturday #82 Indianapolis

I was selected to present a session Back to basics – How does the Algebrizer work  has been selected for  SQL Saturday #82 in Indianapolis, IN on Saturday,   June 25, 2011!

Location is:
University of Indianapolis-1400 E Hanna Ave
Indianapolis, IN 46227

View Larger Map

SQLSaturday is a training event for SQL Server professionals and those wanting to learn about SQL Server. Admittance to this event is free, all costs are covered by donations and sponsorships. Please register soon as seating is limited, and let friends and colleagues know about the event.
If you are attending please stop by and introduce yourself. Also if you are on Twitter don’t forget to use hashtag #sqlsat82 and see what others are doing.

Hope to see you there!

What’s the big difference between Oracle and SQL Server indexes (Part I)

Last week Erin Stellato (blog| twitter) has asked a very interesting question on Twitter that triggered this post:

So how is it you can have good perf in Oracle, w/o IOTs, but in SQL Server, everyone says need CIs for good perf?

There are two aspects to that question

  1. What is the difference between index implementations in Oracle vs. SQL Server?
  2. How do you define good performance?Peeling an Onion

Until now I have not realized that this subject is not very well covered and since it involves diving into the internals  I offered to make an attempt to fill this gap over the Memorial Day long weekend and soon realized that there is more to this than I originally thought it is.  This will be the first post in a set that I am planning to write in the next weeks.

First, let’s begin by comparing the different types of indexes supported by Oracle and SQL Server:

Type of index SQL  Server 2008 R2 Oracle 11g
B-tree indexes Yes (Non-clustered) Yes (Default)
B+ tree indexes Yes (Clustered Indexes) Yes (IOT)
Function Based Indexes Yes (Column has to exist in table) Yes
Filtered Indexes Yes Yes
Hash cluster indexes No Yes
B-Tree cluster indexes No Yes
Bitmap Indexes No Yes

So both Oracle and SQL Server support B-tree indexes which are ordered lists of key values, associated with the storage location of the table record that contains the respective value. So far it doesn’t seem to be a big difference between the two RDBMS. Both also support B+ tree indexes although there are differences between the two implementations. B+ trees are similar to B-trees but the table records are stored in the leaf nodes of the primary key index. This allows  fast access for singleton searches (exact match) or range searches on the primary key of the table.  Microsoft calls them clustered indexes (CI) while Oracle uses the term Index Organized Tables (IOT).

Let’s look at non-clustered indexes which are the default indexes in Oracle. Here the two implementations are quite similar: Indexes are using a B-tree structure that contains  the key values and a unique reference to the storage location of the record. This is where the implementations start to separate:

In Oracle every record has a unique (inside the table) 10 byte pseudocolumn ROWID representing the physical location on disk that the record is stored. When printed, each ROWID is displayed in BASE64 format (A-Za-z0-9+/): AAAAaoAATAAABrXAAA
ROWID has a four-piece format, OOOOOOFFFBBBBBBRRR:

  • OOOOOO: The data object number that identifies the database segment (AAAAao in the example). Schema objects in the same segment, such as a cluster of tables, have the same data object number.
  • FFF: The tablespace-relative datafile number of the datafile that contains the row (file AAT in the example).
  • BBBBBB: The data block that contains the row (block AAABrX in the example). Block numbers are relative to their datafile, not tablespace. Therefore, two rows with identical block numbers could reside in two different datafiles of the same tablespace.
  • RRR: The row in the block.

Users can use ROWIDs directly for several functions:

  • Rowids are the fastest way of accessing  rows.
  • Rowids can be used to see how a table is organized.
  • Rowids uniquely identify rows in a given table.

There are two completely different perspectives when migrating from one RDBMS to another:
Most Oracle users have no concept of clustered or non clustered indexes because primary key indexes created in Oracle are balanced non-clustered indexes that include ROWID for fast access.

A common mistake that Oracle users make is assume that  the indexes are implemented the same way  in SQL server and underestimate the performance hit that comes with using a large composite primary key in SQL server on a Clustered table. It takes a while to understand that the non clustered index size  index size in SQL server can be exponentially bigger in SQL server and is not necesarly proportional to the size of the columns specified in the non clustered index. Based on this, some Oracle experts say that the clustered indexes in SQL Sever have a 110% overhead because index key is stored “twice”, once with the index entry in the intermediate index nodes – and again with the row data at the leaf level.

Another performance hit can come from the  default value of the free space (PCTFREE) that is left on a page  as Oracle leave by default 10% space on each block (at 91% block is split). After a data block is filled to the limit determined by PCTFREE, Oracle considers the block unavailable for the insertion of new rows until the percentage of that block falls beneath the parameter PCTUSED.

The alternative of using a non clustered table (heap)  is not recommended in most cases  and Paul Randal ( blog | Twitter )  has some excellent posts on this subject  as well as Erin’s recent blog post on Internals and Deletes for SQL University which I highly recommend reading.

On the opposite camp, most SQL Server users are looking for an alternative to clustered indexes and end up choosing to use Index Optimized Tables (IOT) instead of using simple tables just because they search for something that has the same structure as clustered tables in SQL Server in order to have comparable performance.

In the next part of my series I will continue with access methods in Oracle, and other index types.

Optimists, Pessimists and Technology People

I got this in the email a while ago and looking at it again, I had to post it because it is too funny.



Pessimist: “The glass is half empty.”

Optimist: “The glass is half full.”

Database Administrator: “The glass is sized correctly for peak loads.”

Virtualization Admin: “This person should be sharing a Dixie Cup with the guys in accounting.”

Business Intelligence Admin: “How can we upsell this customer to bottled water?”

Security Admin: “Whose glass is this? Half of the water has gone missing.”

Storage Admin: “This glass needs thin provisioning.”

ETL Admin: “We need a faster way to fill this glass.”

Active Directory Admin: “We need a secondary glass for redundancy.”

Backup Admin: “Nobody told me to take a snapshot of the full glass.”

Break Fix Admin: “Hang on, I’ll go get you another glass.”

Linux Admin: “I bet I could build a better glass myself.”

Project Manager: “I need a FTE for 15 minutes to accomplish the deliverable of a full glass.”

Azure Developer: “Why buy your own glass? The clouds are full of water.”

jQuery Developer: “When you start drinking, it automatically refills.”

Mac User: “Check out how much thinner and smaller my glass is, and it holds that much water too.”

Twitter: “The faucet is over capacity. Please wait a moment and try again.”

Using CONVERT with COLLATE to strip accents from Unicode strings

Today Mladen Prajdic  posted a on Twitter a SQL statement that captured my attention: (http://twitter.com/MladenPrajdic/status/9878219727)

 CONVERT(varchar(50), N'æøåáäĺćçčéđńőöřůýţžš')
 COLLATE Cyrillic_General_CI_AI
returns aoaaalcccednooruytzs

If you notice the conversion to Cyrillic_General_CI_AI collation his shows an interesting mechanism for code page translation that we could not explain:

Why does Cyrillic_General_CI_AI or   Greek_CI_AI collation successfully convert accents to their base ascii chars?

 I found an answer to this on this KB page: http://support.microsoft.com/kb/153449

What Happens when a Character on One Code Page is Unavailable on Another

 If a conversion occurs and a character on one code page is unavailable on another, the character gets converted to its closest equivalent character on the other code page. For example, ASCII 224 (“alpha”) on code page 437 is converted to 97(“a”) on the ANSI code page. When you convert it back to code page 437, you receive 97(“a”). This is a problem, because you have now lost the original 224(“alpha”). This can potentially cause incorrect data to be saved on your server, as happens in the following two examples:

Example 1:

You have an OEM server in which you are inserting data from an ANSI client, and your AutoAnsiToOem option is enabled. If the character that you are inserting is valid in both code pages, the correct data will be saved. However, if the character is valid in ANSI but not in OEM, the correct data will not be saved.

Example 2:

You insert data into an ANSI server from an OEM client, and the AutoAnsiToOem option is enabled. If the character that you are inserting is valid in both code pages, the correct data will be saved. However, if the character is valid in OEM and not in ANSI, then correct data will not be saved.

 Nevertheless I this is very useful to exotic strip accents from strings (i.e before an export).

Skip to toolbar