A SSD Technology a Day (6) – CacheCade

CacheCade is a technology developed by LSI  for its MegaRAID Storage controllers. CacheCade software allows  you to mix inexpensive SATA or even SAS hard disk drives with up to 512GB of solid state storage capacity distributed over a few SSD drives, to to provide a substantial performance boost, adding additional SATA HDDs or moving to an all SSD RAID volume to achieve performance requirements.

CacheCade - Image (c) Copyright LSI Corporation

This combination of HDDs and SSDs as secondary cache is especially best suited  in random read intensive applications where hot data can be moved to SSD storage in order to take advantage of the low latency, High IOPS characteristics of SSD at a reasonable price.
This technology is available on LSI  MegaRAID 9260 and 9280  controller series as well as on re-badged RAID controllers like Dell PERC H700 and H800 with 1GB Cache.

While CacheCade version 1.o offers only read cache (only one supported by DELL) CacheCade 2.o offers read and write caching for impressive results. This technology requires a inexpensive hardware license ($300). Read more about this here

As mentioned earlier Dell also offers the CacheCade 1.0 technology for Dell PERC H700 and H800 controllers with 1 GB NVRAM and firmware version 7.2 or later. This is an excellent solution that combines SSD with regular HDD arrays in order to intelligently store the hot data on 1 or more SSDs (Maximum SSD pool is 512GB).
In their Whitepaper they claim to double the number of transaction using up to 4x50GB SSDs
Read more here:
http://www.dell.com/downloads/global/products/pedge/en/perc-h700-cachecade.pdf
and here on technical details
http://support.dell.com/support/edocs/software/svradmin/6.5/en/OMSS/HTML/ecache.htm

A SSD Technology a Day (5) – Wear Leveling

Wear leveling is a technology used in Solid State Drive  controllers to prolong the service life of flash memory. As mentioned in the 2nd post of this blog series  What’s the difference between SLC and MLC? flash memories have limited endurance,measured by the number of P/E cycles that the memory can perform before becoming degraded. Wear leveling ensures that all cells are getting the same number of P/E cycles (even wear) so that you do not have  just a few cells on the drive receive the majority of the writes and wear out early. This might cause the drive to fail when most of the memory on it is still usable way ahead of the prescribed service life.

Memory wear-out concerns are unique to flash-based memory. Hard disks store data by magnetizing a thin film of ferromagnetic material on a disk. DRAM is volatile memory  (the memory stores data only while it is powered on). Flash memory stores data inside the NAND cell  via a process called tunneling while the floating gate is being flooded with high voltage. This leaves a charge in the NAND cell and that charge can be read over and over. Because  of this invasive writing method and coresponding erasing method, the flash cells degrade over time.

Wear leveling algorithm basically stores the P/E count of each cell and writes the next block to the  “least used available cell” so that cells that were intensively used are put to the end of the queue until all cells have an even wear.

One caveat is that new disks will have a much better performance than the once that were used intensively because all cells are good candidates for writing. Once the disk has been used, there is a performance degradation due to the need to erase  the cell selected by wear leveling. So a good advice would be not to test a new drive but first perform writes on 2-3 times the capacity of the drive before starting tests(e.g. 240GB SSD drive should have 500-750GB lifetime writes) in order to simulate the a real production scenario instead of just  benchmarking a brand new disk. One other thing that affects this performance is Static Data Rotation which is discussed in the first post of the series. Lifetime writes can be queried using Cristal DiskInfo and other free tools.

Disk parameters to check

A SSD Technology a Day (4) – Redundant Array of Independent Silicon Elements (RAISE)

RAISE is a technology developed by Sandforce that stands for  Redundant Array of Independent Silicon Elements. It is based on RAID ( Redundant Array of Independent Disks) technology  and is used to protect against write errors. Sandforce controllers that implement this technology work in a way that is very similar to RAID 5. Every chip contains a number of dies (typically 8 or 16). Each die is the equivalent of a HDD in a RAID5 array and data is being spread across multiple dies and to enable recovery from a failure in a sector, page or entire block, the missing data can be calculated calculated from parity and the write is performed again in the same block.  For more information read the article on Sandforce website.

Crystal DiskInfo

You can use Crystal DiskInfo to check the number of errors where the Raise technology recovered from.

 

 

A SSD Technology a Day (3) – Program and Erase Cycle (P/E)

One of the limitations of flash memory is that while it can be read or programmed a byte or a word at a time in a random access fashion just like regular RAM, it can only be erased a “block” at a time. This will set all bits in the block to 1 which is the default state for NAND memory.

Writing a byte in flash memory involves 2 steps: Program and Erase (P/E). The block is written to a new cell and the old block needs to be erased.

The programing can be done at cell level (setting it to the “0″ state)  via a process called tunneling while the floating gate is being flooded with high voltage using the on-chip charge pumps.

Erasing can be done only on an entire block (resetting it to the “1″ state), thru high negative voltage pulling the electrons off the floating gate via process called quantum tunneling. Flash memory is divided in erase segments (often called blocks or sectors).

 

A SSD Technology a Day (2) – What’s the difference between SLC and MLC?

It’s time dissect the two main types of flash chips in order to understand why not all SSDs are created equal. What is after all the physical difference between SLC and MLC?

SLC stands for Single Level Cell and just like the name suggests can store one bit per NAND gate hence SLC cell has two states:
0 or 1 based on the charge of the NAND gate.

SLC Levels

SLC Reference Levels

 

MLC on the other hand stands for Multi-Level Cell and uses multiple voltage threshold levels in order to store 2 or even 3 bits (also called TLC – Triple-Level Cell) in the same NAND gate. this is done by coding 4 or even 8 states (in the case of 3 bit TLC) on the same gate so the MLC will typically one of the following states :
11, 10, 01, 00). The benefit over SLC is the increased capacity per chip (2 or 3 times more) but at the same time the voltage reference levels are a lot tighter which leads to more rapid degradation of the cell after a lot of P/E (Program/Erase) Cycles. Once the MLC NAND gate has degraded the reads are no longer predictable because the stored value  overlaps reference  levels. In this case the memory will report an error or if the controller supports it it will retire the cell and replace it with one from the reserve capacity.

MLC Reference Levels

MLC Reference Levels (2bit cell)

Typical number of write cycles is pretty solid around 100K for SLC and floats around 10K for MLC (different dies can have very different quality and will wear differently). This number is still high enough for a consumer lifecycle in the case of  MLC if the entire memory is programmed 5 times  daily for 5 years and  runs uptu 50 years for SLC under the same usage.

 

Type of flash cell SLC MLC 2bit TLC 3bit
Bits/ cell 1 2 3
States stored 0 00 000
001
01 010
011
1 10 100
101
11 110
111
Typical capacity /chip 32GB 64GB 96GB
Endurance P/E cycles 100K 10K-30K <1K
Performance Over Time Constant Degrades Quickly Degrades
Application Enterprise Consumer Thumb drives,Camera cards

In the case of MLC the program cycle take 2 or 3 times more than for SLC since the  programming signal has to be a lot more precise to code 4 states in the space of 2. This leads to higher speed and increased number of IOPS (IO Operations Per Second) for SLC type of memory compared to MLC.

A SSD Technology a Day (1) – Static Data Rotation

One of the main drawbacks of SSD  has been reliability. Every NAND cell has a certain prescribed number of Program/Erase (P/E) cycles and as data is being written to disk, chances are it will remain unchanged for weeks or months. That means that the cells that are being used to store that data will have the same wear level (used P/E cycles) for the weeks or months that data was unchanged. This becomes a problem because the remaining free cells are going to be taxed even more and could reach their end of life and make the entire drive read only or even fail it completely.
I discovered this technology while I was trying to explain the degraded performance on my  new OCZ Vertex 3 SSD drive. I ran a bunch of tests using SQLIO based on Jonathan Kehayias (Blog|Twitter) post about Parsing SQLIO Output to Excel Charts using Regex in PowerShell with a 6GB file and I got some good results. I started using the drive and installed a few VMs until 50% of the drive was full. At that point I kept running SQLIO and Crystal Disk Mark test only to see the performance sinking more and more.

Little did I know that OCZ Vertex 3 which is based on SandForce 2281 chipset implements an intelligent Static Data Rotation algorithm as part of   Duraclass (Sandforce’s set of technologies to increase the reliability of the drive). This means that the SSD controller  actively rotates static data  from cells intensively used to other cells that were least used during  idle periods  to allow the drives wear leveling to work  at it’s best. But what happens when you stress test the disk and you run the about  3 times the size of the drive worth of data in a couple hours while half of the drive is full. The Sandforce  Duraclass algorithm will kick in and start moving data around even when the drive is not idle and the user will see a decrease in performance until the wear level is stabilized.

Essentially Static Data Rotation is there to make sure that you can use the drive for the MTTF prescribed by the manufacturer and prevents premature wear on the cells that store hot data.

There is an interesting post on the OCZ Technology Forum about this

UPDATE: Nitin Salgar (b|t) Has asked avery good question on Twitter after reading my post:

“Is Static Data Rotation in SSD a common phenomenon across all manufacturers?”

The answer is no, this is one of the strong selling points for the newer Sandforce SSD controllers that implement Duraclass. Newer Intel controllers have this technology as well but older ones do not have it. I would like to think that any Enterprise class controller has its own implementation of  a Static Data Rotation algorithm.

 

New blog series: One SSD Technology a Day

It’s become a tradition in the SQL blogs to start a month long series on a certain topic and try to blog every day. Today I want to start my first series on SSD technologies. It has been over a year since I started speaking on this topic and as this is a hot topic there are new technologies that I feel are not well explained even on specialty blogs.

I will keep this as a master post and will add links to each of the posts  as I publish them.

  1. Static Data Rotation 
  2. What’s the difference between SLC and MLC?
  3. Program and Erase Cycle (P/E)
  4. Redundant Array of Independent Silicon Elements (RAISE)
  5. Wear Leveling
  6. CacheCade
  7. Bad Block Management

 

Presenting at SQL Saturday #119 Chicago

I will be presenting  my session on “Optimizing SQL Server I/O with Solid State Drives” at  SQL Saturday #119 in Chicago on Saturday May 19, 2012!

Since this is the 4th time I’m presenting this and I kept accumulating a lot of ideas related to Solid State Drives I decided to start a series of One SSD Technology a Day. Stay tuned!

Location is:
DeVry University – Addison Campus,
1221 North Swift Road,  Addison, IL 60101-6106

View Larger Map

SQLSaturday is a training event for SQL Server professionals and those wanting to learn about SQL Server. Admittance to this event is free, all costs are covered by donations and sponsorships. Please register soon as seating is limited, and let friends and colleagues know about the event.
If you are attending please stop by and introduce yourself. Also if you are on Twitter don’t forget to use hashtag #sqlsat119 and see what others are doing.

Hope to see you there!

Beer, brats, unicorns, paper airplanes and 36 awesome sessions

I’ve been meaning to write my blog post on last week’s SQL Saturday experience so finally I got to work so here it is:

Bored Member

Bored Member

After the previous weekend marked a record with 5 (five) SQL Saturday events all around the world on Apr 14 it was time for Wisconsin to show what they can do. And boy did they put on a great event. The organizing team did a fabulous job, helping 227 hungry minds learn about SQL Server from 33 speakers in 36 sessions organized on 6 tracks. They almost made it look easy.

I remember 2 years ago meeting the person responsible for this at SQL Saturday #31 in Chicago. Back then Jes Borland (b|t) had traveled 160 miles to volunteer at the event. She enjoyed it so much that she came back the next year as a speaker and a year after that with the help of her great friends from MADPASS , she put on an epic first time event!  This goes to show a how important a single person can be in the SQL Family and anyone who thinks that they will never be able to raise to the level of some of the speakers has to learn a lesson from Jes. Every marathon starts with a first step (and she will run one in Kenosha the next week too ) and every little helps.

I left work Friday afternoon, after a pretty busy day with Vince (b|t) and we headed north to the land of beer and cheese. We got to Madison, checked in at our hotel and shortly after I was headed to the Speaker Dinner. The atmosphere was great, beer was good and the lasagna was too much for me to handle in one sitting. The highlight of the night was the paper airplane fight started by Aaron Lowe (b|t) and continued in part by me. We had a lot of fun unwary that above our head was a clothesline with underwear and shirts to complete the Italian experience.

The next day we woke up bright and early, had a frugal breakfast at the hotel and headed to Madison Area Technical College for the event, registered, grabbed  some coffee and a bagel and after finding the speaker room we chose Mike Donnelly’s (b|tSSIS: Figuring Out Configuring  session, his first SQL Saturday presentation ever. He did a great job with a demo packed session that had a very good audience.

Unicorns and gang signs w/ @StrateSQL

Unicorns and gang signs w/ @StrateSQL

Next we decide to take Erin Stellato’s (b|t) DBCC Commands: The Quick and the Dangerous.  She did a great job condensing all DBCC commands in one session. I could not help not to drop the ” unicorns will die if you shrink the database” slogan and the room liked it. The same room hosted an extraordinary fun  session on what happens when you think outside of the reporting box.  Stacia Misner’s (b|t) (Way Too Much) Fun with Reporting Services  was spiced up with a lot of humor with the help of her daughter Erika Bakse (b|t) . They started playing Words With Friends inside a SSRS report and then went under the covers to explain how it is done and mainly how to use Report Actions in SSRS to create interactions inside a report.
It was  lunchtime and the menu included tasty brats, burgers and Cows of a Spot tables, the Wisconsin version of Birds of a feather. I sat at the Data Profiling table and had a great conversation with Ira Whiteside (b|t) and his lovely wife Theresa Whiteside. They where very kind to share with me and Vince some of the experience accumulated during the course of  their vast career working with Data Profiling and Data Quality  as well as some of the methods that were presented in Ira’s session that started right after lunch. It was a extremely interesting session titled Creating a Metadata Mart w/ SSIS – Data Governance and we learned a lot from it.

It was time for my session on Optimizing SQL Server I/O with Solid State Drives a session that seems to be very popular (same session was selected for SQL Saturday #119 in Chicago as well at Washington DC and Tampa before). I had some great feedback that I will try to use to improve the format and add some fresh content (whiteboard on TRIM, Wear Leveling and Bad Block Management). I had the honor to have in the audience Norm Kelm (b|t), Craig Purnell (b|t) and Matt Cherwin (t).  We did an impromptu drawing for a Ted Krueger’s book (signed by Ted himself) and we stayed in the same room for the last session of the day with Sanil Mhatre (b|t)on  Asynchronous programming with Service Broker . He did an excellent job and it was his first SQL Saturday presentation ever.

It was a great day and an epic event by MADPASS which made me look forward to the next SQL Saturday in Madison.

 

 

 

Speaking at SQLSaturday #118 in Madison,WI

I am happy to announce that my session on “Optimizing SQL Server I/O with Solid State Drives” was selected for SQL Saturday #118 in Wisconsin’s capital, Madison on Saturday April 21, 2012!

Location is:
Madison Area Technical College – Truax Campus
3550 Anderson St, Madison, WI 53704

SQLSaturday is a training event for SQL Server professionals and those wanting to learn about SQL Server. Admittance to this event is free, all costs are covered by donations and sponsorships. Please register soon as seating is limited, and let friends and colleagues know about the event.
If you are attending please stop by and introduce yourself. Also if you are on Twitter don’t forget to use hashtag #sqlsat118 and see what others are doing.

Hope to see you there!

Speaking at SQLSaturday #110 in Tampa,FL

I am happy to announce that my session on “Optimizing SQL Server I/O with Solid State Drives ” was selected for SQL Saturday #110 in Tampa, FL  on Saturday March 10, 2012!

Location is:
K-Force,
1001 East Palm Avenue,
Tampa, FL 33605

SQLSaturday is a training event for SQL Server professionals and those wanting to learn about SQL Server. Admittance to this event is free, all costs are covered by donations and sponsorships. Please register soon as seating is limited, and let friends and colleagues know about the event.
If you are attending please stop by and introduce yourself. Also if you are on Twitter don’t forget to use hashtag #sqlsat110 and see what others are doing.

Updated:

Due to a family emergency, I had to cancel this speaking engagement and luckily my slot was filled by another speaker.I hope to make it next year!

Weekly links for 11-21-2011

SQL Server 2012 Release Candidate is now Available from SQL Server Team blog

The Data Scientist from BuckWoody

7 Scaling Strategies Facebook Used to Grow to 500 Million Users  HighScalabilty

10 Sloppy Social Media Mistakes to Fix NOW from Hubspot Blog

What is Hadoop? And Why is it Good for Big Data?  from The Data Roundtable

Hacker Says Texas Town Used Three Character Password To Secure Internet Facing SCADA System from ThreatPost

The AdventureWorks2008R2 Books Online Random Workload Generator from Jonathan Kehayias