Are SSDs Ready For The Data Center?
Solid-State Drives (SSDs) are recently becoming popular storage mediums in laptops, and are praised for their speed and shock resistance. However, there has been some debate over whether or not SSDs are ready for the data center, and here we aim to provide a comprehensive list of advantages and disadvantages of the usage of SSDs over conventional hard disks in servers from a variety of viewpoints.
Before we consider the finer points of an in-place SSD within a typical datacenter scenario, let’s first consider the advantages and disadvantages of the actual SSD hardware itself.
- Fast, no spin up required
- Random access, no head movement
- Quiet operation
- High operating environment tolerance
- High mechanical reliability – no moving parts save for fans
- Fault-tolerance – most failures occur on writing, not reading
- Cost – HAVE YOU SEEN HOW EXPENSIVE THEY ARE PER GB?
- Power consumption – goodbye green data center
- Slow (small) write speeds – the blocking of the flash hardware causes this
- Size – not as much storage space as HDD; see cost
- Limited write cycles – it’ll crap out sooner than HDDs due to flash wear and tear
- Firmware-caused fragmentation – newer SSD firmware’s “wear leveling” causes hardware-based fragmentation that software can’t fix, regardless of filesystem
Now that we’ve established the advantages/disadvantages of SSD hardware, let’s look at it from a data center scenario. First, we’ll start with a basic web farm scenario to build upon: a group of 5 or so Linux servers, with one providing NFS-shared storage for the rest of the servers using SSDs as the storage medium. For comparison, we’ll also throw in a database server also using an SSD as the storage medium, while the storage for the operating system for all of the servers will also be based off SSDs (TFTP-booting would make this too easy ).
This setup is not very different from that of our own server setup, which makes it easy for us to compare it to SSD storage (and hopefully make it easy to convey the comparison to our readers).
It is also important to establish the software used in this scenario: ext3 journaling filesystem on all drives, a mix of lighttpd/apache2 for the web server software, and either PostgreSQL or MySQL for the database server. This all factors in how the servers’ fault-tolerance plays in.
The most stress-inducing factor on the SSDs is easily the logging and journaling: above, we said that SSDs suffer from small write speed penalties, hardware-based fragmentation, and limited write cycles. With these disadvantages in mind, you can flat-out forget logging on the drives while expecting them to last long, as both of our pieces of web server software used in this scenario log data in one-line 30-60 character (1 byte each) increments to a file stored on said SSD, and that combined with ext3′s method of journaling and the SSD firmware “wear leveling” feature will slap around the SSDs quite a bit.
So, with this in mind, let’s change the scenario a little bit: we’ll use small hard drives (or TFTP-booting) for the operating system storage of the smaller, web server-only systems to eliminate cost, leaving only the database server and NFS server using SSDs. The NFS server will have the shared NFS directory (we’ll say /home) mounted from the SSD as a non-journaling ext2 partition, while the rest of the system (namely /var and the dpkg directories) will be mounted from a good, old fashioned HDD RAID array. This will keep the majority of the writing, such as logging and other frequently-changing files, off the SSD.
For the NFS server, this is almost the best way to go: the SSD now is predominately read from, instead of written to, and since it is not journaled and data is written to it in larger portions its estimated life expectancy is longer. The NFS server will also see all of the benefits of the SSD pretty clearly – no spin up time means the only NFS-related latency is the network speed (recommended 1Gbps) so the shared web data is distributed amongst the rest of the servers pretty quickly.
But what about the database server’s SSD? Modern database software can either write data to a database file or series of files in small portions or chunks, depending on the software configuration. Most of the resources consumed in a database server are caches in memory, and we would generally recommend using a HDD with a database server due to the writes. Tables and rows are often cached into memory anyways, so a fast RAID array of HDDs would perform just as well as an SSD without the drawbacks of SSD writes.
Now that we’ve considered a web farm data center scenario (and inherently NFS, database, and general software/filesystem theory), let’s consider using an SSD in a file server, FTP server, or Git/CVS/SVN just for fun. We’ve already covered the advantages of a sufficiently partitioned ext2/ext3 SSD/HDD Linux system, so we’ll now move on to NTFS and ZFS as our scenarios’ filesystems.
First off, the FTP server scenario is almost across the board better for seeing the performance benefits of an SSD than the others, mainly for downloading files from due to lack of latency or spinning up (although the power consumption might be excessive). However, logging still carries the same penalty under most FTP solutions, and both of our filesystems will be subjected to this. Both ZFS and NTFS can be configured to mount a directory from a separate disk, however, so using an HDD for logging can still be done to save wear and tear on the SSD.
ZFS has also been specially designed for SSDs in that it will utilize caching on both the reading and writing of data to an SSD so as to save on the hardware stress of the hardware, whereas NTFS currently has no special performance boosts that specifically go together with using an SSD.
The part that affects FTP and file sharing the most in this scenario is the writing of data and uploading files – consistent small writes made to the disk will almost obliterate the life expectancy of an SSD, which is not negligible given the high cost and small storage space SSDs typically induce. The small storage space also serves as a penalty for non-pooled storage solutions in a file/FTP server, and if SSDs keep on dying because of exceeding their write cycle or become too hardware-fragmented to the point they are no better than HDDs then the total cost of ownership just blew up beyond what using an HDD-based RAID system would cause.
With that said, using an SSD for a versioning server would just be suicide: constant writes would continually decrement the countdown of write cycles SSDs carry, and if used in conjunction with a non-caching journaling filesystem or used on the same disk as the logging that goes with versioning, you’re guaranteed no more than a year and a half per disk.
In addition to the software scenarios we discussed above, the cost of the disks in conjunction with the operating cost (read: electric bill) of running SSDs and also an HDD to remove the stress of logging almost make this solution way too expensive. Someone once suggested using an SSD in a RAID array, but given the points about cost we just suggested and the combined disadvantages of SSDs make a RAID array of SSDs almost infeasible, and we also don’t know of too many motherboard manufacturers implementing this specifically for SSDs at the hardware level for presumably the same reason.
So, what is our verdict as far as using SSDs in the data center? In most scenarios, it’s a very risky (and costly) venture, especially if used without a backup disk or secondary HDD for logging purposes or just general fault-tolerance.
The web farm scenario mentioned above, which utilized ext2 on the SSD which was NFS-shared over the network to other web servers combined with an ext3 disk used for logging and operating system storage was by far the best usage for an SSD in a data center, but only if in a controlled environment, e.g. without public hosting as that causes excessive writes and therefore is no better than the points made about using an SSD in a file server.
For reading, SSDs are great – random access, fast seek time and no moving parts make SSDs perfect for any application requiring large amounts of read-only data, e.g. an image-only host or static content provider in a file, FTP or web farm. But writing, on the other hand, is where SSDs suffer for usage with the data center, not to mention the operating costs associated with running them.