Approximately 11 months ago, Vadim reported some test results from the Virident FlashMax 1400M, an MLC PCIe SSD device. Since that time, Virident has released the FlashMAX II, which promises both increased capacity and increased performance over the previous model. In this post, we present some benchmark results comparing this new model to its predecessor, and we find that indeed, the FlashMax II is a significant upgrade.
For reference, all of the FlashMax II benchmarks were performed on a Dell R710 with 192GB of RAM. This is a dual-socket Xeon E5-2660 machine with 16 physical and 32 virtual cores. (I had originally planned to use the Cisco UCS C250 that is often used for our test runs, but that machine ran into some unrelated hardware difficulties and was ultimately unavailable.) The operating system in use was CentOS 6.3, and the filesystem used for the test was XFS, mounted with both the noatime,nodiratime options. The card was physically formatted back to factory default settings in between the synchronous and asynchronous test suites. Note that factory default settings for the FlashMax II will cause it to be formatted in “maxcapacity” mode rather than “maxperformance” mode (maxperformance reserves some additional space internally to provide better write performance). In “maxcapacity” mode, the device tested provides approximately 2200GB of space. In “maxperformance” mode, it’s a bit less than 1900GB.
Without further ado, then, here are the numbers.
First, asynchronous random writes:
There is a warmup period of around 18 minutes or so, and after about 45 minutes the performance stabilizes and remains effectively constant, as shown by the next graph.
Once the write performance reaches equilibrium, it does so at just under 780MiB/sec, which is approximately 40% higher than the 550MiB/sec exhibited by the FlashMax 1400M.
Asynchronous random read is up next:
The behavior of the FlashMax II is very similar to that of the FlashMax 1400M in terms of predictable performance; the standard deviation on the asynchronous random read throughput measurement is only 5.7MiB/sec. However, the overall read throughput is over 1000MiB/sec better with the FlashMax II: we see a read throughput of approximately 2580MiB/sec vs. 1450MiB/sec with the previous generation of hardware, an improvement of roughly 80%.
Finally, we take a look at synchronous random read.
At 256 threads, read throughput tops out at 2090MiB/sec, which is about 20% less than the asynchronous results; given the small bump in throughput going from 128 to 256 threads and the doubling of latency that was also introduced there, this is likely about as good as it is going to get.
For comparison, the FlashMax 1400M synchronous random read test stopped after 64 threads, reaching a synchronous random read throughput of 1345MiB/sec and a 95th-percentile latency of 1.49ms. With those same 64 threads, the FlashMax II reaches 1883MiB/sec with a 95th-percentile latency of 1.105ms. This represents approximately 40% more throughput, 25% faster.
In every area tested, the FlashMax II outperforms the original FlashMax 1400M by a significant margin, and can be considered a worthy successor.