This is a blog post that has been long overdue, I have blogged about Nimble Storage a couple of times when at VMworld and Devin Hamilton (Director of Storage Architecture & Nimble’s first ever customer-facing engineer) was also on one of the HandsonVirtualization podcasts that we recorded in the past. I sat down with a good friend of mine Nick Dyer around 6 months ago, Nick at the time had been with Nimble for only a few months after previously being at Xsigo and Dell EqualLogic, we discussed who Nimble were and what made them different to everyone else in the market place, Nick also gave me a tour of its features and functionality.
Very recently Nimble have announced Nimble OS 2.0, which this walkthrough is based on and big thanks to Nick for helping me up date this from 1.X to 2.X
Home Screen
The home screen shows you a good overview of what is happening within your storage array. On the left we can see a breakdown of the storage usage including snapshots, below this we can see what our space saving is, utilising the in-line compression technology for both primary and snapshot data. In the middle we have a breakdown of throughput in MB/Sec and IOPS broken down by Reads and Writes. Finally on the left we have a breakdown of events over the last 24 hours.
Array Management
Prior to Nimble OS 2.0, the architecture worked with a frame / scale up based design where you start with a head unit that contains 2 controllers, 12 high-capacity spinning disks and a number of high-performance SSDs. You then can increase capacity by attaching up to a further 3 shelves of high-capacity drives using the SAS connectors on the controllers, or you can scale performance by upgrading the controllers or swapping for larger SSDs. What is different about Nimble is the architecture is not based on drive spindles to deliver performance as like traditional storage arrays, rather using multiple Intel Xeon processors to drive IOPS from the array. Nimble have now released version 2.0 of their software, meaning that scale-out is now available as a third scaling method for any Nimble array. This now forms part of a Nimble Storage array “Group”. Today Nimble supports up to 4 arrays in a group – each array supporting 3 shelves of additional disk. The theoretical maximums are thus ~280,000 IOPS & 508TB usable storage in a scale out cluster!
We can see on the screenshot below on a different system that there are a number of shelves configured and that we have active and hot standby controllers configured.
Nimble use an architecture called CASL (Cache Accelerated Storage Architecture) This is made up of a number of components, SSDs are utilised as a Random Read Cache for Hot Blocks in utilisation within the systems, random writes are coalesced through NVRAM to the array, and it compresses them and writes them sequentially to the RAID 6 near line SAS spinning disks resulting in write operations that Nimble claim can be up to 100x faster than traditional disk alone.
The compression within the Nimble storage array happens inline with no performance loss and can offer between 30-75 percent saving depending on the workload.
Check out the following page for more information on CASL – http://www.nimblestorage.com/products/architecture.php
One of the nice feature in the GUI is when you hover over a port in the array screen it will then highlight which port it corresponds to on the array and display the IP address and status on screen.
When configuring the array with your ESXi or Windows servers you will use the target IP address shown below to configure your storage connectivity.
The network configuration on the array is easily configured. Nimble has now dedicated “Networking” tab available in the administration menu where the Group or individual arrays can be changed. From here we can also configure a new technology Nimble call “Virtual Target IP addresses” as well as creating “Network Zones” to stop traversing and saturating Inter-Switch Links for Multipath configurations. Both of these topics are an individual blog post on their own! It is now also possible to create multiple routes on the array to allow for replication traffic, for example.
Any individual port can be configured to either be on the management, data or both networks.
It’s now also possible to save your network changes as “Draft”, but also to revert your network settings back to your previously applied configuration – very handy in case something went wrong!
Adding Arrays to Group
To deploy a Nimble array into the group, it is now as simple as clicking a button in the Nimble GUI under the “Arrays” page. We did this on a pair of fully-functional Nimble VMs.
The Group will then detect the unconfigured Nimble array (which must be on the same layer 2 broadcast domain). It is also where it is possible to merge two Groups together in this screen!
From here all that’s required are the physical network IP addresses for the new array data ports. It will enherit all other configuration from the Group (ie Replication, Email Alerts, Autosupport, Performance Policies, Initiator Groups and more). This is a non-disruptive process, too!
Once IP addresses are configured, the new array is provisioned in the Group in the “default” storage pool.
Initiator Groups
Initiator groups are used to manage access to the volumes, you start off by creating an initiator group for servers that will require access to the same volumes, in this example ESXi hosts, you will then map your volume to the initiator group.
Performance Policies
Performance policies are used to handle the cache, compression and block sizes for the volumes to tune the metrics to suit the use case. Out of the box there are a number already configured for the most frequent use cases, however it is entirely possible to create your own with your own requirements (ie creating a volume which will never be cache-worthy, which is very useful for backup-to-disk volumes). This is useful as traditional storage arrays which utilize flash as a tier or cache very rarely have the intelligence to specify or keep dirty-data away from these very expensive resources.
Snapshots
Volume collections are utilised for replicating and snapshotting volumes, volume collection may contain multiple volumes that will allow you to synchronise snapshots and or replications over multiple linked volumes, this maybe useful for VMFS volumes that contain multiple related VMs for example. Another example maybe your SQL-Logs and DB volumes.
Snapshots and replicas are able to be made fully consistent with the use of VSS integration direct from the array without the need to install additional software.
As Nimble uses a variable block size of 4/8/16/32KB snapshots and replication are generally very space efficient when compared too other arrays utilising larger block sizes. Also all snapshots are using compressed blocks and thus it is not uncommon to see snapshots taken and retained for longer than 30 days on the array.
As snapshots are so granular and do not take any performance overhead, the limitation of snapshots is currently 10,000 per array group, and 1000 per volume.
The image below shows the average snapshot change rate as a daily percentage that Nimble customers see for key use cases.
Volumes
Within the volumes view under the manage menu you are able to see at a glance the performance and compression on each volume over the last 5 minutes by selecting the performance tab.
Individual Volume Breakdown
By selecting an individual volume in this view you get a more detailed breakdown of the configuration and performance utilisation of that individual volume. We are also able to edit the volume, set it offline and delete the volume from this same screen.
Individual Volume Snapshot Tab
By selecting the snapshot or replication tabs in the individual volume view you get a detail breakdown of the usage including the date and name of the snapshot / replica, its origin and schedule but also information regarding how much new data is kept within the snapshot and what compression ratio was achieved.
Replication Partner
Replication Partners are easily configured via a simple wizard accessed under Manage > Protection > Replication Partners, the replication is configurable to take place either over the data or management network to give you flexibility. What you can also see here is Nimble give you the ability to decide where your replication traffic gets presented; over the management or data networks you have!
What I really liked about the replication configuration was the built in quality of service that allows you to tune the replication, this could be extremely important for a small business utilising a single line for replication and other business traffic.
After configuring the replication you get a very clear view of the policies configured and the volume collections replicating, you also see what the lag is between production and DR.
Administration
The Nimble arrays contain a dial home support functionality called Infosight. Each array contains 30million sensors, when enabled every 5 minutes the results from those sensors are rolled up into a log bundle and is transmitted to Nimble support. Nimble’s systems are then are able to detect issues, failures and auto raise cases prior to the customer knowing in many cases. Today over 90% of all support cases raised by Nimble are automatically generated and resolved, according to Nick.
Firmware updates are easily handled within the array themselves allowing you to check version information, download the latest firmware and upgrade the unit.
By default all volume and snapshot space on the Nimble array is thin provisioned, this can be customised for new volumes by configuring the volumes reserve seen above.
Monitoring
There are a number of monitoring options within the Nimble array, these can all be found under the Monitor tab on the top menu. The example shows the performance across the array, you can customise this view to see performance across a time period from the last 5minutes to the last 90 days as standard and also focus on an individual volumes.
Conclusions
That’s it for my Nimble array walkthrough I intend on delving a little deeper when possible in the future. I really like what Nimble are doing in this space as they appear to be doing something different to most and when digging deeper all the technical design decisions certainly make a lot of sense, based on the results I am hearing customers seem to be very happy. Of course there are a huge amount of ways to deliver the IO for your infrastructure but Nimble are certainly cementing their space as a validated disruptive technology in this arena.
Something that interests me greatly is its use cases in VDI. Speaking to Devin the arrays even love mixed VDI and Server workloads due to the way the writes are coalesced through the NVRAM random workloads aren’t a problem. I