Saturday, November 19, 2016

Freaking Out About To Run Out Of Disk Space On My Nimble AFA

So I am moving workloads over to a Nimble All Flash Array and I notice I am out of free space.  Now I start freaking out afraid my critical VMs are about to start crashing.  I checked the Nimble GUI and I am only using 20% of disk space after compression and deduplication.   I know I am not running out but VMware doesn't.  There is the Nimble free space vs VMware free space.

As I start move workloads off Nimble, I already know the problem, just now what to do about it yet.  When the Nimble volume was created, we chose to allocate the entire capacity to a single volume that will be mounted on a cluster of VMWare hosts.  The total free space is 7TB.

As I run through the issue and ping my account SE and a friend who knows this stuff better than I, I consider how Nimble is supposed to represent actual free space.  Or better yet, how is it going to dynamically show the volume size?  Starting out, there is no compression and deduplication savings so, the volume size is the max size of free space.  While I want it to change the volume size dynamically based on actual dedupe and compression ratios, Nimble doesn't.

I proceed to the Nimble GUI and navigate to my single large volume.  As I iterate through volume configuration, I decide to change the volume size or at least see if I can. Right there above the volume size is a blurb of text  advising you can create a volume size greater the the free space because of deduplication and compression.  It would be nice it the GUI gave me some guidance on how big based on the current ratios, but it doesn't.  With nearly 5X space reclamation, I could probably choose 35TB.  So I choose 15TB for now.

Back in VMware vCenter, I rescan the volume on each host and try to resize the volume on each mounted host to no avail.  I make an educated guess and connect directly to a host.  I picked the first host I mounted and formatted the volume and attempt a resize from there.  Sure enough, it allows me to resize.  Back in vCenter, I go to each host again and rescan which now shows the 15TB of total space.  I cancel my storage vMotions that were abandoning the storage and go back to moving the final set of workloads back onto the Nimble.

Crisis averted and I never needed help for the SE or my expert friend.

The Nimble AFA has been performing incredibly well with sub millisecond latency.  My jobs are performing quite predictably which is critical for the workload.  Further, Nimble AFA is saving me around 30-45 minutes over the fastest time from my next gen hybrid array.  The Nimble AFA is a better match to the workload than the next gen hybrid array which experiences unpredictable latency causing my jobs to vary between zero and 6 hours or additional time.  Of course time will tell but so far, Nimble is stellar.




No comments: