vSAN – Check VM Storage Policy & Compliance

As I continue to work with vSAN I discover there’s way more to do than just move some VMs over and you’re on your way. With multiple vSAN clusters each with different configurations I needed a way to monitor the current setup and check for changes. While creating a simple script to check which VM Storage Policy is assigned to each VM isn’t very difficult, a creating a script to check the storage policy of VMs across multiple vSAN datastores proved to be a little more difficult.

We run multiple PowerCLI scripts to check health and configuration drift (thanks to a special tool created by Nick Farmer) in our environment. In the event that a new vCenter is added or new vSAN datastore is deployed, we needed a simple script that can be run without any intervention or modification. Now we can be alerted when the proper VM storage policies isn’t assigned or the current policy is out of compliance.

To further complicate things in our setup, we create a new VM Storage Policy that contains the name of the cluster in which it’s assigned.  Due to the potential differences in each vSAN cluster (stripes, failures to tolerate, replication factor, RAID, etc) having a single Storage Policy does not work for us. In the event a VM is migrated from one vSAN cluster to another we need to check that the VM storage policy matches vSAN datastore cluster policy.

What this script does is grab all the clusters in a vCenter that have vSAN enabled. For each cluster that is found with vSAN enabled, it is filtering only the VMs that live on vSAN storage (with the name of “<cluster>-vsan”. Then we get the storage based policy management (Get-SpbmEntityConfiguration) of those VMs. The script then filters for a storage policy that doesn’t contain the cluster name OR a compliance status that is compliant.

$vsanClusters = Get-cluster | Where-Object {$_.vsanenabled -eq "True"}
foreach ($cluster in $vsanClusters)
{
$Cluster | get-vm |?{($_.extensiondata.config.datastoreurl|%{$_.name}) -like "*-vsan*"} |
Get-SpbmEntityConfiguration | Where-Object {$_.storagepolicy -notlike "*$Cluster*" -or $_.compliancestatus -notlike "*compliant*"} |
Select-Object Entity,storagepolicy,compliancestatus
}

Once this is run we can see the output below. I’ve obscured the names of the VMs, but we can see that there are still 12 VMs that are using the default vSAN Storage Policy instead of the cluster-specific storage policy they should be using. In addition, we see that the compliance status is currently out of date on most of these VMs. These VMs reside on 2 separate clusters and there are also 2 VMs that were filtered because they are on local storage in these clusters instead of vSAN.

storagepolicy01-12202016

 

vSAN – Check VM Storage Policy & Compliance

Cohesity – DataPlatform in the Cloud

cohesityWhat separates vendors is focus and execution. In a crowded market, finding the right backup provider is no easy task. While each product has its pros and cons, finding the differentiator can be a daunting task. While Cohesity is relatively new to this space (founded in 2013), they have that focus and execution necessary to be a leader in the backup space.

But Cohesity is more than just backups. The Cohesity storage appliance not only handles your backup storage needs, but can also run your dev and test workloads. Cohesity is focused on your secondary storage needs. That secondary storage consists of any workloads or data that isn’t production. By avoiding the draw of being another primary storage vendor, Cohesity is listening to customers, learning their needs and creating a solution that can fit any size business.

storageiceberg

The Cohesity solution was built for a virtualized (VMware-only) environment. Connecting directly to your vCenter servers and pulling your inventory allowing administrators to create backup jobs and policies. While their start was in virtualization, there are still many physical workloads in the datacenter. Creating agents for physical Windows, Linux, and SQL server all backing up to the same storage system and with the same policies prove no workloads can’t be protected by Cohesity.

But wait, there’s more!

While data protection is important, that’s only a small portion of the Cohesity offering. Running these backups directly from the Cohesity storage arrays allows you to free up primary storage resources and (potential) bottlenecks when running multiple instances of the same VM on a single array. Leveraging the SSDs that come in each Cohesity node as a cache tier, testing software patches and deployments from your backed up production VMs means that your performance doesn’t suffer. And with a built in QoS engine your dev/test workloads don’t have to affect the speed of your backups.

Cohesity provides a scale-out solution, meaning as storage demand increases so can your secondary storage space. Operating under a single namespace, as new nodes are added, your space increases without needing to reconfigure jobs to point to a new array or manually re-striping data. Cohesity has customers that have scaled up to as much as 60 nodes with over a petabyte of storage.

To the cloud!

Policy-based backups and replication ensures that your data will be available. Cohesity has the ability to distribute data across the nodes in a cluster, replicate to clusters in another locations, and also replicate your data to a cloud provider in order to satisfy offsite backup requirements. The latest addition to the Cohesity software portfolio is the DataPlatform Cloud Edition. This gives you the ability to run Cohesity in the cloud.

DataPlatform CE is more than just replicating data to the cloud. Your VMs can be backed up to your on-premises cluster and that data can be replicated to your cloud-based array. From that cloud-based array, you can then clone virtual machines to a native cloud format. This means your servers can be run in the cloud in their native format and available to test or even run in the event of migrations or datacenter outages.

Many backup and data protection software vendors are doing replication to the cloud such as Veeam and Zerto. While the features isn’t new, its addition makes Cohesity a serious contender in this space. DataPlatform CE is available currently in tech preview in the Microsoft Azure Marketplace, but Cohesity hopes to release it in the first half of 2017 with support for Azure as well as AWS.

Wrapping Up

Data protection and availability is never going to be exciting. Swapping tapes and deploying agents is tedious work. A fully integrated software solution that not only protects your data, but also helps solve the problem of data sprawl, a platform for developers to test against production data in an isolated environment and the ability to migrate workloads to the cloud. That’s about as exciting as it gets in data protection and that is just the tip of the (storage) iceberg.

________________________________________

Take a look at posts by my fellow delegates from Tech Field Day 12 and watch the videos here.

First Look at Cohesity Cloud Edition
The Silent Threat of Dark Data
Cohesity Provides All of Your Secondary Storage Needs
Secondary Storage is Cohesity’s Primary Goal

________________________________________

Disclaimer: During Tech Field Day 12, my expenses (flight, hotel, transportation) were paid for by Gestalt IT. Cohesity provided each delegate with a gift bag, but I am under no obligation to write about any of the presented content nor am I compensated for such writing.

Cohesity – DataPlatform in the Cloud

VSAN – Compliance Status is Out of Date

Occasionally the Compliance status of the performance service will go to the “out of date” status. This is not an alert that is thrown anywhere within vCenter. You will have to check this status by logging into the vSphere web client, locating your vCenter, choose the cluster, clicking on “Manage” then choosing “Health and Performance” under “Virtual SAN”
ComplianceStatus-a

As I have recently fixed this issue the above screenshot shows the “Compliant” status. Below are the steps to get to that point.

1. In the box for “Performance Service” click “Edit storage policy”
ComplianceStatus-01

2. If there is a storage policy available in the drop down, select it and click “OK”. This will apply that policy and perform the compliance check.
ComplianceStatus-02

For the lucky few where that works, that’s all you need to do. If the storage policy list is empty you’ll need to restart the vsanmgmtd service on each of the hosts.

3. Enable SSH on each of the hosts in the VSAN cluster and using an SSH client (like putty), SSH to a host and run the following command to restart the vsanmgmtd service (this is a non-impactful operation and should be able to be performed during production hours with no impact)
a. /etc/init.d/vsanmgmtd restart

4. Repeat that command on each of the hosts in the cluster until they have all restarted their services
ComplianceStatus-04

5. Wait 5 minutes and then check to see if you are able to select a storage policy for the performance service. If not, move on to step 6

6. Now we’ll need to restart the vSphere Profile-Driven Storage Service on the vCenter server. This is also non-impactful and should be able to be performed in the middle of the day. If you’re using vCenter on windows, connect to the Windows server and restart the “Vmware vSphere Profile-Driven Storage Service”. If using VCSA (like this example) you’ll need to SSH to the VCSA and run the command below
a. Service vmware-sps restart

7. After the vmware-sps service restarts, log out of the web client and wait for 5 minutes while the storage profile service  completes its restart.

8. Log back in to the web client, navigate to the vCenter server, click “Manage” then choose the “Storage Providers” tab
ComplianceStatus-08

9. Click the Synchronize Providers button to resync the state of the environment
ComplianceStatus-09

10. Wait another 5 minutes while these synchronize completes. After 5 minutes, navigate to the VSAN cluster in the web client. Click on “Manage” then choose “Settings” and locate “Health and Performance” under the “Virtual SAN” section
ComplianceStatus-10

11. In the Performance Service box, click the “Edit Storage Policy” button
ComplianceStatus-11

12. From the drop down list you should be able to select the appropriate VSAN storage policy and then click “OK”
ComplianceStatus-12

13. After this is selected the compliance status should change to “Compliant” and you should be all set.

So far these are the only steps that I have needed to follow in order to fix this issue. Let me know if there are any other fixes available.

 

 

 

VSAN – Compliance Status is Out of Date

VSAN – Host Not Contributing Stats

After an upgrade or maintenance on one or more of the nodes in a VSAN cluster one of the hosts can stop contributing performance stats. This is not a production down issue, but should be addressed to see the most up-to-date stats across all the nodes.

The fix for this is one of three things, but each of them involves turning off performance statistics on the cluster which will cause all historical performance stats to be removed. My hope is that VMware will fix this issue in an upcoming release because a loss of historical is not tolerable in all environments.

1. View the health of the VSAN by logging into the vCenter web client. Navigate to the appropriate vCenter and cluster, then click the “Monitor” tab, followed by “Virtual SAN” then click on “Health.” Expanded “Performance service” and click the warning for “All hosts contributing stats”
step01

2. At the bottom you will now see the list of hosts that are not contributing stats
step02

3. Now that we’ve identified the problem host, we need to disable VSAN performance service temporarily. Navigate to the “Manage” tab for this cluster then click on “Health and Performance” under “Virtual SAN”
step03

4. Click “Turn off” in the “Performance Service” box

a. Click “OK” to confirm stopping the service which will erase all existing performance data
step04a

5. Confirm Perform Service has been disabled by refreshing the page
step05

6. SSH to the affected host (using putty or similar SSH client) we identified in step 2 (you may have to enable SSH on the host before you can connect).
step06

7. Run the command below to restart the VSAN management agent. This should have no production impact so it is safe to perform outside of a maintenance window.
a. /etc/init.d/vsanmgmtd restart
step07a

8. Once the service has been restarted, go back to the vCenter web client and the click the “Edit” button for the Performance Service box
step08

9. Select the appropriate storage policy from the drop down list, ensure the “Turn ON Virtual SAN performance service” box is checked and click “OK”
step09

10. Confirm that the performance service is turned on and reporting healthy
step10

11. Navigate back to the “Monitor” tab and then “Virtual SAN” clicking the on the “Health” section. Click “Retest” to verify that all hosts are contributing stats.
step11

If this does not fix the issue, you can restart the process, but this time instead of restarting the vsanmgmt service on the one node, do it on all of the nodes in the cluster. Once the services have been restarted across all nodes then restart the performance service and all nodes should be contributing stats.

I have also seen a case where restarting the service on all nodes didn’t fix the problem. In that scenario I was able to fix the problem by entering maintenance mode on the problem node and choose “full data migration” so all the data would be removed from the cluster. After that was complete I completely rebuilt the host from scratch (including wiping the disks claimed by VSAN) then moving it back into the cluster. I haven’t heard from VMware of any other ways to fix this issue.

VSAN – Host Not Contributing Stats

The Beginning of Cloud Natives

Over the last 8 years I have built my career around VMware. I remember the first time I installed VMware Server at one of my jobs just to play around with and imported my first virtual machine. I had no idea what I was doing or how any of it worked, but I felt there was a future for me in this technology. As I moved on to other companies, the VMware implementations just got larger and larger; from 3 hosts all the way up to well over 1000.

Having spent time in these environments and with other users at local VMUG events and VMworld, I’ve seen that the skills required to be a VMware administrator are becoming commoditized. More people know about it than ever before, more blogs exist than ever before, and the necessity of meetings that revolve around VMware specifically seems to have run its course. While VMware remains integral to the datacenter today, there are skills we need to be developing and technologies we need to be exploring to ensure we’re not the ones being replaced when the next generation joins the workforce.

Enter Cloud Natives.
cloud natives

Cloud Natives was the idea of Dominic Rivera and myself as a means to bridge the gap between user and these new technologies. Cloud Natives looks to bring together the leaders in a technology space to present their solutions in one location. Rather than just letting vendors spew marketing material,  we take a different approach. Vendors are required to provide actual customers to present how their solutions have impacted their job and their business. No more outlandish claims, no more vanity numbers that don’t depict actual workloads, just real stories from real users.

We are kicking off 2016 with our first event on July 14th in Portland, OR. This event will be focused on one of the hottest technologies in the datacenter right now: Flash Storage. We’re bringing together the top players in the Flash Storage space and you’ll hear their customers discuss the benefits and challeneges they faced when moving away from legacy spinning disk arrays and even newer hybrid arrays.Our goal is to educate our members one event at a time.

Cloud Natives looks to bring together all the datacenter technologies into one place. Whether it’s a focus on hypervisors, traditional or next-generation storage and infrastructure, cloud providers, DevOps and automation, or anything else that is hot in the datacenter, we will be that go-to resource in the Pacific Northwest. Each event is an opportunity to evaluate multiple vendors from the perspective of the customer. With no overlapping session schedules, you can walk away better informed and get any questions answered in one event.

I encourage everyone in the Portland area to register for this event at the Cloud Natives site. Our goal is to bring a sense of community back to Portland. We want to be a place to meet and network, to encourage, to mentor and to grow in our careers. No matter the stage in our career, we all have knowledge and experience that can help someone else and it’s time we all do our part to give back to the community.

The Beginning of Cloud Natives

Cohesity – Scale-Out Secondary Storage

Backups are boring. No matter if you’re talking about swapping tapes, configuring backup jobs in your legacy agent-based  software, or spending another night restoring snapshots from your storage array; there’s just no way to make backups interesting. Cohesity aims to fix that. No, they won’t make backups sexy, but they are looking to add a bit more flash to the secondary storage market.

So what exactly is “secondary storage?” Secondary storage encompasses our backups, non-prod workloads, fileshares and the like. The secondary storage market is gaining visibility recently. With the flood of primary storage vendors, Cohesity could have been another “me-too” primary storage vendor, but they see the value in attacking an under-developed market.

The concept of Cohesity is simple. You can purchase the C2300 or C2500 models which offer 48TB or 96TB of storage respectively in each 4-node appliance (with a minimum of 3-nodes to start). Additional capactity can be added a single node at a time afterwards in 12TB or 24TB chunks depending on the model. Each node contains either 800GB or 1.6TB of flash for caching along with compute and memory. Cohesity claims they are infinitely scalable due to their distributed OASIS (Open Architecture for Scalable Intelligent Storage) architecture, though they’ve only tested up to 32 nodes at the time of this writing. Once your nodes are setup, you just point Cohesity at your vCenter Server and you now have visibility of your virtual machines.

Cohesity, leveraging VADP, is able to snapshot your configured VMs and begins ingesting all that data. The changes of these VMs are tracked (using CBT) so you’re not performing new full backups each time. All that is pretty standard in the backup world, so what sets Cohesity apart? That data is not just backed up, but it is available to actually use. Want to spin up one of these backed up VMs for testing? Space-efficient clones are created directly on the Cohesity appliance and are presented back to your ESXi hosts. Searching for a file to restore from one of these VMs? You can locate it right from the web interface and download the file without having to restore the entire VM.

The differentiator for Cohesity is not just how it scales or how simple it makes the backup process, but how it makes your backups useful. Enabling developers to access clones of your production systems to test deployments and hotfixes without impacting your production storage. Integrated QoS preventing your dev/test workloads from consuming all your resources and causing backup performance to suffer. Utilizing the onboard flash combined with global deduplication, performance of these workloads can mimic production without the cost of an all flash array.

An all-inclusive secondary storage appliance that provides visibility of data sprawl adds to the value. Often times, as production systems are backed up and cloned and cloned again, you lose sight of the origin of that data. Migrating data from one storage array to another you lose that deduplication and you’re now increasing capacity across systems to accommodate your storage footprint. By providing an all-in-one solution for your backups and dev/test workloads, you’re able to maximize your investment without the need for multiple arrays and storage targets.

The backup market is a crowded one. There are more feature rich backup software providers in the space, but many of them require the purchase of additional storage that doesn’t have the capabilities of what Cohesity provides. Having just released Version 1 in mid-October, Cohesity has a lot of capabilities in their software with what appears to be a great vision for the future. The product is still in need of refinement to simplify the process of searches, reporting, and scheduling, but the foundation of what the Cohesity team has built has me excited to see where they’ll be able to take their product.

__________

Watch all the videos from Cohesity at Storage Field Day 8 here.

Disclaimer: During Storage Field Day 8, my expenses (flight, hotel, etc) were paid for by Tech Field Day. I am under no obligation to write about any of the presented content nor am I compensated by any of the presenting companies for such writing.

Cohesity – Scale-Out Secondary Storage

Track Datastore Add & Removes With PowerCLI

While working with the data protection team at my job I was asked if there was any way to track new datastores being added to a vSphere cluster. When new LUNs are allocated to our vSphere clusters, the data protection team isn’t always made aware ahead of time. Normally this isn’t a big deal, but in our case we have a product that requires access to specified datastores for backups. In order to maintain access to these virtual machines for backup purposes, we need to be notified when new datastores are added.

As I sat and thought about how I could accomplish this task I came up with a couple ideas, but figured a scheduled task with PowerCLI/PowerShell would be the easiest to implement. In this script we will connect to the vCenter server, get all the datastores in the cluster, write a file daily with a date stamp, then compare the current and previous day’s datastore output files and write that to a third file that will only display the new datastores that have been added or the datastores that have been removed.

I’ve broken down the script so I can explain each section making it easy to understand. Before I had any knowledge of PowerShell/PowerCLI, modifying something to fit my environment when I didn’t understand what was happening at each step was time consuming and frustrating.

1. This is where we define the name of the vCenter instance we’ll be connecting to and the name of the cluster we’re interested in.

$vCenter = "LabvCenter.domain.com"
$Cluster = "LabCluster"

2. This is where we define the output location for our datastores and difference file. I chose to drop it into a folder named for the cluster, but that can be removed.

$filePath = "C:\test\" + $Cluster + "\"

3. This is where we connect to vCenter and then immediately wait 15 seconds which can fix issues of commands running before security warnings are displayed

Connect-VIserver $vCenter
Start-Sleep -s 15

4. This will gather all the datastores in the cluster and exclude any datastore that has a name containing “*-local”. The wildcard is important because the local datastores contain “servername + -local” and if the wildcard wasn’t there all of the local datastores would be included because no datastore is named exactly “-local”

$Datastores = Get-Cluster -Name $Cluster | Get-Datastore | Where {$_.Name -notlike "*-local"}

5. I prefer the format of 2 digit month, 2 digit day, 2 digit year. This will get the current date of the system running this script, then convert it to this format of 051415 for example.

$today = (Get-Date).ToString("MMddyy")
$yesterday = (Get-Date).AddDays(-1).ToString("MMddyy")
$2DaysAgo = (Get-Date).AddDays(-2).ToString("MMddyy")

6. This will set the file name and location for the output from 2 days ago. If that file exists, it will be removed. Rather than have an output from every day saved until I manually remove it, this process seemed better. I chose to delete the file from 2 days ago as opposed to deleting yesterday’s file after we run the comparison in case we see a huge change in the difference file we can manually compare the 2 files to try to find the error.

$2DayOldFile = $filepath + $Cluster + $2DaysAgo + ".txt"
If (Test-Path $2DayOldFile){Remove-Item $2DayOldFile}

7. This will set the file path and name to the file path defined at the top, plus the cluster name, plus the date and add .txt to the end.

$CurrentFile = $filePath + $Cluster + $today + ".txt"
$YesterdaysFile = $filePath + $Cluster + $yesterday + ".txt"

8. Here we are exporting all the datastores from Step 4 by name and outputting to the file name/path defined in Step 7.

$Datastores | Select Name | Out-File $CurrentFile

9. This is where we set the name and path for the difference file that will track the datastore add/remove.

$DifferenceFile = $filePath + "Datastore-Changes" + ".txt"

10. This will read the content from today’s content and yesterday’s content.

$YesterdaysContent = Get-Content $YesterdaysFile
$CurrentContent = Get-Content $CurrentFile

11. Here we are comparing the content we just read in step 10.

$Compare = Compare-Object $YesterdaysContent $CurrentContent

12. The standard way “Compare-Object” outputs this data shows difference with a side indicator of <= or => depending on where the difference exists. Rather than remember which file was read first to determine whether a datastore was added or removed, we change the column names. If a datastore existed yesterday, but is missing today it is labeled as “Removed”. If a datastore didn’t exist yesterday, but does today it is labeled as “Added”.

$compare | foreach {
if ($_.sideindicator -eq '<=')
{$_.sideindicator = "Removed"}

if ($_.sideindicator -eq '=>')
{$_.sideindicator = "Added"}
}

13. This will take the results from step 11 with formatting of step 12 then change the column names. The list of objects compared is normally named “InputObject” and then “Added or Removed” is normally “SideIndicator”. Maybe that’s fine, but I prefer something a little easier to read. I’ve renamed “InputObject” to “Datastore” but also I add the current date and we change “SideIndicator” to “Added or Removed”. Once that is done, we output that file to the path and name defined in Step 9. The reason why we include the current date in the “Datastore” column is because we are using “-Append” with the “Out-File” command. This will add a dated entry of changes that occurred to the bottom of the existing (or new) output file. This means we aren’t overwriting the same file every day, we are just adding to it. In case you forget to check this file for a few days you won’t lose that data.

$Compare |
select @{l='Datastore' + ' - ' + (Get-Date);e={$_.InputObject}},@{l='Added or Removed';e={$_.SideIndicator}} |
Out-File -Append $DifferenceFile

Now that we know what this thing does, let’s see it in action. I have run the output over 3 days and this is how the output file is displayed. We can see that on 05-14-15 we added Lab-Datastore-10 which didn’t exist on 05-13-15. Then on 05-15-15 we removed Lab-Datastore-03 and we added -11 and -12.
image

When running the script I commented out the removal of the 2 day old file so we could compare manually. Now we have an output file created (Datastore-Changes.txt) that should show the differences.
image

Inside Datastore-Changes.txt we see that on 5/14 the datastore “Lab-Datastore-10” was added and on 5/15 we lost Lab-Datastore-03, but added 11 and 12.

image

We can delete this file at any time and the next time this script runs we’ll create a brand new file. This means there is no dependency on this file already existing in order for the script to run and doesn’t require us to keep a long list of all the datastore add/removes for all eternity. Now you’ll just need to save the script schedule it to run using Windows Task Scheduler.

Below is the full scripts with comments.

#Define the vCenter Server and Cluster
$vCenter = "LabvCenter.domain.com"
$Cluster = "LabCluster"

#Set the path location for the output files
$filePath = "C:\test\" + $Cluster + "\"

#Connect to the vCenter Server and sleep for 15 seconds (necessary for security warnings)
Connect-VIserver $vCenter
Start-Sleep -s 15

#Get a list of all the datastores
$Datastores = Get-Cluster -Name $Cluster | Get-Datastore | Where {$_.Name -notlike "*-local"}

#Get the current date in the correct format
$today = (Get-Date).ToString("MMddyy")
$yesterday = (Get-Date).AddDays(-1).ToString("MMddyy")
$2DaysAgo = (Get-Date).AddDays(-2).ToString("MMddyy")

#Delete the output from 2 days ago (Remove this section if you want to keep the history)
$2DayOldFile = $filepath + $Cluster + $2DaysAgo + ".txt"
If (Test-Path $2DayOldFile){Remove-Item $2DayOldFile}

#Set the filename to include today's date
$CurrentFile = $filePath + $Cluster + $today + ".txt"
$YesterdaysFile = $filePath + $Cluster + $yesterday + ".txt"

#Export those datastores to a TXT file
$Datastores | Select Name | Out-File $CurrentFile

#Set file name & path for difference file
$DifferenceFile = $filePath + "Datastore-Changes" + ".txt"

#Get the content for yesterday and today's files
$YesterdaysContent = Get-Content $YesterdaysFile
$CurrentContent = Get-Content $CurrentFile

#Compare yesterday's and today's files
$Compare = Compare-Object $YesterdaysContent $CurrentContent

#Change the source/target column to "Removed" and "Added"
$compare | foreach { 
      if ($_.sideindicator -eq '')
        {$_.sideindicator = "Added"}
     }

#Change the column name output to "Datastore + Date" and "Added or Removed" then output to file
 $Compare | 
   select @{l='Datastore' + ' - ' + (Get-Date);e={$_.InputObject}},@{l='Added or Removed';e={$_.SideIndicator}} |
   Out-File -Append $DifferenceFile
Track Datastore Add & Removes With PowerCLI