How easy is to replace disk in Nutanix AOS

Q: How many disks types you know?
A: Two. Those already broken and those will fail soon 🙂

How Nutanix AOS handle disk failures? Nutanix AOS has build in, out of the box self healing capabilities (no no, it does not heal the disk, … yet 😉 ). Nutanix AOS monitor all hardware components, including disks (both SSD and HDD). For disk monitoring, Nutanix AOS use S.M.A.R.T for proactive disk failure detection. When Nutanix AOS detects symptoms on disk which my lead to disk failure. Nutanix AOS stops writing any new data to disk. Immediately after failure detection, Curator scan occur. It scans Cassandra (it is Nutanix metadata) and checks what data need to be re-replicated. At eh end, Hades puts disk into offline to prevent further read and writes from\to disk.

In addition, as every modern system should do, Nutanix AOS informs admins about disk failure by rising alert in Prism. If you have SNMP or SMTP alerts configured, you will get info over monitoring system or email too. There is a Pulse services running on every Nutanix cluster. If it is enabled and configured to send cluster state to Nutanix support, then Support Request ticket will be open on your behalf and Nutanix support Engineers will be in touch with you shortly.

So, what do you have to do when you replace disk in Nutanix cluster. Nothing (apart of swapping disks, of course 🙂 ). Nutanix AOS will take care of everything. If you want, you can monitor process in Prism or by grabbing hades logs.

Hades is a process responsible for monitoring disks states and making disk online or offline.

New disk information in Prism

New disk information in Prism

Entries in Hades.out logs shows new disk was detected and succesfully added to pool of disks.

That is it. Now, new disk is a part of disk pool and it is available for writing data.

Artur Krzywdzinski

Artur is Consulting Architect at Nutanix. He has been using, designing and deploying VMware based solutions since 2005 and Microsoft since 2012. He specialize in designing and implementing private and hybrid cloud solution based on VMware and Microsoft software stacks, datacenter migrations and transformation, disaster avoidance. Artur holds VMware Certified Design Expert certification (VCDX #077).

  • Ernest Rogo

    Hi Artur, what are the Do’s and Dont’s regarding the NutanixManagementShare container which is created by default upon creating a cluster?
    Is there a limit to its size?

    • Hi Ernest,
      You can keep it or delete it, it is up to you. The limit is the size of the disk pool. It is totally up to you what you want to do, how do you want to manage.
      Let me know if you have more questions.

      • Ernest Rogo

        Thank you so much Artur, and I have one more question: Does AHV support running nested VMs, so far?

        • I do not think so, but let me check with engineering.

          • Ernest Rogo

            I am facing a situation where I cannot delete/remove VM disks marked as INACTIVE in Prism. AOS 5.0.3 AHV 20160925.44

          • Are you sure it is a vdisk not Protection Domain? Can you share printscreen?

      • David Avice

        What is the NutanixManagementShare used for? Any repercussions if it is deleted? I was told long ago that it was used when downloading updates etc.

    • David Avice

      • Hi, I apologize for late response. If you do not use AFS or ACS or Nutanix SelfService Portal – you can delete it.
        The AOS, NCC, and hypervisor upgrades still continue to use the local Controller VM storage /home/nutanix/software_downloads and do not require the NutanixManagementShare storage container.
        Let me know of you have more questions.