Link to home
Start Free TrialLog in
Avatar of E C
E CFlag for United States of America

asked on

Can I run 30-40 production VMs on a high-end NAS?

My organization currently has a physical server with 256gb RAM and 2 x 4tb internal Intel P4500 SSD cards. It's a Hyper-V host and all of the VMs live on the internal SSD cards. Performance is great but I'm exploring options for expanding our storage.

Rather than having all of the storage within the server box, one obvious alternative is a SAN. But the SAN configurations I'm looking at are over $30,000. Boooo.

It seems like NAS devices are getting a lot more powerful, robust and (with Flash drives) faster.

So my question: Is it feasible to load up a Synology with Solid State drives (or a hybrid combination), connect it directly to the server and then run all of the VMs off of there? And would it be best to connect using dual 10GbE direct Ethernet connections and iSCSI?

I fully understand a NAS is not a SAN. I am trying to consider in-between solutions. I know how my VMs run with internal storage. I know how my VMs will run on a SAN. But what if I ran all of my VMs on a NAS? (the SAN solution I am looking at blows away my current solution as far as IOPS so I might even say a SAN is overkill. With my current solution we are not experiencing any latency or speed issues)

Another question - if this is feasible, then would I be able to connect TWO or THREE servers to the NAS and have them all share the same NAS volume?

White papers or actual real-world examples would be great.
Avatar of noci
noci

That depends on the IO load you expect on those servers.
If they are 30-40 Database servers ....
Or 30-40 Compute servers that once in a blue monday load a program to run for 100's of hours.

Things to prevent: excessive paging, and swapping.

And possibly you could spread the load across several NAS's. and several network connections.
(network can easily be a bottlneck as well).
I would use the NAS only to serve as DATA disks for the VMs. A NAS featuring a normal share (SMB/CIFS) will never perform well enough to host the OS VMs, especially ones that are heavy IO minded like Windows Server is.
Even the most expensive NAS (featuring SSD maybe in RAID1+0 config) I'd say is good for at most 5 VM's with very light loads.
You could easily benchmark those 5VMs though (run them all at exactly the same time), and see if performance is good enough to add more.
If you are familiar with VMWare vSAN you might also appreciate Server 2016 Storage Spaces Direct aka S2D.  Leveraging Scale-Out File Server (SOFS) - SMB3.

With S2D, locally attached drives are used to create software-defined storage in a converged or hyper-converged scenario.  Network performance addressed utilizing RDMA.

https://docs.microsoft.com/en-us/windows-server/storage/storage-spaces/deploy-storage-spaces-direct
https://docs.microsoft.com/en-us/windows-server/storage/storage-spaces/storage-spaces-direct-overview 

For example, for Hyper-V you could leverage a 3-node cluster with SSD or NVMe for the "caching" and less expensive storage for the data that changes less often such as the Operating System itself.  

I can elaborate if you want to know more.

Also, check out
https://blogs.technet.microsoft.com/filecab/2018/10/30/windows-server-2019-and-intel-optane-dc-persistent-memory/The new HCI industry record: 13.7 million IOPS with Windows Server 2019 and Intel® Optane™ DC persistent memory
Avatar of E C

ASKER

Hi Brian. I moved from VMWare to Hyper-V about 2 years ago. While I loved VMware, I am now a big fan of Hyper-V.  I recently read about some performance issues on VMware vSAN after about your 12th VM. Not sure how accurate that is but regardless, I’d say there’s very little chance of me going back to VMware after having invested so much time and energy migrating to Hyper-V.

I looked into S2D and it’s quite impressive. However, I am told that it works “best” with 4  servers in a cluster (vs 3, that is). And the servers have to be Microsoft certified as do all the components. By the time we were done the price was more expensive than buying 3 stand-alone servers, 2 SAN switches and a SAN!

I’m still interested in exploring S2D but at the moment I’ve got my eye on the budget. And so I am trying to determine if a high-end NAS would be a viable replacement for my internal SSD storage situation.  The problem with my internal storage is that I’ve reached the capacity of the internal connections (8) and so there’s no room for me to add more storage or networking.
ASKER CERTIFIED SOLUTION
Avatar of Brian Murphy
Brian Murphy
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
You will always get more IOPS from local NVMe cards than you will get from a SAN or NAS. The advantage of a SAN is redundancy, not performance.  A decent SAN (or high end NAS) has two or more controllers and you can share LUNs between servers for clustering to give server redundancy.

Rip apart the fastest SAN in the world and what do you find? NVMe cards just like you have in your server. Admittedly they have a lot more of them than you have PCIe slots for but for your $30,000 you can buy a bigger server with more slots.

Skim through https://www.datacenterdynamics.com/news/dell-introduces-the-worlds-fastest-storage-array/
10 million IOPS for the SAN but it takes two racks of what are basically servers to achieve that. Unlikely they use TLC cards as MLC have much better random write performance but the principle is the same. Data from the SAN's NVMe cards goes into their CPU, out their interface card, over some wire, into your interface card, then into your CPU. All that latency that you don't have with local storage.
I can say we have ran our entire Citrix infrastructure, using XenServer, on DellEMC Isilons for about 8 years. The XenServers are running on Cisco UCS M4 blades, but all of the storage, including the boot drives are on Isilon, which do not have any SSD's. Now, these are not running any high end data bases, but we do have about 30 VM's running on them. These vary in OS's from Windows Servers and workstations to appliance VM's such as virtual Netscalers. This Isilons are not even dedicated to the Citrix farm. They are also used to present the primary file shares for the entire company, the target for our email archive systems, as well as our OpenText SAP archive system. We are looking at also using the Isilon for data analytics in the future. At that point, we will probably replace them with nodes that have SSD's and are more suited for big data. The great thing about Isilon is the ease of replacing them to upgrade or expand. All you do is plug in the new node, add it to the cluster to expand the array, the data is restripped across all nodes. To remove an old node, simply select the option to remove a node, the data is migrated to the other nodes, and you can remove it from the rack.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I agree with the majority of concepts and would simply add or ask why NAS? CIFS/SMB1.2.3.3.1 oh my....

Citrix and other technologies like it are TOS....Top of Stack.  

The end-user ultimately decides where our design is good or bad and that is not subjective....It's perception.

Ultimately, we can utilize whatever storage we want on NAS and I have....I've created designs that hosted upwards of 50000 users, and multi-tenancy, but it was in a POD type design that required several racks of servers and a NAS that was HA and attached to  SAN.  At the time, the cost save was relative to HBA's in every physical server and having FC.  

The cost save still came at a cost to performance being the Operating System itself, if not in Flash, can only deliver so much.  It is easy to save money in IT.  I'm old school in that I still believe that it is our job to bridge the gap between "perception" and "reality".

If you are coming at this from the perspective of being a good "steward" of IT and trying to save money...your not wrong.  We should try to save money.  Just not at the expense of the end users.  You have those end users that are relegated to the fact that "this is just how it is" and you have the end users that appreciate the value of IT.  That perception, is our reality.

Way back when, not so long ago, the bottleneck was RAM, perhaps not RAM but the bus speed between the CPU and RAM and then it was just...RAM...   Before that, it was CPU.  Then, it was always IO...the disks.  It is only once we get past one pain point that we acknowledge the other it seems.

The architecture can only compensate as far as the weakest link in the chain.

The weakest link it would seem, today, is the Operating System.

Unless your spinning up VM's in flash...preferred... NAS is the least of your worries.

Which is most important...saving money, or end-user experience?

Example, Citrix is designed to run local applications.  What happens when you move the applications to the cloud?

Weakest link. It would seem we can host "Citrix" on a box of rocks and make it sound great but the end-user might just disagree. We can do it, save some money, but is it right?  

When Citrix was founded by a few guys from IBM perhaps they did not take this in to consideration.  They were focused on decentralized computing at the time and who could blame them.  At least they are compensating, now.  

SAN versus NAS, not so different.

Might as well move the data to the cloud.

Eventually it will catch up...just not now.

Sometimes we need to compensate the reality to change the perception.  If your leadership is pushing you down the path of cost saves ask them to talk to the end user.  If that doesn't work.... I'm available with 48 hour notice.
Avatar of E C

ASKER

Thanks everyone for your wisdom. Rest assured I am not going to base my decision off of cheapest configuration. My intention is to find out if there are any real-world solutions somewhere between (a) internal storage and (b) san solution with san switches.

My immediate thought was 'high end NAS' since they seem to have come a long way and NAS seems to be the logical step below a SAN. In particular, I was looking at the Synology FS series ... https://www.synology.com/en-us/products/performance#fs_series

I am currently looking at 3 options with Dell:
1. S2D and 3 servers (love how S2D works - learning more about it)
2. A 3-2-1 configuration (3 servers, 2 switches, 1 SCv3020 SAN)
3. An FX2 converged chassis with 3 server blades, connecting to an SCv3020 SAN)

Can a NAS replace the SAN, and how nice will it play with my servers and switches - I'll have to dive deeper into that.
Can/Do I share 1 LUN with 3 servers vs create separate LUNs for each server - I'll need to look into this as well.

I'll spend the next few days gathering all of this info together and reading some more. The advantage of a one-vendor solution is that if something doesn't work, I just give them a call. I know if I put something else into the mix, I'm on my own.
How about 3 servers, a 1 SCv3020 SAN and *no* switches? There is a "12G SAS, 4port, PCI-E, Full height, Qty 2"option that actually reduces the price of the SAN, removes the necessity of having switches because you use SAS DAC cables to the hosts. SAS HBAs for the hosts are probably cheaper than 10Gb Ethernet or 16Gb fibre. using SAS host connection is limited to 4 servers. You do realise you can only use Compellent drives and not 3rd party ones in it?

BTW, your PCIe SSDs will still be useful, you can use them as local read cache with VMware, presumably with Hyper-V too.

You might also consider local disks and a virtual SAN, you get the same clustering for redundancy. Various vSAN vendors - Starwind are probably the cheapest.
Avatar of E C

ASKER

Hi Andy,
The Dell rep told me that the only way to direct-connect to the SAN (that is, no switches) is if I went with the FX2 Chassis (Option #3 in my previous reply). I has asked him why I couldn't direct-connect the standalone servers (for example, 3 R640s or R740s) and he said it wasn't supported. I think I will ask him again. I'd love to avoid having to get SAN switches.

As for the Compellent drive requirement - Thanks, I was not aware of that, but I suppose this is the game we play whenever we purchase a 'solution' from a vendor. Same thing if I purchased a complete S2D solution from Dell - the only way they will sell it is with Microsoft-approved components which undoubtedly means higher prices.

Regarding local disks and a virtual SAN: Can you elaborate on this? Let's say I purchased 3 standalone servers, each with 256gb of RAM and 8tb of internal storage. If I were to just build a RAID inside of each server, then it's only available/usable to that server. Storage Spaces Direct takes all of the combined internal storage and allows me to use it between all 3 servers - RAIS. Does a vSAN solution like Starwind essentially do the same thing, only difference is it's not part of the Windows Server OS but rather, a third-party software SAN solution? Does it still give redundancy if one server goes off-line?
Pity it doesn't support DAC, the MD3xxx from Dell do although lower spec (and lower price) than Compellent.

A virtual SAN runs in a VM on the local host and creates an iSCSI target that other VMs can use. Starwind is fairly simple, limited to two storage nodes that replicate between themselves so if either storage host fails the cluster stays up. You can still have 3 or more servers but only two of them would make the vSAN. Others such as HPE StoreVirtual can use more storage nodes and support network RAID 5 rather than just mirroring two nodes; more expensive of course but as you add compute nodes to your cluster you can also add vSAN appliances at the same time. In all cases the storage nodes are compute nodes as well. You would generally use separate switches for iSCSI but you can use vLANs instead.
Hi,

in addition to all comments mentioned also keep in mind that running virtual machines from any SOHO NAS requires it to have enough memory to run the virtual machines in the first place. You probably won't be able to configure 30 to 40 virtual machines due to lack of available resources.

After you know the reserved system memory, you can calculate how much memory is available for the virtual machines.

If your Synology NAS has 8 GB of memory, DSM allocates 1.5 GB for system use and keeps 6.5 GB for virtual machines. If your Synology NAS has 16 GB of memory or more, DSM reserves 10% of the total memory for system use and the rest is available for virtual machines. A NAS with 512GB memory will generally be enough to accommodate about 20 to 24 virtual machines

Also keep in mind that each vCPU requires 80 MB.

Cheers
I don't believe anyone was suggesting that a NAS would contain the compute resources for VMs,...only the storage.

Starwind supports much more than simple mirroring. It has support for some pretty good size clusters capable of running even after more than 2 node failures. It also runs natively on Hyper-V without requiring a VM to be the storage controller.

https://www.starwindsoftware.com/grid-architecture-page
AFAIK it is only 3-way mirroring but there is no whitepaper, just a webinar. Although you can have 6 nodes that would be split into 2 or 3 clusters.

They say "N+1 or N+2 systems, where each component has one or two backup partners". Well that's not N+1 or N+2, it's 1+1 or 1+2. HPE do Network RAID 5 and Network RAID 6 which is really N+1 or N+2 where N can be greater than 1. But HPE costs more.
Avatar of E C

ASKER

Hi everyone - a quick update. We're still exploring options and everyone's comments here have been very helpful. The high-end NAS would be for the (shared) storage of the VM VHD files. The servers will handle the compute power. The more I think about it, the more I am leaning against a high-end NAS mainly because the entire company will rely on these VMs. And even if I got everything to work great, chances are one day something will break and I'd get no support from the server vendor (in my case, Dell) since I implemented it on my own. As I've mentioned before, low cost is not the deciding factor. And besides, no amount of discount can make up for the stress and the down-time. I was really just curious if (A) IS this being done out there in the real world and (B) how reliable is it. I'll report back when we get closer to a decision.
Support is a valid concern for any deployment, especially if you involve 2 or more vendors. That's true for SAN, NAS, and even hyperconverged. Don't forget that all of those use networking that is often from yet another vendor. Synology says that they're certified for Hyper-V (at least 2012 and 2012 R2), if that helps. This doesn't mean that they're in the same tier as the big name players (Dell EMC, HPE, NetApp, Pure Storage, HDS) in terms of support or engineering.

https://www.synology.com/en-us/dsm/feature/storage_virtualization