--
DavidBannon - 13 Sep 2005
The Decision
Discussion
Some comments from someone using vmware GSX Server in a Grid development evironment:
- 4 Gigs of RAM: we run 8 or so virtual machines in this amount of memory. We don't run X11 on them (including the host) for the most part and generally each guest machine is intended to perform one task well. As a result memory requirements for each guest are generally lower than a single server would be. Note each guest can be allocated a different amount of RAM and this amount can be altered trivially with a reboot of the guest (not the host).
- With regards to the Performance mode comment: I would be very surprised if the gatekeepers have 100% cpu use - or even 10%. Certainly in our case we simply see better CPU utilisation when the servers are under load from multiple virtual machines rather than a performance loss.
- LOCAL Disk - make sure you have plenty of space for the guest machines and the host os and any spare images you want to keep around. Our servers have PATA Raid 1 on the host (note the vmachines can have virtual SCSI!). The most disruptive issue we have had has been insufficient disk space for all the virtual machines and their growth. You don't want to run the VM's over network disk (though they can access data and files via network you don't want the guest OS on a network disk).
- Ethernet ports - Virtual machines can share ethernet ports, can have multiple virtual ethernet cards, and the virtual ethernet cards can connect to different physical ethernet cards. There is a lot of flexibility. I would be surprised if it was necessary to have the suggested number of physical GigE? cards.
- OS 2.6 kernels - we've run all of the suggested operating systems on vmware GSX Server. All work. There is an occasional glitch with getting networking and clock time stable. We've not been able to figure out why in all circumstances. Debian 2.4 kernel is rock solid. Suse with a 2.6 kernel works though I'm yet to get the vmware network driver working (my bad I expect since I'm not familiar with Suse). Note the vmware network driver is higher performance than the default driver but requires a kernel module build. Easy enough for most OS's. Clock sync is a known problem and a combination of NTP and/or sync to host clock time (a utility in vmware) fixes it usually.
- Standard images - SRB should probably be a standard image, along with a vanilla Linux OS
--
RobertWoodcock - 09 Jun 2005
A couple of comments from me as well:
- Given everything is going to be running on these boxes, and in some cases there won't be spare parts "handy", we need to cover simple redundency issues, e.g. dual power supplies and mirrored disks.
- The disk mirroring could be either hardware (simplier for distribution) or software (cheaper). I'd favor hardware, just because we are shipping the boxes around and can't be sure of the support level at all sites.
- GSX is listed at 4 virtual machines/CPU, since we are all agreed on dual CPU, that is 8 VMs and that should be enough for ongoing work.
- 4 GBytes is sufficient RAM.
- Given a 4-8 VMs per machine, we would need around 200 GB disk for a reasonable disk space per VM. I'd go for 250GB SATA disks. Speed isn't a big issue.
- OS for the host system, I'd suggest CentOS? 4, as it is rebuilt RHEL4, and is a 2.6 kernel that David wants (I agree with him). Going for Fedora Core anything gives us long term problems due to their short support cycle.
- VMWare on 2.6 while not officially support runs fine. I've been using VMWare Workstation on 2.6 for over a year, and the discussion lists are full of people using it in that config without a problem.
- Guest OSs depend on the application requirement. Obviously they are mostly Linux, but I'd suspect we will get a mixture of 2.4 and 2.6, with various distros, depending on what the APAC maintainer is familiar with and lists as stable.
- I agree with Rob about ethernet ports. I feel that the two GigE? would be enough (one external, one internal). If it is really pumping through multi-gig data transfers then we need to look at a different mechanism for the data transfer.
Topic revision: r1 - 13 Sep 2005 - 11:15:49 -
DavidBannon