Basic Xen Install (Testing)

Responsible Person

GrahamJenkins, VPAC

Technology Summary and Contents

Basic Installation, Creation of Domain 0

  • If you have an IBM xSeries-346 server, you should boot from a recent (Jan 2006 or later) Update-Express CD (#1) and select all available updates. It will probably be necessary to reboot with the CD still loaded so that the update process completes satisfactorily.
  • If you have an IBM ServeRAID card and 2 or more disks, install the root disks in positions 0 and 1, then boot from the (green) ServeRAID support disk, and use the wizard to mirror those disks in their entirety.
  • From: http://mirror.centos.org/centos/5/isos/i386/ download: CentOS-5.0-i386-bin-1of6.iso Don't use an x86_64 image, as some VDT applications aren't supported for that architecture.
  • Write to a CD, then boot from it.
  • Create a custom layout on the the disk; sda1 / ext3 (4G), sda2 swap (512M), sda3 LVM (balance, containing logical volume XenGuests with a 60G ext3 filesystem mounted as /XenGuests).
  • Assign a static IP address.
  • During software selection, tick 'Customise Now' and unselect every item in every group except the Base component in the Base System group.
  • After reboot, accept the default Firewall configuration and also elect to disable SELinux.
  • Then: yum update and yum install libvirt kernel-xen ntp
  • Edit: /etc/ntp.conf for your site, then do: chkconfig ntpd on
  • Edit: /boot/grub/grub.conf so that the Xen kernel boots by default, and add dom0_mem=256MB to the end of the '/boot/xen..' line.
  • Then: init 6

NIC Allocation

  • By default, Domain0 uses eth0. In the interests of uniformity on IBM xSeries-346 machines, it is desirable that eth0 corresponds to NIC 1 on the motherboard. To bring this about, you may need to do something like this (to switch eth0 and eth2):
  • ifdown eth0
  • cd /etc/sysconfig/network-scripts
  • cp ifcfg-eth[02] /tmp; cp ifcfg-eth2 ifcfg-eth0; cp /tmp/ifcfg-eth0 ifcfg-eth2
  • vi ifcfg-eth[02] [switch names in "DEVICE" line]
  • ifup eth0

Domain 0 Monitoring

  • Copy the appropriate version of 'ipssend' from the (green) ServeRAID support disk to /usr/bin and ensure that is executable by root.
  • Do: cd /etc/yum.repos.d && wget http://www.grid.apac.edu.au/repository/dist/APAC-Grid.repo
  • Then do: yum install Gpulse Gbuild
  • Configure sendmail or install and configure postfix as appropriate for your site. For instance: yum install postfix .. then in /etc/postfix/main.cf, add relayhost = [your.smtp.gateway] in the appropriate section.
  • Activate sendmail or postfix. For instance: service sendmail stop; chkconfig --del sendmail; chkconfig --add postfix; service postfix start
  • When executed in non-interactive mode, gridpulse.sh actually delays mailing the machine's status for a psuedo-random interval (up to 3 minutes) so as to reduce the impact of multiple simultaneous arrivals at the mail-server.

Domain Shutdown Actions

  • By default, when Xen host shutdown occurs, the state of each guest domain will be saved as a large file in directory: /var/lib/xen/save
  • This is probably not a good idea; you can change this behaviour by changing the XENDOMAINS_SAVE value in file: /etc/sysconfig/xendomains
  • It is suggested that you change the value to "" so that guest domains are shut down rather than saved; this will ensure that guest domains come up with a correct time after Xen host startup.

Creation of Other Domains (e.g. NG2)

  • Note: A variation of this install procedure is to use a local NFS mount .
  • Pick a unique MAC address beginning in "00:16:3e" then use it in a command like:
    virt-install -n NG2 -r 512 -f /XenGuests/NG2 -s 8 -m 00:16:3e:00:00:a0  -p -l http://ftp.monash.edu.au/pub/linux/CentOS/5/os/i386/ --nographics -x \
    "ks=http://www.grid.apac.edu.au/repository/dist/production/files/C5Guest_ks.cfg ip=131.170.184.141 netmask=255.255.255.0 dns=131.170.184.1 gateway=131.170.184.254"
    
    The "-r" parameter can take a value of 256 for gateway machines other than NG2.
  • This will build a guest with a 7G primary root partition and a 1000M primary swap partition. If you need additional (e.g. "/data") partitions, adjust the "-s" value above accordingly so that there will be room to allocate space once the guest has been built.
  • When prompted, select "System clock uses UTC" and choose a time zone. Also, when prompted, supply a root password.
  • Then: yum install ntp postfix (or other mail program as used at your site).
  • Edit /etc/ntp.conf and /etc/postfix/main.cf appropriately for your site. Also change network settings as required; for a gateway machine like NG2 where users might be authenticated, you may need to edit /etc/nsswitch.conf, /etc/yp.conf and /etc/hosts.
  • chkconfig ntpd on; chkconfig --del sendmail; chkconfig --add postfix; service ntpd start; service sendmail stop; service postfix start If your site uses sendmail, perform whatever equivalent operations are appropriate.
  • cd /etc/yum.repos.d && wget http://www.grid.apac.edu.au/repository/dist/APAC-Grid.repo
  • Add the following two lines to: /etc/sysctl.conf:
    vm.min_free_kbytes = 32768
    xen.independent_wallclock = 1
  • Then do: init 6
  • Most gateway VMs will need a host certificate: HostCertRequestAPAC

Installing from a local NFS mounted directory -- for Guest DomUs?

  • % virt-install -n NG2 -r 512 -f /srv/XenGuests/NG2 -s 8 -m 00:16:3e:00:00:a0  -p -l nfs:tl4:/l0/centos-5.0 -x "ks=nfs:tl4:/l0/C5Guest_ks.cfg"
  • "tl4" is our test Dell 1950 box
  • To speed up the installation, rather than pointing at Monash's CentOS? mirror we've copied contents of the CentOS5? ISOs onto local disk and NFS export it locally since Anaconda does not understand "file:///" URLs i.e. nfs:tl4.anu.edu.au:/l0/centos-5.0
  • The kickstart script is also snarfed from NFS: "ks=nfs:tl4.anu.edu.au:/l0/C5Guest_ks.cfg"
  • In the kickstart script we've replaced the Monash URL http://ftp.monash.edu.au/pub/linux/CentOS/5/os/i386/ with nfs i.e. the server is tl4 and the directory is /l0/centos-5.0
  • Use of "nonsparse" disk images -- http://www.mail-archive.com/fedora-xen@redhat.com/msg00431.html

Backing Up Guest Images

  • You can backup all guests on your Xen host thus:
  • lvcreate -L 2G -s -n Snapshot /dev/VolumeGroup00/XenGuests
  • mount /dev/VolumeGroup00/Snapshot /mnt
  • cd /mnt && nice tar cjf backup@backuphost:Guests.tbz  --rsh-command=/usr/bin/ssh . (or otherwise, as appropriate)
  • cd /tmp && umount /mnt && lvremove /dev/VolumeGroup00/Snapshot

Recovering Data from Guest Images

  • lomount -t ext3 -diskimage /XenGuests/NG2 -partition 1 /mnt
  • Guest root filesystem will then be found under /mnt
  • Note: specifying a -verbose to lomount tells you the parameters it passes onto mount

Running fsck on a Guest's disk image

  • Using virt-install with the kick-start script leads to the creation of an file on disk which contains the boot partition and the DomU? 's partition
  • If you nuked a DomU image by mounting it under the Dom0, wrote to it while the DomU was up, you need to fsck the disk image smile
  • Here are the steps to fsck the DomU's disk using Linux's loopback device infrastructure:
  • a) What is the current offset of the partition of interest? i.e. where does your DomU's data partition start on disk? (Assume /XenGuests/NG2)
  • = fdisk -l -u /srv/XenGuests/NG2 =
                        Device Boot      Start         End      Blocks   Id  System
          /srv/XenGuests/NG2p1   *          63    14683409     7341673+  83  Linux
          /srv/XenGuests/NG2p2        14683410    16723664     1020127+  82  Linux swap / Solaris
         
  • From this we now know /XenGuests/NG2p1 starts at sector 63. Multiply it by 512 bytes to obtain 32256 (i.e 63 sectors per track and 512 bytes per sector)
  • b) Loopback mount this offset onto a free /dev/loop entry. losetup -f gives this value. losetup -a gives a list of all currently instantiated loopback devices.
  • losetup `losetup -f` /XenGuests/NG2 -o 32256
  • losetup -a to confirm
        # losetup -a
        /dev/loop0: [fd06]:49153 (/srv/XenGuests/NG2), offset 32256
        
  • c) fsck the device --
        # fsck /dev/loop0
        fsck 1.39 (29-May-2006)
        e2fsck 1.39 (29-May-2006)
        /1: clean, 32146/1836768 files, 275563/1835418 blocks
        
  • d) Delete the loopback device -- losetup -d
  • References: Mounting Loopback devices, Working with filesystems

Starting/Stopping Domains Automatically

  • To start/stop domains automatically during xen host boot/shutdown ..
  • chkconfig --add xendomains; service xendomains start
  • ln -s /etc/xen/NG2 /etc/xen/auto (for 'NG2')=
  • When the xendomains service is activated in this way, kernel file images get written periodically into the directory: /var/lib/xen
  • You should therefore install the XenClean.cron script as follows and ensure that is is executable:
    cd /etc/cron.hourly && wget http://www.grid.apac.edu.au/repository/dist/production/files/XenClean.cron

Using Additional NICs

  • To make the additional interfaces accessible, change the 'vif' line in your Xen guest configuration script(s) so that it looks something like:
       vif = [ 'mac=00:16:3e:00:00:30, bridge=xenbr0',
               'mac=00:16:3e:00:00:31, bridge=xenbr1' ]
       
  • Also in directory /etc/xen/scripts do:
       mv network-bridge network-bridge.dist
       cat >network-bridge <<"EOF" 
       #!/bin/sh
       /etc/xen/scripts/network-bridge.dist $1 netdev=eth0 bridge=xenbr0 vifnum=0
       /etc/xen/scripts/network-bridge.dist $1 netdev=eth1 bridge=xenbr1 vifnum=1
       EOF
       chmod a+xr network-bridge
       
  • Then reboot.

  • By default, CentOS 5 Xen only setup 4 network loopback devices. Edit /etc/modprobe.conf to enable more:
       options netloop nloopbacks=7
       

pciback -- A quick primer on getting pciback up on a given DomU

  • What is it? pciback is a means to get a DomU to use a PCI interrupt rather than going via Dom0.
  • Value proposition? Achieving close to wire speed from a DomU. Benchmarks coming soon!
  • Tell the Dom0 to keep off a given PCI id by adding this in /etc/modprobe.conf in Dom0
       # These two lines are to ensure that before the BNX2 network driver is
       # loaded, pciback gets a chance to hide the devices that should be
       # directly used in Xen domUs. -JAO, 2007-04-18
       options pciback hide=(0000:09:00.0)
       install bnx2 /sbin/modprobe pciback ; /sbin/modprobe --ignore-install bnx2
       

  • On our box the requisite PCI id is 09:00.0
  • The Dell 1950 uses a newer Broadcomm bnx2 chipset
  • The tg3 driver used by the IBM gateway box has "pci quirkiness" which is handled by the the /etc/xen/xend-pci-quirks.sxp file

  • Confirmation from dmesg on dom0
       ...
       pciback 0000:09:00.0: seizing device
       ACPI: PCI Interrupt 0000:09:00.0[A] -> GSI 16 (level, low) -> IRQ 16
       ACPI: PCI interrupt for device 0000:09:00.0 disabled
       input: PC Speaker as /class/input/input4
       Broadcom NetXtreme II Gigabit Ethernet Driver bnx2 v1.4.44-1 (August 10, 2006)
       ACPI: PCI Interrupt 0000:05:00.0[A] -> GSI 16 (level, low) -> IRQ 16
       eth0: Broadcom NetXtreme II BCM5708 1000Base-T (B1) PCI-X 64-bit 133MHz found at
       mem f8000000, IRQ 16, node addr 001372fb7ab1
       

  • We need to tell the DomU about the PCI ID we've hijacked via pciback. This is done by editing the appropriate /etc/xen/vmname.init file i.e. add the following in the /etc/xen/NGData file --
        pci = [ '0000:09:00.0' ]
        

  • Edit the appropriate /etc/sysconfig/network-scripts/ifcfg-eth{0,1} file and add the MAC address of the network card who's PCI id was hijacked by pciback i.e. /etc/sysconfig/network-scripts/ifcfg-eth1
       # Broadcom Corporation NetXtreme II BCM5708 Gigabit Ethernet
       DEVICE=eth1
       ONBOOT=yes
       HWADDR=00:13:72:fb:7a:b3
       BOOTPROTO=static
       IPADDR=192.43.239.49
       NETMASK=255.255.255.0
       TYPE=Ethernet
       

  • dmesg on DomU
        ...
       netfront: Initialising virtual ethernet driver.
       netfront: device eth0 has flipping receive path.
       ADDRCONF(NETDEV_UP): eth1: link is not ready
       bnx2: eth1 NIC Link is Up, 1000 Mbps full duplex
       ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready
       eth0: no IPv6 routers present
       eth1: no IPv6 routers present
       ...
       
  • pciback Queries: JosephAntony, JasonOzolins

Booting up CentOS 4.4 Guest

  • Assumed there is at least one CentOS 5 guest built. Copy CentOS 5 guest initrd over

    lomount -t ext3 -diskimage /XenGuests/NG2 -partition 1 /mnt
    cp /mnt/boot/initrd-2.6.18-8.1.8.el5xen.img /boot/initrd-2.6.18-8.1.8.el5xenU.img
    umount /mnt
    

  • Edit /etc/xen/ngold

    kernel = "/boot/vmlinuz-2.6.18-8.1.8.el5xen"
    ramdisk = "/boot/initrd-2.6.18-8.1.8.el5xenU.img"
    name = "ngold"
    memory = "256"
    disk = [ 'phy:/dev/VolumeGroup00/ngoldRoot,xvda1,w',
             'phy:/dev/VolumeGroup00/ngoldSwap,xvda2,w']
    vif = [ 'mac=00:16:3e:XX:XX:XX, bridge=xenbr0', ]
    root = "/dev/xvda1"
    extra = "ro selinux=0 3"
    #bootloader="/usr/bin/pygrub"
    vcpus=1
    on_reboot   = 'restart'
    on_crash    = 'restart'
    

  • Copy over kernel modules

    mount /dev/VolumeGroup00/ngoldRoot /srv/ngold

    cd /lib/modules/`uname -r` && mkdir -p /srv/ngold/lib/modules/`uname -r` && find . -print | cpio -pdm /srv/ngold/lib/modules/`uname -r
    

  • Edit /srv/ngold/etc/fstab

    /dev/xvda1              /                       ext3    defaults        1 1
    none                    /dev/pts                devpts  gid=5,mode=620  0 0
    none                    /dev/shm                tmpfs   defaults        0 0
    none                    /proc                   proc    defaults        0 0
    none                    /sys                    sysfs   defaults        0 0
    /dev/xvda2              swap                    swap    defaults        0 0
    

  • Edit /srv/ngold/etc/inittab for the gettys section

    # Run gettys in standard runlevels
    co:2345:respawn:/sbin/agetty xvc0 9600 vt100-nav
    #1:2345:respawn:/sbin/mingetty tty1
    #2:2345:respawn:/sbin/mingetty tty2
    #3:2345:respawn:/sbin/mingetty tty3
    #4:2345:respawn:/sbin/mingetty tty4
    #5:2345:respawn:/sbin/mingetty tty5
    #6:2345:respawn:/sbin/mingetty tty6
    

  • Append this line to the end of /srv/ngold/etc/rc.sysinit file

    /usr/bin/killall nash-hotplug
    

  • Don't forget to unmount the partition before boot up.

    umount /srv/ngold
    xm create -c ngold
    

  • Minor quirks: udev will not be able to start when booting up CentOS 4.4 guest, everything else should be normal.

-- GrahamJenkins - 22 May 2007

Topic revision: r87 - 25 Sep 2007 - 14:24:03 - WillHsu
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback