|
IBM HS20 and ESX 2.x - Build process, problems and solutions. Here is another long overdue article regarding IBM HS20 blades and ESX 2.x. I had a fair bit of experience with ESX on HP kit when the IBM gear turned up. New bladecenters and a nice shiny new SAN. So according to my documented build process all I had to do was stick the cd in and follow my process. Well it wasn't to be, I had some large configuration issues with the IBM gear and this article hopes to help anyone in the same situation. Some of the parts in this document apply to all configurations and will be outlined also in seperate articles as well as inclusion here due to this.
If you are deploying a large number of servers then you should look at automating your host deployment, but if you are only installing a small number then what is the point of spending days setting up such an automated process if it is quicker to build them manually. The only reason to do an automated build process for a small number of hosts is for consistency, but if you are pedantic about your process then this should not be an issue anyway. I will outline a build process for an IBM HS20 blade that I know works. This build process is for esx 2.x and I believe a lot of the issues outlined here are resolved in esx 3.x. Hopefully I can update the build process for esx 3.x here soon. Some of the information was gleaned from the vmware knowledge base, some from the VMTN community forums and some from other friends running on the same hardware. Credit where credit is due, and I will attempt to acknowledge the people along the way who helped with each part. Ninety nine percent of the compilation is my own work though and is outlined below. Feel free to adapt this build process to your own hardware and your own companies infrastructure. Do not copy it word for word though without passing on the information where you got it from. It is all about giving back to the community that we are trying to build here. Straight off the bat we find the first problem with the IBM HS20 blade. The boot order out of the box is normally wrong, and will not boot from the hard disk or the cdrom drive. You will need to go into the bios and change the boot order then configure the raid array on the blade to install esx on. Normally my configuration is based on an internal RAID 1 mirror and the data drives on the SAN. Boot from SAN is out of scope and is large enough to be a new article anyway. The HS20 cannot be built in GUI mode, due to USB issues with the bladecenter itself. Thats ok because I personally think that you should be using text mode. Get elbows deep in the software, you will learn more that way. This build process is broken up into the pages that you see in each section, it assumes that you hit next after each section described. Boot from the ESX 2.x cd and select text mode at the menu prompt by typing in "text". Select ok on the welcome screen Select Default Accept the license agreement Enter your serial numbers. I keep all my serial numbers up to date in a spreadsheet that are copied from my vmware.com login. As one is used, the spreadsheet is updated with the hostname. The defaults on the sharing for the scsi and the qlogix drivers are ok on this menu. Just leave them be. On the network config page, make sure that one network card is set to the console and one is set to the vm's. We will change this later as the HS20 only has 2 network cards that must be forced to run at gig speed if you are using the copper pass thru modules on the bladecenter. On the device allocation screen you need to make sure that you change the reserved memory to 800mb. Always select 800mb, do not worry about the recommendations. The reserved memory needs to be large so that the service console can perform actions that you want it to, like backups and scripts etc. Select yes to initialise the drive when prompted. On the partitioning screen select auto and remove all. Select yes to the warning that appears. On the partition screen we want to create new partitions for the /tmp, /usr, and /var partitions. The tmp partition is for temporary files, the usr partition is for user home directories and the var partition is for files that vary in size. Due to the abundance of internal disk and the fact that I do not store any VM data on the internal disks, I like to go big with these partitions. Select new, type the name for the /tmp partition and change the size to 10240mb. Make sure the file system is ext3 and leave the rest of the defaults. Rinse and repeat for /usr and /var. If you have the space make your root partition nice and large as well. The default 2 gig is not enough if you intend to upgrade the server to esx 3.0 down the track. Once completed, ok on the partitioning screen to get out. On the network config page, fill in your network details. Select your time zone Enter your root password If you want to create new users do that here on the new users screen. Select ok on install to begin. Wait for installation to complete After installation completion, hit next. Wait for reboot to initiate and remove cd as eject will fail. Post install, put server name into dns. Change the nic binded to the console to shared. This part of the document was taken from the VMWare community forums and credit goes out to the poster who originally wrote it though I cannot find his name now. Just doing the vmkpcidivy -i is not enough, this will not work on it's own. You need to follow the config outlined below to successfully share the service console nic with the vm's. CHANGE NIC BINDED TO CONSOLE TO SHARED • Let machine boot up, when it comes up, press ALT-F2 to bring up console login, log onto machine as root • When logged on, should get prompt similar to [root@<machine name> root]# • Type vmkpcidivy –i • Press ENTER to accept all the default units which are shown in [ ] until you get to the NICS (its about 4 or 5 enters) • When it gets to the Ethernet Controller section, look for the one with the default value of [c] – when you see it, change it to shared mode by typing s then enter • Leave the other values as [v] by hitting enter • At Commit changes (y/n) make sure it says [y] and hit ENTER CREATING THE BOND – type the commands in BLUE • [root@<machine name> root]# cd .. (enter) • [root@<machine name> /]# (note the prompt has changed, the root has gone) • [root@<machine name> /]# cd etc (enter) • [root@<machine name> etc]# cd vmware (enter) • [root@<machine name> vmware] vi hwconfig (enter) • This will open up the editor vi and you will be looking at a file. • Use the down arrow key to scroll down to the last line, then use the right arrow key to scroll right to the last character of the last line, then press i , then press right arrow once, then press ENTER, your cursor should now be on the line below • Add the following (NOTE it is the number ONE in <1> not the letter L) – press ENTER after each line including the last nicteam.vmnic0.team = “bond1” nicteam.vmnic1.team = “bond1” • Press ESC key twice • Type :wq! (including the : ) Then ENTER • You should be back at the prompt [root@<machine name> vmware]#
GIVE NETWORK ACCESS TO SERVICE MODULE – type the commands in BLUE • [root@<machine name> vmware]# cd .. (enter) • [root@<machine name> etc]# cd .. (enter) • [root@<machine name> /]# cd etc (enter) • [root@<machine name> etc]# vi rc.local (enter) • This will open up the editor vi and you will be looking at a file. • Use the down arrow key to scroll down to the last line, which should be # END_OF_VMWARE_RC_DOT_LOCAL • Press i then ENTER TWICE to add two blank lines • Press the UP ARROW key once to put cursor up to the blank line above • Add the following (NOTE it is the number ONE in <1> not the letter L) – press ENTER after each line including the last #vmxnet_console through bond1 /etc/rc.d/init.d/network stop rmmod vmxnet_console insmod vmxnet_console devName=bond1 /etc/init.d/network start mount –a • Press ESC key twice • Type :wq! (including the : ) Then ENTER • You should be back at the prompt [root@<machine name> etc]# • Type reboot then Press enter – the system should go through a shutdown process
Additional Config Log into the MUI (for all the young players, the MUI is a web browser pointed at the server name) Look at the warnings on the front page of the MUI "No swap space is configured", select reconfigure. In child window, select create. Leave defaults and select ok. select ok again. Using putty, connect to servername login as "root" type "vmkusagectl install". This starts the vmkusage logging which can be accessed by pointing a web browser at http://servername/vmkusage. type "reboot". If you are on a windows server, run a "ping -t" from the run command to find out when the server is back up if you are not at the console. When the server is back up, configure the time service. This is the process I use with my VM's to stop the time wandering, so it should work for you as well. This information was taken from the VMWare.com website a long time ago and is a cut and paste. To configure NTP on the Service Console, follow these steps: 1. Log on to the console as the root user. 2. Edit the file /etc/ntp.conf. The comments in ntp.conf explain the purpose of each section. 3. Find the section titled # --- OUR TIMESERVERS ----- o Copy the existing restrict and server example lines: # restrict mytrustedtimeserverip mask 255.255.255.255 nomodify notrap noquery # server mytrustedtimeserverip o Remove the # character from the two newly copied lines so they are no longer treated as comments. Use yourservername.domain.com o Update both the new lines with the FQDN address of the NTP server. Repeat this section to add more time servers, if needed. Note: There must be both a restrict and a server line in this section for each NTP server. 4. Save the file. 5. Edit the file /etc/ntp/step-tickers. In this file, list the host name of the NTP servers entered above. 6. Add each NTP server to /etc/hosts to minimize the impact of DNS lookup failures during NTP sychronization. 7. To see the offset (in seconds) between the local clock and the source clock, run: ntpdate -q time_server_name_or_ip_address Note: If the correction resulting from synchronizing the local clock with the time server is large enough, it could affect the operating systems or applications running in virtual machines when they synchronize their clocks with the ESX Server they are running on. 8. To enable the ntp daemon to autostart when the server is rebooted, run: chkconfig --level 345 ntpd on 9. To (re)start it now without rebooting, run: service ntpd restart 10. To set the local hardware clock to the NTP synchronized local system time, run: hwclock --systohc 11. To watch the status of the ntpd process, run: watch ntpq -p Press Ctrl-c to stop watching the process. Note the information in the following columns: o The character in the first column indicates the quality of the source. * indicates the source is the current reference. o remote lists the IP address or host name of the source. o when indicates how many seconds have passed since the source was polled. o poll indicates the polling interval. This value increases depending on the accuracy of the local clock. o reach is an octal number that indicates reachability of the source. A value of 377 indicates the source has answered the last eight consecutive polls. o offset is the time difference between the source and the local clock in milliseconds. polls. After the host time service has been configured, you will need to disable the windows time service in the virtual machine and get the virtual machine to sync with the host via vmtools. This is done by clicking on the vmware tools icon near the clock in the vm and checking the sync time with host box in the options. If you need to zone up your SAN and configure arrays and luns, then this is the time you should do it. Add your host to the virtualcenter farm, this can be done by right clicking on the darm and selecting add a new host. You also need to configure your vmotion settings at this point. Configure your SNMP information, do this by editing the file located at \etc\snmp\snmpd.conf so it contains something similar to the example below. syscontact root@localhost (edit /etc/snmp/snmpd.conf) syslocation room1 (edit /etc/snmp/snmpd.conf) rocommunity yourcommname.domain.com trapcommunity yourcommname.domain.com trapsink yoursnmpserver.domain.com Login to the MUI, go to options and then to snmp configuration. Set master snmp agent to started by clicking start Set master snmp agent startup type to automatic by clicking on automatic select close window log out of mui. Thats about it for the build process. There are problems with the HS20 where by default it will not reboot itself or shut itself down when told to. The fix for this is located here. There are also known issues with usb where if the console is changed from one host to another, sometimes when you switch back to an esx host on a hs20 you will get no kvm access. This appears to be resolved in the latest esx patch and the latest firmware from IBM on both the hs20 and the management module of the bladecenter. SNMP documentation also located here. Time information also located here. Console nic sharing information also located here. Thats about it for this one.. a long one but I hope it helps. |