Kickstart Installs From A USB Drive

04 13 2009

When installing a lot of identical or near-identical systems, having an installer answer file makes things go much quicker. By eliminating the need to manually select the same values over and over again, an answer file saves time and minimizes human error. Linux distributions with the Anaconda installer, like RedHat and CentOS, using a kickstart file for this purpose.

The challenge with kickstart files (and other installer answer files) has always been how to provide the file to the installer application. Since the OS media is generally read-only, the file has to be made available somewhere else.

In the olden days, kickstart configuration files were copied onto a floppy. That was simple and easy, but floppies were slow and had limited storage capabilities. As networks got faster, network-based installs became the leading installation method of choice, and floppy disks slowly disappeared from systems.

While network-based installation is still the best choice in most situations, it’s not feasible in situations where you have little control over the network infrastructure. Since floppy drives are extinct, the obvious next choice is to install from a USB drive (flash or hard disk). However, there’s one big gotcha: By default, Anaconda installs Linux onto all available drives, including the USB drive you’re trying to install from. Worse, most kickstart files uses the “clearpart” option, which erases the partition table on all drives, also including the USB drive you’re trying to install from. Not good.

If all your hardware is the same, there’s an easy solution: hard-code the disk drive device name into your kickstart file. That way, Anaconda won’t touch the USB drive you’re installing from. For example, if your target disk drive is /dev/sda:

clearpart –all –drives=/dev/sda –initlabel
part / –fstype=ext3 –size=75 –ondisk=/dev/sda

But what if your hardware isn’t all the same and the target disk drive name varies? Managing multiple kickstart configuration files that are identical except for the disk drive device name is a big headache. There has to be an easier way.

%pre sections to the rescue! The %pre section of a kickstart configuration file contains a script that is run before Anaconda begins the OS installation. At first glance this may seem a bit useless — why run a script before the OS is installed? The answer is that the true power of %pre is to generate parts (or all) of your kickstart configuration dynamically, at install time. In this case, a script in the %pre section can figure out the correct disk drive device to install to, and then generate the appropriate kickstart configuration. For example:

%include /tmp/partitions

%pre
#!/bin/sh

# find the first drive that doesn’t have removable media and isn’t USB
DIR=”/sys/block”
ROOTDRIVE=””
for DEV in sda sdb sdc sdd sde hda hdb hdc hdd hde; do
  if [ -d $DIR/$DEV ]; then
    ls -l $DIR/$DEV/device | grep -q /usb
    if [ $? -ne 0 ]; then
      REMOVABLE=`cat $DIR/$DEV/removable`
      if (( $REMOVABLE == 0 )); then
        if [ -z $ROOTDRIVE ]; then
          ROOTDRIVE=$DEV
        fi
      fi
    fi
  fi
done

# Check for RAID controller disks
if [ -z $ROOTDRIVE ]; then
  for DEV in c0d0 c0d1 c0d2 c1d0 c1d1 c1d2; do
    if [ -d $DIR/cciss!$DEV ]; then
      if [ -z $ROOTDRIVE ]; then
        ROOTDRIVE=cciss/$DEV
      fi
    fi
  done
fi

cat << EOF > /tmp/partitions
bootloader –location=mbr –driveorder=$ROOTDRIVE
clearpart –all –drives=$ROOTDRIVE –initlabel
part /boot –fstype=ext3 –size=75 –ondisk=$ROOTDRIVE
part / –fstype=ext3 –size=4096 –ondisk=$ROOTDRIVE –asprimary –grow
part swap –size=2048 –ondisk=$ROOTDRIVE
EOF

This %pre section scans for the first non-USB, non-removable-media disk drive on the system, and then dynamically generates the kickstart partition directives hard-coded to that device. This ensures that Anaconda will not install to or erase the USB drive.

One side note: the “removable” flag in Linux is set for devices with removable media (e.g. DVD drives); it is not set for devices that are themselves removable/hot-swappable (e.g. USB drives). This is why the %pre script above checks explicitly for USB devices.



Faster Hardware or Faster Software?

12 25 2008

In his post Hardware is Cheap, Programmers are Expensive, Jeff Atwood asserts that companies should “always try to spend your way out of a performance problem first by throwing faster hardware at it.” While there are certainly cases where this is true, most of the time throwing hardware at a performance problem as the first step is the wrong thing to do.

Though it’s not clear in his post, I’m going to assume he’s talking about server software. When you’re writing software that runs on a client (desktop, laptop, smartphone, or other embedded device), upgrading the hardware is not an option. Trying to sell shrink-wrapped software that will only run on 5% of your target market’s current hardware is generally not a good idea (except perhaps if you’re selling a PC game).

On the server side, Jeff overlooks several key points. Most importantly, a poorly-written app will often times not be able to take advantage of faster hardware. A single-threaded, extremely-inefficient program will run only marginally faster when given a multi-core, 8GB of RAM, fast-CPU box. An app that uses a very large database with no indexes also may not benefit much from faster hardware. Throwing hardware at a bad architecture will not make it better.

In addition, new hardware imposes additional costs beyond the initial purchase price. You need to pay sysadmins to set up and maintain the hardware. You need to pay power and cooling for the new hardware, which is not cheap if you’re at a good datacenter. Because your datacenter and network get more complex as you add lots of servers, management/overheard costs do not increase linearly either. Going from one server to ten is usually less than a 10x increase in operational costs, while going from ten servers to a hundred is usually more than a 10x increase in operational costs.  Upgrades are also disruptive, and will impact your users/customers.

If you have a more complex application, it may not always be clear what needs to be upgraded. If your site is running slow, do you upgrade your load balancers, firewalls, webservers, appservers, database servers, or the SAN?

Jeff also overlooks the most important metric when looking at servers apps: cost per user. Many dot-coms went under because their software required too much hardware per unit of revenue. In other words, if you need one $2000 server for every 200 users, and an average user generates $0.10 of revenue per month, it will take 100 months  (over 8 years!) to pay back the cost of the hardware. (Remember, this is before overhead and other fixed costs.) At that rate, getting bigger doesn’t help; it just makes you lose money faster! If you could support 2000 users on the same server, you will pay back the cost of the hardware in less than a year. Now you would have a shot at getting to profitability.

Put another way, Jeff is weighing the cost of a software engineer versus one server, but this is not a fair comparison. Most server-side applications run on many systems, with some form of load balancer to spread out the workload. The alternative to improving the code usually isn’t buying one new server; it’s buying ten, twenty, or fifty new servers (assuming the application will scale to that many servers.) If you have 20 servers, a 2x improvement in speed saves at least $40,000 right off the bat (not counting the overhead savings discussed earlier); now the ROI is starting to favor improving the code before buying new hardware. There may also be multiple deployments — one for each customer, or one per department, plus one for QA, one for engineering, etc — which also increases the number of servers that would need to be upgraded.

So when is it appropriate to buy new hardware as the first step? If your app is running on one or two servers that are a few years old, buying new hardware as the first step makes sense. If your servers have exceeded their useful life (3-5 years is usually what I plan on), replace them. If the workload on your application has increased, it’s easy to justify replacing the hardware. If upgrading your existing hardware (adding RAM or replacing the disks) will improve your app’s performance, this is an easy and relatively inexpensive first step.

Finally, it’s important to keep in mind that code optimization yields diminishing returns. For a small app it may only be worth spending a day or two on optimization. For a large app it may be worth taking a week or a month. In either case, after a while, all the low-hanging fruit have been optimized; then the cost of upgrading the hardware vs the cost of continuing to optimize the software needs to be compared, and a decision made. Replacing hardware can be the right thing to do; usually, though, it’s not the right thing to do first.



Server Virtualization

08 12 2008

As mentioned in an earlier post, server virtualization was a hot topic at this year’s LinuxWorld. This post will discuss some of the advantages and disadvantages of virtualization, and the various types of virtualization solutions in use.

What is Virtualization?

On a non-virtualized system, only one operating system (OS) can be running at a time. Virtualization allows a system to run one or more “guest” OSs on top of the “host” OS. Virtualization software tricks the guest OS’s into thinking they are running directly on hardware, when in fact they are running within a Virtual Machine (VM).

For example, a Windows XP system could have a copy of RedHat in a VM, and Windows Server in another VM, assuming the hardware is powerful enough to support having three OS’s running at the same time.

Initially, virtualization was used mostly by engineers for development and QA, because virtualization was a big time saver. For instance, since a guest OS’s entire disk image can be a regular file in the host OS, you can clone VMs easily. Thus, a QA engineer could be guaranteed an exactly identical system each time they ran a regression test. Also, testing a server with multiple OSs became much easier — instead of having 10 physical client systems (Win95, Win98, Win2k, WinXP, MacOS X, etc), a QA engineer could have 10 different VM images on one physical system.

What is Server Virtualization?

Server Virtualization is when VMs are used to host production services, such as external web sites, email servers, file sharing, etc. Server Virtualization is different from development/QA system virtualization in serveral major ways:

  1. Performance and reliability are paramount.
  2. Server Management is more important, especially cross-server management. If you have 30 physical host systems, the management software must let you view all of their status info at the same time.
  3. Each physical host must be able to support a significant number of VMs at a time.
  4. The VM software must support high-availability features such as failover, moving VMs from system to system “live”, and load balancing of VMs.

Why Server Virtualization?

Application Isolation

Installing multiple server applications on a single server without virtualization leads to several issues. First, there is the possibility of application conflict. For instance, app A may require a particular patch that app B won’t work with. Second, system downtime has to be approved by all the app owners. If the owner of app A only wants downtime 8pm – midnight, and the owner of app B only wants downtime between 2am – 4am, getting system downtime approval becomes a nightmare. Also, any problems will usually be blamed on the other app. App A is slow? Must be App B’s fault!

Installing each app onto its own system solves this issue, but is wasteful. What are the odds that App A needs even 10% of a modern system’s CPU?

Virtualization solves this issue by giving each app its own OS instance. Each OS instance can have different patches installed, can be brought down independently of the others, and provides isolation from the other applications. However, all the OS instances can share the same hardware, leading to efficient hardware usage. If the underlying Host OS needs to be brought down, the VMs can be migrated “live” to another Host system for the duration of the outage, with no downtime required.

Scaling

Most applications can’t take advantage of more than one or two processor cores, or if they do, performance doesn’t scale very well. By running multiple one-CPU VMs on a multi-core server, with a copy of the application running in each VM, the application can take full advantage of a multi-core system. For instance, running Apache httpd within 8 1-CPU VMs on an 8-core host system will provide better performance than running Apache httpd directly on top of an 8-core server.

Hardware Independance & Fault Tolerance

Any VM can run on top of any hardware, as long as the hardware is running the same virtualization software. Most virtualization solutions allow VMs to be moved from host system to host system “live”, with no interruption to the guest OS, as long as the VM’s disk is on shared storage (such as a NAS or SAN). This has serveral advantages:

If the underlying hardware or host OS needs maintenance, VMs can be moved off of the system beforehand, eliminating any service interruption.

When an application outgrows its current hardware, it can be migrated to more powerful hardware without any downtime, much less any reinstallation and data migration pains.

If a host system fails unexpectedly, any other host system can run the VM, making failure recovery much quicker. In fact, some virtualization solutions allow two host systems to run the same VM in lockstep, so if one host system fails unexpectedly, the other can take over with no service interruption.

Security

By controlling a VM’s access to disk, network, and memory resources, virtualization software can help keep VMs secure. Any virus or root kit that modifies the guest OS to hide itself would still be fully visible to the virtualization software. Also, the guest OS could request that certain memory regions or disk resources be made irrevocably read-only on boot, preventing malware from writing to those regions.

Server Virtualization Challenges

Complexity

The number one downside of virtualization is complexity. Complexity always makes things harder to manage, harder to understand, and harder to troubleshoot. A well-designed virtualization infrastructure manages the complexity by imposing standards and procedures, and documenting everything. A poorly-designed virtualization infrastructure quickly becomes very fragile and impossible to manage.

If a VM is running slow, is it the application? The guest OS? The virtualization software?  The host OS? The host hardware? Shared disk storage? Did the VM move to a different host server? If your virtualization software can automatically move VMs among host servers, do you even know which host server was running the VM when the issue appeared?

More Things Can Go Wrong

The virtualization software is one more thing that needs to be learned, installed, patched, managed, upgraded, and troubleshot. While it would be nice if the virtualization software never had bugs or glitches, that’s certainly not the case.

More OS Instances to Manage

Each OS instance in a VM is one more OS instance that needs management, such as security patches, anti-virus software, etc. If your current patch strategy is to run Windows Update by hand on each system, virtualization will kill you.

Performance Overhead

Virtualization software imposes a performance penalty, especially for disk and network I/O. Also, because each VM is running a copy of the OS, each running VM imposes memory overhead. Full virtualization (described below) has the highest overhead.

There are several ways to mitigate these issues. Using container-based virtualization (described below) or paravirtualization (also described below) reduces overhead. Also, manufacturers are beginning to release virtualization-aware network and disk controllers that speed up I/O from within VMs. Finally, Intel and AMD have added virtualization-specific CPU instructions in their newer CPUs that reduce virtualization’s performance overhead even with full virtualization.

Cost

Purchasing commercial virtualization software is not cheap. If you go with a free solution, you may save on licensing costs, but will need to spend more time implementing the various management tools you would have gotten with the commercial software. Also, server virtualization requires better OS, application, and performance management tools, which you need to either purchase or implement.

Security

It’s possible that the virtualization software or your configuration has a bug that allows hostile software in a VM to “escape” into the host system. Now, it has full control of all the VMs on that host system. Virtualization software vendors take security seriously, so this is relatively unlikely, but…

Virtualization Types

Full virtualization. In full virtualization, the guest OS is completely unaware that it’s running within a VM. This is the most flexible type of virtualization, as it can run any OS unmodified, but it also has the greatest performance hit because the VM has to fully emulate hardware.

Paravirtualization. In paravirtualization, the guest OS is aware that it’s running within a VM. Instead of talking directly to hardware or protected memory, it will talk to the virtualization software. This eliminates the need for full hardware emulation in the virtualization software, greatly improving performance. The downside is that the selection of guest OS is limited to those that support paravirtualization with your virtualization software.

Containers. A container is closer to a chroot’d tree on steroids than a full VM. The software running within the container can only see the files, memory, and processes within the container; however, the kernel is shared among all the containers. Therefore,  all the containers are necessarily running the same OS. Since there is really only one OS running on the whole system, containers have the lowest overhead and best scalability, but they are much more limited in their flexibility.

Virtualization Software

VMware. VMware introduced the first real virtualization solution, and have maintained a significant lead over their competitors since. VMware has a great set of tools to manage VMs, including Lab Manager (managing groups of VMs together), VMmotion (migrating VMs from host to host), and Infrastructure Client (a great view into all the VMs on a set of host servers).

VMware’s biggest downside is cost. Also, they have had several issues with their licensing tools, ranging from the inability to issue a license key for purchased software to updates that caused VMs to not start due to spurious license errors.

The general consensus I’ve heard is that if you can afford VMware, they are the best option for large-scale server virtualization.

Xen. Xen is an open source virtualization solution that most closely competes with VMware. On paper it looks very similar to VMware. In practice the toolset is much less mature, and the product has a lot of rough edges.

KVM. KVM is virtualization software implemented as a Linux kernel module. Because it is fully Linux, many Linux distributations have announced that KVM will be their preferred virtualization solution going forward. Today, it is still a work in progress, and not yet ready for datacenter deployment. KVM supports

OpenVZ. OpenVZ is a container-only solution. If your virtualization needs can be satisfied by containers, OpenVZ is worth considering. For most virtualization needs, though, OpenVZ is not enough.

Hyper-V. Hyper-V is Microsoft’s server virtualization solution. I don’t know much about it, and it was (unsurprisingly) not talked about much at LinuxWorld.