In his post Hardware is Cheap, Programmers are Expensive, Jeff Atwood asserts that companies should “always try to spend your way out of a performance problem first by throwing faster hardware at it.” While there are certainly cases where this is true, most of the time throwing hardware at a performance problem as the first step is the wrong thing to do.
Though it’s not clear in his post, I’m going to assume he’s talking about server software. When you’re writing software that runs on a client (desktop, laptop, smartphone, or other embedded device), upgrading the hardware is not an option. Trying to sell shrink-wrapped software that will only run on 5% of your target market’s current hardware is generally not a good idea (except perhaps if you’re selling a PC game).
On the server side, Jeff overlooks several key points. Most importantly, a poorly-written app will often times not be able to take advantage of faster hardware. A single-threaded, extremely-inefficient program will run only marginally faster when given a multi-core, 8GB of RAM, fast-CPU box. An app that uses a very large database with no indexes also may not benefit much from faster hardware. Throwing hardware at a bad architecture will not make it better.
In addition, new hardware imposes additional costs beyond the initial purchase price. You need to pay sysadmins to set up and maintain the hardware. You need to pay power and cooling for the new hardware, which is not cheap if you’re at a good datacenter. Because your datacenter and network get more complex as you add lots of servers, management/overheard costs do not increase linearly either. Going from one server to ten is usually less than a 10x increase in operational costs, while going from ten servers to a hundred is usually more than a 10x increase in operational costs. Upgrades are also disruptive, and will impact your users/customers.
If you have a more complex application, it may not always be clear what needs to be upgraded. If your site is running slow, do you upgrade your load balancers, firewalls, webservers, appservers, database servers, or the SAN?
Jeff also overlooks the most important metric when looking at servers apps: cost per user. Many dot-coms went under because their software required too much hardware per unit of revenue. In other words, if you need one $2000 server for every 200 users, and an average user generates $0.10 of revenue per month, it will take 100 months (over 8 years!) to pay back the cost of the hardware. (Remember, this is before overhead and other fixed costs.) At that rate, getting bigger doesn’t help; it just makes you lose money faster! If you could support 2000 users on the same server, you will pay back the cost of the hardware in less than a year. Now you would have a shot at getting to profitability.
Put another way, Jeff is weighing the cost of a software engineer versus one server, but this is not a fair comparison. Most server-side applications run on many systems, with some form of load balancer to spread out the workload. The alternative to improving the code usually isn’t buying one new server; it’s buying ten, twenty, or fifty new servers (assuming the application will scale to that many servers.) If you have 20 servers, a 2x improvement in speed saves at least $40,000 right off the bat (not counting the overhead savings discussed earlier); now the ROI is starting to favor improving the code before buying new hardware. There may also be multiple deployments — one for each customer, or one per department, plus one for QA, one for engineering, etc — which also increases the number of servers that would need to be upgraded.
So when is it appropriate to buy new hardware as the first step? If your app is running on one or two servers that are a few years old, buying new hardware as the first step makes sense. If your servers have exceeded their useful life (3-5 years is usually what I plan on), replace them. If the workload on your application has increased, it’s easy to justify replacing the hardware. If upgrading your existing hardware (adding RAM or replacing the disks) will improve your app’s performance, this is an easy and relatively inexpensive first step.
Finally, it’s important to keep in mind that code optimization yields diminishing returns. For a small app it may only be worth spending a day or two on optimization. For a large app it may be worth taking a week or a month. In either case, after a while, all the low-hanging fruit have been optimized; then the cost of upgrading the hardware vs the cost of continuing to optimize the software needs to be compared, and a decision made. Replacing hardware can be the right thing to do; usually, though, it’s not the right thing to do first.