Published by Chris Rutledge on 29 Feb 2008 at 12:23 pm
Blade Technology and Who Cares?
We have all seen the commercial for IBM where the technician comes into the data center to find that all of his servers have been stolen. Then it is pointed out to him by a co-worker that the data center, that was once overflowing with hardware, has been condensed down into one rack with an IBM blade server in it. While this is a bit of an exaggeration by IBM, it does convey their point quite eloquently.
There are a few hidden elements to this technology that IBM could not touch on in a 30 second promo window, so they touched on the main issue. This would be a “Cost of Ownership” issue. What they want you to believe is that you can do more with less hardware.
Containing the Monster
Over time, data centers have become an ever sprawling monster that gobbles up a university’s precious space resource. We see that here at IWU as well. What seemed to be an adequately “spec’d for space” data center has been eaten up by today’s demand for more and more services. It seems that every new service we provide requires at least one of it’s own dedicated servers (as demanded by the vendor) AND in some cases, the service is spread out across as many as 4 to 5 servers.
In the past, the answer was, CONSOLIDATE!!!! Put as many services on one piece of hardware as you can. This has been found to be a bad idea. Imagine, if you will, one server that has campus authentication (LDAP), DNS and DHCP running on it. Can the server handle this? Well, maybe, if it is adequately resourced enough. However, what happens when that server goes down? All those core services that keep the campus network connectivity up and running are all down. It doesn’t matter why all these services went down, the fact remains, all of our users can not access the internet.
The latest emergent technology that promises to help with this issue is Blade technology. Ooooh, Blade Servers, doesn’t that sound cool???? Well, don’t laugh, it is cool! Network vendors have been using this technology for years and now we see it popping up all over in the server area as well. What Blades actually do is allow your data center services to be consolidated onto one piece of hardware that acts and operates virtually like many different servers. So in essence, we are doing all our processing on less hardware.
Think about Blades this way; let’s say you have a Lamboghini Diablo and it has a whopping 530Hp, but you never take it up passed 65MPH (because that is the law). How much of that car’s resources go unused? Probably enough power is left unused to push 4 Yugos at 65MPH. Where does all that unused resource go? It remains unused and is lost. The same goes with servers. You have this incredibly spec’d out server, but it is only being pushed at half it’s capacity. We have paid for resources that are not being used. Wouldn’t it be nice if when a server’s resources were not being used, it could give that resource back so other services that may need it could take advantage of it? That is exactly what Blades do! It is a big community of resources that gets shared between all the virtual servers that are running on it.
Taming The Monster
Not only does today’s hardware practices turn our data centers into monsters that consume our space resources, but someone has to power all these pieces of hardware too. We have experienced this problem first hand in the past few months here at IWU. We are continuously trying to find ways to power all the servers without tripping the breakers and bringing down the existing servers. At one time, we plugged in a monitor that tripped a breaker and brought down Mail. But, due to our system administrators skill and cat-like reflexes, it was brought back up quickly and without much notice.
Currently in our data center we have about 7, 110V 20Amp circuits to power all our servers and network gear. 6 of these 7 circuits are at capacity. We have 1 circuit left to add new servers on. Granted, we are trying to remove some older servers too which will gives us a few amps back, but at best, this is a stop gap measure. We seriously need to think how we can use the power we have and use it in a more environmentally friendly way. This is also in compliance with IWU’s strategic plan. Well, here is a way we can make this happen.
The beast devours power. We need to tame it’s appetite a bit. We can do this by implementing Blade technologies in the data center. Blades, as I said above, make better use of resources by sharing all the resources between all the services. By doing this, power is not wasted on unused server resources as they have been in the past. “Unused resources” becomes a phrase that is no longer used because all our resources will be utilized efficiently. Not only does the university keep with it’s strategic plan by being “green”, it gets to save a bit of “green” as well.
A Necessary Evil?
The beast does serve a purpose. Gone are the days of pens, paper and abaci. Computers are here to stay. They will require power. They will take up space. They are necessary. We just need a way to control it, and in our attempts to control it, we need to make sure it stays reliable.
Currently, IWU has 2 LDAP servers. These are 2 independent servers that act as one. If one goes down, the other takes the load. It is a reliability concern. We need campus authentication to work, and it must work always. So, let’s analyze these 2 servers as our example. Both are powered up at all times. Both are spec’d out identically. The each take up rack space. Both sit idle for over 70% of the time. But, both are necessary!!! Redundancy can not be forsaken in a solution to make us more environmentally friendly or in our strides to do more with less. All aspects must coincide in our solution if our solution is to be fair and equitable.
Again, Blades address this issue. A Blade server using virtual servers allow us to create as many servers as we wish. If we run out of resources, we simply add another blade to the chasis to increase our resources. So, given this, we can create redundant servers to our hearts content. And believe it our not, the Sys Admins here at IWU have a fondness for creating redundant servers. They are weird that way.
Check Please!
So, who is picking up the tab on this. Hahaha Oh yes, the awkward moment at the end of the meal. Well, there is good news here too, believe it or not. Initially, for IWU to get rid of about 10 old servers, we would have to come up with about 15K. Hmmmm, what would 15 new individual servers cost? What would be the power consumption of 15 new individual servers? What would the data center footprint of 15 new servers be? This is how we should be thinking.
Think about the benefits here; less hardware, less power, less cooling, better redundancy, better resource utilization etc…, at about the same cost, if not cheaper. Like any new technology, there are pains with adapting a new way of thinking, but the pains are just birthing pains to a better way of computing; a more environmentally and responsible way of computing. A way that is all ready being adopted by our peers around the world.
Bottom Line
Illinois Wesleyan University is ready to embrace such a change. We see it in our strategic planning and in our people’s attitudes to become better and more responsible with technology. All we have left to do now is act upon what we know. We should be a leader and an example here to other universities and show them that IWU acts on what it believes in.
5 Responses to “Blade Technology and Who Cares?”
Leave a Reply
You must be logged in to post a comment.
Pat Riehecky on 29 Feb 2008 at 1:17 pm #
For those that haven’t seen the IBM commercial
http://www.youtube.com/watch?v=DO9ZWDaLLxA
Rick on 03 Mar 2008 at 4:26 pm #
Are there industry standards for Blade hardware? It would be nice if it could continue to be upgraded for a number of years, but if each company has its own proprietary design they might force you to upgrade to the “new” Blade architecture every 4 years.
Also, how does the blade architecture compensate for redundancy? Doesn’t consolidating into virtual servers mean that there are fewer hardware points of failure, but the ones that are left are Really Big?
Chris Rutledge on 07 Mar 2008 at 6:51 am #
While I am sure that Blades use industry standard hardware, i.e. processors, disk drives etc…, there may be some vendor specific memory issues depending on what vendor you purchase your Blade from.
Currently, the Blade technology is the best solution out there that I have seen that will allow us to use hardware for an extended period of time. The Blade chassis is populated with “Blades”. These blades are like mini, powerful computers and, once placed in the chassis, become part of a resource pool. You need more resources for your virtual servers, buy another Blade. Don’t get me wrong here, there is still hardware to purchase on an ongoing basis. Blades are not a “one time” purchase and all your computing needs are taken care of forever. What they do well is; they share resources among all your virtual servers so that your resources are used more efficiently. You will have to purchase hardware less often.
I guess I did not address redundancy very well. So…. I will now.
We would not think about implementing this technology if there were not a bit of redundancy in the hardware. Here is what I mean:
POWER:
The Blade server that we have looked at has a variety of options as to how many power supplies it can have. We have spec’d one out that has fully redundant power. If a power supply should go out, there is another to take it’s place. This is about as good as it gets.
As far as building power, IWU has recently purchased a central UPS unit for the data center. If the building power should go out, this UPS will keep the data center powered until the CNS emergency generator takes over. There is just one caveat here though, that is, the blade server hardware runs on 220V 3 phase power and I would have to check to see what it would take to put that on the UPS. Currently, that is an unknown. Edit: this is already all set and on the UPS.
HARDWARE:
By it’s nature the Blade is hardware redundant. If a disk drive fails, there are others there to take up the slack. “What about the data on the failed disk”, you ask? Almost all our servers in the data center are running what is called RAID. This allows data from any one disk that fails to be rebuilt and restored across all the other disks. Once a new disk is put back in, all the other drives rebuild the new disk. It is pretty cool, but that is as far as I will go into explaining that.
Since resources are all pooled together, having memory or a CPU go bad is really kind of trivial. If this should happen, the other pooled resources take the load until we replace the failed card.
I think it is very important to add here that if you are going to serve any user disk space, or any application disk space that has large growth potential, then an attached SAN (an appliance that holds many disk drives) would be needed.
NETWORK CONNECTIVITY:
It is very seldom that a network card fails in a server, but it can happen. The Blade has redundant network connections too. The unit we have looked at has options for Gig and 10Gig network connections. The good thing about this is, when we go to the next generation of network speeds (10G), this unit is ready.
SERVER/APPLICATIONS:
So far we have just addressed hardware redundancy, but the beauty of this technology goes beyond the hardware. Blades will allow us to set up redundant services too. Here is an example:
We have a new service that is needed here at IWU. We try not to implement services that are going to be used heavily by the university without preparing a backup server in case that service should go down. So, this requires 2 separate pieces of hardware to run the service and the backup to that service on. With Blades, it is the same box!
Not only is that a huge cost savings, we can also build test environments on the Blade. Let’s say that there is a huge upgrade to the Banner service. In the past (and even now) we have had to handle this in one of two ways:
1.) Put the upgrade on the production server
2.) Build a brand new server for the upgrade (very expensive)
With Blades, we can copy the Banner virtual server over to another virtual server, let’s call it Banner Test Server. We could then install upgrades, patches and config changes to the Banner Test Server before we ever think about touching the production Banner server.
I think I am getting farther into this than I need to, so, I will reel myself in.
COSTS:
Blades do bring a lot to the table. On the flip side, there is an “upfront” cost that has to be dealt with. To start ourselves off with a Blade server, without a SAN, could cost between 20 - 30 thousand. But, that does get you 10 cup holders, 8 speaker premium surround sound and a sun roof.
If we needed to attach a SAN, you could double that. However, if we keep our eyes on the bigger picture, what we save is immense. We don’t have to keep purchasing individual servers, individual service agreements and backup servers. We will not be using desktop grade computers for university core applications. We keep our power and cooling costs down, well, as much as we can and still have a data center. We gain the ability to replace one Blade card at a time making upgrading our hardware a bit easier on the pocket book.
I want to make sure that all know that the amount of administration for these servers is not decreased via the use of Blades. The amount of maintenance is still the same, just the hardware requirements are minimized.
I hope this answers some of your concerns.
Pat Riehecky on 12 Mar 2008 at 1:27 pm #
Chris’s answer is complete, but here is a more focused response to the high points of Rick’s questions.
Is the technology proprietary? Yes, each blade system takes its own type of blades which are only made by the manufacturer. There really aren’t any 3rd party parts for a blade system. That being said, the average lifespan for a blade series is 12 years.
Do blades have a higher redundancy but bigger cost of failure? With blade technology everything is duplicated a few times. It would take a large number of non-trivial failures to bring a blade system down. These failures could happen, but to compare it to our current architecture, it is in every way possible for the motherboard in Sun to fail on the same day that the motherboard in my fails while setting fire to Triton. This could honestly happen. Then what? We need expensive parts for three different systems each of which could be on back order and would require some time to replace. How do you decide which core system is most important. With a blade the type of catastrophic failure that would bring it down is of the same likelyhood, but, with things linked up this way, if we do suffer a serious enough failure, the parts all go in the same place. There is no triage conflict. And because all the parts are in the same place we could afford a much more extensive hardware contract - 4 hour support instead of next day. And with that we would probably still be saving money.
Chris Rutledge on 30 May 2008 at 11:17 am #
Progress Update:
We have made some movement on getting the Blade Server into the IWU data center. It has been approved this year and we are in the process of determining the spec’s of the gear before we put it on order.
The next step would be to get the gear in and get it setup for virtual servers. Once this has happened, we will need to make the servers that will be running on this machine. This may sound a bit simplistic, but it does require time and a bit of forethought. One of the issues at hand is, what servers are we going to put on this Blade and what OS’s are we running them on.
Currently, there are a lot of ideas as to what servers/services should go onto this new machine AND, as you know, things never go according to what you had envisioned. But, the likely candidates as of now would be LDAP, SMTP, Titan, Ariel, ElecRes and a few of the extremely old servers in the data center.
It is our hopes to get this in before the Fall semester and start turning over services to this machine. It will be in our (IWU’s) interest to move slow with this and make sure that we do not oversell this hardware and it’s capabilities. We will need to monitor this server after every service install and through peak usage times to get an idea of just how many services we can put on this hardware.