Archive for February, 2008

Published by Chris Rutledge on 29 Feb 2008

Blade Technology and Who Cares?

We have all seen the commercial for IBM where the technician comes into the data center to find that all of his servers have been stolen. Then it is pointed out to him by a co-worker that the data center, that was once overflowing with hardware, has been condensed down into one rack with an IBM blade server in it. While this is a bit of an exaggeration by IBM, it does convey their point quite eloquently.

There are a few hidden elements to this technology that IBM could not touch on in a 30 second promo window, so they touched on the main issue. This would be a “Cost of Ownership” issue. What they want you to believe is that you can do more with less hardware.

Containing the Monster

Over time, data centers have become an ever sprawling monster that gobbles up a university’s precious space resource. We see that here at IWU as well. What seemed to be an adequately “spec’d for space” data center has been eaten up by today’s demand for more and more services. It seems that every new service we provide requires at least one of it’s own dedicated servers (as demanded by the vendor) AND in some cases, the service is spread out across as many as 4 to 5 servers.

In the past, the answer was, CONSOLIDATE!!!! Put as many services on one piece of hardware as you can. This has been found to be a bad idea. Imagine, if you will, one server that has campus authentication (LDAP), DNS and DHCP running on it. Can the server handle this? Well, maybe, if it is adequately resourced enough. However, what happens when that server goes down? All those core services that keep the campus network connectivity up and running are all down. It doesn’t matter why all these services went down, the fact remains, all of our users can not access the internet.

The latest emergent technology that promises to help with this issue is Blade technology. Ooooh, Blade Servers, doesn’t that sound cool???? Well, don’t laugh, it is cool! Network vendors have been using this technology for years and now we see it popping up all over in the server area as well. What Blades actually do is allow your data center services to be consolidated onto one piece of hardware that acts and operates virtually like many different servers. So in essence, we are doing all our processing on less hardware.

Think about Blades this way; let’s say you have a Lamboghini Diablo and it has a whopping 530Hp, but you never take it up passed 65MPH (because that is the law). How much of that car’s resources go unused? Probably enough power is left unused to push 4 Yugos at 65MPH. Where does all that unused resource go? It remains unused and is lost. The same goes with servers. You have this incredibly spec’d out server, but it is only being pushed at half it’s capacity. We have paid for resources that are not being used. Wouldn’t it be nice if when a server’s resources were not being used, it could give that resource back so other services that may need it could take advantage of it? That is exactly what Blades do! It is a big community of resources that gets shared between all the virtual servers that are running on it.

Taming The Monster

Not only does today’s hardware practices turn our data centers into monsters that consume our space resources, but someone has to power all these pieces of hardware too. We have experienced this problem first hand in the past few months here at IWU. We are continuously trying to find ways to power all the servers without tripping the breakers and bringing down the existing servers. At one time, we plugged in a monitor that tripped a breaker and brought down Mail. But, due to our system administrators skill and cat-like reflexes, it was brought back up quickly and without much notice. :)

Currently in our data center we have about 7, 110V 20Amp circuits to power all our servers and network gear. 6 of these 7 circuits are at capacity. We have 1 circuit left to add new servers on. Granted, we are trying to remove some older servers too which will gives us a few amps back, but at best, this is a stop gap measure. We seriously need to think how we can use the power we have and use it in a more environmentally friendly way. This is also in compliance with IWU’s strategic plan. Well, here is a way we can make this happen.

The beast devours power. We need to tame it’s appetite a bit. We can do this by implementing Blade technologies in the data center. Blades, as I said above, make better use of resources by sharing all the resources between all the services. By doing this, power is not wasted on unused server resources as they have been in the past. “Unused resources” becomes a phrase that is no longer used because all our resources will be utilized efficiently. Not only does the university keep with it’s strategic plan by being “green”, it gets to save a bit of “green” as well.

A Necessary Evil?

The beast does serve a purpose. Gone are the days of pens, paper and abaci. Computers are here to stay. They will require power. They will take up space. They are necessary. We just need a way to control it, and in our attempts to control it, we need to make sure it stays reliable.

Currently, IWU has 2 LDAP servers. These are 2 independent servers that act as one. If one goes down, the other takes the load. It is a reliability concern. We need campus authentication to work, and it must work always. So, let’s analyze these 2 servers as our example. Both are powered up at all times. Both are spec’d out identically. The each take up rack space. Both sit idle for over 70% of the time. But, both are necessary!!! Redundancy can not be forsaken in a solution to make us more environmentally friendly or in our strides to do more with less. All aspects must coincide in our solution if our solution is to be fair and equitable.

Again, Blades address this issue. A Blade server using virtual servers allow us to create as many servers as we wish. If we run out of resources, we simply add another blade to the chasis to increase our resources. So, given this, we can create redundant servers to our hearts content. And believe it our not, the Sys Admins here at IWU have a fondness for creating redundant servers. They are weird that way.

Check Please!

So, who is picking up the tab on this. Hahaha Oh yes, the awkward moment at the end of the meal. Well, there is good news here too, believe it or not. Initially, for IWU to get rid of about 10 old servers, we would have to come up with about 15K. Hmmmm, what would 15 new individual servers cost? What would be the power consumption of 15 new individual servers? What would the data center footprint of 15 new servers be? This is how we should be thinking.

Think about the benefits here; less hardware, less power, less cooling, better redundancy, better resource utilization etc…, at about the same cost, if not cheaper. Like any new technology, there are pains with adapting a new way of thinking, but the pains are just birthing pains to a better way of computing; a more environmentally and responsible way of computing. A way that is all ready being adopted by our peers around the world.

Bottom Line

Illinois Wesleyan University is ready to embrace such a change. We see it in our strategic planning and in our people’s attitudes to become better and more responsible with technology. All we have left to do now is act upon what we know. We should be a leader and an example here to other universities and show them that IWU acts on what it believes in.

Published by Pat Riehecky on 28 Feb 2008

Single Sign On and Singular Sign On in IWU’s Future

If you have been around IWU for a while you will have noticed how more and more systems are getting connected to the NetID system.  This is great for everyone.  From my end, there is only the one place where user accounts can be messed up (excluding the few systems which are in no way hooked up to LDAP, but that is hopefully coming…. hopefully).  From your end, there is just the one username and password to remember.  But is the project complete with just LDAP?

LDAP (Lightweight Directory Access Protocol) support is available on almost every piece of Enterprise grade software.  This is all well and good, but all that does is provide Singular Sign On.  One username and password that you must enter into everything.  This is close to a Single Sign On but also very far from the goal.  What then is Single Sign On?

Single Sign On means, if you visit Ames, login to the computer there, you should automatically have access to everything that you should have access to.  You should just be able to launch a web browser, point it at my.iwu.edu and automatically get access to your email - without typing a password.  For a real Single Sign On, you should only have to login once, not only have one username and password.

What am I doing about it?  I have just built a test system that will be able to connect a few services that were not able to be connected to LDAP.  This helps push forward the Singular Sign On.  It also supports GSSAPI, the foundational protocol for Kerberos v5.  Kerberos V5 is a secure way of doing Single Sign On.  In another post on this Blog I mention the Microsoft way of performing Single Sign On over SMB.  It is not secure.  Kerberos is very well secured.  In the end Kerberos is the only secure Single Sign On system we understand enough to implement.  As an added bonus this system will also speak SASL.  SASL is just a method of protecting passwords in transmission.  One of the supported SASL methods is GSSAPI.  The primary motivator for this project was SMB support.  SMB is the official name for Microsoft’s file sharing protocol.  Support for that was built in, and designed in such a way that it too can do GSSAPI.  Lastly, it would be a shame to lose the LDAP support that we have all grown to love.  So LDAP is included, but this LDAP server supports GSSAPI as well.  All together this gives, under a single username and password, LDAP, SMB, Kerberos, and SASL based logins, each of which is GSSAPI enabled.

What does this really mean?  When this comes on line we should be able to start fostering a GSSAPI friendly environment, this then can give us a secure Single Sign On and a Singular Sign On too.  There are still some loose ends, but the testing coming in a few weeks should tie most of those up.  When it is successful, there simply is not a major protocol we are likely to bring online here that cannot be supported from that single username and password.

Yeah, but what does this really mean?  It means that, if enough applications can be configured on campus to support GSSAPI, you will start to see a day where all the university servers just magically know who you are and grant you the access you deserve after a single login.  No more logging in to login so that you can login to a service.

Technologies used: OpenLDAP - 2.4.7,  Samba 3.0.24, Heimdal Kerberos 1.0, smbk5pwd 0.0.0

Is there a next step beyond this combination?

Yes, we can add to the mix a bit of software called Shibboleth.  I am still trying to get it figured out, but once that happens it should be transparently added one day, leaving no one the wiser.  Why Shibboleth?  It is an open standard, secure, and free.  Periodically I hear mention of using Google to host our email, the price is vastly lower than what it costs us to do it ourselves, but, if we go that way, we would need a secure way of transmitting who can have IWU access to Google resources.  This is where Shibboleth comes in.  Google likes it because it is an open standard, very secure, and easy to integrate large numbers of remote sites together.  They support other access methods, but none really compare to what Shibboleth really is.  Another factor that is driving Shibboleth into the main stream is Internet2.  Internet2 has standardized on Shibboleth as its authentication system of choice.  If IWU is going to get Internet2 (which is currently a no, but that cannot last forever), we will have to decide if we are going to allow Internet2 based logins.  It seems silly to say no, but if we say yes then Shibboleth is a must.

I haven’t explored all the capabilities of Shibboleth yet, but I honestly expect it to spawn the next big thing in authentication.  Shibboleth itself seems capable of so many things, but only covers so many possible access methods.  The successor to Shibboleth will drive Single Sign On and Singular Sign On into yet another uncharted realm.  That, however, is about 10 years out.  Why wait for the next big thing when such a great one is sitting here waiting only for the time it will take to understand it.

Published by Pat Riehecky on 21 Feb 2008

RAID performance and I/O grouping

Realistically all important data should have backups.  This is all well and good, but backups get out of date, this is where RAID comes in.  RAID puts one or more disk into an active mirroring of the data so, in the event of a disk failure, no data is actually lost.  You can read more information on RAID at wikipedia.

There are really two types of RAID sets to consider in server environments  RAID 1+0 and RAID 5.  RAID 0+1 is never to be considered, if a single disk is lost the entire RAID set is in peril - regardless of how many disks are available.  RAID 1+0 uses the same number of disks for essentially identical performance, but without this problem.

RAID 5 is very fast for block level reads and relatively quick for block level writes.   It sufferers a mathematical bottleneck with the parity.  Every block has parity information on it.  This parity information will allow the block to be rebuilt in the event that the original data goes missing or bad.  Reads and writes automatically call on this parity bit for data integrity.  For non-real time operations RAID 5 offers the best performance, disk space, and protection combination.  With RAID 5 you get around 2/3 of the disk space inserted into the RAID set for data storage; the rest is consumed by the parity calculation.  RAID 5, however, is especially bad for databases.  Databases are not only real time, but also generally ignore block boundaries in writing data.  This can lead to one database commit having 10 or 12 disk writes, each of which must find, calculate parity, update the parity, and verify the parity write was successful in addition to just merely writing the data to disk.  While this may sound far fetched, I have personally seen one normal Oracle checkpoint create over a hundred scattered writes merely to rotate redo logs.

RAID 1+0 is much simpler.  Every disk has an identical copy at all times.  Each identical pair holds a percentage of every file so that the data is spread out more or less evenly.  If there were 4 identical pairs, each one would have about 25% of the data for a given file on it.  This makes reads and writes relatively fast.  The RAID set keeps track internally of what parts of what file is where.  This, however, comes at a rather high cost.  In RAID 1+0 you receive about 48% of the disk allocated to the set. Half of the disk becomes the on-line mirror and about 1% of the overall space is allocated for keeping track of what files are where.  The 1% usage is duplicated onto the second disk, so if two 100 G drives were added (200G) only about 98G would be available for usage by the RAID set.  Why would anyone want to use this instead of RAID 5?  RAID 1+0 is a real-time RAID set.  It does not have the problem RAID 5 has in verifying the data on either reads or writes.  The data is assumed accurate, that is assured by the extra disk allocated for duplication.  This skips over entirely the mathematical problems RAID 5 encounters on non-block writes.  Every database out there should only be run on RAID 1+0 and every best practice guide should say this.

There has been some mention that the new Banner server should be faster, the primary reason for this is the RAID switch.  Old Banner was on RAID 5 as advised by an off-site specialist.  The new Banner is on RAID 1+0, this one change should substantially improve the performance and, hopefully, lead to a more stable registration environment.

Published by Chris Rutledge on 21 Feb 2008

Network Replacement/Upgrade Project

As I am sure that most know by now, IWU Information Technology is contending with capital budget requests to fund the replacement/upgrade of IWU’s network infrastructure. We are currently meeting with vendors who intend on proposing their solutions for this upgrade. The RFP not only deals with upgrading our LAN equipment to handle faster backbone and switching speeds out to the desktop, but has also made allotments for the vendors to bid on extending campus wireless, network access control solutions and network monitoring systems.

IWU Backbone and Switching Speeds

Currently IWU’s network operates on a 1Gig backbone with dedicated 100M out to the desktop. The backbone speed is the rate the data travels from the switch that is in your building’s data closet back to the core router in CNS. The dedicated 100M speed is the data transfer rate from the computer on your desk back to the switch that is in your building’s data closet. The proposed network gear will have the capacity to handle the next generation of network backbone speeds of 10Gig while giving the end user 1Gig data transfer speeds out to their desktop.

Look Mom, No Wires…

This Request For Proposal (RFP) also has the vendors bidding on a new wireless plan for the university. This plan is to encompass all buildings on campus with the potential to start adding wireless in outdoor common areas in the future. What we are hearing from many vendors is that the new wireless gear can handle commonly used ratified wireless speeds of 54Meg with the ability to upgrade to the faster, soon to be ratified, wireless speeds of 250-300Meg. I will not bore you with all the “techie” statistics, but I will tell you that these wireless speeds will be noticeably faster. These new controller based wireless solutions will allow users that are using the wireless network to roam around in their building without ever losing their wireless connection. And, if in the future the wireless is extended across the entire campus, you will be able to move about from building to building without ever dropping your wireless connection.

Get The NAC

We have all had to deal with, from time to time, a nasty worm or virus that has affected our computer. We have all had to deal with worms and virus on other peoples’ computers slowing the network down for us. In this day and age we have come to expect that in order for us to use the internet safely, some type of protection is needed. Then why is it that there are so many computers out there that are not sufficiently protected? I imagine that there are many reasons that will qualify as an answer to that question, but, it really makes no difference in resolving the problem. One of the fastest growing technologies that is being adopted, not only by universities world wide, but by the business world as well is; Network Access Control (NAC).

NAC is an appliance that sits on the network that makes sure that all users who access the university’s network resources have shown themselves to be “qualified”. The implementations for this technology are varied. This appliance can tie into the university’s campus authentication so we can ensure that those users that are on our network are legitimate IWU community users. This includes students, faculty, staff, alumni, retirees, vendors, guests etc… and all can have different levels of usage on our network. Not only does this appliance authenticate the user, but it also ensures that the user’s computer is sufficiently protected in order to join the network by checking to make sure that the computer is running an accepted revision of anti-virus software.

Think about this for a second. Wouldn’t it be nice if we could start taking a pro-active stance in keeping our network space free of those things that steal our resources (i.e. unauthorized users, virus, worms) while giving that same resource back to those that need it?

What about those computers that do not pass the NAC inspection? Well, that is the beauty of this appliance. If your computer is found not to be running the correct revision of anti-virus (or whatever the case may have been) the appliance can then direct you to a web page that allows you to down load and install the correct revision. Thereby allowing your computer to access the network.

Network Monitoring

The RFP also allots for a much needed network monitoring system. To have the ability to see into the network traffic is key for keeping network uptime service agreements. It is very difficult to head off a problem that can be devastating to a network if you have no insight that this problem was coming. Currently IWU Network Services Group (NSG) is using a few rudimentary tools to try to stay on top of network and server issues. It is becoming much more prevalent that NSG needs a way of staying on top of network issues as the LAN and WLAN usage is growing.

Some of the proposals for this are giving us the ability to set alarms for network usage and other “questionable” network behaviors. Some of these systems can tie in information from all the network equipment including the firewall and NAC. We would be able to see when the internet connection is having issues and ways to remediate these issues. We would also have that same ability to correct these same resource problems on IWU’s internal networks as well.

Time Line

After the vendors have submitted their RFP’s there will be a time of review to select those vendor teams that will best fit IWU’s plan for the network future.  If the network upgrade project is to be funded, then this project would be extended out over a 4 year 4 phase implementation.  I do think that this is really a better way to go since there are some growing issues that IWU would have to deal with.  Implementing a NAC solution and growing our wireless network should be something that we take seriously enough to implement with a lot of forethought.  I say this because there are going to be administrative issues that come with using these technologies.  We want to not only be sure that all the pieces fit together correctly giving IWU the best experience it can have with these new services, but also that we are able to maintain and grow these services seamlessly into the future.  To do this we have to consider personnel requirements and our ability to keep those personnel trained on this new gear.  We can not afford to keep growing IWU’s network infrastructure by adding service upon service without contemplating the amount of trained support personnel it takes to keep those technologies living up a service level agreement.

Bottom Line

The network that has served IWU so well in the past is getting tired. It is now about 7 years old and some of the critical core equipment is being retired by the vendors that produced it. We, IWU, have outgrown and are demanding more from the network then what it can deliver in it’s current state. It is time for IWU to put in place a network that will protect the user community, protect the investment that we have made in our network infrastructure and do all of this to further enable the university’s ability to achieve it’s strategic goals through the use of technology.