Dec. 4th, 2010 05:53 pm
jimx0r: (Default)
[personal profile] jimx0r
So, I'm building a cluster for another department on Campus...

They're CentOS fans, so we're using CentOS (not my first choice, but... ). Anyway, we're doing a "diskless" configuration for this machine: While there /is/ a local disk on each node, the theory is that the diskless configuration will be a little simpler for them to manage. Its only my job to set it up for them, then give them a (all too brief) tour of the system and how to manage it.

The Diskless config is taken care of by the system-config-netboot tool under CentOS. It creates a somewhat interesting configuration. Basically, there is a shared NFSROOT mounted by all of the compute nodes, then, over the top of this is mounted a set of files/directories that should be unique to each machine. The mount table is a complete mess(!), but, to be perfectly fair, it seems to work pretty well. This simplifies the task of administering the cluster to basically maintaining two copies of the OS: one for the "head node" and one for the "compute nodes".

This got me to thinking...

If we're willing to have the mount table a bit, umm..., large, then why not have a single copy of the OS to maintain? Instead of a chroot environment that lives underneath the root filesystem somewhere, why not just NFS export / to all of the compute nodes, then do the same NFS mounting trick on the compute nodes in order to make them unique? Then we would have a single copy of the OS that is modified where, and only where, it needs to be unique for the compute nodes.

I'm thinking this kind of approach will likely represent a larger investment up-front, but I'd be willing to bet that your investment will be repaid many times over as opposed to this maintaining two distinct root filesystems (one for the head node, one for the compute nodes). I think it should especially help when it comes to preventing "version skew" between the software installed.

Anyway, that's my big idea for the day...


