ALICE and Clustered File Systems

I came across an interesting article the other day in CIO magazine about a particular systems engineering problem at CERN. A particular experiment at the institute’s Large Hadron Collider (LHC) is going to be generating a constant 1GB/sec data stream, and apparently they need an architecture to process and store the data.

Unfortunately, the article is thin on details. It does mention, however that the first stop for the data will be a group of about 200 PCs, and from there it gets processed and goes to another 50 PCs which will record it to disk. It mentions that they’re using a standard 4Gbps Fiber Channel SAN with a clustered filesystem. I’m going to assume that they have the 50 PCs set up as a cluster to receive the data and write over the fiber network directly to shared LUNs on the SAN.

I’ve never heard of the storage clustering software they are using (Quantum’s StorNext), but it sounds pretty interesting. Clustered storage is definitely becoming more popular, and can be a huge win from a performance, as well as management standpoint — as these folks at CERN have no-doubt been thinking. We made use of clustered SAN storage for the netfiles system at UIUC, and I think it was a big win over any NFS solution. The StorNext software sounds particularly interesting, because it’s supposedly vendor agnostic. The QFS software we used for netfiles was Solaris only, but quite stable and fast — if not a bear to initially get set up.

It seems like they are using a relatively low-bandwidth SAN, unless they are talking about 4Gbps per link, and not 4Gbps total. But, then again, maybe the bottleneck is in all the processing done by the initial 200 pcs in the tree. And, of course, nothing is mentioned about backups. One would hope that they have a duplicate SAN somewhere else doing some sort of timed asynchronous backups in case something bad happens. Anyway, interesting stuff.

  1. #1 by jcbarret on July 23, 2007 - 5:52 pm

    You read CIO magazine?!?