Many of today’s research problems rely on huge volumes of data, the analysis of which can lead to computing bottlenecks. Systems biology approaches are even more complicated because researchers want to drill down to the minutest level of details, even as they maintain cross-referencing to investigate relationships between proteins. These computational problems become large enough that the traditional approach — assembling more CPU power in an ever-expanding cluster — begins to generate limiting returns. Numerous CPUs sending and receiving large amounts of data create an I/O bottleneck. Those processors also eat up power and space, and management of such clusters become unwieldy.
Computational limits can also prevent researchers from asking really big questions. In an ideal world, researchers would like to incorporate many lines of inquiry into a single series of experiments. For example, an experiment to investigate the genetic basis of mental retardation in fruit flies — pursued as a model of Fragile X syndrome and other forms of human mental retardation — might also include an analysis of the fly metabolites (its metabolome) as well as its proteins. Determining associations and performing cross-correlations, perhaps across multiple species of fly, would quickly overwhelm any traditional computing cluster.
Typically, researchers respond to such limitations by simplifying the experiment so that computational resources can handle the burden, but that limits the power of the experiment and analysis. The kind of in-depth analysis that can really push forward the boundaries of science requires a totally different approach, such as marrying traditional clusters with Field Programmable Gate Array (FPGA) devices that can eliminate computational bottlenecks.
The Proteomics Appliance
To respond to these needs, SGI has developed the Proteomics Appliance. It is based on the SGI® RASC™ (Reconfigurable Application Specific Computing) system using Intel® Itanium® and Quad-Core Intel ® Xeon ® Processors. SGI’s technology partner, Singapore-based Progeniq, has already loaded the Proteomics Appliance with the proteomics applications Smith-Waterman and ClustalW, and it is expected that by the end of March it will also be loaded with HMMer. BLASTp will follow soon after.
Preliminary test runs with the SGI Proteomics Appliance show a 20 to 30-fold speedup over a
2.4GHz Intel Core 2 Quad Q6600 when using Smith-Waterman to search the SWISS-PROT database with query sequences ranging from 500 to 1,000 amino acids. The performance of Clustal W shows an approximate 7-fold speedup over the same 2.4GHz Intel Core 2 Quad Q6600 when aligning a set of 124 sequences. These already impressive boosts will be improvedfurther as the researchers at SGI and Progeniq further optimize the system. Combined with the SGI® Altix® system, the Proteomics Appliance can run analyses as much as hundreds of times faster than traditional solutions, with only a minimal increase in power consumption. The Appliance’s reconfigurable hardware processor can ‘rewire’ itself on the fly according to whichever application needs to be accelerated.
Read More