<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://bioitalliance.org/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>CLC bio</title><link>http://bioitalliance.org/blogs/clc_bio/default.aspx</link><description /><dc:language>en-US</dc:language><generator>CommunityServer 2.0 (Debug Build: 60217.2664)</generator><item><title>Identifying genetic variance related to human diabetes mellitus</title><link>http://bioitalliance.org/blogs/clc_bio/archive/2007/03/26/149.aspx</link><pubDate>Mon, 26 Mar 2007 20:54:00 GMT</pubDate><guid isPermaLink="false">9206313b-73a6-4b55-a104-a0254c229ae1:149</guid><dc:creator>CLC bio</dc:creator><slash:comments>0</slash:comments><comments>http://bioitalliance.org/blogs/clc_bio/comments/149.aspx</comments><wfw:commentRss>http://bioitalliance.org/blogs/clc_bio/commentrss.aspx?PostID=149</wfw:commentRss><description>Diabetes mellitus is causing major health problems in the world and
is affecting more and more people. Today nearly 200 million people
have been diagnosed, and the number of new incidences is rapidly
increasing, especially in the western world. The reason for this
increase is not simple. Change in lifestyle and environmental
factors are important factors as indicated by many clinical studies.

&lt;P&gt;
In general, diabetes mellitus is a severe life threatening disease.
The secondary complications are many and well described. Examples
are highly increased risk of obesity, development of retinopathy,
development of kidney dysfunction, and accelerated development of
atherosclerosis and mucormycosis.

&lt;P&gt;
Diabetes mellitus types 1 and 2 are characterized by primary
hyperglycemia due to dysregulation of the plasma glucose titer.
Glucose uptake by somatic cells is regulated by insulin. Insulin is
produced by the &amp;#223;-cells in pancreas and dysfunction or apoptosis of
these cells are affecting insulin expression and subsequently the
plasma concentration of glucose.

&lt;P&gt;

&lt;H2&gt;Studying diabetes mellitus&lt;/H2&gt;
The understanding of the genetic basis of diabetes mellitus and the
influence of genetic variation on phenotypic traits are of great
importance for medical research and development of drugs. The use of
sequence analysis tools like the CLC bio workbenches are of great
importance today to gain a better understanding of the complexity
and of the diabetes pathogenesis.

&lt;P&gt;
In pre-clinical studies many different animal models have been used.
The pig is a valuable model organism and is relevant to human
biomedical research in areas as e.g. obesity, cardiovascular
disease, and nutritional studies.

&lt;P&gt;
This case study describes how the CLC Combined Workbench is used to
identify sequence polymorphisms in the pig genome at positions that
may be related to diabetes and subsequently demonstrate how to
analyze those using bioinformatics tools.

&lt;P&gt;

&lt;H2&gt;Genetics of Diabetes Mellitus&lt;/H2&gt;
Diabetes mellitus is a complex, multifactorial and polygenic
disease, likely to be caused by one or more gene alterations acting
in combination with non-genetic factors [&lt;A HREF="http://www.clcbio.com/index.php?id=478#Morwessel1998"&gt;Morwessel, 1998&lt;/A&gt;]. Since
obese phenotypic traits often are seen in relation to a diabetic
diagnose, genetic analysis of genes (including sequence analysis)
known to be related to obesity could be interesting for the
clarification of some phenotypic relations to the disease.

&lt;P&gt;

&lt;H3&gt;Identifying uncoupling proteins as diabetes type 2 candidate genes&lt;/H3&gt;
The candidate gene approach for type 2 diabetes mellitus tests for
association between particular gene variants and diabetes. So
candidate genes encode proteins involved in either insulin
synthesis, pathways of insulin secretion, or insulin action, where
defects cause abnormal patterns [&lt;A HREF="http://www.clcbio.com/index.php?id=478#So2000"&gt;So et&amp;nbsp;al., 2000&lt;/A&gt;] and polymorphisms in
these genes may be important risk factors for type 2 diabetes
mellitus patients.

&lt;P&gt;

&lt;H3&gt;Uncoupling proteins&lt;/H3&gt;
Uncoupling proteins (UCP) are located in the inner mitochondrial
membrane, and one of the suggested functions of UCPs is that of
uncoupling by acting as a channel for proton entry into the
mitochondrial matrix.

&lt;DIV ALIGN="CENTER"&gt;&lt;IMG WIDTH="500" HEIGHT="232" ALIGN="BOTTOM" BORDER="0" SRC="http://www.clcbio.com/scienceimages/Diabetes/intramembrane_web.png" ALT="Image intramembrane_web"&gt;&lt;br /&gt;&lt;A NAME="fig:location"&gt;&lt;/A&gt;&lt;A NAME="745"&gt;&lt;/A&gt;
&lt;STRONG&gt;Figure 1: &lt;/STRONG&gt;&lt;I&gt;Location of the uncoupling protein in the intramembrane space [&lt;A HREF="http://www.clcbio.com/index.php?id=478#Gura1998"&gt;Gura, 1998&lt;/A&gt;].&lt;/I&gt;&lt;/DIV&gt;

&lt;P&gt;&lt;br /&gt;

&lt;DIV ALIGN="CENTER"&gt;&lt;IMG WIDTH="500" HEIGHT="330" ALIGN="BOTTOM" BORDER="0" SRC="http://www.clcbio.com/scienceimages/Diabetes/ucp_web.png" ALT="Image ucp_web"&gt;&lt;br /&gt;&lt;A NAME="fig:workflow"&gt;&lt;/A&gt;&lt;A NAME="746"&gt;&lt;/A&gt;
&lt;STRONG&gt;Figure 2: &lt;/STRONG&gt;&lt;I&gt;The yellow arrow represents the uncoupling protein's function as a channel for proton entry into the matrix [&lt;A HREF="http://www.clcbio.com/index.php?id=478#Rousset2004"&gt;Rousset et&amp;nbsp;al., 2004&lt;/A&gt;].&lt;/I&gt;
&lt;/DIV&gt;

&lt;P&gt;
Respiration and ADP phosphorylation in mitochondria are coupled, and
the uncoupling proteins appear to be controlling the level of these
functions. Uncoupling is when a protein acts as a proton carrier,
and by the transportation of protons from the intermembrane space to
the matrix a shunt between ATP synthase and the respiratory chain is
created [&lt;A HREF="http://www.clcbio.com/index.php?id=478#Rousset2004"&gt;Rousset et&amp;nbsp;al., 2004&lt;/A&gt;]. By this mechanism UCPs might have some
basic roles to play in human physiology. In addition UCP2 and UCP3
decrease membrane potential and increase thermogenesis
[&lt;A HREF="http://www.clcbio.com/index.php?id=478#Dalgaard2001a"&gt;Dalgaard and Pedersen, 2001&lt;/A&gt;] and the genes encoding these proteins are thus
regarded as candidate genes for studies of the diabetes type 2 often
accompanying obese phenotypic traits.

&lt;P&gt;
As the uncoupling proteins are highly conserved among species and
significant similarity is seen between e.g. the human and the pig
uncoupling proteins, showing more than 90 percent identity at the
level of amino acid sequence, these genes might be interesting
related to the animal model research.

&lt;P&gt;

&lt;H1&gt;Work flow&lt;/H1&gt;
CLC bio provides bioinformatics software of great importance for the
detection, identification, and characterization of polymorphisms in
diabetes type 2 candidate genes as e.g. the uncoupling proteins. The
use of CLC Combined Workbench integrating all analyses in one
program is exemplified below.

&lt;P&gt;
Human and porcine DNA encoding UCP2 and UCP3 is retrieved from local
and on-line databases using BLAST. The sequences are aligned to
check for proper identity between species. Next, primers are
designed for porcine DNA. The primers are used for PCR, and the
products are subsequently sequenced. The sequencing data is
assembled to the reference sequences used for designing primers, and
putative polymorphisms are identified. Using the integrated SNP
annotation functionality of the Combined Workbench, the possible
polymorphisms are characterized and compared to known SNPs in the
SNP database. Next, the coding regions of the DNA sequences are
translated into protein and subjected to a number of predictions to
determine the impact of the polymorphisms on the UCP proteins.

&lt;P&gt;
In the next sections, three of the steps in the work flow are
described in further detail.

&lt;P&gt;

&lt;H2&gt;Zooming in on Annotations&lt;/H2&gt;
During the entire &lt;A&gt;work flow&lt;/A&gt;, the
same sequences are used for both alignments, as basis for primer
design, and as reference sequences in assembly and SNP
identification. This means that annotation of genes, coding regions
etc. are preserved during all the analyses. They can then be used to
guide the inspection of alignments and BLAST hits, the location of
primers, the interpretation of sequencing data etc. Small snapshots
of the role of annotations in the different parts of the work flow
are shown below:

&lt;P&gt;

&lt;DIV ALIGN="CENTER"&gt;&lt;IMG WIDTH="500" HEIGHT="91" ALIGN="BOTTOM" BORDER="0" SRC="http://www.clcbio.com/scienceimages/Diabetes/blast_annotated_web.png" ALT="Image blast_annotated_web"&gt;&lt;br /&gt;
&lt;A NAME="fig:blast_annotated"&gt;&lt;/A&gt;&lt;A NAME="729"&gt;&lt;/A&gt;
&lt;STRONG&gt;Figure 3: &lt;/STRONG&gt;&lt;I&gt;Inspecting the result of a BLAST search. The yellow annotation represents the coding region of the UCP3 gene. The annotations make it easy to get an overview of where the hits align to the query sequence.&lt;/I&gt;
&lt;/DIV&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;br /&gt;

&lt;DIV ALIGN="CENTER"&gt;&lt;IMG WIDTH="500" HEIGHT="90" ALIGN="BOTTOM" BORDER="0" SRC="http://www.clcbio.com/scienceimages/Diabetes/alignment_annotated_web.png" ALT="Image alignment_annotated_web"&gt;&lt;br /&gt;&lt;A NAME="fig:workflow"&gt;&lt;/A&gt;&lt;A NAME="732"&gt;&lt;/A&gt;
&lt;STRONG&gt;Figure 4: &lt;/STRONG&gt;&lt;I&gt;An alignment with a yellow annotation representing the coding region of the UCP3 gene for humans(top) and pigs(bottom). The translation is shown to visualize differences in the amino acid sequence between the two species.&lt;/I&gt;
&lt;/DIV&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;br /&gt;

&lt;DIV ALIGN="CENTER"&gt;&lt;IMG WIDTH="500" HEIGHT="182" ALIGN="BOTTOM" BORDER="0" SRC="http://www.clcbio.com/scienceimages/Diabetes/primer_web.png" ALT="Image primer_web"&gt;&lt;br /&gt;&lt;A NAME="fig:workflow"&gt;&lt;/A&gt;&lt;A NAME="735"&gt;&lt;/A&gt;
&lt;STRONG&gt;Figure 5: &lt;/STRONG&gt;&lt;I&gt;Designing primers to bind just before the coding region of the UCP3 coding region. The annotations eliminate the need for remembering positions.&lt;/I&gt;&lt;/DIV&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;br /&gt;

&lt;DIV ALIGN="CENTER"&gt;&lt;IMG WIDTH="500" HEIGHT="241" ALIGN="BOTTOM" BORDER="0" SRC="http://www.clcbio.com/scienceimages/Diabetes/assemble_web.png" ALT="Image assemble_web"&gt;&lt;br /&gt;&lt;A NAME="fig:workflow"&gt;&lt;/A&gt;&lt;A NAME="738"&gt;&lt;/A&gt;
&lt;STRONG&gt;Figure 6: &lt;/STRONG&gt;&lt;I&gt;Assembling to the reference sequence where both the primer binding site (red annotation) and the coding sequence (yellow annotation) are shown.&lt;/I&gt;
&lt;/DIV&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;br /&gt;

&lt;H2&gt;Zooming in on SNP annotation using BLAST&lt;/H2&gt;
Sequencing data of genes encoding the uncoupling proteins are
searched for polymorphisms. In positions where a polymorphism is
identified, a BLAST database search is performed for SNPs in human
genes similar to the genes of interest. This helps control and
verify the results.

&lt;P&gt;
Results of the SNP annotation using BLAST can be shown as graphics
(see figure &lt;A&gt;8&lt;/A&gt;), in a tabular view, and as annotations on
the input sequence(s).

&lt;P&gt;

&lt;DIV ALIGN="CENTER"&gt;&lt;IMG WIDTH="500" HEIGHT="321" ALIGN="BOTTOM" BORDER="0" SRC="http://www.clcbio.com/scienceimages/Diabetes/snp_blast_web.png" ALT="Image snp_blast_web"&gt;&lt;br /&gt;
&lt;A NAME="fig:SNP"&gt;&lt;/A&gt;&lt;A NAME="741"&gt;&lt;/A&gt;
&lt;STRONG&gt;Figure 7: &lt;/STRONG&gt;&lt;I&gt;The graphical view of the SNP BLAST.
  At the top you see the sequence used as reference in the assembly
  with possible SNP's annotated with red arrows. Below are the hits from the SNP database. One SNP is high-lighted (M).&lt;/I&gt;
&lt;/DIV&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;br /&gt;
When annotating SNPs using BLAST, you can select which database you
want to search against, e.g. human, mouse, or rat. You can also set
the BLAST parameters such as filtering and gap costs.

&lt;P&gt;
In the graphic view you can see sequence matches between your query
sequence and hit sequences in the database chosen. From here you can
easily zoom in on specific regions of interest, or you can open a
hit region in a new view.

&lt;P&gt;
Annotations on the query sequence will indicate any match from the
database where a polymorphism has previously been identified, and
the tabular view provides an overview of e.g. identity and positions
of matching regions between query and hit sequences. From the
tabular view you can easily open any of the hit sequences at the
NCBI web page.

&lt;P&gt;

&lt;H2&gt;Zooming in on transmembrane helix prediction and secondary structure prediction&lt;/H2&gt;
After translation of the sequenced genes and adding of annotations
where polymorphisms were identified, transmembrane regions in the
porcine uncoupling proteins 2 and 3 are predicted by CLC Combined
Workbench to localize the identified polymorphisms; transmembrane
location or in extracellular or intermembrane regions.

&lt;P&gt;
In a similar way, secondary structure of the proteins were predicted
in CLC Combined Workbench. From suggested locations of the
polymorphisms the complications with specific domains affecting
protein structure and function are identified.

&lt;P&gt;
As related to uncoupling protein 3 putative topology in the inner
mitochondrial membrane, a suggested location of identified
polymorphisms may be helpful to predict  possible impacts of the
identified genetic variance.

&lt;P&gt;

&lt;DIV ALIGN="CENTER"&gt;&lt;IMG WIDTH="500" HEIGHT="171" ALIGN="BOTTOM" BORDER="0" SRC="http://www.clcbio.com/scienceimages/Diabetes/secondary_web.png" ALT="Image secondary_web"&gt;&lt;br /&gt;
&lt;A NAME="fig:secondary"&gt;&lt;/A&gt;&lt;A NAME="744"&gt;&lt;/A&gt;
&lt;STRONG&gt;Figure 8: &lt;/STRONG&gt;&lt;I&gt;Prediction of secondary structure for the analyzed uncoupling protein 3.&lt;/I&gt;
&lt;/DIV&gt;&lt;img src="http://bioitalliance.org/aggbug.aspx?PostID=149" width="1" height="1"&gt;</description><category domain="http://bioitalliance.org/blogs/clc_bio/archive/category/1227.aspx">Case Studies</category></item><item><title>High-performance bioinformatics without any hassles</title><link>http://bioitalliance.org/blogs/clc_bio/archive/2007/03/14/146.aspx</link><pubDate>Wed, 14 Mar 2007 19:27:00 GMT</pubDate><guid isPermaLink="false">9206313b-73a6-4b55-a104-a0254c229ae1:146</guid><dc:creator>CLC bio</dc:creator><slash:comments>8</slash:comments><comments>http://bioitalliance.org/blogs/clc_bio/comments/146.aspx</comments><wfw:commentRss>http://bioitalliance.org/blogs/clc_bio/commentrss.aspx?PostID=146</wfw:commentRss><description>&lt;p&gt;CLC bio announced the release of an innovative product to the scientific community - the CLC Bioinformatics Cell - which will enable users to transform their ordinary Intel- or AMD-based workstation computers to super-computing units. Tasks previously taking two hours can now be computed in one minute - without any hassles.&lt;/p&gt;
&lt;p&gt;J&amp;ouml;rgen Forsberg, Nordic Director at Intel:&lt;blockquote&gt;This product can potentially help pharma and biotech corporations move faster through the phases of drug discovery, and thereby help the pharma industry to develop products faster to meet the healthcare challenges of today&lt;/blockquote&gt;&lt;br /&gt;&lt;/p&gt;
&lt;p&gt;Senior Scientific Officer at CLC bio, Dr. Jannick Bendtsen adds: &lt;blockquote&gt;A very important research area of molecular biology today concerns the use of small interfering RNA (siRNA). The design of siRNA relies on database matching to ensure that strong cross-hybridization to non-target genes does not occur. However, due to the small size of siRNA molecules, potential cross-matches can be missed by heuristic algorithms like BLAST. With the speed of the Bioinformatics Cell we are now making it possible to abandon heuristics, and to weed out cross-matching siRNAs, by using Smith-Waterman - the most accurate matching algorithm that exists.&lt;/blockquote&gt;&lt;br /&gt;&lt;/p&gt;
&lt;p&gt;It is commonly accepted throughout the scientific community, that the growth in amounts of data which needs to be computed is growing faster, than the increase in processor speed. This means scientists are craving for alternative ways to have their computing needs met. The CLC Bioinformatics Cell provides this, by accelerating bioinformatics algorithms, such as Smith-Waterman and ClustalW, up to 110 times faster than what a fast desktop computer can handle today.&lt;/p&gt;
&lt;p&gt;The Bioinformatics Cell has been developed by CLC bio's world class bioinformatics experts, and they have essentially brought high-performance computing to everybody with this product. This remarkable speed-up is achieved through innovative use of SIMD technology that unleashes already existing power in your computer processor. &lt;/p&gt;
&lt;p&gt;A scientist can buy a single unit for his laptop computer - a lab can share a couple among their desktop computers - or - it can be plugged in to each computer in a computer cluster and see the speed-up soar beyond anything available in the market today.&lt;/p&gt;
&lt;p&gt;The CLC Bioinformatics Cell works with a simple hassle-free Plug 'n Play concept through CLC bio's intuitive workbench platform, thus eliminating the need for highly specialized consultants. The device is simply plugged in to the USB port of any Intel- or AMD-based computer, be it Mac OS X, Windows or Linux. Command-line execution is also an option, giving the option of integrating the Bioinformatics Cell with existing bioinformatics workflow of scripts and in-house programs.&lt;/p&gt;
Version 0.91 of the Bioinformatics Cell is currently supporting the bioinformatics algorithms of Smith-Waterman and ClustalW, with more to come. The acceleration of Smith-Waterman is such, that it can be compared to BLAST, however yielding 100% precise results instead of the imprecise estimates of BLAST, where up to 50% of the results can be missed, as stated in the white paper "High performance computing, using CLC Bioinformatics Cell", available on &lt;a href="http://www.clccell.com" target="_blank"&gt;www.clccell.com&lt;/a&gt;. The CLC Bioinformatics Cell can help researchers and scientists worldwide get better scientific results in less time
&lt;img src="http://bioitalliance.org/aggbug.aspx?PostID=146" width="1" height="1"&gt;</description><category domain="http://bioitalliance.org/blogs/clc_bio/archive/category/1228.aspx">News</category></item><item><title>Bioinformatics Consulting</title><link>http://bioitalliance.org/blogs/clc_bio/archive/2007/03/14/145.aspx</link><pubDate>Wed, 14 Mar 2007 19:11:00 GMT</pubDate><guid isPermaLink="false">9206313b-73a6-4b55-a104-a0254c229ae1:145</guid><dc:creator>CLC bio</dc:creator><slash:comments>15</slash:comments><comments>http://bioitalliance.org/blogs/clc_bio/comments/145.aspx</comments><wfw:commentRss>http://bioitalliance.org/blogs/clc_bio/commentrss.aspx?PostID=145</wfw:commentRss><description>More and more public research organizations, biotech companies, and pharmaceutical corporations need assistance in achieving specialized, faster, and more effective research results. Our world class bioinformatics consultants can help you achieve these goals and get even better scientific results - in less time! We can help you succeed by:
&lt;ul&gt;&lt;li&gt;Developing specialized bioinformatics algorithms&lt;/li&gt;
&lt;li&gt;Creating customized modules for your bioinformatics software&lt;/li&gt;
&lt;li&gt;Performing bioinformatics analyses for organizations that lack time or qualified knowhow.&lt;/li&gt;&lt;/ul&gt;
	
&lt;a href="http://www.clcbioconsulting.com/" target="_blank"&gt;CLC bio Consulting - World Class Bioinformatics Consulting&lt;/a&gt;&lt;img src="http://bioitalliance.org/aggbug.aspx?PostID=145" width="1" height="1"&gt;</description><category domain="http://bioitalliance.org/blogs/clc_bio/archive/category/1230.aspx">Reference Links</category></item><item><title>Screening the carrot genome for MADS-box genes</title><link>http://bioitalliance.org/blogs/clc_bio/archive/2007/03/14/144.aspx</link><pubDate>Wed, 14 Mar 2007 18:57:00 GMT</pubDate><guid isPermaLink="false">9206313b-73a6-4b55-a104-a0254c229ae1:144</guid><dc:creator>CLC bio</dc:creator><slash:comments>0</slash:comments><comments>http://bioitalliance.org/blogs/clc_bio/comments/144.aspx</comments><wfw:commentRss>http://bioitalliance.org/blogs/clc_bio/commentrss.aspx?PostID=144</wfw:commentRss><description>This case study shows how bioinformatics in general, and CLC
Combined Workbench in particular, can help researchers gain deeper
knowledge of the complex processes involved in flower development in
&lt;i&gt;Daucus carota&lt;/i&gt;.

&lt;p&gt;
&lt;/p&gt;&lt;div align="center"&gt;&lt;a name="fig:Pathways"&gt;&lt;/a&gt;
&lt;img src="http://www.clcbio.com/scienceimages/QueenLace2_web.jpg" alt="" align="bottom" border="0" height="331" width="250"&gt;&lt;br&gt;
&lt;strong&gt;Figure 1: &lt;/strong&gt;&lt;i&gt;Flower of Daucus carota.&lt;/i&gt;
&lt;/div&gt;

&lt;p&gt;
The study focuses on how the alignment-based primer design functionalities of the CLC Combined Workbench make it possible to
design degenerate primers to amplify a set of conserved
genes in &lt;i&gt;Daucus carota&lt;/i&gt;. The research has been carried out at
the &lt;a href="http://www.clcbio.com/index.php?id=478#moeller-nielsen"&gt;Laboratory of Gene Expression at the University of Aarhus,
Denmark.&lt;/a&gt;
&lt;/p&gt;&lt;p&gt;

&lt;/p&gt;&lt;h2&gt;Early flowering in &lt;i&gt;Arabidopsis thaliana&lt;/i&gt;&lt;/h2&gt;
Flowering plants, or angiosperms, represent one of the most
successful and diverse groups of organisms on the planet, with more
than 250,000 species in the wild. Although angiosperms such as
orchids, roses and snapdragons have very distinctive flowers, most
flowers contain just four organ types and their development involves
highly conserved molecular mechanisms [&lt;a href="http://www.clcbio.com/index.php?id=478#Krizek2005"&gt;Krizek and Fletcher, 2005&lt;/a&gt;].

&lt;p&gt;
Flowers, like shoot and leaves, derive from the shoot apical
meristem (SAM). Whereas leaves arise early in the plant life cycle,
flowers are only produced after transition of the SAM from
vegetative to reproductive growth. The timing of the transition is
crucial to reproductive success and as such must take place at the
exact right time.

&lt;/p&gt;&lt;p&gt;
In &lt;i&gt;Arabidopsis thaliana&lt;/i&gt; the switch from vegetative to
reproductive growth is governed by four main pathways (see figure
&lt;a&gt;2&lt;/a&gt;). Signals transmitted by genes belonging to each
pathway converge at the Floral Pathway Integrators. These activate
and/or up-regulate the expression of the Floral Meristem Identity
(FMI) genes responsible for the establishment of the floral meristem
[&lt;a href="http://www.clcbio.com/index.php?id=478#Simpson1999"&gt;Simpson et&amp;nbsp;al., 1999&lt;/a&gt;]. The FMI genes in turn activate the Floral Organ
Identity genes, required for specifying the identity of the floral
organs [&lt;a href="http://www.clcbio.com/index.php?id=478#Krizek2005"&gt;Krizek and Fletcher, 2005&lt;/a&gt;].&lt;/p&gt;&lt;p&gt;

&lt;/p&gt;&lt;div align="center"&gt;&lt;a name="fig:Pathways"&gt;&lt;/a&gt;
&lt;img src="http://www.clcbio.com/scienceimages/Pathways.gif" alt="Image Pathways" align="bottom" border="0"&gt;&lt;br&gt;
&lt;strong&gt;Figure 2&lt;/strong&gt;: &lt;i&gt;Flower Inducing Pathways. The Gibberellic, Autonomous, Vernalization and Photoperiodic
  pathways induce the transition of the SAM from vegetative to reproductive growth and hence early flower
  development. Genes written in boldface belong to the MADS-box gene family.&lt;/i&gt;
&lt;/div&gt;

&lt;p&gt;
Of the 15 genes central to early flower development, nine belong to
a group of homeotic genes known as MADS-box genes, sharing a
conserved region of approximately 180 bp in the 5'-end. Due to the
high degree of conservation, degenerate primers targeting this
region can be designed. This enables a broad search for MADS-box
genes in virtually any organism.

&lt;/p&gt;&lt;p&gt;
Expressed in vegetative as well as floral tissues, MADS-box genes
are likely to be involved in many aspects of plant development. They
have been successfully cloned from more than 40 different plant
genera [&lt;a href="http://www.clcbio.com/index.php?id=478#Purugganan1995"&gt;Purugganan et&amp;nbsp;al., 1995&lt;/a&gt;]. In the &lt;i&gt;Arabidopsis thaliana&lt;/i&gt;genome alone, more than 100 sequences have been recovered and
although less than 20% are characterized, many have been shown to
be involved in several aspects of bolting and flowering
[&lt;a href="http://www.clcbio.com/index.php?id=478#Parenicova2003"&gt;Parenicová et&amp;nbsp;al., 2003&lt;/a&gt;], [&lt;a href="http://www.clcbio.com/index.php?id=478#Martinez-Castilla2003"&gt;Martinez-Castilla and Alvarez-Buylla, 2003&lt;/a&gt;].

&lt;/p&gt;&lt;p&gt;

&lt;/p&gt;&lt;h2&gt;MADS-box genes in carrot (&lt;i&gt;Daucus carota&lt;/i&gt;)&lt;/h2&gt;
Cytoplasmic male sterility (CMS) results from incompatibility
between the nucleus and cytoplasm. CMS plants are not capable of
normal pollen development, in that maturation of viable pollen is
prevented. In addition, flower phenotype is altered. The phenotypes
resemble homeotic flower mutants of &lt;i&gt;Arabidopsis thaliana&lt;/i&gt;.

&lt;p&gt;
In carrot, five nuclear genes involved in the expression of two CMS
phenotypes have been identified. The five clones were aligned with
putative equivalents of other plant species, and due to their
homology to homeotic genes of &lt;i&gt;Antirrhinum majus&lt;/i&gt; and
&lt;i&gt;Arabidopsis thaliana&lt;/i&gt; the genes were named &lt;i&gt;DcMADS1-5&lt;/i&gt; [&lt;a href="http://www.clcbio.com/index.php?id=478#Linke2003"&gt;Linke et&amp;nbsp;al., 2003&lt;/a&gt;].

&lt;/p&gt;&lt;p&gt;
The research performed in the Laboratory of Gene Expression at the
University of Aarhus alternated between tasks performed in the lab
and bioinformatics tasks using the CLC Combined Workbench. Below,
the work performed is summarized.

&lt;/p&gt;&lt;p&gt;

&lt;/p&gt;&lt;h2&gt;Research&lt;/h2&gt;
The purpose of the research was to identify genes involved in the
floral initiation process in carrots (&lt;i&gt;Daucus carota&lt;/i&gt; cv.
&lt;i&gt;'Nantes'&lt;/i&gt;) and focus was set on the MADS-box genes. The work
flow is outlined in figure &lt;a&gt;3&lt;/a&gt;.

&lt;p&gt;

&lt;/p&gt;&lt;div align="center"&gt;&lt;a name="fig:workflow"&gt;&lt;/a&gt;
&lt;img src="http://www.clcbio.com/scienceimages/workflow_carrot.gif" alt="" align="bottom" border="0"&gt;&lt;br&gt;&lt;strong&gt;Figure 3&lt;/strong&gt;: &lt;i&gt;Illustration of work flow.&lt;/i&gt;
&lt;/div&gt;

&lt;p&gt;
The CLC Combined Workbench was used to search for, import and align
the cDNA sequences of &lt;i&gt;DcMADS1-5&lt;/i&gt;. Then, a pair of degenerate
primers were designed targeting the conserved MADS-box region. These
primers were used for PCR amplification of genomic DNA isolated from
leaves of 19 weeks old carrots. The resulting fragments were cloned
into the TOPO vector and sequenced.

&lt;/p&gt;&lt;p&gt;
The trace files were imported into the CLC Combined Workbench, where
the forward and reverse reads were assembled. BLAST searches against
the NCBI database were performed using BLASTx which translates the
sequence into amino acid sequences before BLASTing against a protein
database. Based on the results of the BLAST searches, the clones
suggested to be fragments of MADS-box genes were selected. Based on
the nucleotide sequence, primers for inverse PCR (iPCR) were
designed.

&lt;/p&gt;&lt;p&gt;
Back in the lab, the positive clones were labeled with
&lt;sup&gt;32&lt;/sup&gt;P-dATP and used to screen a carrot genomic
library. Positive
λ-clones
were isolated, DNA extracted and subjected to iPCR. The resulting
amplicons were cloned into the TOPO vector and processed as
described above.

&lt;/p&gt;&lt;p&gt;
The screening of the genomic library remains a work in progress and
at this point in time putative MADS-box genes have been selected and
annotations describing the regions identified in each clone were
manually added. The images generated within the CLC Combined
Workbench have been exported in high-quality graphics formats for
use in presentations etc.

&lt;/p&gt;&lt;p&gt;
In the following sections, two of the steps in the work flow are
described in further detail; The alignment-based primer design and
export of images.

&lt;/p&gt;&lt;p&gt;

&lt;/p&gt;&lt;h2&gt;Zooming in on Alignment-based primer design&lt;/h2&gt;
The CLC Combined Workbench makes it possible to align several
sequences and design degenerate primer sets in regions of homology.

&lt;p&gt;

&lt;/p&gt;&lt;div align="center"&gt;&lt;a name="fig:alignment"&gt;&lt;/a&gt;
&lt;img src="http://www.clcbio.com/scienceimages/alignment_with_annotations.gif" alt="" align="bottom" border="0"&gt;&lt;br&gt;
&lt;strong&gt;Figure 4&lt;/strong&gt;: &lt;i&gt;Alignment of four sequences which are used to design primers. The red color indicates fully conserved residues and the blue
color shows positions with high degree of variation. At the
positions with no background color, one of the sequences is
different from the rest.&lt;/i&gt;
&lt;/div&gt;

&lt;p&gt;
Annotations from the original nucleotide sequences or alignment are
still visible within the primer design viewer, which is a big help
when the primers are required to lie within specific regions. In
figure &lt;a&gt;4&lt;/a&gt;, you can see the yellow annotations
indicating the coding regions of the MADS-box genes.

&lt;/p&gt;&lt;p&gt;
The conservation is shown as a blue-to-red gradient background color
making it easy to visually find the highly conserved regions of the
alignment. These regions are defined as locations for the forward
and the reverse primers as shown with the gray annotation in figure
&lt;a&gt;4&lt;/a&gt;.

&lt;/p&gt;&lt;p&gt;
The parameters regarding the individual properties of the primers
are adjusted in the Side Panel. In addition to normal primer
settings, such as primer length, melting temperature, buffer
properties and degree of self annealing, the alignment-based primer
design makes it possible to select degree of degeneracy, or the
number of mismatches acceptable.

&lt;/p&gt;&lt;p&gt;
When all the parameters are set, the CLC Combined Workbench
calculates the best combinations of primer pairs for the selected
regions. The properties for the primer pairs, such as melting
temperature and self annealing, are inspected before deciding on a
primer pair.

&lt;/p&gt;&lt;p&gt;

&lt;/p&gt;&lt;h2&gt;Zooming in on Export of images&lt;/h2&gt;
In figure &lt;a&gt;5&lt;/a&gt; you can see the graphical result of the
first phase of the research described in this case study.

&lt;p&gt;

&lt;/p&gt;&lt;div align="center"&gt;&lt;a name="fig:result"&gt;&lt;/a&gt;
&lt;img src="http://www.clcbio.com/scienceimages/result_web.gif" alt="Image result_web" align="bottom" border="0"&gt;
&lt;strong&gt;&lt;br&gt;Figure 5&lt;/strong&gt;: &lt;i&gt;Graphical output of the CLC Combined Workbench. The result of the research is shown as annotations.&lt;/i&gt;
&lt;/div&gt;

&lt;p&gt;
The graphics generated in the CLC Combined Workbench are easily
exported in a variety of graphics formats such as gif, jpeg, png and
pdf.

&lt;/p&gt;&lt;p&gt;
The CLC Combined Workbench graphics export is based on a WYSIWYG
principle: What You See Is What You Get. This means that you can
configure the layout of e.g. a sequence or alignment in the program
and when you export this to a graphics file, it will look exactly
the same as in the Workbench. This makes it easy to create
good-looking illustrations of research findings for publications and
presentations.

&lt;/p&gt;&lt;p&gt;
The annotations on the sequence shown in figure &lt;a&gt;5&lt;/a&gt;
illustrate putative MADS-box genes and the annotations furthermore
describe the regions identified in each clone. This illustration is
used for communicating the results of the research in reports and
presentations.&lt;/p&gt;&lt;img src="http://bioitalliance.org/aggbug.aspx?PostID=144" width="1" height="1"&gt;</description><category domain="http://bioitalliance.org/blogs/clc_bio/archive/category/1227.aspx">Case Studies</category></item><item><title>CLC bio joins Microsoft in the BioIT Alliance</title><link>http://bioitalliance.org/blogs/clc_bio/archive/2007/01/25/123.aspx</link><pubDate>Thu, 25 Jan 2007 21:58:00 GMT</pubDate><guid isPermaLink="false">9206313b-73a6-4b55-a104-a0254c229ae1:123</guid><dc:creator>CLC bio</dc:creator><slash:comments>0</slash:comments><comments>http://bioitalliance.org/blogs/clc_bio/comments/123.aspx</comments><wfw:commentRss>http://bioitalliance.org/blogs/clc_bio/commentrss.aspx?PostID=123</wfw:commentRss><description>&lt;p&gt;Aarhus, Denmark -- January 25, 2007 --  CLC bio today announced that it has joined the BioIT Alliance, a cross-industry  group which unites the pharmaceutical, biotech, hardware and software industries,  to explore new ways to share complex biomedical data and collaborate among  multi-disciplinary teams to speed up the pace of discovery in the life  sciences. &lt;/p&gt;
&lt;p&gt;CLC bio CEO Thomas Knudsen is excited  about the collaboration:&lt;/p&gt;&lt;blockquote&gt;It is an honour  for our company to join a network of such highly estimated companies as  Microsoft, Hewlett-Packard, Sun Microsystems and others. We think this  agreement will prove beneficial for all involved parties, and we’re looking  forward to working together within this scientific community, developing  innovative solutions and solving current issues in the post-genomic age.&lt;/blockquote&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;Microsoft Platform Strategy Advisor Don  Rule adds: &lt;/p&gt;&lt;blockquote&gt;We are thrilled to welcome  CLC bio to the BioIT Alliance.  They bring deep experience in bioinformations software, hardware and consulting  solutions to our rapidly expanding community.&lt;/blockquote&gt; &lt;p&gt;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;About the BioIT Alliance&lt;/strong&gt;&lt;br&gt;
  Formed in 2006, the BioIT Alliance is a  cross-industry group working together to improve biomedical information  technology on the Microsoft platform. Founding members include Affymetrix,  Inc., Accelrys Software Inc., Amylin Pharmaceuticals, Inc., Applied Biosystems,  The BioTeam Inc., Digipede Technologies LLC, Discovery Biosciences  Corporation., Geospiza Inc., Hewlett-Packard Development Company,  InterKnowlogy, Microsoft Corporation, Sun Microsystems Inc., VizX Labs LLC and  other key companies in the pharmaceutical, biotech, hardware and software  industries. Additional information about the BioIT Alliance can be found on the  BioIT Alliance Web site at &lt;a href="http://www.bioitalliance.org"&gt;http://www.bioitalliance.org&lt;/a&gt; &lt;/p&gt;
&lt;strong&gt;&lt;br&gt;
&lt;/strong&gt;
&lt;p&gt;&lt;strong&gt;About CLC bio&lt;/strong&gt;&lt;br&gt;
  CLC  bio is the world's leading full-service bioinformatics solution provider,  solely focusing on the development of bioinformatics: software, hardware, data  analysis, and custom-designed bioinformatics algorithms.&lt;/p&gt;
&lt;p&gt;CLC  bio’s mission is to be among the most innovative bioinformatics companies in  the 21st century. This is realized through:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Development       of bioinformatics software and hardware based on the latest scientific       findings&lt;/li&gt;
  &lt;li&gt;User-friendly,       integrated and intuitive software solutions&lt;/li&gt;
  &lt;li&gt;Continuous       focus on customer needs and superior customer service&lt;/li&gt;
  &lt;li&gt;Frequent       product updates including the latest IT technologies and bioinformatics       algorithms&lt;/li&gt;
  &lt;li&gt;A       flexible IT architecture, enabling customers to buy or develop       individualized solutions at a reasonable price.&lt;/li&gt;
&lt;/ul&gt;
&lt;p align="center"&gt;# # #&lt;/p&gt;
&lt;p&gt;For  further information, please contact:&lt;br&gt;
  &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Thomas Knudsen, CEO&lt;br&gt;
  &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; CLC bio&lt;br&gt;
  &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Gustav Wieds Vej 10&lt;br&gt;
  &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 8000 Aarhus  C&lt;br&gt;
  &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Denmark&lt;br&gt;
  &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Phone: +45 70 22 55 09&lt;br&gt;
  &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; E-mail: info@clcbio.com&lt;br&gt;
  &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Website: www.clcbio.com &lt;/p&gt;
&lt;p align="center"&gt;&amp;nbsp;&lt;/p&gt;&lt;img src="http://bioitalliance.org/aggbug.aspx?PostID=123" width="1" height="1"&gt;</description><category domain="http://bioitalliance.org/blogs/clc_bio/archive/category/1228.aspx">News</category></item><item><title>Micro sites</title><link>http://bioitalliance.org/blogs/clc_bio/archive/2007/01/15/121.aspx</link><pubDate>Mon, 15 Jan 2007 19:06:00 GMT</pubDate><guid isPermaLink="false">9206313b-73a6-4b55-a104-a0254c229ae1:121</guid><dc:creator>CLC bio</dc:creator><slash:comments>0</slash:comments><comments>http://bioitalliance.org/blogs/clc_bio/comments/121.aspx</comments><wfw:commentRss>http://bioitalliance.org/blogs/clc_bio/commentrss.aspx?PostID=121</wfw:commentRss><description>&lt;p&gt;&lt;a href="http://www.clccube.com/" target="_blank"&gt;CLC Bioinformatics Cube - hardware-acceleration on your desktop&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="http://www.clcdeveloper.com/" target="_blank"&gt;CLC Developer Kit - software developer kit (SDK) for bioinformatics software plug-in developers&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="http://www.clcbioconsulting.com/" target="_blank"&gt;CLC Bio Consulting - World Class Bioinformatics Consulting&lt;/a&gt;&lt;/p&gt;&lt;img src="http://bioitalliance.org/aggbug.aspx?PostID=121" width="1" height="1"&gt;</description><category domain="http://bioitalliance.org/blogs/clc_bio/archive/category/1230.aspx">Reference Links</category></item><item><title>Online workbench presentations</title><link>http://bioitalliance.org/blogs/clc_bio/archive/2007/01/15/120.aspx</link><pubDate>Mon, 15 Jan 2007 19:05:00 GMT</pubDate><guid isPermaLink="false">9206313b-73a6-4b55-a104-a0254c229ae1:120</guid><dc:creator>CLC bio</dc:creator><slash:comments>0</slash:comments><comments>http://bioitalliance.org/blogs/clc_bio/comments/120.aspx</comments><wfw:commentRss>http://bioitalliance.org/blogs/clc_bio/commentrss.aspx?PostID=120</wfw:commentRss><description>&lt;p&gt;Now, you can learn more about your workbench of choice while sitting comfortably at your computer in your office.&lt;/p&gt;
&lt;p&gt;The one hour presentation is performed by one of our product specialists and transmitted to you through the internet. During the whole session, you can ask questions through a free/low-cost teleconference connection, ensuring that you do not leave with questions unanswered.&lt;/p&gt;
&lt;p&gt;It’s very easy and without any obligations: You do not have to prepare, download or install anything on your computer – just sign up, and we will make sure that you will get a lot of valuable information.&lt;/p&gt;
&lt;p&gt;&lt;a href="http://www.clcbio.com/index.php?id=722" target="_blank"&gt;Follow this link to read more&lt;/a&gt;&lt;/p&gt;&lt;img src="http://bioitalliance.org/aggbug.aspx?PostID=120" width="1" height="1"&gt;</description><category domain="http://bioitalliance.org/blogs/clc_bio/archive/category/1229.aspx">Presentations, Podcasts, Webcasts</category></item><item><title>Alignment speed and quality at CLC bio</title><link>http://bioitalliance.org/blogs/clc_bio/archive/2007/01/15/119.aspx</link><pubDate>Mon, 15 Jan 2007 18:53:00 GMT</pubDate><guid isPermaLink="false">9206313b-73a6-4b55-a104-a0254c229ae1:119</guid><dc:creator>CLC bio</dc:creator><slash:comments>0</slash:comments><comments>http://bioitalliance.org/blogs/clc_bio/comments/119.aspx</comments><wfw:commentRss>http://bioitalliance.org/blogs/clc_bio/commentrss.aspx?PostID=119</wfw:commentRss><description>&lt;p&gt;The alignment algorithms used in the software from CLC bio A/S has some unique features including the option of adjusting the cost of gaps in the end of the alignment to suit the sequences being aligned.	&lt;/p&gt;
&lt;p&gt;We have two alignments: A standard algorithm that is 10 times faster than our previous&lt;br&gt;
  alignment in most scenarios, and an additional alignment that is even faster, but less accurate than&lt;br&gt;
  the standard algorithm. &lt;/p&gt;
&lt;p&gt;The White Paper below forms the basis for these 5 conclusions:&lt;/p&gt;
&lt;ol&gt;
  &lt;li&gt;On &lt;strong&gt;large data sets&lt;/strong&gt; of      sequences that are not too divergent, our alignment is significantly      faster than the standard CLUSTAL W alignment, and around the same speed as      the fast CLUSTAL W alignment. &lt;br&gt;
  &lt;/li&gt;
  &lt;li&gt;Performing an alignment of 28 HIV genomes, our fast alignment is      more than 10 times (55 minutes) faster than the standard CLUSTAL W      alignment.&lt;br&gt;
  &lt;/li&gt;
  &lt;li&gt;We have benchmarked our new algorithms on the BAliBASE 3.0      database of accurate &lt;strong&gt;protein alignments&lt;/strong&gt; (Thompson et al., 2005). This shows that our alignment      algorithm  is about 1% more accurate than the latest version of the      standard CLUSTAL W on protein alignments.&lt;br&gt;
  &lt;/li&gt;
  &lt;li&gt;We have bechmarked our new algorithms on the BRaliBase II database      of structurally aligned &lt;strong&gt;RNA&lt;/strong&gt;.      Here, our new algorithm is about 3.5% more accurate than the standard      CLUSTAL W. &lt;br&gt;
  &lt;/li&gt;
  &lt;li&gt;Our standard algorithm is still a little slower than the standard      CLUSTAL W on the fairly divergent alignments in BAliBASE and BRaliBase. Our      fast alignment is as precise and as fast as the standard CLUSTAL W on      these data sets. &lt;/li&gt;
&lt;/ol&gt;&lt;/p&gt;
&lt;a href="http://www.clcbio.com/files/whitepapers/wp_alignmentspeed_A4.pdf" target="_blank"&gt;Download White Paper&lt;/a&gt;&lt;img src="http://bioitalliance.org/aggbug.aspx?PostID=119" width="1" height="1"&gt;</description><category domain="http://bioitalliance.org/blogs/clc_bio/archive/category/1231.aspx">White Papers</category></item><item><title>NASA Chooses Sequencing Software from CLC bio</title><link>http://bioitalliance.org/blogs/clc_bio/archive/2007/01/15/118.aspx</link><pubDate>Mon, 15 Jan 2007 18:46:00 GMT</pubDate><guid isPermaLink="false">9206313b-73a6-4b55-a104-a0254c229ae1:118</guid><dc:creator>CLC bio</dc:creator><slash:comments>0</slash:comments><comments>http://bioitalliance.org/blogs/clc_bio/comments/118.aspx</comments><wfw:commentRss>http://bioitalliance.org/blogs/clc_bio/commentrss.aspx?PostID=118</wfw:commentRss><description>&lt;b&gt;The Exobiology Branch of the NASA Ames Research Center in Moffett Field, California, has chosen bioinformatics software from CLC bio for analyzing sequencing data which revolves around hypersaline microbial mat systems that are found worldwide in exotic locations such as Baja, Mexico, and Solar Lake, Egypt, to name a few.&lt;/b&gt;&lt;p /&gt;

&lt;img src=http://clcbio.com/images/reference_nasa.png align=left&gt;&lt;i&gt;Exobiology - also known as astrobiology - is an interdisciplinary field, combining aspects of astronomy, biology and geology, which is focused primarily on the study of the origin, distribution and evolution of life.&lt;/i&gt;
Microbial mat communities represent, in gross morphology, some of the earliest known microbial communities on Earth. As a common form of microbial community existing on Earth during the Precambrian, and dominant in the Proterozoic era, these systems are hypothesized to have played a significant role in the development of modern oceanic and atmospheric conditions. &lt;p /&gt;
&lt;b&gt;Dr. Stefan J. Green, scientist from the Exobiology Branch of the NASA Ames Research Center, states&lt;/b&gt;:
&lt;blockquote&gt;We are currently employing various molecular tools to characterize shifts in community structure of the total microbial population as a result of alterations in sulfate and salinity levels.
Our initial molecular analyses are performed using a fairly standard DNA extraction-PCR amplification-denaturing gradient gel electrophoresis (DGGE) methodology. We are employing a wide variety of primers for PCR-DGGE analyses and while many of these primer sets amplify regions of the 16 or 18S rRNA gene, we are also exploring a variety of primer sets to target functional genes. Such functional gene analyses can provide information regarding microbial function and phylogeny. 
CLC Combined Workbench is very useful in our manipulations of sequence data recovered from our exobiological analyses. In particular, the software has been useful in assembling complete and long gene sequences from multiple sequence reactions - with the sequencing trace visible for all overlapping reactions. For example, the dsrAB genes are approximately 1900 bp in length, and require a minimum of 3 sequencing reactions. 
The gene annotation feature is also highly beneficial, by helping establish, for example, where the dsrA gene ends and the dsrB gene begins, based on other sequences in Genbank. The wide variety of other tools in CLC's software are extremely useful – BLAST searches from within the program, in silico restriction digests, alignments, rudimentary phylogenetic trees, among a lot of other features. 
One of the worst facets of sequence analysis is the constant reformatting that must be performed to analyze the data with different programs - and with CLC's software the need for many of these manipulations are eliminated. Thank you for creating such nice software!&lt;/blockquote&gt;

&lt;a href="http://exobiology.nasa.gov" target="_blank"&gt;Learn more about exobiology&lt;/a&gt;&lt;p /&gt;

&lt;a href="http://www.clcbio.com/index.php?id=720" target="_blank"&gt;Click here to read the full press release.&lt;/a&gt;&lt;img src="http://bioitalliance.org/aggbug.aspx?PostID=118" width="1" height="1"&gt;</description><category domain="http://bioitalliance.org/blogs/clc_bio/archive/category/1228.aspx">News</category></item><item><title>Detection of genomic changes in Mycobacterium tuberculosis</title><link>http://bioitalliance.org/blogs/clc_bio/archive/2007/01/15/117.aspx</link><pubDate>Mon, 15 Jan 2007 18:10:00 GMT</pubDate><guid isPermaLink="false">9206313b-73a6-4b55-a104-a0254c229ae1:117</guid><dc:creator>CLC bio</dc:creator><slash:comments>0</slash:comments><comments>http://bioitalliance.org/blogs/clc_bio/comments/117.aspx</comments><wfw:commentRss>http://bioitalliance.org/blogs/clc_bio/commentrss.aspx?PostID=117</wfw:commentRss><description>&lt;P&gt;
&lt;H1&gt;
Detection of genomic changes in &lt;i&gt;Mycobacterium tuberculosis&lt;/i&gt;&lt;/A&gt;
&lt;/H1&gt;

&lt;P&gt;

&lt;H2&gt;
Resistance development in &lt;i&gt;Mycobacterium tuberculosis&lt;/i&gt;&lt;/A&gt;
&lt;/H2&gt;
Tuberculosis (TB) remains the leading cause of death due to
bacterial infections worldwide. About 8 million new cases of active
TB arise each year resulting in 3 million annual deaths. Roughly 1
billion individuals are believed to harbour latent tuberculosis.

&lt;P&gt;
Primary infections with &lt;I&gt;M. tuberculosis&lt;/I&gt; are generally
asymptomatic but will in some cases remain as a latent infection.

&lt;P&gt;
Tuberculosis is a secondary disease caused by reactivation of the
&lt;I&gt;M. tuberculosis&lt;/I&gt; bacteria not fully eliminated after the
primary infection.

&lt;P&gt;
Sequencing and sequence analysis of bacterial and fungal genomes and
proteins/peptides have led to a better general understanding of the
pathogenesis of bacterial and fungal infections. Future
understanding of the regulatory events at the molecular level will
increase and be accelerated by using a variety of new technologies
and technology platforms within microarrays, protein chips and
sequence analysis tools (CLC bio workbenches). The aim is to develop
more specific and effective drugs much faster to target e.g. the
expanding multi resistance &lt;I&gt;Mycobacterium tuberculosis&lt;/I&gt; strains
in human populations all over the world.

&lt;P&gt;
Multidrug resistant bacterial strains arise by sequential
accumulation of resistance mutations for individual drugs. A diverse
array of strategies is available to assist in rapid detection of
drug resistance-associated gene mutations.

&lt;P&gt;
Bioinformatics (CLC bio workbenches) together with functional
genomics and functional proteomics have been used to identify
expression pattern (signature) changes in multi resistant &lt;I&gt;M.
tuberculosis &lt;/I&gt;strains.

&lt;P&gt;

&lt;H2&gt;
&lt;i&gt;Mycobacterium tuberculosis&lt;/i&gt; genome
&lt;/H2&gt;
The &lt;i&gt;Mycobacterium tuberculosis&lt;/i&gt; genome was sequenced in 1998
[&lt;A HREF="http://www.clcbio.com/index.php?id=478#Cole1998"&gt;Cole et&amp;nbsp;al., 1998&lt;/A&gt;] (see figure &lt;A&gt;1&lt;/A&gt;). Today
many strains of &lt;I&gt;M. tuberculosis&lt;/I&gt; have been sequenced. In pest
control laboratories, sequencing is routinely performed to monitor
strain genomic changes.

&lt;P&gt;

&lt;DIV ALIGN="CENTER"&gt;&lt;A NAME="fig:tuberculosis_genome"&gt;&lt;/A&gt;
&lt;IMG BORDER="0" SRC="http://www.clcbio.com/scienceimages/tuberculosis_genome.png" ALT="Image tuberculosis_genome"&gt;&lt;br /&gt;&lt;br /&gt;
&lt;STRONG&gt;Figure 1:&lt;/STRONG&gt; &lt;I&gt;The Mycobacterium tuberculosis genome &lt;/I&gt;[&lt;A HREF="http://www.clcbio.com/index.php?id=478#Cole1998"&gt;Cole et&amp;nbsp;al., 1998&lt;/A&gt;]
&lt;P&gt;&lt;br /&gt;
&lt;/DIV&gt;

&lt;P&gt;
Presently with the latest re-annotation of the &lt;I&gt;M.
tuberculosis&lt;/I&gt; genome, it is possible to assign a function to 2058
protein coding sequences (52% of the 3995 proteins predicted). Only
376 putative proteins share no homology with known proteins and thus
could be unique to &lt;I&gt;M. tuberculosis&lt;/I&gt;.

&lt;P&gt;

&lt;H2&gt;
Drug resistance in &lt;i&gt;Mycobacterium tuberculosis&lt;/i&gt;&lt;/A&gt;
&lt;/H2&gt;
Today eight agents are used in the treatment of TB. Multi drug
resistant &lt;I&gt;M. tuberculosis&lt;/I&gt; strains are resistant both to
isoniazid (INH) and rifampicin (RFP). These agents have the most
effective bactericidal activity towards &lt;I&gt;M. tuberculosis&lt;/I&gt;.
Nearly 95% of the RFP resistant strains possess a mutation in the
rpoB gene encoding a DNA-dependent RNA polymerase. Approximately
90% of INH resistant strains have a mutation in the inhA, katG, and
ahpG genes encoding enzymes related to a mycolic acid synthesis of
cell wall. Pyrazinamide (PZA) resistant strains have a mutation in
the pncA gene encoding a pyrazinamidase which degrades PZA to the
bactericidal substance, pyrazinoic acid. Streptomycin resistant
strains have a mutation in the rrs and rpsL gene encoding the 16S
rRNA and the 12S ribosomal subunit protein, respectively. Kanamycin
resistance is due to nucleotide substitutions in the rrs gene
encoding 16S rRNA. Ethambutol resistant strains have a mutation in
the embB gene encoding a arabinosyl transferase which catalyzes cell
wall synthesis. Ethambutol resistance is in approximately 60% of
organisms due to amino acid replacements at position 306 of an
arabinosyltransferase encoded by the embB gene.

&lt;P&gt;
CLC bio provide some important bioinformatics solutions for
detection and identification of mutations in selected genes of new
strains of &lt;I&gt;M. tuberculosis&lt;/I&gt; and other important pathogenic
microbes. An example of this is described below.

&lt;P&gt;

&lt;H2&gt;
Detection of mutations in &lt;i&gt;Mycobacterium tuberculosis&lt;/i&gt; embB, rrs and pncA genes&lt;/A&gt;
&lt;/H2&gt;
The work flow for detecting and identifying mutations using the
&lt;I&gt;CLC Combined Workbench&lt;/I&gt; is summarized in figure
&lt;A&gt;2&lt;/A&gt;.

&lt;P&gt;

&lt;DIV ALIGN="CENTER"&gt;&lt;A NAME="fig:workflow"&gt;&lt;/A&gt;
&lt;IMG WIDTH="436" HEIGHT="286" ALIGN="BOTTOM" BORDER="0" SRC="http://www.clcbio.com/scienceimages/workflow.png" ALT="Image workflow"&gt;&lt;br /&gt;&lt;br /&gt;
&lt;STRONG&gt;Figure 2:&lt;/STRONG&gt; &lt;I&gt;Illustration of work flow.&lt;/I&gt;
&lt;p&gt;&lt;br /&gt;
&lt;/DIV&gt;

&lt;P&gt;
The multi resistance &lt;I&gt;M. tuberculosis&lt;/I&gt; strains are identified
and the genomic DNA extracted. The encoding region of the embB, rrs
and pncA genes are marked with primers using the graphical primer
design. The relevant regions are sequenced automatically and
assembled into contigs. BLAST searches in selected databases (both
local and NCBI databases) are performed. Subsequently homologous
sequences are aligned and annotations are transferred to the contig
sequence.

&lt;P&gt;
In the next two sections, two of the steps in the work flow are
described in further detail.

&lt;P&gt;

&lt;H3&gt;
Zooming in on Primer design&lt;/A&gt;
&lt;/H3&gt;
Using the integrated GenBank search function, the annotated &lt;I&gt;M.
tuberculosis&lt;/I&gt; is downloaded. In a few clicks, the relevant genes
(embB, rrs and pncA) are extracted. Guided by the CDS annotations, a
set of primers are calculated for each gene (see
figure&amp;nbsp;&lt;A&gt;3&lt;/A&gt;). Using the interactive primer design
functionality of the &lt;I&gt;CLC Combined Workbench&lt;/I&gt;, the primers can
be designed to match very specific needs.&lt;p&gt;

&lt;DIV ALIGN="CENTER"&gt;&lt;A NAME="fig:primerdesign"&gt;&lt;/A&gt;
&lt;IMG ALIGN="BOTTOM" BORDER="0" SRC="http://www.clcbio.com/scienceimages/primerdesign.png" ALT="Image primerdesign"&gt;&lt;br /&gt;&lt;br /&gt;
&lt;STRONG&gt;Figure 3:&lt;/STRONG&gt; &lt;I&gt;Primer design guided by CDS annotations (the yellow boxes).&lt;/I&gt;
&lt;p&gt;&lt;br /&gt;
&lt;/DIV&gt;

&lt;P&gt;

&lt;H3&gt;
Zooming in on Assembly&lt;/A&gt;
&lt;/H3&gt;
The sequencing data is imported into the CLC Workbench, and during
the assembly, the genes are automatically divided into separate
contigs.

&lt;P&gt;
The inconsistencies that exist between different reads are inspected
using the variance table (see figure&amp;nbsp;&lt;A&gt;4&lt;/A&gt;).&lt;p&gt;

&lt;DIV ALIGN="CENTER"&gt;&lt;A NAME="fig:contigs"&gt;&lt;/A&gt;

&lt;IMG BORDER="0" SRC="http://www.clcbio.com/scienceimages/contigs.png" ALT="Image contigs"&gt;&lt;br /&gt;&lt;br&gt;&lt;STRONG&gt;Figure 4:&lt;/STRONG&gt; &lt;I&gt;Assembling of raw sequence data. The upper image is the contig variance table, showing where there are conflicts/variation between the reads. When selecting a row in the table, the bottom view automatically selects this position in the contig, and it can be inspected in more detail.&lt;/I&gt;
&lt;p&gt;&lt;br /&gt;

&lt;/DIV&gt;

The reads are both forward and reverse, which are automatically
detected during the assembly. The orientation is reflected in the
color of the reads (in figure &lt;A&gt;4&lt;/A&gt;, the read at the top
is reversed which is indicated by the red color of the residues).
Quality scores are assigned in order to trim low-quality trace data,
so that reads align properly. The quality scores can also be shown
graphically below the sequence.

&lt;P&gt;
The result of the assembly is a contig sequence which is used in the
following steps for BLAST searches and alignments.

&lt;P&gt;
The description above is only a brief example of how bioinformatics
software, such as the CLC Workbenches, can be used in clinical
microbiology. Although remarkable advances have been made, much
remains to be learned about the molecular genetic basis of drug
resistance development including M. tuberculosis. New therapeutics
will be developed based on improved bioinformatic data-mining of
results from bacterial drug resistance studies.&lt;img src="http://bioitalliance.org/aggbug.aspx?PostID=117" width="1" height="1"&gt;</description><category domain="http://bioitalliance.org/blogs/clc_bio/archive/category/1227.aspx">Case Studies</category></item></channel></rss>