This case study shows how bioinformatics in general, and CLC
Combined Workbench in particular, can help researchers gain deeper
knowledge of the complex processes involved in flower development in
Daucus carota.
Figure 1: Flower of Daucus carota.
The study focuses on how the alignment-based primer design functionalities of the CLC Combined Workbench make it possible to
design degenerate primers to amplify a set of conserved
genes in Daucus carota. The research has been carried out at
the Laboratory of Gene Expression at the University of Aarhus,
Denmark.
Early flowering in Arabidopsis thaliana
Flowering plants, or angiosperms, represent one of the most
successful and diverse groups of organisms on the planet, with more
than 250,000 species in the wild. Although angiosperms such as
orchids, roses and snapdragons have very distinctive flowers, most
flowers contain just four organ types and their development involves
highly conserved molecular mechanisms [
Krizek and Fletcher, 2005].
Flowers, like shoot and leaves, derive from the shoot apical
meristem (SAM). Whereas leaves arise early in the plant life cycle,
flowers are only produced after transition of the SAM from
vegetative to reproductive growth. The timing of the transition is
crucial to reproductive success and as such must take place at the
exact right time.
In Arabidopsis thaliana the switch from vegetative to
reproductive growth is governed by four main pathways (see figure
2). Signals transmitted by genes belonging to each
pathway converge at the Floral Pathway Integrators. These activate
and/or up-regulate the expression of the Floral Meristem Identity
(FMI) genes responsible for the establishment of the floral meristem
[Simpson et al., 1999]. The FMI genes in turn activate the Floral Organ
Identity genes, required for specifying the identity of the floral
organs [Krizek and Fletcher, 2005].
Figure 2:
Flower Inducing Pathways. The Gibberellic, Autonomous, Vernalization and Photoperiodic
pathways induce the transition of the SAM from vegetative to reproductive growth and hence early flower
development. Genes written in boldface belong to the MADS-box gene family.
Of the 15 genes central to early flower development, nine belong to
a group of homeotic genes known as MADS-box genes, sharing a
conserved region of approximately 180 bp in the 5'-end. Due to the
high degree of conservation, degenerate primers targeting this
region can be designed. This enables a broad search for MADS-box
genes in virtually any organism.
Expressed in vegetative as well as floral tissues, MADS-box genes
are likely to be involved in many aspects of plant development. They
have been successfully cloned from more than 40 different plant
genera [Purugganan et al., 1995]. In the Arabidopsis thalianagenome alone, more than 100 sequences have been recovered and
although less than 20% are characterized, many have been shown to
be involved in several aspects of bolting and flowering
[Parenicová et al., 2003], [Martinez-Castilla and Alvarez-Buylla, 2003].
MADS-box genes in carrot (Daucus carota)
Cytoplasmic male sterility (CMS) results from incompatibility
between the nucleus and cytoplasm. CMS plants are not capable of
normal pollen development, in that maturation of viable pollen is
prevented. In addition, flower phenotype is altered. The phenotypes
resemble homeotic flower mutants of
Arabidopsis thaliana.
In carrot, five nuclear genes involved in the expression of two CMS
phenotypes have been identified. The five clones were aligned with
putative equivalents of other plant species, and due to their
homology to homeotic genes of Antirrhinum majus and
Arabidopsis thaliana the genes were named DcMADS1-5 [Linke et al., 2003].
The research performed in the Laboratory of Gene Expression at the
University of Aarhus alternated between tasks performed in the lab
and bioinformatics tasks using the CLC Combined Workbench. Below,
the work performed is summarized.
Research
The purpose of the research was to identify genes involved in the
floral initiation process in carrots (
Daucus carota cv.
'Nantes') and focus was set on the MADS-box genes. The work
flow is outlined in figure
3.
Figure 3:
Illustration of work flow.
The CLC Combined Workbench was used to search for, import and align
the cDNA sequences of DcMADS1-5. Then, a pair of degenerate
primers were designed targeting the conserved MADS-box region. These
primers were used for PCR amplification of genomic DNA isolated from
leaves of 19 weeks old carrots. The resulting fragments were cloned
into the TOPO vector and sequenced.
The trace files were imported into the CLC Combined Workbench, where
the forward and reverse reads were assembled. BLAST searches against
the NCBI database were performed using BLASTx which translates the
sequence into amino acid sequences before BLASTing against a protein
database. Based on the results of the BLAST searches, the clones
suggested to be fragments of MADS-box genes were selected. Based on
the nucleotide sequence, primers for inverse PCR (iPCR) were
designed.
Back in the lab, the positive clones were labeled with
32P-dATP and used to screen a carrot genomic
library. Positive
λ-clones
were isolated, DNA extracted and subjected to iPCR. The resulting
amplicons were cloned into the TOPO vector and processed as
described above.
The screening of the genomic library remains a work in progress and
at this point in time putative MADS-box genes have been selected and
annotations describing the regions identified in each clone were
manually added. The images generated within the CLC Combined
Workbench have been exported in high-quality graphics formats for
use in presentations etc.
In the following sections, two of the steps in the work flow are
described in further detail; The alignment-based primer design and
export of images.
Zooming in on Alignment-based primer design
The CLC Combined Workbench makes it possible to align several
sequences and design degenerate primer sets in regions of homology.
Figure 4:
Alignment of four sequences which are used to design primers. The red color indicates fully conserved residues and the blue
color shows positions with high degree of variation. At the
positions with no background color, one of the sequences is
different from the rest.
Annotations from the original nucleotide sequences or alignment are
still visible within the primer design viewer, which is a big help
when the primers are required to lie within specific regions. In
figure 4, you can see the yellow annotations
indicating the coding regions of the MADS-box genes.
The conservation is shown as a blue-to-red gradient background color
making it easy to visually find the highly conserved regions of the
alignment. These regions are defined as locations for the forward
and the reverse primers as shown with the gray annotation in figure
4.
The parameters regarding the individual properties of the primers
are adjusted in the Side Panel. In addition to normal primer
settings, such as primer length, melting temperature, buffer
properties and degree of self annealing, the alignment-based primer
design makes it possible to select degree of degeneracy, or the
number of mismatches acceptable.
When all the parameters are set, the CLC Combined Workbench
calculates the best combinations of primer pairs for the selected
regions. The properties for the primer pairs, such as melting
temperature and self annealing, are inspected before deciding on a
primer pair.
Zooming in on Export of images
In figure
5 you can see the graphical result of the
first phase of the research described in this case study.
Figure 5:
Graphical output of the CLC Combined Workbench. The result of the research is shown as annotations.
The graphics generated in the CLC Combined Workbench are easily
exported in a variety of graphics formats such as gif, jpeg, png and
pdf.
The CLC Combined Workbench graphics export is based on a WYSIWYG
principle: What You See Is What You Get. This means that you can
configure the layout of e.g. a sequence or alignment in the program
and when you export this to a graphics file, it will look exactly
the same as in the Workbench. This makes it easy to create
good-looking illustrations of research findings for publications and
presentations.
The annotations on the sequence shown in figure 5
illustrate putative MADS-box genes and the annotations furthermore
describe the regions identified in each clone. This illustration is
used for communicating the results of the research in reports and
presentations.