|
|
|||||
CAGE |
Project details | ||||
Compendium of Arabidopsis Gene Expression |
||||
| Objectives: | ||||
| The CAGE project aims to build a gene expression reference database. A Consortium of European Arabidopsis functional genomics centers has teamed up with bioinformatics partners that contribute expertise in microarray data processing, analysis and storage/distribution. A total of 2000 Arabidopsis samples will be produced and analysed under largely standardised conditions. These samples will be profiled on CATMA microarrays containing gene specific probes for most Arabidopsis genes, to build a Compendium of expression profiles. The data will be assessed for statistical significance and submitted to the ArrayExpress database at the European Bioinformatics Institute (EBI). EBI will deliver specific CAGE ontology, and data submission pipelines. The Compendium data will be annotated and analysed for content and confirmation of gene function. The Compendium will further be maintained by EBI. |
||||
| Aims: | ||||
|
||||
| Description of the work: | ||||
Arabidopsis Functional
Genomics today faces the immense challenge to map genomic sequence to
function since most of the 29,000 or more genes identified in the Arabidopsis
genome have not been characterised experimentally. A particularly powerful
technology for the association of gene-to-function is microarray-based
expression analysis. In the CAGE project we will build a publicly available
functional genomics knowledge base using the novel CATMA microarray. The
project will demonstrate both the power of this microarray (designed to
discriminate highly between gene homologues), and the added value of analysis
of microarray data in a Compendium format. |
||||
|
Arabidopsis thaliana (Boyes
et al., Plant Cell, 200,)[PDF] |
![]() |
|||
To successfully accomplish
this we have brought together a consortium of European laboratories including
a series of plant research centres (URGV/France, VIB/Belgium, HRI/United
Kingdom, MPI/Germany, SLU/Sweden, RUU/Netherlands, CSIS/Spain and UNIL/Switzerland);
the VIB-Micro Array Facility; and partners that excel in developing statistics
and mining algorithms for the analysis of gene expression (ESAT/Belgium,
EBI/United Kingdom). All microarrays will be produced by VIB-MAF, thereby
controlling variance and reducing cost. A total of 4000 microarrays will
be provided to the project partners. Together they will analyse 2000 carefully
chosen Arabidopsis samples (two chips per sample, Reference design)
to explore Arabidopsis’ “developmental and functional
space”. Samples will consist of biological replicates; some tissues
and organs will be sampled even more extensively). |
||||
The resulting data will
be statistically analysed for quality and significance by ESAT, and subsequently
submitted to the central ArrayExpress database at EBI. EBI (in collaboration
with TAIR) will deliver specific CAGE ontology, and data submission pipelines.
The Compendium data will be annotated with pre-computed results, and thoroughly
analysed for content and proof of gene function, as will be demonstrated
in publications. |
||||
| Deliverables: | ||||
The duration of the project
is 3 years. A total of 4000 microarrays with at least 25000 features will
be produced over a period of 18 months. The partners will assemble 2000
biological samples, and processing on microarrays will generate close
to 100.000.000 data points. The reference sample used in all comparisons
will be made publicly available. The first data will be produced in the
summer of 2003, and data production will continue until the end of 2004.
All data will be first analysed by the consortium partners, and submitted
for publication prior to releasing the Compendium data to the scientific
community. However, we will keep the lag-time for data submission as short
as possible (< 6 months). Data processing pipelines and pre-computed
results will be released for public use. Submission of the final data
will be completed by the end of 2005. The Ath. ArrayExpress database
will be maintained by the EBI. |
||||