
All the versions of this article:
Project:
State of the project:
Genoscope belongs to an international consortium for the sequencing of the rice genome, the International Rice Genome Sequencing Project (IRGSP). It is the only participating center in Europe.
Initially, Genoscope worked with the CNRS / University of Perpignan (M. Delseny), the rice genomics group of the IRD at Montpellier (A. Gesquière) and the CIRAD (J.C. Glaszmann) within the framework of a pilot project focused on a small number of BACs.
The International Rice Research Institute (IRRI), created in 1960 to help small rice producers in developing countries, is one of the centers of the consulting group for international agricultural research (CGIAR). It manages a large collection of rice varieties.
The International Rice Information System (IRIS) is a database for the management and integration of information on the genetic resources for rice; it is managed by IRRI.
Riceweb has a large number of facts and figures on rice; managed by IRRI.
Riceworld : pages of the rice museum, based at IRRI in the Philippines.
GRAMENE : this comparative genomics platform for cereals, managed by the US Department of Agriculture (USDA), provides a very complete collection of rice genomics resources.
The Rice Genome Research Program (RGP) in Japan offers numerous resources, including the INE database which integrates physical and genetic maps and sequence information.
Rice sequences submitted to GenBank on a FTP server of the National Institute for Agrobiological Sciences (NIAS) in Japan.
Results of the automatic annotation by the Rice Genome Automated Annotation System (RiceGAAS), developed at NIAS.
The rice page at The Institute for Genomics Research (TIGR) in the US, including an index of rice genes (OsGI).
Genoscope is participating in the international effort to sequence the rice genome (Oryza sativa) by determining the sequence of chromosome 12 of this cereal. The rice cultivar chosen for sequencing by the international IRGSP Consortium is the Nipponbare variety, or GA3 of the japonica sub-species. Chromosome 12 of the Nipponbare variety is 30 million nucleotides (Mb) long (the totality of the 12 chromosomes of the Nipponbare variety represents 430 Mb). Genoscope is also sequencing a telomeric region of chromosome 11, which corresponds to a duplication between chromosome 11 and chromosome 12.
Genoscope, as well as the other members of the consortium, are using the hierarchic shotgun approach, also called “clone by clone”, which relies on the previous construction of a physical map of the genome. The physical mapping resources available at the beginning of the project came from two centers:
When Genoscope became involved in the chromosome 12 sequencing project, these clones and the physical mapping data were available for the determination of a minimal “tiling path” for the totality of the chromosome. Work began in 1999 within the framework of a pilot project with CIRAD, IRD, CNRS and the University of Perpignan: 11 BACs were selected at first as entry points along the chromosome. Sequencing was intended to progress in each direction from these “seed points”, and others to come, chosen along the chromosome based on an optimal spacing of markers. In this strategy, known as STC (Sequence Tag Connectors), each time a clone is sequenced, it is compared to the end sequences of all the clones of the library to select clones with minimal overlap; using the fingerprint data, the clone which extends furthest is then chosen to be sequenced next. The minimal tiling path is constructed in this way as the sequencing progresses. The process stops when the actives edges of two BAC contigs progressing in opposite directions meet.
The initial scenario for the project was modified in 2000 after Monsanto’s announcement that they had produced a rough draft sequence of the Nipponbare genome. The American company had entrusted this project to Leroy Hood’s group at the University of Washington in Seattle. From a library of more than 75,000 BAC clones for which the end sequences and fingerprints were available, they selected about 3500 clones to form a tiling path, and sequenced them at 5X coverage. Monsanto finally made some of this data available to the IRGSP, with certain conditions. Thus, in March 2001, Genoscope obtained sequence data from the 148 Monsanto clones assigned to chromosome 12, as well as the BAC clones used, end sequence and fingerprint data, and access to a BLAST server which included all the sequences obtained by Monsanto. We then endeavored to fuse the ensemble of private and public data in order to produce the chromosome 12 sequence at minimum cost.
At the beginning, we verified the anchoring of the Monsanto BAC clones on chromosome 12 by a series of hybridizations with probes derived from markers. The sequences of the 148 Monsanto clones assigned to chromosome 12 were then compared with the end sequences of the CUGI BACs, which allowed us to position 127 Monsanto BACs on 20 CUGI fingerprint contigs. In order to confirm the integration of the Monsanto BACs, we submitted them to a new fingerprint analysis with the CUGI clones. Of these 127 Monsanto clones, 82 were selected to start construction of the minimum tiling path, with the 11 clones from the pilot project. These 82 Monsanto clones represented only 12 Mb of the 30 Mb of chromosome 12, however. In the regions which were not covered by the Monsanto clones, we determined a new set of entry points by a second set of hybridizations with genetic markers localized in these regions. Lastly, we derived probes from end sequences of the contigs of CUGI clones which were already anchored on chromosome 12 in order to identify new contigs.
Sequencing of the first clones began at the end of November 2001 (in the case of the Monsanto clones which had been sequenced at 5X coverage, complementary sequencing was necessary to extend the coverage to 10X). New BACs from Monsanto or CUGI were then selected using the STC strategy. In all cases, their integrity was verified by analyzing their co-linearity with neighboring BACs. Assembly of the sequence of each BAC is validated by comparing theoretical restriction profiles, deducted from the sequences, with those obtained experimentally. In October 2002, contigs from the genomic draft produced by Syngenta became available, which facilitated the finishing work. Data from Syngenta are integrated automatically into the BACs in finishing stage.
In the summer of 2003, the minimum tiling path defined for chromosome 12 consists of 265 BAC clones, representing 27 Mb of unique sequence. Only 3 mapping gaps remain, corresponding to the centromere and the two telomeres. The duplicated region at the end of chromosome 11 is covered by 6 clones. All the sequences of the clones sequenced at Genoscope are submitted to the HTG section of the EMBL database for clones in phase 1 or 2, and to the PLN section for finished clones.