NRSP-8 Aquaculture Research Progress Report for 2008
USDA NRSP-8 Aquaculture Coordinator: John Liu
USDA NRSP-8 Aquaculture Coordinator: Caird RexroadProgress toward Objective 1: Enhance and integrate genetic and physical maps of agriculturally important animals for cross species comparisons and sequence annotation:
Catfish: On the basis of previously published framework genetic linkage maps, we have mapped over 1400 microsatellite markers. These include approximately 300 ESTderived microsatellites, 72 EST-derived SNP markers, and about 1,100 BAC end anchored microsatellites derived from ESTs. A bacterial artificial chromosome (BAC) contig-based physical map of the catfish genome was generated using four color fluorescence-based fingerprints. A total of 40,416 BAC clones (6.5X genome coverage) was processed generating 34,580 fingerprints (5.6X genome coverage) for the FPC assembly of the BAC contigs. A total of 3,307 contigs was assembled. Each contig contains an average of 9.25 clones with an average size of 291 kb. The combined contig size for all contigs was 0.965 Gb, approximately the genome size of channel catfish. The reliability of the contig assembly was assessed by both hybridization of gene probes to BAC clones contained in the fingerprinted assembly, and by validation of randomly selected contigs using overgo probes designed from BAC end sequences. The presented physical map should greatly enhance genome research in catfish, particularly aiding in the identification of genomic regions containing genes underlying important performance traits. To date, a total of 63,387 BAC end sequences have been generated. From these BAC end sequences, a total of 17,640 microsatellites have been identified including 10,860 dinucleotide repeats, 4,007 trinucleotide repeats, 2,631 tetranucleotide repeats, and 135 pentanucleotide repeats. Of the total 17,640 BAC end-anchored microsatellites, 1,671 were located at the beginning of BAC end sequences and 4,392 were located at the end of BAC end sequences, making them not directly useful as markers, while 11,577 from 8673 unique BACs have sufficient flanking sequences for microsatellite primer design. Of the 3307 contigs, 2208 contigs contain at least one microsatellites. Mapping of these microsatellite makers will facilitate the integration of the physical map with the genetic linkage map. We have characterized a fraction of the large number of BACanchored microsatellites. This work has been published in Aquaculture (Somridhivej et al., 2008). Over 43,000 high quality SNPs were identified from 128,000 putative SNPs derived from ESTs. In addition, deep sequencing of reduced representation library has allowed identification of 14,424 high quality SNPs from 8,572 contigs. These SNPs are being used to develop the SNP genotyping platform in catfish.
Salmonids:: A second generation rainbow trout genetic map ordering 1124 microsatellite loci spanning a sex-averaged distance of 2927.10 cM (Kosambi) and having 2.6 cM resolution was constructed by genotyping 10 parents and 150 offspring from the National Center for Cool and Cold Water Aquaculture (NCCCWA) reference family mapping panel. Microsatellite markers, representing pairs of loci resulting from an evolutionarily recent whole genome duplication event, identified 180 duplicated regions within the rainbow trout genome. Microsatellites associated with genes through expressed sequence tags or bacterial artificial chromosomes produced comparative assignments with tetraodon, zebrafish, fugu, and medaka resulting in assignments of homology for 199 loci (Rexroad et al, 2008).
A first generation BAC physical map of the rainbow trout genome was constructed using the high-information content fingerprinting (HICF) method of Luo et al. (2003; Genomics, 82, 378-389). All the clones from the Swanson YY doubled haploid male BAC library (10X coverage; 184,704 clones) were fingerprinted and edited using FPMiner software. Approximately 16% of the clones' fingerprints did not pass our editing criteria and were removed from the project. The remaining clones were assembled into physical contigs using the finger-printing contig (FPC) program with a tolerance of 5, an initial cutoff of 1E-70 and extensive manual editing. The current version of the map is composed of 154,439 clones of which 145,060 are assembled into 4,173 contigs and 9,379 are singletons. The total number of unique fingerprinting fragments (consensus bands) in contigs is 1,185,157, which corresponds to an estimated physical length of 2.0 Gb (75% - 80% of the rainbow trout genome). The assembly was validated by 1) comparing it to the agarose gel fingerprinting contigs of Palti et al. (2004; Animal Genetics, 35:130-133), and 2) anchoring the largest contigs to the microsatellites genetic linkage map. BAC end sequences (BES) are available for approximately 100,000 clones from the same library. We are currently integrating the physical and genetic maps using microsatellites from BES of clones from the 200 largest contigs. We also integrate the maps by screening PCR super-pools of the BAC library with over 300 markers that represent all the genetic linkage groups.
Rainbow trout and Atlantic salmon are closely related species for which major genome projects are in progress. Sequencing of the Atlantic salmon genome is scheduled to begin in early 2009. Once a draft sequence is obtained for Atlantic salmon, it may be possible to sequence much of the rainbow trout genome with 454 sequencing technology. QTL important in disease resistance appear to be conserved between the two species, so it is important to correlate the two genetic maps, so information from one species can be utilized in the other. This year we completed the linkage of the genetic and cytogenetic maps for Atlantic salmon using fluorescence in situ hybridization with BAC clones containing markers on the genetic map (Phillips et al., manuscript in preparation). These BAC clones have been assembled into contigs which are available online at the cGRASP site. End-sequences from these clones have been blasted against zebrafish, medaka and stickleback genome sequences. Comparisons of the consensus genetic maps and the cytogenetic maps for Rainbow trout and Atlantic salmon in our laboratory showed that large blocks of genetic markers involving whole chromosome arms are conserved in the same order in the two species. We have identified the homologous chromosome arms for all of the chromosomes in both species. Many large blocks are also conserved between the salmonids and other teleosts, although the markers are not in the same order as salmonids.
To add single nucleotide polymorphisms (SNPs) to the growing number of genome tools in rainbow trout, we have evaluated multiple bioinformatic pipelines for their ability to detect SNPs from expressed sequence tag data. Candidate SNPs were evaluated and 9 SNPs representing 7 genes were successfully genotyped on the broodstock panel from the NCCCWA selective breeding program. The Nichols lab at Purdue University is currently 454 sequencing 40 BACs from physical contigs underlying QTL in rainbow trout. This effort is to identify the genes underlying important developmental, growth, and life history traits in this species, and will be an important tool for comparative genomics in aquatic species.
Tilapia: Genoscope has sequenced the ends of ~20,000 BAC clones, with another ~15,000 expected in early 2009. The NIH-NHGRI funded tilapia genome sequencing project at the Broad Institute is making slow progress due to the phasing out of Sanger sequencing and the development of methods using next generation (Solexa) sequencing. As of January 2009, the methods have been developed for 75bp sequence reads and paired end reads from small fragments. Critical to success of the project will be the development of methods for 100bp reads, and paired end reads at distances of 4 and 40kb. With luck, production sequencing is expected to be completed in June 2009. There are tentative plans for a meeting to discuss an initial assembly in fall 2009. Chris Amemiya has constructed another 10x BAC library from the individual tilapia being sequenced at the Broad, and the Broad recently completed end-sequencing of this library. Francis Galibert's lab has constructed an RH panel from the same homozygous clonal line being sequenced at the Broad Institute. He has just received genotypes for ~1500 loci on 190 clones from the panel. Preliminary analyses demonstrate a good map with large syntenic blocks relative to the medaka and stickleback assemblies. Richard Crooijmans' lab has undertaken a SNP discovery project using Solexa technology. From 33 million reads they assembled 258,000 contigs and identified more than 4000 SNPs. The intention is to genotype a large number of these SNPs on the RH panel.
Oysters: The Institute of Oceanology of Chinese Academy of Sciences (IOCAS) and the Beijing Genomics Institute (BGI), in collaboration with the international Oyster Genome Consortium (OGC), have initiated a genome sequencing project for the Pacific oyster Crassostrea gigas. The goal of the project is to produce a draft genome sequence of the Pacific oyster that covers 95% of its genome and 98% of its genes. To date, a 51.4X genome coverage has been sequenced using Solexa technology with 10.9X clone coverage for end sequencing of a Fosmid library using traditional Sanger sequencing. Preliminary assembly allowed assembly of N50 contig size of 1.1 kb, and N50 scaffold size of 8.9 kb. This effort allowed identification of 25,452 unique genes from oysters. The sequencing team is determining what to do next to extend the highly segmented contigs.
A JGI/Stanford project to sequence 58 BAC clones selected to contain eight genes of interest (5-9 clones for each target gene) is nearing completion, with 47 BACs fully sequenced and 11 in the finishing stages. Last year, a BAC-based physical map of the Pacific oyster genome was constructed with USDA support. The library and map will be used in the new USDA NRI Project GigaSNP (P.I's: Dennis Hedgecock, Pat Gaffney, and Ximing Guo). The primary goal of this project is to provide critical resources needed to assemble the draft sequence of this highly polymorphic genome. From EST sequences produced by the Joint Genome Institute (DOE), the project will develop high quality coding SNP sequences, validate their amplification from genomic DNA, and genotype >3000 candidate SNP markers by multiplex assays in four mapping families and a panel of oysters from diverse stocks and closely related species. A secondary goal is to identify 1500 SNP markers that are evenly spaced across the genetic map and polymorphic enough to be broadly useful to our international community. The project will also assign 500 mapped SNPs to BAC clones by PCR, linking genetic and physical maps, and develop a cytogenetic map by fluorescence in situ hybridization of mapped BACs and selected repetitive elements. Integrated genetic, physical, and cytogenetic maps are critical resources for bridging gaps in the draft genome sequence.
In addition to progress in pacific oysters, significant progress has been made with eastern oysters. The laboratory of Ximing Guo (Rutgers University) has been active for the development of a relatively high density genetic linkage map that now include 914 markers including 728 AFLP, 156 microsatellites and SNP markers spanning 1051 cM.
Shrimps: A BAC library with insert size of 120 Kb has been constructed in Dr. Xiang's lab of the Oceanic Institute of the Chinese Academy of Sciences, and fingerprinting is underway to construct the BAC contig-based physical map. Two fosmid libraries were constructed at Clemson University. Max Rothschild's lab at Iowa State University has identified 3,894 EST-derived SNPs and assessed their polymorphic rate. Mapping of these SNPs is underway.
Striped Bass: A medium-density linkage map for striped bass (Morone saxatilis) based on 498 microsatellite DNA markers are being constructed. These markers were developed by researchers in the Departments of Zoology (C.V. Sullivan) and Genetics (C.R. Couch) at North Carolina State University (NCSU), Kent SeaTech Corporation (M. Westerman and J. Stannard), and the USDA/ARS National Center for Cool and Coldwater Aquaculture (NCCCWA) in Kearneysville, WV (C. Rexroad III). The mapping effort is funded by the NOAA Marine Aquaculture Initiative grant program. Genotyping of 143 fish (3 parents plus 70 progeny from each of two performance tested half-sib families of striped bass) is underway at Virginia Institute of Marine Science (VIMS). The VIMS researchers (K. Reece and J. Cordes) are engaged in initial screening of available loci using a panel of fish consisting of the three parents (2 dams and 1 sire) and 6 progeny of each of the two families (15 individuals total). One hundred loci have been multiplexed into 25 groups of 4 loci and screened using the panel and 83 of these have been amplified and scored for all 15 individuals in the panel; 14 have been partially scored and will be finished shortly, and the remaining three will be rerun. Loci that do not amplify the second time will be set aside and re-optimized should they be needed to reach the target of at least 325 scoreable loci. It is anticipated that genotyping on the project will be completed in 2009.
The NCSU researchers have obtained funding from N.C. Sea Grant to utilize the panel of available microsatellite DNA markers to undertake a detailed genetic characterization of their special line of white bass, M. chrysops (NCSU-WB1), which has been domesticated over 8 generations in captivity, has been distributed to the ARS Stuttgart National Aquaculture Research Center (SNARC), and will be distributed to members of the Striped Bass Growers Association (see below). One objective is to identify a minimum suite of markers that can be used to discriminate between NCSUWB1 and other captive or wild strains of white bass so that unauthorized distribution or adulteration of the line can be detected (and prevented). This genotyping effort, which also may be transferred to VIMS, also should aid in application of the striped bass linkage map to selective breeding of white bass, the female parent of the hybrid striped bass (HSB) produced in commercial aquaculture. In addition, the SNARC researchers (J.A. Fuller and D. Freeman) have optimized and characterized 26 of the microsatellite markers developed for striped bass in 3 year classes of the NCSU -WB1 white bass being held at the SNARC for use in the national breeding program.
Progress toward Objective 2: Facilitate integration of genomic, transcriptional, proteomic and metabolomic approaches toward better understanding of biological mechanisms underlying economically important traits:
Catfish: One of the milestones for the past year was the completion of the EST sequencing project by JGI. Now the total number of catfish ESTs reached 493,852 including 139,475 ESTs from blue catfish and 354,377 ESTs from channel catfish. Sequence analysis at Auburn University has generated 37,912 contigs and 54,082 singletons. Major efforts of characterization of immune-related genes are continuing at Auburn University of Mississippi Medical Center, and USDA ARS. Six BACs covering the catfish immunoglobulin (Ig) heavy chain locus have been sequenced, and various immune-related genes have been characterized including IgD, Ig light chain (L) sigma, IgL lambda, FcRs and the B cell accessory molecules CD79a and CD79b, T cell coreceptor molecule CD8 and the TCR CD3 accessory molecules. In addition, as a part of US Veterinary Immune Reagent Network Grant (USDA CSREES NRI-CPG project #0206006), anti-TCRãnd anti-TCRáonoclonal antibody reagents (mAbs) are being tested.
Salmonids: We have constructed a rainbow trout high density oligo-array representing 37,394 unique TCs using Agilent's SurePrint technology. Performance of the array has been evaluated by analyzing expression of genes associated with vitellogenesis-induced muscle atrophy in rainbow trout (Salem et al., 2008). To identify miRNAs that might be important for early embryogenesis in rainbow trout, we constructed a miRNA library from a pool of unfertilized eggs and early stage embryos. Sequence analysis of random clones identified 14 miRNAs. Distinct expression patterns were observed during early embryogenesis and some miRNAs showed up-regulated expression during embryonic genome activation (Ramachandra et al., 2008).
Suggestive association was detected between MH-Ib marker haplotypes and bacterial cold water disease (BCWD) resistance in the broodstock of the National Center for Cool and Cold Water Aquaculture, Leetown, WV (Johnson et al., 2008) and this association is currently being evaluated through QTL mapping in selected families. The Nichols lab has completed a microarray study aimed at understanding genes that are differentially expressed during embryonic development. We have identified differentially expressed genes as a function of embryonic development rate QTL genotype, and by sex at three time points prior to hatch in this species. Analysis of this data will provide important insights into the biological mechanisms promoting fast vs. slow development rate, and potentially the developmental mechanisms involved in sex determination and sex-specific gene expression.
Tilapia: Kocher’s lab is completing a project to sequence 100,000 ESTs from a variety of tilapia cDNA libraries. Approximately 20 normalized libraries have been constructed, and the sequencing has been contracted to Agencourt. As of January 2009 approximately 75,000 sequences have been obtained, and these sequences confirm the low redundancy of the libraries. It is expected that all 100,000 ESTs will be deposited in GenBank by mid-2009. These sequences will complement existing EST resources for related cichlid fishes, allowing the production of 2nd generation microarrays for tilapia.
Oysters: The JGI large-scale EST sequencing project has so far produced 27,562 cDNA clone sequences from larval and adult libraries. Another production run of the adult libraries, which were made from the differentially tagged RNA of two inbred lines, is expected to yield another 48,000 clone sequences. Several hundred thousand EST sequences are being generated by 454 sequencing of two larval cDNA libraries. The European sequencing efforts has generated a large number of ESTs with 26,724 unique sequences, consisting of 8,885 contigs and 17,839 singletons. A cDNA microarray containing 9,058 unigenes was designed in Europe to identify genes differentially expressed between lines selected to be resistant or sensitive to summer mortality.
Shrimps: A total of 176,198 ESTs are now available from shrimps. Approximately 130,000 EST sequences for the white shrimp, L. vannamei, all sequenced from both ends, have been delivered from JGI to the Hollings Lab in South Carolina.
Striped Bass: The NCSU researchers establish genetically and phenotypically diverse founder stocks of striped bass for selective breeding. Nearly 400 crosses were performed between 13 male and 9 female lineages of striped bass with genetic contribution from 6 geographic stocks to generate 82 full-sib families and 336 half-sib families distributed across 4 year classes (2004-2007). The 4 year classes are to be maintained as separate lines for breeding purposes. In 2008, the line of 2004 year class striped bass (line NCSU SB1) was reproduced for the first time with excellent results and several thousand progeny are currently being subjected to performance testing of growth related traits at NCSU and the SNARC for selection of future broodstock. In 2008, the NCSU researchers also worked with 3 N.C. commercial HSB hatcheries to amplify the NCSUWB1 line of white bass that had been domesticated at the PAFL for 7 generations, generating thousands of F8-generation advanced fingerlings, which are being performance tested at NCSU and SNARC for selection as future broodstock and for eventual distribution to industry. Also in 2008, the SNARC researchers completed detailed phenotypic characterizations of three year classes of NCSU-WB1 with respect to growth (length, weight), response of plasma cortisol levels to a low water stress, and circulating IGF-I levels, and mean egg size per female for fish raised on different broodstock diets. These striped bass and white bass will comprise the founder stocks of the National Program of Genetic Improvement and Selective Breeding for the Hybrid Striped Bass Industry (National Breeding Program), which held its 6th Annual Workshop at the World Aquaculture Society Aquaculture America 2008 Meeting in Lake Buena Vista, Florida, with partial support from NRSP-8.
Progress Toward Objective 3: Facilitate and implement bioinformatic tools to extract, analyze, store and disseminate information. (See Attachment 1 for more details on objectives.):
Catfish: John Liu's lab continues their efforts on development of comparative genome tools such as chromosome-anchored ESTs of catfish, and comparative map viewers. Large scale informatic mining of microsatellites and SNPs have been completed, and databases have been developed; The CGRU will continue development and enhancement of microarray tools, and further use of arrays to identify candidate genes for disease resistance. Dr. Zhiliang Hu of Iowa State University trained our students on database construction.
Salmonids: The rainbow trout BAC physical map can be viewed and searched online through a Clemson University URL (http://www.genome.clemson.edu/activities/projects/rainbowTrout/index.shtml). The microarray data from Nichols lab has been uploaded to GEO and will be available upon publication of the manuscript for transcriptome analysis during embryonic development rate. Moreover, once analyzed, sequence data for BACs will be deposited in GenBank for public use.
Tilapia: Dr. Kocher's group has reestablished the Gbrowse software after his move to the University of Maryland. This tool facilitates viewing of tilapia sequences on the Tetraodon and other fish genome assemblies (www.cichlidgenome.org). Oysters: Within the EU project Aquafirst, QTL data will be hosted by the Roslin Institute on ArkDB. EST data will be held on the INRA platform "Sigenae" in Toulouse.
Shrimps: The marine genomics group at the Hollings Marine Laboratory and MUSC continues to maintain www.marinegenomics.org for the archiving of EST and microarray data, and as a resource for on-line tools that can be used in the analysis of genomic and transcriptomic data, which are being used to archive and analyze shrimp metagenomic and microarray data. In addition, contracting with Clemson University Genome Institute is underway to enhance EST analysis capabilities.
Striped bass publications: