Efficient Assembly, Annotation of Catfish Transcriptome by RNA-Seq analysis of Doubled Haploid Homozygote

Sustainability Catfish / Pangasius Education & academia +2 more

4 February 2013, at 12:00am

Recent advances in next-generation sequencing enabled an array of whole genome sequencing or re-sequencing projects in both model and non-model species. Such efforts have produced a wealth of genome resources. However, thorough genome analysis thereafter is essential to associate genome sequences with biological meanings. An important step in genome analysis is to decipher the complete protein coding sequence (CDS) region of each gene

by Lucy Towers

Authors from Auburn University, USA, look at the efficient assembly and annotation of the transcriptome of catfish by RNA-Sequencing analysis of a doubled haploid homozygote, to develop a comprehensive set of reference transcript sequences for genome-scale gene discovery and expression studies in catfish and to obtain a large number of full-length transcripts for whole genome annotation, duplicate gene identification, and facilitating detection of false SNPs derived from PSVs/MSVs.

Introduction

Upon the completion of whole genome sequencing, thorough genome annotation that associates genome sequences with biological meanings is essential. Genome annotation depends on the availability of transcript information as well as orthology information.

In teleost fish, genome annotation is seriously hindered by genome duplication. Because of gene duplications, one cannot establish orthologies simply by homology comparisons. Rather intense phylogenetic analysis or structural analysis of orthologies is required for the identification of genes. To conduct phylogenetic analysis and orthology analysis, full-length transcripts are essential. Generation of large numbers of full-length transcripts using traditional transcript sequencing is very difficult and extremely costly.

Results

In this work, we took advantage of a doubled haploid catfish, which has two sets of identical chromosomes and in theory there should be no allelic variations. As such, transcript sequences generated from next-generation sequencing can be favorably assembled into fulllength transcripts. Deep sequencing of the doubled haploid channel catfish transcriptome was performed using Illumina HiSeq 2000 platform, yielding over 300 million high-quality trimmed reads totaling 27 Gbp.

Assembly of these reads generated 370,798 non-redundant transcript-derived contigs. Functional annotation of the assembly allowed identification of 25,144 unique protein-encoding genes.

A total of 2,659 unique genes were identified as putative duplicated genes in the catfish genome because the assembly of the corresponding transcripts harbored PSVs or MSVs (in the form of pseudo-SNPs in the assembly). Of the 25,144 contigs with unique protein hits, around 20,000 contigs matched 50% length of reference proteins, and over 14,000 transcripts were identified as full-length with complete open reading frames. The characterization of consensus sequences surrounding start codon and the stop codon confirmed the correct assembly of the full-length transcripts.

Conclusions

The large set of transcripts assembled in this study is the most comprehensive set of genome resources ever developed from catfish, which will provide the much needed resources for functional genome research in catfish, serving as a reference transcriptome for genome annotation, analysis of gene duplication, gene family structures, and digital gene expression analysis.

The putative set of duplicated genes provide a starting point for genome scale analysis of gene duplication in the catfish genome, and should be a valuable resource for comparative genome analysis, genome evolution, and genome function studies.

February 2013

Efficient Assembly, Annotation of Catfish Transcriptome by RNA-Seq analysis of Doubled Haploid Homozygote

Introduction

Results

Conclusions

Further Reading

Local, antibiotic-free and automated: inside HanseGarnelen’s shrimp RAS

Making the case for seaweed carbon credits

A pond bottom-up approach to improving shrimp welfare

Seaweed Insights launches for Latin America & Caribbean

Less wild fish in feed, bigger footprint for European aquaculture

BioMar launches SmartCare Defence functional feed for sea bass

SAMS scientists hail "phenomenal" red seaweed breakthrough

Sustainability

Less wild fish in feed, bigger footprint for European aquaculture

Insect meal: can the sector still deliver on its 2030 promise?

CMFRI urges 3-month closure for Ashtamudi clam fishery amid recovery

Catfish / Pangasius

The fish farms breathing new life into coal mines

Rabobank / Gorjan Nikolik reports Rabobank optimistic despite reduced global aquaculture growth

How pangasius farming kept an entire Indonesian community afloat

Education & academia

FAI launches online Shrimp Hub to drive global welfare standards

First Able Seafarer apprentices set sail with Bakkafrost Scotland

Texas A&M to open new marine aquaculture HQ