With the advent of massively parallel next generation sequencing technologies (NGS), these types of studies have become a reality and while many of the challenges and preferred strategies are still being addressed, empirical studies are now starting to be reported. Studies of transcriptome level responses to environmental change offer an opportunity to understand the underlying genetic basis for adaptation. Such studies represent a powerful approach to assessing the genes involved in adaptation to a changing climate, particularly increasing temperatures. By profiling transcriptional changes induced by temperature stress, it is possible to identify the gene regions or pathways that are likely to be the targets of selection. This information is crucial to enable researchers to assess levels of variation across these gene regions, at a landscape scale, to predict the capacity of organisms to adapt to a warming climate.
Genes involved in physiological adaptation to temperature stress have been uncovered in many species. Heat shock proteins , alcohol dehydrogenase and lactate dehydrogenase genes have all been shown to be related to heat tolerance. In fish, the list of candidates also includes many from other gene regions related to respiration and protein binding. Apart from differences in coding regions, transcriptional regulation is also a source of variation that can potentially contribute to adaptive evolutionary change, particularly in the early stages of divergence. Studies in natural populations of gobies (Gillichthys mirabilis) have shown that short term exposure,
Buckley and Hofmann examined the extensive plasticity in Hsp induction in gobies acclimatised to different thermal backgrounds (13°C, 21°C, and 28°C).
They found that the activation temperature of the transcriptional regulator HSF1 was positively associated with the acclimatisation temperature indicating that plasticity in heat shock response is linked to plasticity in the regulatory framework governing Hsps. While adaptive plasticity is often seen as a mechanism that can slow or dampen divergent selection, it has been argued that it can also lead to rapid speciation if there are strong correlations between phenotype and environment combined with significant population structure. By examining the transcriptomic response to temperature stress we can develop a better understanding of the genes and biochemical pathways that are fundamental to physiological acclimatisation to a warming environment and gain insights into the regulatory changes that accompany adaptation over evolutionary timescales.
Australian rainbowfish are an ideal species group to test hypotheses about the genetic responses to increasing temperatures. In particular, the crimson-spotted rainbowfish (Melanotaenia duboulayi) is a subtropical freshwater species found along a north–south temperature gradient in eastern Australia. Their distribution ranges over several ecoregions which, coupled with a strong population structure and local abundance, makes them a well suited model for studying local adaptation. The ease of maintaining captive populations of rainbowfish also make them amenable to a range of laboratory-based experimental studies.
In this study we maintained groups of M. duboulayi at ambient and elevated temperature levels and then used an RNA-seq approach to assess transcriptome level changes related to temperature stress. Our aim is to provide an initial investigation of the transcriptomic response to thermal stress in rainbowfish. As such, this will allow for the screening of many more individuals via genotyping of candidate SNPs. In addition we present the first annotated transcriptome and gene catalogue for the order Atheriniformes. Our goal is to identify key candidate genes and make a first step towards understanding the important biochemical pathways on which selection is likely to act in a warming climate.
Results and Discussion
Raw Sequencing Data and Quality Statistics
The single lane of Illumina HiSeq2000 produced close to 128 million paired-end reads (2 × 101bp). After trimming and quality filtering, 12.3% of reads were discarded leaving over 224 million reads for downstream analysis (2 × 94bp). The final number of reads per individual ranged from 11.7 million to 29 million (mean = 18.7 million ± 1.4 million). The number of reads in each treatment group was well balanced with 112.3 million in the 21°C group and 112.0 million in the 33°C group (Additional file 1: Table S1). We selected the best k-mer merge range for assembly based on the distribution of assembly statistics for the individual kmer assemblies from k = 19 to k = 49 (see Table 1).
The merged assembly from a k-mer range of 21 to 39 scored best on the balance of these parameters with a N50 value of 1,856 and a total number of contigs of 107,749. While this range may exclude some rare, lowabundant transcripts, it presents a more conservative and reliable approach to differential expression testing by emphasising the accuracy of the assembly rather than the identification of low-abundant transcripts from both treatments. Annotation of the transfrags with the Blast2Go software suite resulted in 65,105 (60.4%) blast hits and 53,278 (49.4%) successfully annotated sequences.
Differential Expression Analyses
The four different combinations of mapping and DE testing produced vastly different numbers of DE transfrags (see Table 1, Figure 1). The combination of BWA alignment followed by EdgeR DE analysis identified the most with 14,076 DE transfrags, whereas Bowtie followed by DESeq identified the least with 5,577 (Figure 1). The difference between the approaches likely arises from the different characteristics of the two aligners combined with the sensitivities of the DE tests. Bowtie does not allow gapped alignments and makes use of the base quality scores , making it more conservative than BWA in the number of mapped reads.
On the other hand DESeq has also been shown to be more conservative than EdgeR when identifying DE genes from low count data which likely explains the lower number of hits in multiplex sequencing strategies such as ours. The total number of DE transfrags identified by all four approaches was 4,251 (Figure 2). We adopted a conservative approach and selected only these transfrags to blast against the reference database.
Future RNA-seq studies should assess their priorities for DE gene discovery and select the detection strategy based upon the need for identifying lowly expressed genes versus the accuracy expected given the number of replicates used . Robles et al.  showed that EdgeR could be used to detect higher numbers of DE transfrags from low count data without compromising accuracy when the number of biological replicates was at least six in each treatment group.
The Blast2GO program was able to find sequence similarities for 2,740 of the DE transfrags but could not find mapping or annotation information for a further 634 of them, leaving 2,106 DE transfrags which were successfully annotated. The top 15 matching species from the BLAST query were all fish species with the most BLAST hits being for the Nile tilapia Oreochromis niloticus with 583 matches. Duplicate gene isoforms were detected by matching identical annotated gene names from the Blast2GO output. These isoforms were then combined and reported as single “genes”.
Once isoforms were combined, there were 614 genes that were up-regulated in the high temperature treatment with 349 genes being downregulated (see Additinal file 1: Table S2a and b). For significantly down-regulated transfrags, the mean fold-change between ambient and high-temperature conditions was 4.0-fold, with a range from 55.6-fold for g2/m phase specific e3 ubiquitin-protein ligase to 2.2-fold for the Phytanoyl-peroxisomal-like protein. The mean fold-change for significantly up-regulated transfrags was 11.13, ranging from 1.98 (for the cyclin-dependent kinase 2 interacting protein) to 259-fold (for the Heat shock protein Hsp-90-like).
Ontology of Differentially Expressed Genes
Many functional classes of genes were affected by temperature stress. As expected, heat shock protein genes including HSPA4 (12.3 x), Hsp60 (6.6 x), Hsp70 (9.9 x) and Hsp90? (141.3 x) were significantly up-regulated in heat stressed fish. These transcripts are well characterised as stress inducible and have been shown, in many species, to be involved in protection against apoptosis or as a molecular chaperone under extended exposure to heat stress [15,19,20,52-56].
Further to these well-characterised stress related genes, the gene ontology analysis also identified transcripts involved in catabolism (11% of annotated sequences) and lipid metabolism (12% of annotated sequences) as being the important biological processes in the response to temperature stress (Figure 2a). As with other studies in fish, regulation of metabolic processes are clearly important parts of the heat stress response. A large proportion of the individual over-expressed genes in rainbowfish were related to oxidoreductase activity, mitochondrial components and organelle membranes. These gene categories are typically associated with increased metabolism, particularly to cope with increased temperature and the related hypoxic conditions.
Additionally we found a role for genes of the ubiquitin family and the gene 78 kDa glucose-regulated protein precursor which, similar to Quinn et al. , were upregulated in response to heat stress. Gene ontology analysis also identified biomolecular binding and catalytic activity as the major molecular functions affected by exposure to different temperature regimes (see Figure 3b). Within these broad categories, protein binding and ATP binding were the major biomolecular binding functions affected by differentially expressed transfrags with node scores of 244 and 226 respectively. For catalytic activity, transferase activity (nodescore = 53) and oxidoreductase activity were prominent (node score = 54). These functional categories, combined with electron carrier activity (node score = 63), is congruent with the expected role of aerobic respiration in response to the increased temperature.
While the Hsp genes are commonly identified as overexpressed in short-term temperature manipulation experiments, they are less likely to be targets for selection during gradual temperature shifts associated with climate change. Hsp genes represent a physiological response to sudden stressors and therefore plasticity in these traits is unlikely to be adaptive over longer timescales.
The more likely candidates for an adaptive genetic response are those genes involved in what has been termed the “cellular homeostatis response” to long-term temperature stress. Unlike stress response genes that provide an immediate early response to macromolecular damage and sudden changes in cellular redox potential, the cellular homeostatasis response involves effector proteins mediating parameter specific adaptation to environmental change.
Responses Associated With Prolonged Exposure to Heat Stress
Prolonged exposure to increased temperatures has previously been associated with gene ontologies related to protein folding, oxidative stress and immune function. Similarly, we detected significant upregulation of genes with these ontologies in the high temperature treatment such as Calnexin (2.8 x), NADH dehydrogenase (2.5 x), and glutathione S- transferase (5.1 x) suggesting long-term reallocation of energy resources.
Plasticity in the expression of these genes is more likely to be adaptive and allow localised populations to survive in a changing environment, eventually leading to divergent selection. Kassahn et al. grouped stress-response transcripts into four different clusters according to the pattern of regulation detected under short versus long-term exposure to heat stress. They suggested that long-term exposure to heat stress in a coral reef fish (310C for five days) induces expression of genes involved in development and immune function whereas genes related to metabolic function are suppressed.
Our data, from long-term exposure to heat stress in rainbowfish (330C for 14 days), support those findings. Developmental processes and metabolic processes accounted for 48% of dysregulated transfrags (Figure 3a). Immune function seems less important in our dataset and is covered by the “response to stimuli” category representing 9% of DE transfrags including the natural killer cell enhancement factor (upregulated 2.8 x). It is possible that the longer exposure to heat stress in our study allowed recovery from the immediate activation of the immune function genes.
Under simulated models of divergence with plasticity, selection is possible when plasticity is moderate, dispersal ability is low and there are no fitness costs to plasticity. It may therefore be worthwhile to focus attention on those gene regions that showed mid-range shifts in expression level in the treatment group when looking for evidence of adaptive evolution. In particular, the mid-range transfrags related to metabolic and developmental processes as well as immune function are likely to be good candidates as genes of adaptive significance for increasing temperatures (Table 2).
Rainbowfish represent ideal candidates for studies of local adaptation due to their reduced dispersal and distribution over multiple ecoregions. The suite of genes that we present here showing a response to increased temperature are a good starting point for further manipulative experiments or landscape wide surveys of genetic variation. Creating a catalogue of polymorphisms at these genes throughout the range of M. duboulayi will provide insights into the adaptive potential of this species in the face of a warming environment.
RNA-Seq Recommendations for Non-Model Taxa
The results of this study highlight the appropriateness of an RNA-seq approach for studies of adaptation (including adaptive plasticity) in non-model organisms. With the paucity of genomic resources available for most wildlife species, NGS technologies offer the best hope for unravelling the processes of evolutionary adaptation in a natural setting. Rainbowfish are evolutionarily very different from their nearest genome-enabled species, Oryzias latipes, yet in this study we were able to generate a substantial list of candidate genes involved in a response to increasing temperatures.
Over the past few years, the proliferation of software resources and validated pipelines for RNA-seq means that virtually any organism can now be the focus of ecological genomic research and this is reflected in the rapid increase in publications reporting RNA-seq analyses in non-human taxa. The limiting factors that remain now are bioinformatic expertise and incomplete reference data.
Over half of the dysregulated transfrags identified in our study were unable to be identified or were of unknown function. This continues to be a major challenge for studies of ecological and evolutionary genomics. Interpretation of genomic data lags well behind the current ability to generate that data. The limitation stems from the fact that annotation of genes of ecological interest still relies upon inferring homologies with genomic features established and developed in a few model species for non-ecological purposes. Better data integration is needed to facilitate the association of gene transcripts with specific natural conditions or phenotypic responses. Further work to characterise the function of these unknown genes via experimental studies of non-model organisms will enhance our understanding of the important biological pathways involved in responses to temperature stress and other environmental changes.
We have shown that differing mapping and DE analysis approaches lead to very different outcomes in terms of the DE genes identified. While a combination of all available approaches is preferable to identify overlap in the candidate genes detected, we found that combining output from just Bowtie mapping and DESeq significance testing with BWA mapping and DESeq significance testing delivered just 21 more DE genes than combining all four approaches tested in our study (see Figure 1). This conservative approach is an efficient way to avoid large numbers of false positives being detected in RNA-seq studies.
Temperature increases predicted over the coming decades suggests species with limited dispersal abilities will need substantial adaptive potential to avoid extinction. That adaptive potential will likely come from a number of sources including adaptive phenotypic plasticity, standing genetic variation, and newly-derived mutations. Regardless of the source, adaptation will be most important in those processes related to heat tolerance. We have presented a first insight into which processes are likely to be important in the rainbowfish, M. duboulayi. This provides a foundation for future research into temperature-driven adaptive responses in freshwater species but also invites more detailed study of the phenome-genome interaction under conditions of temperature stress.
We identified a predictable suite of heat shock genes that responded sharply to increased temperatures in the treatment group. However, we also identified transfrags related to regulation of metabolic functions and developmental processes that showed mid-range levels of dysregulation and may be stronger candidates as genes for long-term adaptation to a warming environment. We present these candidate genes as targets for ongoing research into populations representing different thermal environments throughout the species range. We also expect that these candidates will be useful targets for studies of other freshwater species experiencing long-term thermal challenges.
The expression level changes we have presented may be an example of a plastic response. To check for an adaptive component it is necessary to repeat the temperature trial on other geographically distant populations and/or sister taxa. Parallel expression level changes in these populations would indicate plasticity whereas altered responses would be suggestive of adaptation at the genome level. Such “common garden” experiments allow the disentangling of pure plastic vs. genetic responses and are ideal approaches for future research. Other avenues to explore evolutionary adaptation to increased temperatures include investigating if DNA polymorphisms are present within and between populations at the gene regions we have identified in this study. Extensions of this research to include adaptive traits from other important environmental impacts will enable a much broader understanding of how freshwater species are likely to cope with human-induced habitat and climatic change.
Further ReadingYou can view the full report and list of authors by clicking here.