Comparative analysis of differentially expressed genes between the ovaries from pregnant and nonpregnant goats using RNA-Seq

Background A multitude of genes tightly regulate ovarian follicular development and hormone secretion. These complex and coordinated biological processes are altered during pregnancy. In order to further understand the regulatory role of these genes during pregnancy, it is important to screen the differentially expressed genes (DEGs) in the ovaries of pregnant and nonpregnant mammals. To detect the genes associated with the development of pregnancy in goats, DEGs from the ovaries from pregnant and nonpregnant Anhui white goats (pAWGs and nAWGs, respectively) were analyzed using RNA sequencing technology (RNA-Seq). Results In this study, 13,676,394 and 13,549,560 clean reads were generated from pAWGs and nAWGs, respectively, and 1724 DEGs were identified between the two libraries. Compared with nAWGs, 1033 genes were upregulated and 691 genes were downregulated in pAWGs, including PGR, PRLR, STAR and CYP19A1, which play important roles in goat reproduction. Gene Ontology analysis showed that the DEGs were enriched for 49 functional GO terms. Kyoto Encyclopedia of Genes and Genomes analysis revealed that 397 DEGs were significantly enriched in 13 pathways, including “cell cycle”, “cytokine–cytokine receptor interaction” and “steroid biosynthesis”, suggesting that the genes may be associated with cell cycle regulation, follicular development and hormone secretion to regulate the reproduction process. Additionally, quantitative real-time PCR was used to verify the reliability of the RNA-Seq data. Conclusions The data obtained in this work enrich the genetic resources of goat and provide a further understanding of the complex molecular regulatory mechanisms occurring during the development of pregnancy and reproduction in goats. The DEGs screened in this study may play an important role in follicular development and hormone secretion and they would provide scientific basis for related research in the future. Electronic supplementary material The online version of this article (10.1186/s40709-019-0095-9) contains supplementary material, which is available to authorized users.


Background
The goat (Capra hircus) can provide high-quality wool, meat, and other products and it is thus an important domestic and commercial animal. However, compared with other commercial animals, goat fecundity is relatively low, which has hindered the development of the goat industry. Anhui white goats (AWGs) have been used as a model to study goat fecundity in recent years [1] because of its precocious puberty, higher fertility, and higher leather quality than other types of goats [2].
Pregnancy is one of the most complex reproductive activities of mammals and is tightly regulated by the external environment, various endocrine factors, and the expression of a large number of genes [3,4]. The ovary plays a vital role during pregnancy. There are significant differences in the activity and endocrine characteristics of the ovary during pregnancy and nonpregnancy [5]. In the nonpregnant phase, ovulation is normal and estrogen secretion dominates to maintain normal estrous function; in contrast, with the formation of the corpus luteum, ovulation is suspended during pregnancy and progesterone secretion is gradually increased to maintain pregnancy [6]. These differences between pregnancy and nonpregnancy may be associated with the distinct regulation of gene expression, which is implicated in various reproductive activities, including follicular development [7,8], hormone secretion [9], ovulation [10], luteinization [11], and pregnancy maintenance [12].
Many studies have indicated that the differential expression of ovarian mRNA hormone receptor genes (follicle-stimulating hormone receptor [FSHR] and luteinizing hormone/choriogonadotropin receptor [LHR]) and growth factor genes (insulin-like growth factor 1 [IGF-I] and bone morphogenetic protein 6 [BMP6]) during follicular development significantly impact the development of preantral follicles [13][14][15]. Pramod et al. showed that the differential expression of bone morphogenetic protein 15 (BMP-15) and growth/differentiation factor (GDF-9) might be involved in higher prolificacy in goats [16]. Suppression subtractive hybridization has been used to screen differentially expressed genes (DEGs) in ovarian tissues between polytocous and monotocous goats, obtaining 29 differentially expressed sequence tags [17]. Microarray analysis has been used to analyze gene expression during early folliculogenesis in goats, and 2466 genes and three cell death-related pathways were explored [18]. In 2014, the different expression of genes in the ovaries from uniparous and multiparous AWGs were analyzed by high-throughput sequencing. In total, 2201 DEGs were identified, and 12 genes that may be associated with AWG high prolificacy were explored [19]. However, the information on the functions of specific genes in regulating goat pregnancy and reproduction is insufficient.
RNA sequencing technology (RNA-Seq) is mainly used for quantitative gene expression analysis of biological processes in a particular tissue or cells in specific species. With the development of high-throughput technologies, RNA-Seq has been extensively used to study specific genes in numerous species, including mice, pig, cattle, human, sheep, and goat [19][20][21][22][23][24]. It was reported that 338 genes were upregulated in the Jining grey goats and 404 were upregulated in Laiwu blank goats, and these genes might be involved in the regulation of goat fecundity and prolificacy [25]. Ling et al. used RNA-Seq to study the DEGs between the ovaries of uniparous and multiparous goats, and identified candidate genes for goat prolificacy [19].
In the present study, we analyzed the gene expression levels in ovaries from pregnant and nonpregnant AWGs (pAWGs and nAWGs, respectively) using an Ion Proton ™ RNA-Seq platform. The results shed further light on the role of specific genes in reproductive biological processes, including hormone secretion, ovulation, luteinization formation, and pregnancy maintenance, which will help to identify genes that could potentially be used to regulate hircine reproduction and breeding practices in the future.

Animal materials
In the present study, AWGs raised under identical feeding conditions at the College of Animal Science and Technology (Anhui Agricultural University, Hefei, China) were used as animal samples. Six ewes, which were 2 years old and had a similar appearance, were selected from two groups for this study. Three of them had been pregnant for 3 months, and the rest were not pregnant and were undergoing anestrus, which was determined by hormone levels. After slaughter, both whole ovaries from each ewe were immediately collected and frozen in liquid nitrogen, followed by storage at − 80 °C. Half of the ovaries were used for RNA-Seq, and the other half were used for real-time PCR. All experimental procedures involving AWGs had been approved by the ethics committee of Anhui Agricultural University, Anhui, China (permit no. AHAU20101025).

Total RNA extraction and analysis for RNA-Seq
To minimize differences among individuals, one ovary from each pair of ovaries from the three nAWGs and three pAWGs were grouped for RNA extraction using RNAiso Plus (Takara, Dalian, China) following the manufacturer's protocol. Briefly, the correct amount of frozen tissue sample was dissolved in 1 ml RNAiso Plus. After incubation at room temperature for 5 min, the homogenate was centrifuged at 12,000g for 5 min at 4 °C. The supernatant was then transferred to a new centrifuge tube. Chloroform (1/5 volume of RNAiso Plus) was then added to the supernatant. The solution was fully emulsified and incubated for 5 min at room temperature before centrifugation at 12,000g for 15 min at 4 °C. The supernatant was then carefully transferred to another centrifuge tube. An equal amount of isopropanol was added to the supernatant. The solution was fully mixed and incubated for 10 min at room temperature before an additional centrifugation at 12,000g for 10 min at 4 °C. Then, 1 ml of 75% ethanol was slowly added to the white precipitate, which was centrifuged at 12,000g for 5 min at 4 °C. Finally, the supernatant was discarded, and the precipitate was dried and dissolved in RNase-free ddH 2 O.
The quantity and quality of the total RNA were measured using the Agilent 2100 Bioanalyzer system (Santa Clara, CA), and the samples were stored at − 80 °C until analysis.

Library preparation and sequencing
Two groups of total RNA were used for library preparation and sequencing by pooling equal quantities (10 μg) of total RNA isolated from 3 individual pregnant or 3 individual nonpregnant goat ovaries. Briefly, the total RNA samples were treated with RNase-free DNase I (Ambion Inc., Austin, TX) to degrade any possible genomic DNA contamination. The digested products were then purified by magnetic beads. Next, the mRNA was enriched using oligo (dT) magnetic beads (for eukaryotes) and fragmented into short fragments (approximately 200 bp) using fragmentation buffer. First-strand cDNA was then synthesized with random hexamer primers using the mRNA fragments as templates. Buffer, dNTPs, RNase H, and DNA polymerase I were added to synthesize the second strand cDNA. The doublestranded cDNA was purified with the QIAQuick PCR extraction kit (Qiagen, Germany) and eluted with elution buffer for end repair and poly (A) addition. Sequencing adapters were then ligated to the 5′ and 3′ ends of the fragments. Next, the ligation products were size selected and purified by agarose gel electrophoresis. Finally, the fragments were enriched by PCR amplification (95 °C for 10 min, followed by 40 cycles at 95 °C for 15 s and 60 °C for 45 s), purified by magnetic beads, and dissolved in the appropriate amount of elution buffer. During the quality control steps, an Agilent 2100 Bioanalyzer (Agilent, USA) was used for qualification and quantification of the sample library. Finally, the cDNA libraries were sequenced using an Ion Proton platform (Life Technologies, USA) at the Beijing Genomics Institute (BGI) in Shenzhen, China.

Sequencing data analysis
According to the requirements of this experiment and the principles of standard bioinformatics analysis, some contaminant reads, such as low-quality reads, adaptor sequences and reads with a length less than the threshold (30 bp), should be removed from the raw reads [26]. After filtering, the valid sequences, also known as the clean reads, were retained for further analysis. To identify the gene expression patterns in each genotype from the two libraries, the clean reads were mapped to goat reference gene sequences (http://goat.kiz.ac.cn/GGD/downl oad.htm) and goat reference genome sequences (http:// goat.kiz.ac.cn/GGD/downl oad.htm) using SOAPaligner/ SOAP2 [27]. No more than two mismatches were allowed in the alignment.

Identification of DEGs
To identify the DEGs between pAWGs and nAWGs, the gene expression levels were calculated using the reads per kilobase per million reads (RPKM) as follows [20]: where RPKM (A) is the expression level of gene A, C is the number of reads uniquely aligning to gene A, N is the total number of reads uniquely aligning to all genes, and L is the number of bases in gene A. DEGs were identified using a strict algorithm developed by BGI based on the method described by Audic and Claverie [28]. To compare the differential expression of the two samples, fold changes and p values were used, which calculated from the normalized expression using the following formulas: Formula for the p value: N 1 and x represent the total clean reads count and the normalized expression level of a given gene in the nonpregnant ovary library, respectively. N 2 and y represent the total clean reads count and the normalized expression level of a given gene in the pregnant ovary library, respectively. The false discovery rate (FDR) was used to determine the threshold p values in the multiple tests and perform analyses through the manipulation of the FDR values. In the present study, FDR ≤ 0.001 and |log 2 Ratio| ≥ 1 were used as the thresholds to judge the significance of the gene expression differences [29].

Real-time quantitative PCR
Eight DEGs were randomly selected to validate the RNA-Seq (quantification) data through real-time quantitative PCR (qPCR). The primers were designed using the Primer Express 5.0 software (ABI, China) ( Table 1). GAPDH was used as a reference gene for data standardization. The left three ovaries from each nAWG or pAWG were pooled for total RNA extraction as described above. First-strand cDNA was synthesized using 1 μg of total RNA and the First-Strand cDNA Synthesis Kit (Toyobo, Japan where N is the number of genes with GO annotations, n is the number of DEGs in N, M is the number of genes that are annotated to specific GO terms, and m is the number of DEGs in M. The calculated p value underwent Bonferroni correction using a corrected p value ≤ 0.05 as a threshold. GO terms fulfilling this condition were defined as significantly enriched GO terms for the DEGs. This analysis is able to recognize the main biological functions performed by the DEGs. Using nr annotation, the gene annotation of the DEGs was conducted using Blast2GO [31]. After obtaining the GO annotation for the DEGs, Web Gene Ontology Annotation Plot (WEGO) software was used to perform GO functional classification of the DEGs and understand the distribution of the gene functions from the species at the macro level [32].
The Kyoto Encyclopedia of Genes and Genomes (KEGG) is the major public pathway-related database [33]. Pathway enrichment analysis identified significantly enriched metabolic pathways or signal transduction pathways for the DEGs compared with the whole genome background. The formula for calculating the pathways is the same as for the GO analysis, with N as the number of all genes with KEGG annotations, n as the number of DEGs in N, M as the number of genes annotated to specific pathways, and m as the number of

Overview of sequencing data
In this study, we generated cDNA libraries of ovaries from pregnant and nonpregnant goat (Capra hircus) (Fig. 1). Approximately 14.39 Gb reads, including 7.12 Gb from pAWGs and 7.27 Gb from nAWGs, were obtained through the sequencing of two ovary cDNA libraries from pAWGs and nAWGs using the Ion Proton sequencing platform at BGI in Shenzhen, China. After discarding sequences shorter than 30 bp, filtering the adaptor sequences, and eliminating low-quality sequences from the raw data, 13,676,394 and 13,549,560 clean reads were ultimately obtained from the ovary libraries for pAWGs and nAWGs, respectively, and were retained for further analysis. The clean reads from each library were sufficient for the quantitative analysis of gene expression. According to the mapping analysis, approximately 98% of the reads (98.01% for nAWGs and 97.95% for pAWGs) could be mapped to the goat genome sequence. When the unique matches were regarded, slightly more than 92% (92.50% for nAWGs and 92.72% for pAWGs) of the reads corresponded to the genome, and approximately 45% of the reads (45.03% for nAWGs and 43.47% for pAWGs) could be perfectly matched to the genome.
When the reference genes were used to analyze the data, only 75.91% of the reads in nAWGs and 75.64% of the reads in pAWGs could be mapped to reference genes. Regarding the unique matches, 54.14% of the reads in nAWGs and the 54.54% in pAWGs corresponded to the reference genes. Additionally, 37.83% of the reads in nAWGs and 36.43% of the reads in pAWGs could be perfectly matched to the reference genes. The major characteristics of the two libraries are described in Additional file 1.

Identification and analysis of the DEGs in the ovaries from pregnant and nonpregnant goats
In total, 20,651 genes were detected in the study, among which 17,181 genes were co-expressed in the two libraries and 2002 and 1468 genes were specifically expressed in the ovary libraries from pAWGs and nAWGs, respectively (Additional file 2).
Analysis of the gene coverage distribution indicated that 16% of genes had 70-90% coverage, while 26% of genes in the nAWG and 27% of genes in the pAWG libraries had 90-100% coverage (Additional file 3: Fig.  S1), which means the read distributions were similar between the two libraries.
To detect pregnancy-related genes, the expression levels of all the detected genes from pAWGs and nAWGs were calculated using the RPKM method. An FDR ≤ 0.001 and the absolute value of the log 2 ratio ≥ 1 were used as criteria to judge the significant differences in the expressed genes. A total of 1724 DEGs were detected between the nAWG and pAWG libraries, with 1033 upregulated genes and 691 downregulated genes in pAWGs compared to nAWGs (Additional file 4: Fig. S2 and Additional file 5). Some of the DEGs were related to reproduction and reproductive processes, such as progesterone receptor (PGR), prolactin receptor precursor (PRLR), steroidogenic acute regulatory protein (STAR ), cytochrome P450 19A1 (CYP19A1) and so on ( Table 2). Among the DEGs, 15 genes and 71 genes were found to be specifically expressed in the nAWG and pAWG libraries, respectively. Among the specifically expressed DEGs, we screened four DEGs related to reproduction, including two genes [beta-defensin 112 (DEFB112) and interferon epsilon (IFNE)] that were specifically expressed in nAWGs and two genes [somatostatin precursor (SST) and somatostatin receptor type 1 (SSTR1)] that were specifically expressed in pAWGs.

QPCR validation
The expression profiles of eight randomly selected DEGs were validated using qPCR analysis. As shown in Fig. 2, the expression of SST, SLC1A2, TSPAN1, and OSTF1 was upregulated, whereas the expression of DEFB112, UGDH, APOA2, and FADS1 was downregulated in pAWGs compared with nAWGs. These expression patterns were largely consistent with the RNA-Seq (Quantification) results, which means that the RNA-Seq results were reliable and that RNA-Seq could be used to reliably and accurately perform mRNA differential expression analysis.

GO functional enrichment and KEGG pathway analysis of the DEGs
GO is an international standardized classification system for gene function that supplies a set of controlled terms to comprehensively describe the properties of genes and gene products. In our present study, GO function enrichment was used to screen the DEGs that were significantly associated with different biological functions. The well-annotated gene sequences were assigned to 23, 15, and 11 functional groups in the "biological process", "cellular component", and "molecular function" categories, respectively (Additional file 6). Many of the enriched GO terms are reproduction-related, such as "signaling", "reproduction", "reproduction process", "developmental process", "growth", "organelle", "receptor activity", "biological regulation", "regulation of biological process", "molecular transducer activity" and "enzyme regulator activity" (Fig. 3). The results are summarized for the following three main categories: "biological process", "cellular component", and "molecular function". Among these groups, the GO terms "cellular process", "binding", "cell", and "cell part" predominated in each of the three main categories. The reproduction-related GO terms were underlined These GO terms were mostly associated with cell proliferation, apoptosis, follicular development, hormone secretion, and pregnancy maintenance. Among the reproduction-related GO terms, some important genes related to reproduction were detected, including PGR, CYP19A1, CCNB1, STAR , SST, SSTR1, estrogen receptor beta 2 (ESR2), BMP receptor 1B (BMPR1B) and erythropoietin receptor (EPOR). Analysis of the KEGG pathway enrichment revealed that the 1413 DEGs participated in 242 pathways (Additional file 7), among which the largest category was "metabolic pathways" (ko01100, 11.18%), which had 158 annotated genes. Furthermore, 397 DEGs were significantly enriched (p ≤ 0.05) in 13 pathways, several of which were related to reproduction, such as "cytokine-cytokine receptor interaction", "cell cycle", and "steroid biosynthesis" (Figs. 4 and 5, Additional file 8: Figure S3, Additional file 9: Figure S4, Additional file 10: Figure S5 and Table 3). And these DEGs up-and down-regulated also shown in Fig. 5. The significantly enriched DEGs, including PRLR, BMPR1B, EPOR, kit Hardy (KIT), interleukin 1 beta (IL1B), cyclin A2 (CCNA2), cyclin B1 (CCNB1), and proliferating cell nuclear antigen (PCNA), have been reported to be involved in follicular development, hormone secretion, luteinization, and pregnancy maintenance.

Discussion
We successfully identified the gene expression profiles of ovarian tissues from pregnant and nonpregnant AWGs using RNA-Seq (Quantification) and analyzed the gene expression differences between the two libraries. Furthermore, the GO enrichment and KEGG pathways for the DEGs were analyzed.
In our present study, 1724 genes were found to be significantly differentially expressed between the two libraries. Among them, a large number of DEGs were associated with prolificacy processes, follicular development, ovulation control, hormone secretion, and luteinization or pregnancy maintenance. For example, genes previously associated with reproduction, including PRLR, LHR, STAR , SPP1, BMP6, 3β-hydroxysteroid dehydrogenase estradiol (3BHSD), 17-beta-dehydrogenase 12 (HSD17B12), and BMPR1B, were found to be upregulated in pAWGs compared to nAWGs. The upregulated expression of PRLR mRNA during pregnancy may be associated with corpus luteum function for maintaining pregnancy [34]. LHR is a specific membrane receptor on the granulosa and theca cells that binds to luteinizing hormone (LH), resulting in androgen and progesterone production. LHR may impact follicular development from the primary follicle stage onwards [35]. 3BHSD plays a crucial role in the biosynthesis of all classes of hormonal steroids  [36]. HSD17B12 plays an important role in female fertility [37]. STAR , which is the key rate-limiting enzyme in the production of steroid hormones, is involved in the regulation of steroid hormone biosynthesis and follicular development in the mammalian ovary [9]. SPP1, which is also known as OPN (osteopontin), is localized   26:3 to luteal cells during the luteal phase of the estrous cycle and is involved in the development and regression of the corpus luteum [11]. Previous research has indicated that BMP6 may regulate ovary granulosa cell steroidogenesis, ovarian hormone secretion, follicular growth, and luteinization [38,39]. A BMPRIB mutation may decrease the ability of BMP to inhibit differentiation, increase INHBA (inhibin beta A) mRNA expression in granular cells, and enhance the responsiveness of granule cells to gonadotropin [40]. PGR, FSHR, CYP19A1, ESR2, INHA, NHBA, estradiol 17-beta-dehydrogenase 1 (HSD17B1), and inhibin beta B (INHBB) were found to be downregulated in pAWG. Recent studies showed that PGR, ESR2, FSHR, CYP19A1, HSD17B1, INHA, INHBA and INHBB played a very important role in hormone secretion, follicle growth, cell proliferation and apoptosis [41][42][43].
Our data showed some uniquely transcribed genes in nAWGs and pAWGs. Among these specifically expressed genes, two genes (DEFB112 and IFNE) from nAWGs and two genes (SST and SSTR1) from pAWGs may be involved in related events during pregnancy. DEFB112 is a type of B-defensin that is preferentially expressed in the male reproductive system and actively involved in sperm maturation and capacitation but is poorly documented. Further study is required to determine whether DEFB112 is involved in the reproductive process in female animals [44]. IFNE is a novel type I IFN that has been detected in the lung, reproductive tissue, intestine and so on. IFNE may protect the female reproductive tract from viral and bacterial infection. Its expression was found to be at the lowest levels during diestrus and the highest levels at estrus [45]. IFNE can activate the JAK-STAT signaling pathway, which plays a very important role in reproduction [46]. SST is a 14-amino acid polypeptide with a short half-life that may inhibit the secretion of various hormones including GH and LH. SSTR1 is somatostatin receptor subtype 1. SST and its receptors are present in the ovary in various species. It was reported that SSTR2 and 5 activation modulate ovarian steroidogenesis by upregulating endogenous BMP activity in growing follicles [47]. Previous studies have shown that SSTR1 inhibits cAMP accumulation by coupling to pertussis toxin-sensitive G proteins [48]. However, the specific functions of SSTR1 are unclear and require further study.
Regarding GO enrichment analysis, we found that some of the correlated genes were enriched into reproduction-related GO terms. For example, CYP19A1, ESR2, PGR INHA, INHBA and STAR were enriched into GO terms for reproduction, reproduction process, molecular transducer activity, and transporter activity. The results indicated that these genes may play a specific role in goat reproduction. For the uniquely transcribed genes, the two genes (ENC1 and RGS17) from nAWGs were enriched for signaling, enzyme regulator activity, and molecular transducer activity GO terms, while the six genes (DCDC2, TBC1D15, PTHRP, SLC1A2, RDH16 and SST) from pAWGs were enriched for biological regulation, enzyme regulator activity, growth, metabolic process, and response to stimulus GO terms. Our data indicated that these genes may be related to cell proliferation and hormone secretion.
Three pathways, including "cytokine-cytokine receptor interaction", "steroid biosynthesis" and "cell cycle", were selected from the 13 significantly enriched pathways and were found to be intimately associated with mammalian reproduction. Most of the DEGs participating in cytokine-cytokine receptor interaction and steroid biosynthesis pathways were upregulated, including EPOR, KIT, IL1B, PRLR and BMPR1B, indicating that these pathways are activated in the ovaries of pregnant goats. It was shown that EPOR and IL1B might play a role in establishing pregnancy [49,50]. KIT plays a very important role in the maintenance of the primordial follicle reserve and in primary to secondary follicle transition [51]. Most DEGs were downregulated in the cell cycle pathway, such as CCNA2, CCNB1 and so on, and only one gene (cyclin D1, CCND1) was unregulated. CCNA2 regulates cell cycle progression by interacting with CDK kinases and is involved in the G2/M transition. CCNB1 is a regulatory protein involved in mitosis. Research indicates that the downregulated expression of CCNA2 may promote primordial follicle activation, while CCNB1 plays a significant role in promoting the maturation of large-follicle oocytes [52,53]. CCND1 is a protein required for progression through G1 phase of the cell cycle and its main function is to promote cell proliferation [54]. It can be inferred that these genes may be involved in cell cycle regulation, granulosa cell proliferation and follicular development.
Furthermore, some other signaling pathways associated with reproduction were also identified in our current analysis. These pathways included the ECM-receptor interaction, focal adhesion, and Jak-STAT, Wnt, insulin, and p53 signaling pathways, suggesting that some molecules in these pathways might also be involved in goat reproduction or maintaining the physiological activity of the ovary. However, the exact relationship between these signaling networks is not fully understood and further research is needed. Generally, the current KEGG pathway analyses have provided a solid basis for future work.

Conclusion
In conclusion, this study presents the transcriptomic profiles of ovarian tissue from pregnant and nonpregnant AWGs using RNA-Seq technology. 1724 genes were detected to be significantly differentially expressed