Within the CPID group, there were seven short inhibitor genes that lack the enzymatic portion of the protein. A similar trend of “stand-alone inhibitors” has been observed in other insects, such as B. mori. These CPID genes may be involved in the regulation of cysteine peptidases. We note that we found multiple fragments of cysteine peptidase genes, suggesting that the current list of L. decemlineata genes may be incomplete. Comparison of these findings with previous data on L. decemlineata cysteine peptidases demonstrates that intestains correspond to several peptidase genes from the uL1 and uL2 groups . These data, as well as literature for Tenebrionidae beetles, suggest that intensive gene expansion is typical for peptidases that are involved in digestion. We also found a high number of digestion-related serine peptidase genes in the L. decemlineata genome , but they contribute only a small proportion of the beetle’s total gut proteolytic activity. Of the 31 identified serine peptidase genes and fragments, we annotated 16 as trypsin-like peptidases and 15 as chymotrypsin-like peptidases. For four chymotrypsin-like and one trypsin-like peptidase, we identified only short fragments. All complete sequences have distinctive S1A peptidase subfamily motifs, a conserved catalytic triad, conserved sequence residues such as the “CWC” sequence and cysteines that form disulfde bonds in the chymotrypsin protease fold. The number of serine peptidases was higher than expected based upon the number of previously identified EST clones,large plastic pots but lower than the number of chymotrypsin and trypsin genes in the T. castaneum genome.
The enzymes that assemble and degrade oligo- and polysaccharides, collectively termed Carbohydrate active enzymes , are categorized into fve major classes: glycoside hydrolases , polysaccharide lyases, carbohydrate esterases , glycosyltransferases and various auxiliary oxidative enzymes. Due to the many different roles of carbohydrates, the CAZy family profle of an organism can provide insight into “glycobiological potential” and, in particular, mechanisms of carbon acquisition. We identified 182 GHs assigned to 25 families, 181 GTs assigned to 41 families, and two CEs assigned to two families in L. decemlineata; additionally, 99 carbohydrate-binding modules were present and assigned to 9 families . We found that L. decemlineata has three families of genes associated with plant cell wall carbohydrate digestion that commonly contain enzymes that target pectin and cellulose , the major structural components of leaves. We found evidence of massive gene duplications in the GH28 family and GH45 family , whereas GH48 is represented by only three genes in the genome. Overall, the genome of L. decemlineata shows a CAZy profle adapted to metabolize pectin and cellulose contained in leaf cell walls. The absence of specific members of the families GH43 and GH78 suggests that L. decemlineata can break down homogalacturonan, but not substituted galacturonans such as rhamnogalacturonan I or II. The acquisition of these plant cell wall degrading enzymes has been linked to horizontal transfer in the leaf beetles and other phytophagous beetles, with strong phylogenetic evidence supporting the transfer of GH28 genes from a fungal donor in L. decemlineata, as well as in the beetles D. ponderosae and Hypothenemus hampei, but a novel fungal donor in the more closely related cerambycid beetle A. glabripennis and a bacterial donor in the weevil Callosobruchus maculatus.To understand the functional genomic properties of insecticide resistance, we examined genes important to neuromuscular target site sensitivity, tissue penetration, and prominent gene families involved in Phase I, II, and III metabolic detoxifcation of xenobiotics.
These include the cation-gatednicotinic acetylcholine receptors , the γ-amino butyric acid -gated anion channels and the histamine-gated chloride channels , cuticular proteins, cytochrome P450 monooxygenases , and the Glutathione S-transferases . Many of the major classes of insecticides disrupt the nervous system , causing paralysis and death. Resistance to insecticides can come from point mutations that reduce the afnity of insecticidal toxins to ligand-gated ion superfamily genes. The cys-loop ligand-gated ion channel gene superfamily is comprised of receptors involved in mediating synaptic ion fow during neurotransmission. A total of 22 cys-loop ligand-gated ion channels were identified in the L. decemlineata genome in numbers similar to those observed in other insects, including 12 nAChRs, three GABA receptors, and two HisCls . The GABA-gated chloride channel homolog of the Resistance to dieldrin gene of D. melanogaster was examined due to its role in resistance to dieldrin and other cyclodienes in Diptera. The coding sequence is organized into 10 exons on a single scafold, with duplications of the third and sixth exon . Alternative splicing of these two exons encodes for four different polypeptides in D. melanogaster, and as the splice junctions are present in L. decemlineata, we expect the same diversity of Rdl. Te point mutations in the transmembrane regions TM2 and TM3 of Rdl are known to cause insecticide resistance in Diptera, but were not observed in L. decemlineata. Cuticle genes have been implicated in imidacloprid resistant L. decemlineataand at least one has been shown to have phenotypic efects on resistance traits following RNAi knockdown. A total of 163 putative cuticle protein genes were identified and assigned to one of seven families . Similar to other insects, the CPR family, with the RR-1 , RR-2 , and unclassifable types, constituted the largest group of cuticle protein genes in the L. decemlineata genome.
While the number of genes in L. decemlineata is slightly higher than in T. castaneum , it is similar to D. melanogaster. Numbers in the CPAP1, CPAP3, CPF, and TWDL families were similar to other insects, and notably no genes with the conserved sequences for CPLCA were detected in L. decemlineata, although they are found in other Coleoptera. A total of 89 CYP genes were identified in the L. decemlineata genome, an overall decrease relative to T. castaneum . Due to their role in insecticide resistance in L. decemlineata and other insects, we examined the CYP6 and CYP12 families in particular. Relative to T. castaneum, we observed reductions in the CYP6BQ, CYP4BN, and CYP4Q subfamilies. However, five new subfamilies were identified in L. decemlineata that were absent in T. castaneum, and the CYP12 family contains three genes as opposed to one gene in T. castaneum . We found several additional CYP genes not present in T. castaneum, including CYP413A1, CYP421A1, CYP4V2, CYP12J and CYP12J4. Genes in CYP4, CYP6, and CYP9 are known to be involved in detoxifcation of plant allelochemicals as well as resistance to pesticides through their constitutive overexpression and/or inducible expression in imidacloprid resistant L. decemlineata. GSTs have been implicated in resistance to organophosphate, organochlorine and pyrethroid insecticides and are responsive to insecticide treatments in L. decemlineata. A total of 27 GSTs were present in the L. decemlineata genome, and while they represent an expansion relative to A. glabripennis,raspberry container all have corresponding homologs in T. castaneum. Te cytosolic GSTs include the epsilon , delta , omega , theta and sigma families, while two GSTs are microsomal . Several GST-like genes present in the L. decemlineata genome represent the Z class previously identified using transcriptome data.RNA interference is the process by which small non-coding RNAs trigger sequence-specifc gene silencing, and is important in protecting against viruses and mobile genetic elements, as well as regulating gene expression during cellular development. The application of exogenous double stranded RNA has been exploited as a tool to suppress gene expression for functional genetic studies and for pest control. We annotated a total of 49 genes associated with RNA interference, most of them were found on a single scafold. All genes from the core RNAi machinery were present in L. decemlineata, including feen genes encoding components of the RNA Induced Silencing Complex and genes known to be involved in double-stranded RNA uptake, transport, and degradation . A complete gene model was annotated for R2D2, an essential component of the siRNA pathway that interacts with dicer-2 to load siRNAs into the RISC, and not previously detected in the transcriptome of the L. decemlineata mid-gut. The core components of the small interfering RNA pathway were duplicated, including dicer-2, an RNase III enzyme that cleaves dsRNAs and pre-miRNAs into siRNAs and miRNAs respectively. The dicer-2a and dicer-2b CDS have 60% nucleotide identity to each other, and 56% and 54% identity to the T. castaneum dicer-2 homolog, respectively. The argonaute-2 gene, which plays a key role in RISC by binding small non-coding RNAs, was also duplicated. A detailed analysis of these genes will be necessary to determine if the duplications provide functional redundancy. The duplication of genes in the siRNA pathway may play a role in the high sensitivity of L. decemlineata to RNAi knockdown and could benefit future efforts to develop RNAi as a pest management technology.
The whole-genome sequence of L. decemlineata, provides novel insights into one of the most diverse animal taxa, Chrysomelidae. It is amongst the largest beetle genomes sequenced to date, with a minimum assembly size of 640 Mb and 24,740 genes. The genome size is driven in part by a large number of transposable element families, which comprise at least 17% of the genome and appear to be rapidly expanding relative to other beetles. Population genetic analyses suggest high levels of nucleotide diversity, local geographic structure, and evidence of recent population growth, which helps to explain how L. decemlineata rapidly evolves to exploit novel host plants, climate space, and overcome a range of pest management practices . Digestive enzymes, in particular the cysteine peptidases and carbohydrate-active enzymes, show evidence of gene expansion and elevated expression in gut tissues, suggesting the diversity of the genes is a key trait in the beetle’s phytophagous lifestyle. Additionally, expansions of the gustatory receptor subfamily for bitter tasting might be a key adaptation to exploiting hosts in the nightshade family, Solanaceae, while expansions of novel subfamilies of CYP and GST proteins are consistent with rapid, lineage-specifc turnover of genes implicated in L. decemlineata’s capacity for insecticide resistance. Finally, L. decemlineata has interesting duplications in RNAi genes that might increase its sensitivity to RNAi and provide a promising new avenue for pesticide development. The L. decemlineata genome promises new opportunities to investigate the ecology, evolution, and management of this species, and to leverage genomic technologies in developing sustainable methods of pest control.Previous cytological work determined that L. decemlineata is diploid and consists of 34 autosomes plus an XO system in males, or an XX system in females. Twelve chromosomes are submetacentric, while three are acrocentric and two are metacentric, although one chromosome is heteromorphic in pest populations. The genome size has been estimated with Feulgen densitometry at 0.46 pg, or approximately 460 Mb. To generate a reference genome sequence, DNA was obtained from a single adult female, sampled from an imidacloprid resistant strain developed from insects collected from a potato field in Long Island, NY. Additionally, whole-body RNA was extracted for one male and one female from the same imidacloprid resistant strain. Raw RNAseq reads for 8 different populations were obtained from previous experiments: two Wisconsin populations, a Michigan population , a lab strain originating from a New Jersey field population , and three samples from European populations . All RNAseq data came from pooled populations or were combined into a population sample from individual reads. In addition, RNA samples of an adult male and female from the same New Jersey population were sequenced separately using Illumina HiSeq 2000 as 100 bp paired end reads , and three samples from the mid-gut of 4th-instar larvae were sequenced using SOLiD 5500 Genetic Analyzer as 50 bp single end reads .Four Illumina sequencing libraries were prepared, with insert sizes of 180 bp, 500 bp, 3 kb, and 8 kb, and sequenced with 100 bp paired-end reads on the Illumina HiSeq 2000 platform at estimated 40x coverage, except for the 8kb library, which was sequenced at estimated 20x coverage. ALLPATHS-LG v35218 was used to assemble reads. Two approaches were used to scafold contigs and close gaps in the genome assembly. The reference genome used in downstream analyses was generated with ATLAS-LINK v1.0 and ATLAS GAP-FILL v2.2 . In the second approach, REDUNDANS was used, as it is optimized to deal with heterozygous samples. The raw sequence data and L. decemlineata genome have been deposited in the GenBank/EMBL/DDBJ database . We estimated the size of our genome using a kmer distribution plot in JELLYFISH, where we mapped the 100 bp paired-end reads from the 180 bp insert library, used a 19 bp kmer distribution plot, and corrected for ploidy. Automated gene prediction and annotation were performed using MAKER v2.0, using RNAseq evidence and arthropod protein databases.