NimbleGen Masthead
 
spacer Product Images

Comparative Genome Sequencing

Comparative Genome Sequencing (CGS) rapidly surveys entire microbial genomes, identifying the locations of SNPs, insertions, or deletions and then fully characterizing identified SNPs by array sequencing with unrivaled speed and accuracy. NimbleGen’s CGS technology accelerates microbe characterization and provides an efficient, high-throughput, cost-effective method for rapid, genome-wide analysis.

Advantages and Applications

Survey Entire Microbial Genomes with Unmatched Speed

With the current capacity of 385,000 custom probes, the initial survey of a microbial genome can span up to 1,300,000bp of genomic sequence in a single array. For larger genomes, the design can be split over as many arrays as required for complete genomic coverage. In the first phase of CGS, genomic alterations are identified by labeling test and reference genomic DNA samples and co-hybridizing to survey arrays derived from both strands of the reference genome. The locations of genomic alterations are identified in a single round of hybridization (see Figure 1).

Rapid, Targeted Resequencing of Regions of Interest

The genomic regions of interest identified in the first phase provide the content for targeted sequencing arrays in the second phase of CGS. Targeted arrays, sequencing only the regions of the genome where alterations exist, maximize the yield of useful data from every synthesized array. Because NimbleGen's highly flexible Maskless Array Synthesis (MAS) technology enables rapid synthesis of new array designs, the results of phase one are quickly incorporated into sequencing array designs in phase two for the characterization of microbial mutations.

Accurate, Scalable SNP Detection

NimbleGen’s CGS technology is the most efficient method for the rapid localization and characterization of genomic alterations on a genome-wide scale. Efficiently detecting genomic changes, CGS can sensitively characterize even a single base change in an entire genome. In one study, CGS utilized only five arrays to effectively identify the only single base change in a 5,000,000bp genome, generating zero false positives. CGS typically identifies ~95% of SNPs present in unique regions of genomes, with less than one false positive per 100,000 to 1,000,000 bases analyzed. CGS also readily identifies regions requiring manual sequencing, such as insertions and deletions, thus greatly simplifying the process of fully characterizing most genomic alterations.

Applications

CGS has a wide range of applications in microbial comparative genomics, including:

  • Characterizing industrially important microbes for food, pharmaceutical, and chemical industries.
  • Characterizing genomic alterations associated with virulence and attenuation.
  • Genotyping infectious pathogens for strain identification.
  • Tracking genomic alterations associated with environmental changes in natural environments and chemostat/ fed batch settings.
  • Rapidly characterizing phenotypically interesting microbes.
  • Identifying genes responsible for anti-microbial agent resistance.
The CGS Protocol

The CGS protocol is divided into two phases. In PHASE 1, regions of genomic difference are identified by a comparative hybridization of test DNA vs. reference DNA on a whole-genome tiling array. In PHASE 2, only the identified regions of genomic difference are sequenced to produce a set of fully characterized SNPs.

Figure 1. CGS Protocol Illustration

  1. DNA samples are obtained from the strain of interest and the sequenced reference strain.
  2. Test and reference genomic DNA samples are independently labeled, with fluorescent dye either by one-color or two-color protocols.
  3. For one-color CGS, the 2 pools are hybridized to two separate CGS whole-genome tiling arrays. For two-color CGS, samples are combined and hybridized to a single array.
  4. The array images are extracted and the ratios of test DNA to reference DNA are plotted versus their genomic position.
  5. The genomic sequence surrounding each identified ratio peak are converted into a sequencing array querying all base positions around each peak on both DNA strands.
  6. The labeled test DNA sample is hybridized to the sequencing array.
  7. The sequencing image is extracted and the sequence is algorithmically "read."
  8. The resulting sequences are compared to the reference sequence and SNPs are called and categorized as synonymous, non-synonymous, or non-coding (see" SNPs Defined"). *
  9. Regions of ambiguous calls are flagged by genomic position for further characterization.

* Requires GenBank file annotated with hyperlinks.

CGS Microarray Specifications
Probe length Isothermal (Tm balanced, 29mer - 39mer)
Probe design format Tiled throughout genome on forward and reverse strands.
Total features 385,000
Feature Size 16µm x 16µm
Array size 17.4mm x 13mm
Slide size 1" x 3" (25mm x 75mm) glass
Sample required 10µg each sample and reference genomic DNA
Sample labeling Random prime labeling
Sequence source GenBank or user-supplied sequence
Deliverables Mutation mapping data, annotated SNPs, sequences of mutated genes, and SignalMapTM data browser
Genome size / array Up to 1,500,000bp per survey array
Sequencing 48,000 bases per array
SNP discovery rate Typically ~95% within unique regions of genomes
Genome divergence < 5% for mutation mapping
< 0.5% for sequencing
Data Delivery

NimbleGen's Comparative Genome Sequencing methodology provides a global perspective on differences between genomes, but the enormity of these datasets can be daunting without the proper tools. Comparative Genome Sequencing datasets are delivered as loadable tracks in NimbleGen's SignalMap data browser. The tracks range from raw array intensity values to fully processed SNP calls. Datasets can be easily browsed, rescaled, and exported for publication.

 

CONTACT US

LITERATURE

NEW RESEARCH

WEBINARS