Sequence Capture Human Exome 2.1M Array
Source: UCSC Build: HG18 or HG19 Recommended Storage: Store arrays desiccated at room temperature. |
||||||
| Description | # of Probes | Capture Target Size | Catalog Number | Pack Size | Workflow | Ordering* |
| NimbleGen Sequence Capture Human Exome 2.1M Array | 2.1M | Up to 50Mb | 05451957001 | 1 Slide | ||
| 05547792001 | 5 Slides | |||||
| N/A | Dataset | |||||
| * Availability of products varies from country to country. | ||||||
Advantages
- High Performance: Capture up to 50Mb total regions on a single 2.1M array and up to 5Mb on a single 385K array with high coverage and specificity.
- Design Expertise: Ensure the highest level of specificity and sensitivity with an empirically tested and validated capture design algorithm.
- Embedded Quality Controls: NimbleGen Sequence Capture arrays incorporate built-in control probes to ensure system performance.
- Maximum Flexibility: Tailor the array design to capture your genomic regions or thousands of exons in parallel.
- Substantial Savings: Save time and cost compared to PCR-based methods.
Applications
NimbleGen Sequence Capture arrays are suitable for targeted sequencing of any size, from small target regions like 250KB (Figure 1) to large regions as large as 30MB (Table 1). All human designs utilize the empirically optimized Sequence Capture design algorithm to ensure highly uniform capture. For example, a 250KB contiguous region, representing a typical GWAS locus, is captured with high specificity and uniformity (Figure 1). Note that small repetitive regions where no probes were selected can still be covered by sequencing, due to efficient capture from neighboring probes and with the advantage of long reads from the Genome Sequencer FLX Titanium Series (red boxes in Figure 1).
Table 1
| Capture of Large Contiguous Regions using 2.1M Arrays | ||
| Experiment | A | B |
| Total Reads (millions) | 1.2 | 1.3 |
| Total Bases | 347 Mb | 380 Mb |
| % Reads Mapped Uniquely | 87.6 | 86.7 |
| % Bases Mapped Uniquely | 93.1 | 92.6 |
| % Mapped Reads on Target | 79.1 | 70.8 |
| Average/Median Coverage | 10.3/9 | 10.1/8 |
Table 1. The ENCODE pilot regions (~30Mb) are captured using 2.1M arrays and sequenced. The target regions consist of ~ 50 individual contigs of ~ 500kb each
Figure 1
Figure 1. High-Performance Targeted Resequencing in a 250kb Target Region.
Roche offers a seamless workflow combining NimbleGen Sequence Capture Arrays and the high throughput sequencing of the Genome Sequencer FLX System from 454 Life Sciences. This complete solution of kits, arrays and instruments are specifically designed to optimize the workflow, reduce processing time, minimize costs, and enhance data quality. Furthermore, the GS Reference Mapper software from 454 Life Sciences enables researchers to easily identify variants like SNPs and indels from the final data output without complicated bioinformatics infrastructure (Table 2)
Table 2
| 454 Optimized Sequence Capture: Resequencing of HapMap Research Sample | ||
| Experiment | 250 kb - 1 | 1 Mb - 1 |
| Total Reads | 70,190 | 140,374 |
| Total Bases | 27,646,394 | 55,453,593 |
| On-Target Reads | 75.2% | 87.3% |
| Median Coverage | 85 | 49 |
| Target Bases with 1+ Coverage | 98.6% | 96.9% |
| Target Bases with 10+ Coverage | 97.3% | 92.8% |
| Known SNP Detection Rate | 97.4% | 96.5% |
Table 2. Sequence Capture Performance on a 250 kb contiguous region and a 1 Mb contiguous region in the human genome. Data shown are from 1 of the 4 independent experiments for each region. A HapMap sample is used in the study and SNP calls were generated by the GS Reference Mapper software.
An example of discovering causative mutations, the mouse Kit locus (~200KB) from 5 non-complementing Kit mutants is shown in Figure 2. These alleles include one known allele W-41J, and four unknown alleles, W-20J, W-39J, W-40J and W-73J. The known mutation from W-41J was confirmed in this experiment, and the data analysis successfully identified a non-synonymous coding mutation for each of the 4 unknown alleles. (D’Ascenzo et al, Mamm. Genome, 2009, 20:424–436)
Figure 2
Figure 2. Mutation discovery in the mouse KIT Locus using Sequence Capture and 454 Sequencing.
Protocol
Roche NimbleGen offers two types of capture methods: SeqCap EZ Library, a solution-based method and Sequence Capture Arrays, an array-based capture method.
Sequence Capture Protocols
- Genomic DNA: SeqCap EZ Oligo pool or an array is made against target regions in the genome.
- Library Preparation: Standard shot-gun sequencing library is made from genomic DNA.
- Hybridization: The sequencing library is hybridized to the SeqCap EZ Oligo pool or to the Sequence Capture array.
Steps 4 and 5 are different for each protocol:
SeqCap EZ Library, biotinylated DNA oligos in solution
- Bead Capture: Streptavidin beads are used to pull down the complex of capture oligos and genomic DNA fragments.
- Washing: Unbound fragments are removed by washing.
Sequence Capture, capture probes synthesized on array:
- Washing: Unbound fragments are removed by washing.
- Target Fragment Elution: The enriched fragment pool is eluted and recovered from the array.
- Amplification: Enriched fragment pool is amplified by PCR.
- Enrichment QC: The success of enrichment is measured by qPCR at control loci.
- Sequencing-Ready DNA: The end product is a sequencing library enriched for target regions, ready for high throughput sequencing.
For more information on how to get trained and set up with Sequence Capture Arrays, visit our Quick Guide page.
Annotation Files
The annotation package for the NimbleGen Sequence Capture 2.1M Human Exome Array includes 6 files that provide visualization and annotation of the array design.
Download the complete 6-file set!
Note: The GFF files can be opened using SignalMap software from NimbleGen. The BED file can be displayed within the UCSC genome browser as a custom annotation track. XLS files can be opened by Microsoft Excel software.
- 2.1M_Human_Exome.gff = There are two tracks in this .gff file. The primary_target_region track displays the exon targets, and the capture_target track displays exon targets that are actually covered by the probes. You will notice that sometimes capture target and primary target regions do not perfectly align, meaning that 1) the exon is shorter than 200bp and the target region was extended out to at least 200bp, or 2) no probes were designed to that particular region of the exon target due to repetitive sequence. Note that this is the same design file as 080904_ccds_exome_rebalfocus_HX1.gff that is part of the standard deliverable for a 2.1M Human Exome array.
- 2.1M_Human_Exome.bed = This file contains the same information as the 2.1M_Human_Exome.gff file above, but in BED format, and is to be displayed within the UCSC Genome Browser. Note that this is the same design file as 080904_ccds_exome_rebalfocus_HX1.gff that is part of the standard deliverable for a 2.1M Human Exome array.
- 2.1M_coding_exons_annotation.gff = There is a single track in this .gff file. Each vertical bar represents 1 human protein coding exon target. A gray bar means that there are no covered bases for that exon target. A black bar means that there is at least 1 base of coverage. When using SignalMap software, move the cursor over each exon to display the CCDS ID.
- 2.1M_miRNA_annotation.gff = There is a single track in this .gff file. Each vertical bar represents 1 human miRNA exon target. A gray bar means that there are no covered bases for that exon target. A black bar means that there is at least 1 base of coverage. When using SignalMap software, move the cursor over each miRNA to display the miRNA registry identifier.
- 2.1M_coding_exons_annotation.xls and 2.1M_miRNA_annotation.xls These are Excel spreadsheets that list the genes and miRNAs that were targeted by the array design. The columns are:
- CCDS ID = The identifier number for a particular human protein coding gene.
- miRNA REGISTRY = The miRBASE identifier for a particular human miRNA gene.
- GENE SYMBOL = The alphanumeric identifier for a particular human protein coding gene.
- DESCRIPTION = The full name for a particular human protein coding gene.
- REFSEQ = The RefSeq identifier for a particular human protein coding gene.
- UCSC GENE ID = The UCSC identifier for a particular human protein coding gene.
- ENSEMBL = The ENSEMBL identifier for a particular human protein coding gene.
- CHROMOSOME = Identifies on what chromosome a particular human protein coding gene or human miRNA gene resides.
- STRAND = On what strand (+ or -) from which a particular human protein coding gene or human miRNA gene is expressed.
- START = Coordinates where a particular human protein coding gene or human miRNA gene starts.
- END = Coordinates where a particular human protein coding gene or human miRNA gene ends.
- EXON COUNT = A raw count of how many exons comprise a particular human protein coding gene.
- ARRAY COVERAGE = Percentage of target bases from a particular human protein coding gene or human miRNA gene covered by probes designed on the NimbleGen Sequence Capture array.
- ARRAY COVERAGE W 100BP EXTENSION = Percentage of target bases from a particular human protein coding gene or human miRNA gene covered by probes designed on the NimbleGen Sequence Capture array, PLUS 100bp of padding on both sides of each probe. This is a better estimate of the final sequencing coverage, because ~ 100bp flanking sequences at both ends of each probe are typically captured and sequenced.
Download the complete 6-file set!
Reagents
NimbleGen Sequence Capture Array Hybridization and Wash Kits contain the components to perform hybridization and wash steps in sequence capture protocols using NimbleGen Sequence Capture Arrays.
| Description | Catalog Number | Pack Size | Kit Capacity | Compatible Applications | Ordering* |
| Sequence Capture Array Hybridization and Wash Kit | 05853257001 | 1 Kit | 8 Arrays | Sequence Capture Arrays | |
| * Availability of products varies from country to country. | |||||
Literature
Brochures and Sales Flyers
- NEW! Roche NimbleGen Research Solutions: Microarrays and target enrichment for genetic disease discovery
Brochure (PDF Format 7.1MB) - NimbleGen Sequence Capture
Brochure (PDF Format 3.9MB) - NimbleGen 454 Optimized Sequence Capture 385K Arrays
Sales Flyer (PDF Format 473KB) - NimbleGen Sequence Capture 385K Version 2.0 Arrays:
Custom Human Arrays for Delivery
Sales Flyer (PDF Format 309KB) - NimbleGen Sequence Capture 2.1M and 385K Arrays: Equipment and Reagent Requirements
Sales Flyer (PDF Format 247KB)
User Guides
- NimbleGen Arrays User’s Guide: Sequence Capture Array Delivery (Version 3.2)
User’s Guide (PDF Format 3.4MB)8 - Sequence Capture Custom Designs: Guide to Submitting Your Target Sequence
User’s Guide (PDF Format 906KB) - NimbleGen Arrays User’s Guide: 454 Optimized Sequence Capture Array Delivery (Version 1.1)
User’s Guide (PDF Format 1.3MB)
Downloads
- 2.1M Human Exome Annotation Files
Annotation Files (ZIP Format 8.7MB)
What files are included in this download?
Technical Notes & Reprints
- NimbleGen Array Capture Outperforms Two Target-Enrichment Methods in ABRF Research Group Comparison
Reprint - GenomeWeb InSequence (PDF Format 107KB) - Comparison of Enrichment Technologies for Targeted Resequencing of Custom Regions
Technical Note (PDF Format 552KB)
