Materials and methods
For our primer software named Hiplex-primer (publicly available for download as a set of Python scripts from https://github.com/bjpop/hiplex-primer), we input the exon coordinates for PALB2 and XRCC2, set the length of the primer intervening sequence to 100 bases, and set a target melting temperature for gene-specific primer regions at 64°C, along with a maximum length of 30 bases and an allowable gene-specific region length variability of 10 bases. Each of these parameters can be modified by the user. To the gene-specific primer region outputs, we added 5′ heel sequences corresponding to adapters that are compatible with Ion Torrent (Life Technologies, Foster City, CA) sequencing chemistry (Figure 1 and Supplementary Table S1). For our experiments, no primers were altered from the initial automated design and all gene-specific primers were present in equal concentration. All oligonucleotide sequences are listed in Supplementary Table S1 and were synthesized by Integrated DNA Technologies (Coralville, IA). Adapter primers were HPLC purified, but all other primers were supplied as standard desalting grade.
Input material consisted of genomic DNA derived from an Epstein Barr Virus-transformed lymphoblastoid cell line (LCL) generated in house, and from FFPE breast cancer tumor tissue collected in 1997 as part of the Australian Breast Cancer Family Study (ABCFS) (16). Extraction was performed using the QIAmp DNA Blood Kit (Qiagen, Dusseldorf, Germany) and QIAamp DNA FFPE Tissue Kit, respectively. DNA was quantitated using the Qubit dsDNA Assay system (Life Technologies).
A 50 μL PCR reaction was made up of 1× Phusion HF PCR buffer (ThermoScientific, Waltham, MA), 2 U of Phusion Hot Start II High-Fidelity DNA Polymerase, 400 μM dNTPs (Bioline, London, UK), 0.5 μM gene-specific primer pool (0.004 μM each individual gene-specific primer), 2.5 mM MgCl2, and either 100 or 25 ng input genomic DNA. The following steps were used for the PCR: 98°C for 1 min, 6 cycles of [98°C for 30 s, 50°C for 1 min, 55°C for 1 min, 60°C for 1 min, 65°C for 1min, 70°C for 1 min], addition of 2 μL of a mix of 50 μM IT_P1_noT and IT-A-key primers (to achieve a final reaction concentration of 2 μM for each), then a further 19 cycles of [98°C for 30 s, 50°C for 1 min, 55°C for 1 min, 60°C for 1 min, 65°C for 1 min, 70°C for 1 min], followed by incubation at 60°C for 20 min. Eight μl of product was subjected to 1.5% agarose/ TBE (w/v) gel electrophoresis. The approximately 220 bp band containing our target library was excised and the DNA was extracted and purified using the Qiagen QIAEX II kit (Qiagen). Figure 2 shows an agarose gel profile for the Hi-Plex product.
Hi-Plex of the 100 ng LCL-derived DNA was conducted using the Ion 314 chip/Ion PGM 200 Sequencing Kit and mapped using Torrent Suite v.3.4.1 (Life Technologies). Hi-Plex with 100 ng or 25 ng FFPE-derived DNA used the Ion 316 chip/Ion PGM 200 Sequencing Kit and Torrent Suite v3.4.2. In order to assess individual amplicon PCR efficiencies, read depths were determined at the midpoints of the sequences between the gene-specific primer regions. Results and discussion
We have developed our Hiplex-primer software to implement our automated approach to primer design. This software tool accepts target coordinates and the primer intervening sequence size as user inputs. The user also specifies an intended length for gene-specific primer regions and a maximum gene-specific primer region length. The software assesses primer melting temperatures based on the simplified assumption that each G or C contributes 4°C to the total melting temperature and each A or T contributes 2°C (the ‘4 and 2’ rule). The software searches in and around these coordinate blocks to minimize the differences between predicted melting temperatures for gene-specific primer sequences and a user-defined target melting temperature.
Relative amplification bias is restricted in Hi-Plex by a combination of mechanisms. Use of 5′ heel clamps, which have adapter sequences that are used subsequently by the sequencing chemistry, can reduce amplification bias (17). Adapter primers are added at the early stages of thermocycling. For the majority of amplification cycles, successful priming is not dependent on gene-specific primers; rather, priming of all targets in the pool can be driven by the same two adapter primers. Generally, smaller targets are more efficiently amplified than larger ones (18, 19). Hi-Plex defines amplicon sizes within a narrow size range, thus eliminating size-related bias. Our method also uses a highly processive DNA polymerase, preferably one with high fidelity (e.g., Phusion Hot Start II High-Fidelity DNA Polymerase), and permissive thermocycling conditions. The system can afford to use relatively low temperature annealing conditions because the size selection step that follows PCR eliminates the great majority of off-target reaction by-products. As such, Hi-Plex tolerates a broad range of primer types with different G/C-contents and actual primer annealing temperatures. In addition to permissive annealing, Hi-Plex uses permissive extension conditions Previous studies have demonstrated that low G/C-content amplicons can benefit from relatively low extension temperatures (20). On the other hand, sequences with relatively high G/C-content can benefit from relatively high temperature thermocycling conditions (21). Our preferred approach to allow for successful amplification of all amplicons representing a broad spectrum of primer and intervening sequence contexts is to apply a gradient of annealing/extension temperatures during thermocycling. Hi-Plex uses relatively long cycle steps to allow greater opportunity for priming and complete extension. In combination, these design elements free the system from many sequence contextual design constraints.
Application of Hi-Plex to 100 ng LCL-derived DNA showed that 93.33% (56/60), 98.33% (59/60), and 100% of targeted amplicons were represented within 5-fold, 10-fold, and 12.5-fold of the mean on-target coverage, respectively. When mapped to the whole human genome, 86.94% of Hi-Plex reads were aligned within PALB2 and XRCC2, with a total number of on-target reads and a mean number of on-target reads per amplicon of 147,838 and 2463.96, respectively (314 chip). The number of on-target reads ranged from 199 (12.38-fold less than the mean) to 10,746 (4.36-fold higher than the mean). Figure 3 illustrates the relative representation of the 60 amplicons in relation to mean primer pair G/C-content and intervening sequence G/C-content. The G/C-content of the gene-specific regions of individual primers ranged from 10.35% to 66.67%. The G/C-content of the intervening sequences ranged from 25% to 74% and amplicons included G/C-rich 5′ untranslated regions. It is worth noting that the single amplicon that yielded a number of on-target reads more than 10-fold from the mean (12.38- fold lower) included a primer in which the 14 3′-most nucleotides had a G/C-content of only 14.29% and included a predicted perfect 4 bp hairpin structure. The LCL was derived from a known heterozygous carrier of the pathogenic PALB2 c.3113 G > A mutation. Of the 1095 reads at this position, 552 (50.41%) and 542 (49.50%) represented the major and minor alleles, respectively. This supports the value of Hi-Plex for gene screening projects, as the approach can accurately detect mutations.