Validation of concurrent preimplantation genetic testing for polygenic and monogenic disorders, structural rearrangements, and whole and segmental chromosome aneuploidy with a single universal platform

Validation of concurrent preimplantation genetic testing for polygenic and monogenic disorders, structural rearrangements, and whole and segmental chromosome aneuploidy with a single universal platform

European Journal of Medical Genetics 62 (2019) 103647 Contents lists available at ScienceDirect European Journal of Medical Genetics journal homepag...

2MB Sizes 0 Downloads 58 Views

European Journal of Medical Genetics 62 (2019) 103647

Contents lists available at ScienceDirect

European Journal of Medical Genetics journal homepage: www.elsevier.com/locate/ejmg

Validation of concurrent preimplantation genetic testing for polygenic and monogenic disorders, structural rearrangements, and whole and segmental chromosome aneuploidy with a single universal platform

T

Nathan R. Treffa,∗, Raymond Zimmermana, Elan Bechora, Jeff Hsua, Bhavini Ranaa, Jens Jensena, Jeremy Lia, Artem Samoilenkoa, William Mowreya, James Van Alstinea, Mark Leondiresb, Kathy Millerb, Erica Paganettib, Louis Lelloc, Steven Averyc, Stephen Hsuc, Laurent C.A. Melchior Telliera a

Genomic Prediction Inc., 675 US Highway One, North Brunswick, NJ, 08902, USA Reproductive Medicine Associates of Connecticut, 761 Main Ave #200, Norwalk, CT, 06851, USA c Michigan State University, Hannah Administration Building, 426 Auditorium Rd., East Lansing, MI, 48824-1046, USA b

ARTICLE INFO

ABSTRACT

Keywords: Preimplantation genetic testing Aneuploidy Monogenic disorder Structural rearrangement Polygenic disorder

Preimplantation genetic testing (PGT) has been successfully applied to reduce the risk of miscarriage, improve IVF success rates, and prevent inheritance of monogenic disease and unbalanced translocations. The present study provides the first method capable of simultaneous testing of aneuploidy (PGT-A), structural rearrangements (PGT-SR), and monogenic (PGT-M) disorders using a single platform. Using positive controls to establish performance characteristics, accuracies of 97 to >99% for each type of testing were observed. In addition, this study expands PGT to include predicting the risk of polygenic disorders (PGT-P) for the first time. Performance was established for two common diseases, hypothyroidism and type 1 diabetes, based upon availability of positive control samples from commercially available repositories. Data from the UK Biobank, eMERGE, and T1DBASE were used to establish and validate SNP-based predictors of each disease (7,311 SNPs for hypothyroidism and 82 for type 1 diabetes). Area under the curve of disease status prediction from genotypes alone were 0.71 for hypothyroidism and 0.68 for type 1 diabetes. The availability of expanded PGT to evaluate the risk of polygenic disorders in the preimplantation embryo has the potential to lower the prevalence of common genetic disease in humans.

1. Introduction Preimplantation genetic testing (PGT) has been successfully used to reduce miscarriage and increase success rates following in vitro fertilization (IVF). These improvements in the treatment of infertility have been made possible through a process that involves characterization of chromosomal aneuploidy (PGT-A), which has been validated by several randomized controlled trials (Forman et al., 2013; Scott et al., 2013; Yang et al., 2012). PGT has also been used to successfully prevent inheritance of monogenic disorders (PGT-M) for more than 3 decades (Handyside et al., 1992), and in patients which carry a balanced translocation (PGT-SR)(Iews et al., 2018). Methods for expanding PGT beyond aneuploidy and single locus screening have yet to be established. While the World Health Organization estimates that approximately 1% of newborns possess a monogenic ∗

disorder (Barness et al., 1996), 15–25% of humans will die prematurely from non-communicable disease (Global-Burden-of-DiseaseStudy, 2016), which predominantly originate from polygenic disorders. However, the ability to screen embryos for polygenic disorder risk, hereafter referred to as PGT-P, has not previously been developed. Several studies have now demonstrated the ability to predict the risk of polygenic disorders in adults, including several types of cardiovascular disease, cancers, respiratory diseases, and diabetes, with a risk variance explained by polygenic risk scores equivalent to, or exceeding, that of risk variance contributed by monogenic disorders (Khera et al., 2018). This capability is largely a consequence of advances in the development of large DNA biobank datasets, such as the UK BioBank (Sudlow et al., 2015), and application thereto of contemporary machine learning algorithms (Lello et al., 2018). Many similar biobank datasets

Corresponding author. E-mail address: [email protected] (N.R. Treff).

https://doi.org/10.1016/j.ejmg.2019.04.004 Received 5 February 2019; Received in revised form 22 March 2019; Accepted 2 April 2019 Available online 23 April 2019 1769-7212/ © 2019 The Authors. Published by Elsevier Masson SAS. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/BY-NC-ND/4.0/).

European Journal of Medical Genetics 62 (2019) 103647

N.R. Treff, et al.

are under development, which will lead to more accurate predictors on extant polygenic disorders, and expanded lists of polygenic disorders tractable to these methods. The idea of performing PGT-P is not new. In fact, Schulman and Edwards predicted this capability in 1996 (Schulman and Edwards, 1996). Furthermore, given the recent opinion by an American Society for Reproductive Medicine Practice Committee, that PGT for adult onset conditions of lesser severity and penetrance is ethical for reasons of reproductive liberty (ASRM-Practie-Committee, 2018), and the significantly higher prevalence and impact of polygenic disorders compared to monogenic disorders, the development and practice of PGT-P is clearly warranted. This study combines polygenic risk score algorithms with novel molecular biology methodologies, to allow simultaneous prediction of aneuploidy, structural rearrangements, monogenic disorders, and polygenic disorders for the first time, with a single and universal platform, herein referred to as expanded (e)PGT.

2. Materials and methods 2.1. Strategy The overarching theme of the present study was to establish performance on positive controls. In order to establish validity for each major category of PGT, monogenic disorders (PGT-M), aneuploidy (PGT-A), structural rearrangements (PGT-SR), and polygenic disorders (PGT-P), samples with known genetic status were tested for concordance. In each case, limited quantities of material were used to model the number of cells available from a trophectoderm biopsy of a blastocyst stage embryo (Neal et al., 2017), and where possible, to evaluate performance on rebiopsies from previously tested blastocysts. 2.2. Samples Forty cell lines were obtained from Coriell Cell Repository (Camden, NJ USA)(Table 1). Cell lines were chosen based on genetic data reported by Coriell and with preference for lines stored at 10 passages or

Table 1 Positive control samples used to evaluate performance of ePGT for whole chromosome and segmental aneuploidy, monogenic, and polygenic disorder risk. Coriell Cell Line ID

Karyotype (status)

# tested

# with result

whole chromosome

segmental

whole chromosome GM00326 GM01250 GM01359 GM02948 GM03184 GM04411 GM04425 GM04426 GM04427 GM04428 GM04435 GM07408 GM07824 GM07904 GM09286 GM10315 GM10959

49,XXXXY 47,XYY 47,XY,+18 47,XY,+13 47,XY,+15 46,XX 46,XX 46,XY 46,XY 46,XY 48,XY,+16,+21[45]/47,XY,+21[5] 46,XX 46,XY 46,XX 47,XY,+9 47,XX,+22 46,XX

6 4 21 4 21 16 16 16 16 16 16 8 8 8 6 4 2

6 4 21 4 21 16 12 14 13 16 12 8 8 8 6 4 2

6 4 20 4 21 16 12 14 13 16 12 8 8 8 6 4 2

5 4 21 4 21 16 12 14 13 15 12 8 8 8 6 4 2

segmental GM16260 GM10995 GM10960 GM04626 GM03808 GM13466 GM10958 GM01555 GM50192 GM15917

46,X,(complex) 46,XY,der(11)t(1;11) 46,XY,t(1;11)(balanced reference) 47,XXX,del(15q) 46,XY,del(11) 46,XY,del(7) 46,XX,t(1;11)(balanced reference) 47,XY,+der(13) 46,XX,del(5)(:p13 > qter) 44,XY,(complex)

8 8 2 12 8 8 8 8 8 8

8 8 2 10 8 8 8 8 8 7

8 8 2 9 8 8 8 8 8 7

6 8 2 9 8 8 8 7 8 5

monogenic disorder GM07224 GM07225 GM07226 GM07227 GM07228 GM07826

46,XX 46,XY 46,XX 46,XY 46,XY 46,XY

(△F508 (△F508 (△F508 (△F508 (△F508 (△F508

6 6 16 16 16 8

6 6 16 16 16 8

6 6 16 15 16 8

6 6 16 15 16 8

polygenic disorder GM03359 GM03358 GM03360 GM03237 GM02637 GM02638 GM02639

46,XY 46,XY 46,XY 46,XY 46,XY 46,XX 46,XY

(T1D (T1D (T1D (T1D (T1D (T1D (T1D

24 24 22 16 8 8 10

22 24 22 16 8 8 9

22 24 22 16 8 8 9

21 24 22 16 8 8 9

446

427 95.7%

424 99.3%

417 97.7%

Totals Rate

normal) carrier) carrier) carrier) carrier) affected)

normal) affected) affected) normal) affected) affected) normal)

2

European Journal of Medical Genetics 62 (2019) 103647

N.R. Treff, et al.

Gene Titan, and UKBB Axiom arrays as recommended by the supplier (Affymetrix Inc, Santa Clara, CA).

Table 2 Embryo rebiopsies used to evaluate ePGT performance for whole chromosome aneuploidy. Embryo ID

Prior Karyotype

whole chromosome 1–1 46,XX 1–2 46,XY,-14,+21 2–1 47,XX,+4 3–1 46,XX,+21,-22 3–2 46,XY 4–1 46,XY 4–2 45,XY,-16 4–3 46,XX 4–4 46,XX 4–5 46,XX 4–6 46,XX 4–7 46,XX 4–8 46,XX 5–1 46,XY Totals Rate

HBB E6V status

# tested

# with result

# concordant

normal normal normal affected affected normal normal normal normal normal normal normal normal normal

5 4 5 3 3 4 4 3 3 3 2 4 2 3

5 3 5 3 3 3 4 3 3 3 2 4 2 3

5 3 5 3 3 3 4 3 3 3 2 4 2 3

48

46 95.8%

46 100.0%

2.4. PGT-A Raw intensity CEL files were analyzed using Affymetrix Power Tools, which generate data used for quality control measures, probelevel copy number and B-allele frequency estimates. After removing samples that fail best practices “DishQC” thresholds, probe-level data were then analyzed with gSUITE software (Genomic Prediction Inc). The resulting data include copy number estimates and B-allele frequencies for each chromosome for predicting copy number alterations and performing several genotyping based analyses as described below. The ability to diagnose samples with mosaicism (mixture of aneuploid and euploid cells) was also evaluated. DNA mixtures from 3 pairs of embryo biopsies (embryos 1-1:1-2, 3-1:3-2, and 4-1:4-2, Table 2) were made at a variety of ratios (1:1, 1:2, 1:3, 1:4, 2:1, 3:1, and 4:1) giving a specific percentage aneuploidy within each sample (20, 25, 33, 50, 67, 75, and 80%). Each mixture was run in duplicate for each of 3 embryo biopsy pairs and for each ratio. The percentage of samples diagnosed as aneuploid was calculated. 2.5. PGT-SR

less to avoid fluctuation in genetic composition associated with extended culture. Each cell line was cultured as recommended by the supplier and 5–8 cell aliquots were prepared using a microscope and 200 μm stripper tips (Origio, Denmark). Cells were loaded in 1ul of loading buffer (Genomic Prediction Inc, North Brunswick, NJ) after washing in 1× PBS (Invitrogen, Carlsbad, CA). These cell lines were used to model cases of PGT-A, -SR, -M, and -P. In order to obtain an additional model for a polygenic disorder, 13 plasma samples with known hypothyroidism status were obtained from Discovery Life Sciences (Los Osos, CA)(Table 2). Forty-eight rebiopsies of discarded embryos were obtained under IRB approval and patient consent from Reproductive Medicine Associates of Connecticut and were evaluated for concordance with previous test results obtained from existing PGT reference laboratories (Table 3).

Segmental aneuploidy associated with unbalanced translocation inheritance was evaluated using several cell lines with known segmental imbalances. Coriell Cell Repository conventional G-banding, FISH, and/or array-based testing results were used as the reference methodology. In addition, sibling cell lines, where one sibling was a carrier of an unbalanced translocation and another sibling carried a balanced translocation, were used to validate the ability to distinguish whether a sample is a carrier of a balanced translocation, or is truly normal, using linked informative SNPs, based upon parental genotypes. Unbalanced embryos can be used as a reference to define informative SNPs and to determine whether a balanced or normal sample (based on copy number analysis) is actually normal (matches the normal chromosome and mismatches the derivative chromosome) or is a carrier (matches the derivative chromosome and mismatches the normal chromosome) when comparing genotypes to the chromosomes in an unbalanced sample (which is known to carry one normal and one derivative chromosome). This strategy has been previously described indepth elsewhere (Treff et al., 2016).

2.3. Genetic analyses Samples were amplified in 0.2 ml PCR tubes (USA Scientific Inc, Ocala, FL) using the ePGT Kit, as recommended by the supplier (Genomic Prediction Inc.) and using a 2720 thermalcycler (ThermoFisher Inc., Foster City, CA). Amplified DNA was quantified using a nanodrop 8000 (ThermoFisher). Samples were normalized to 50ng/ul in a 10ul volume and processed on the Nimbus 2000, Axiom

2.6. PGT-M Intensity files from parents, reference sibling or grandparents, and cells from offspring were placed through an initial genotyping step with software from Affymetrix. A custom linkage analysis protocol (gSUITE, Genomic Prediction Inc.) was used to determine the most reliable sites within a 1 million base pair window on each side of the locus of interest. Linkage was then used to determine whether each embryo inherited the mutation. For example, if the reference sibling was a carrier, each sample's diagnosis was given as ‘carrier’, ‘affected’ or ‘unaffected’ based upon genotype similarities. The linkage analysis also allows for haplotypes to be computed in the surrounding window, and each sample's parental haplotypes were used for Mendelian inheritance analysis.

Table 3 Samples used to evaluate ePGT performance for polygenic risk prediction. Hypothyroidism Plasma Sample

Age

Sex

KH16-02209 KH16-02212 KH16-02215 KH16-02220 KH16-02222 KH16-02227 KH16-02469 KH16-02470 KH16-02472 KH16-02474 KH16-03031 KH16-03042 KH16-03044

74 89 81 54 84 69 84 74 86 71 87 74 66

F F F F F F F F F F F F F

Y Y Y Y Y Y Y Y Y Y Y Y Y

2.7. TaqMan In some cases, a trio (i.e. mother, father, and an existing child) is not available to determine informative linked markers. In these cases, it is possible to perform linkage “on-the-fly” by directly testing the mutation

3

European Journal of Medical Genetics 62 (2019) 103647

N.R. Treff, et al.

in embryos produced during IVF and using the data generated as a reference for making predictions. In order to do this, real-time quantitative (q)PCR was performed on the excess DNA produced by the initial amplification step. This situation was modeled with the same sample set described above for PGT-M and from multiple embryo rebiopsies. A TaqMan qPCR assay was designed to determine genotypes at the delta ΔF508 locus (rs113993960) for the cell lines, and the E6V HBB mutation (rs334) for the embryo rebiopsies. Aliquots of the amplified DNA were subjected to standard TaqMan allelic discrimination. Genotypes were then produced and used to determine informative linked SNPs for final diagnosis of the cell lines/embryos.

case of hypothyroidism, 40 controls and 13 cases were used. In the case of type 1 diabetes, 32 controls and 4 cases were used. Random sampling (Elfil and Negida, 2017) of replicates was used to evaluate accuracy of the predictions. 3. Results 3.1. PGT-A Whole chromosome aneuploidy was predicted in 427 of 446 (95.7%) cell line samples (5–8 cells) tested, with 424 of 427 (99.3%) samples giving full concordance with prior karyotyping results (Table 1). Examples are shown in Fig. 1A. In one case, a false negative was observed (GM01359, called euploid but expected 47,XY,+18), and in two cases there was a false positive aneuploid chromosome (GM07227 and GM04626, incorrectly called trisomy 6 and 3, respectively). A result was obtained in 46 of 48 (95.5%) embryo rebiopsies tested, with 46 of 46 (100%) giving full concordance with prior trophectoderm PGT-A results (Table 2). Examples are shown in Fig. 1B. Evaluation of segmental aneuploidy across all cell line samples, to model the detection of de novo inheritance, gave an accuracy of 97.7% (417/427) with a resolution of detection of 10 Mb (Table 2). Cell lines without segmental aneuploidy (Table 1) were used as negative controls. Overall copy number analyses for segmental imbalances are also shown in Fig. 2 and illustrate the ability to distinguish partial chromosome imbalance at or above 10 Mb. To evaluate the ability of ePGT to detect aneuploidy in a mosaic sample, several mixtures were tested. Samples with 33–50% aneuploidy were diagnosed as aneuploid 20% of the time, while samples with 67% aneuploidy or more were diagnosed as aneuploid 100% of the time (Fig. 3).

2.8. Genotyping-based quality control In cases where sample identity may be of interest in laboratory quality control, it may be useful to determine the similarity of samples within a cohort of embryos, or with respect to an ongoing pregnancy. The ability to distinguish the relationship between samples (i.e. self, sibling, or unrelated) was investigated in several embryo rebiopsies with known status. Genotype similarities were obtained and used to define each category of relatedness. 2.9. PGT-P Polygenic risk scores were constructed using the 2018 release of the UK BioBank (UKBB). (Sudlow et al., 2015). Training was performed on genetically British individuals based upon UKBB principal component analysis, with filtering for high frequency (0.1% minor allele frequency) and a 3% threshold for missingness in individuals and SNPs. Variables such as age and sex were excluded from the training analyses. Predictors were constructed using L1-penalized regression (LASSO)(Chow et al., 2014). Performance was evaluated using area under the curve (AUC)(Hajian-Tilaki, 2013), which is used in classification analysis in order to quantify how well a model does at prediction. In this case, the true positive rates are plotted against false positive rates. In addition, Cohen's d effect size was used to indicate the standardized difference between two means (Lakens, 2013). It can be used, for example, to accompany reporting of t-test and ANOVA results. It is also widely used in meta-analysis. A training set of 397,992 individuals, including 21,017 with hypothyroidism, was used to develop a SNP-based polygenic risk predictor of hypothyroidism as previously described (Lello et al., 2018). Both an out-of-sample UKBB dataset, including 40,030 individuals (2,152 with hypothyroidism), and an out-of-sample eMerge cohort dataset (Malinowski et al., 2014), including 4,225 individuals (1,084 with hypothyroidism), were then used to evaluate the validity of this hypothyroidism predictor. In addition, 401,845 individuals from the UKBB, including 2,734 with Type 1 Diabetes (T1D), were used to develop a SNP-based polygenic risk predictor of T1D. Both an out-oftraining UKBB dataset, including 57,294 individuals (388 with T1D), and an out-of-sample dataset from T1DBASE (Burren et al., 2011), including 6,787 individuals, were then used to evaluate the validity of this T1D predictor. In all cases, it is possible that individuals with rare Mendelian forms of the diseases (i.e. Alstrom Syndrome) were included in the analyses. Polygenic predictor performance would have improved with their removal from either the training or validation sets. In order to evaluate the ability to rank siblings where one was affected, and one was healthy, a second UKBB cohort was held out from training on each phenotype, consisting of 260 sibling pairs for T1D, and 1,127 sibling pairs for hypothyroidism. In addition, a second T1D sibling cohort (1,080 pairs) was obtained from T1DBASE (Burren et al., 2011). In order to evaluate performance of each predictor on data generated by the current ePGT method, several samples were obtained. Samples with known hypothyroidism (Table 3) and T1D (Table 1) were compared to Coriell cell lines without any known phenotype. In the

3.2. PGT-M Before evaluating the ability to use linked informative markers to predict monogenic mutation inheritance, the overall accuracy of genotype data from small quantities of DNA was investigated. Results indicate equivalent accuracy (p = 0.2) from small quantities of DNA (i.e. 5–8 cells) compared to large quantities when incorporating imputation from parental genotypes (Fig. 4). Linkage-based analysis of samples from a family with cystic fibrosis ΔF508 gave 100% concordance with expected genotypes. The number of informative SNPs within a 2 Mb window around genes associated with commonly tested monogenic disorders (CFTR, TSC, FMR1, HBB, HBA1/HBA2, BRCA1 and 2, SMN1, and NF1) was 20–537, indicating universal capability for predicting monogenic disease (i.e. without having to develop custom probes for each case). As a quality control methodology, the ability to distinguish self, sibling, and unrelated relationships across multiple embryo rebiopsies was investigated. Genotype similarities in each of these 3 categories were clearly distinguishable (Fig. 5), demonstrating the utility of simultaneous and accurate analysis of both DNA copy number and SNP genotypes in parallel when identity is of interest. Direct mutation testing in parallel with linkage-based analysis was evaluated as a strategy to perform testing without an existing trio (i.e. linkage on-the-fly). Results of both array-based (Fig. 6A) and qPCRbased analysis (Fig. 6B) were sufficient to make predictions with 100% concordance to expected results when 3 or more siblings (i.e. 3 embryos) are available for testing. In addition, qPCR-based testing of the HBB E6V mutation in embryo rebiopsies gave 100% concordance with expected results with a 97.9% (47 of 48) reliability of obtaining a result (Fig. 6C). 3.3. PGT-SR The results of evaluating de novo segmental imbalance also apply to 4

European Journal of Medical Genetics 62 (2019) 103647

N.R. Treff, et al.

Fig. 1. PGT-A Performance. Example copy number plots from ePGT analysis of 5–8 cell aliquots of several cell lines with known karyotypes (A) and rebiopsies of embryos with prior PGT-A test results (B). Red bars indicate chromosomes identified as aneuploid. (For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.)

cases involving structural rearrangements (Table 2, Fig. 2). In addition, the ability to distinguish balanced carriers from normal embryos using genotyping data was investigated. Eight replicates of each of two samples, one unbalanced [46,XY,der(11)t(1;11)(q31;q25)] and one balanced [46,XX,t(1;11)(1pter>1q31.2::11q25 > 11qter;11pter> 11q25::1q31.2>1qter)], were evaluated. Genotypes near the breakpoint in chromosome 1 and chromosome 11 were compared between the two samples. In all cases (64/64), the balanced translocation sample

was a mismatch for chromosome 1 (normal in the unbalanced sample) and a match for chromosome 11 (derivative chromosome in the unbalanced sample)(Fig. 7) as expected. 3.4. PGT-P Using the UKBB training sets, 7,311 SNPs were identified as a predictor for hypothyroidism, and 82 SNPs were identified as a predictor 5

European Journal of Medical Genetics 62 (2019) 103647

N.R. Treff, et al.

Fig. 2. Segmental Aneuploidy Performance. Copy number analysis of 5–8 cell samples with known segmental imbalances. Results are plotted based on the known size of each imbalance and inside and outside the imbalanced region on the same chromosome. A copy number of 2 was expected outside the imbalance and either 1 copy for losses (A) and 3 copies for gains (B).

for T1D. Overall performance is described in Table 4 and illustrated in Fig. 8. Out-of-sample UKBB cohort predictions gave an AUC of 0.71 and 0.68, and a Cohen's d of 0.78 (p < 1e-200) and 0.86 (p < 1e-200) for hypothyroidism and type I diabetes, respectively. The eMERGE dataset

gave an AUC of 0.64 and a Cohen's d of 0.47 (p < 1.5e-187) for hypothyroidism. The T1DBASE dataset gave an AUC of 0.78 and a Cohen's d of 0.75 (p = 9.37e-99) for T1D. The UKBB sibling cohort Cohen's d was 0.43 for hypothyroidism (p = 2.15e-138) and 0.86 for T1D

Fig. 3. Mosaic Aneuploidy Performance. Examples of copy number plots of samples (DNA mixtures from embryos 1-1 and 1–2) with known levels of aneuploidy (percentage in graphs)(A). Summary of percentage of mixture samples called as aneuploid using ePGT (B).

6

European Journal of Medical Genetics 62 (2019) 103647

N.R. Treff, et al.

Fig. 4. Genotyping Performance. Genotype concordance of either large quantities of DNA without amplification, or 5–8 cell aliquots after amplification with or without imputation, compared to the same cell line's reference genotypes.

Fig. 6. Direct Mutation Testing Performance. Genotyping results from 5 to 8 cell aliquots of cell lines with known dF508 status using either Affymetrix Axiom array analysis (A) or TaqMan allelic discrimination (B), and from multiple embryo rebiospies with known HBB E6V status using TaqMan allelic discrimination (C). Fig. 5. Relationship Prediction Performance. Genotype concordance of multiple rebiopsies from the same (self), sibling, or unrelated embryos (A). A heat map illustrating the genotype concordance of each individual biopsy and the ability to distinguish between self, sibling, and unrelated relationships (B). 7

European Journal of Medical Genetics 62 (2019) 103647

N.R. Treff, et al.

(p = 1.92e-9) and for sibling pairs where 1 sibling was a case and another was a control, the case sibling had a higher PRS 75.05% of the time. Samples processed by ePGT for hypothyroidism and T1D gave an AUC of .67 and .73, and a Cohen’s d of 0.6 (p=5.93e-6) and 1.27 (p=1.2e-4), for these smaller cohorts of n=53(13) and n=36(4) for hypothyroidism and T1D, respectively, using the procedure and set of samples described at the end of section 2.9. It is possible to predict case or control status of this cohort with high accuracy, in part because there are more controls than cases in the samples. However, in our application of ePGT, a numerical risk score is reported for each genotype, as opposed to an actual case or control prediction. 4. Discussion Expanded PGT represents the first method capable of simultaneous and accurate prediction of aneuploidy, structural rearrangements, and monogenic and polygenic disorders. Given the excellent performance of polygenic risk scores when applied to adults with known disease status, and the ability to obtain equivalent genotyping accuracy in an embryo, one can equate performance of disease risk prediction in the embryo to prediction of disease status in adults with proven disease. In addition, with the ability to obtain genotypes in the embryo equivalent in accuracy to genotyping in adults, it is possible to further expand testing to additional polygenic disorders as new predictors become available. In addition to Type 1 Diabetes and Hypothyroidism, recent studies indicate current applicability to several common genetic diseases including Breast Cancer, Prostate Cancer, Testicular Cancer, Basal Cell Carcinoma, Malignant Melanoma, Inflammatory Bowel Disease, and Heart Attack (Lello et al., 2019). Similar validation strategies can be employed, involving the evaluation of UKBB out-of-sample cohort and sibling test performance prior to ePGT utilization. Several new datasets are under development, such as the Million Veteran Program in the United States (https://www.research.va.gov/mvp/), FinnGen in Finland (https://www.finngen.fi), and the 100,000 genomes project in the United Kingdom (https://www.genomicsengland.co.uk), and may provide additional predictors in the near future. This technology expands the application for PGT to include polygenic disorders for the first time and will require expansion of genetic counseling efforts to support its clinical utilization. The American

Fig. 7. Identification of Balanced Translocation Carrier Status. Genotype concordance between a balanced translocation carrier and a sibling with an unbalanced karyotype using ePGT from 5 to 8 cell aliquots from corresponding cell lines. As expected, the balanced translocation informative SNP genotypes match the derivative chromosome but mismatch the normal chromosome from the unbalanced sibling sample. Table 4 Statistical analyses of performance when using a 7,311-SNP predictor of hypothyroidism and a 82-SNP predictor of T1D. Cohort hypothyroidism UKBB training set UKBB out-of-sample eMERGE UKBB siblings Discovery Life Science T1D UKBB training set UKBB out-of-sample T1DBASE UKBB siblings Coriell Cell Repository

n (cases)

AUC

Cohen's d

p-value

285,211 (30,562) 40,030 (2,152) 4,255 (1,084) 2,254 (1,127) 53 (13)

0.71 0.64 – 0.67

0.78 0.47 0.43 0.6

<1 e−200 1.49 e−187 2.15 e−138 5.93 e−6

401,845 (2,734) 57,294 (388)

– 0.68 0.78 – 0.73

– 0.86 0.75 0.86 1.27

– <1 e−200 9.37E-99 1.92 e−9 1.2 e−4

520 (260) 36 (4)

Fig. 8. Polygenic Risk Score Performance. Histogram plots (polygenic risk score versus probability density) for hypothyroidism; UKBB out-of-sample dataset (A), eMERGE dataset (B), UKBB sibling cohort (C), and the Discovery Life Science ePGT dataset (D): and for T1D UKBB out-of-sample dataset (E), T1DBASE dataset (F), UKBB sibling cohort (G), and the Coriell Cell Repository ePGT dataset (H). Blue = control, Orange = case, Purple = both. (For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.)

8

European Journal of Medical Genetics 62 (2019) 103647

N.R. Treff, et al.

Society for Reproductive Medicine has already indicated that PGT for adult onset disorders of lesser severity and lower penetrance is an ethical medical intervention given the argument of reproductive liberty (ASRM-Practie-Committee, 2018). Given the well-established increased risk of disease in the infertile population, application of ePGT may be even more well-suited in the IVF setting. Furthermore, in families with a history of a polygenic disorder, elective IVF and genetic testing may also be suitable, as well as couples utilizing oocyte or sperm donation, where several embryos are typically produced during treatment. This additional testing may offer patients an additional measure of embryo health that may be used to prioritize available embryos for transfer. As with all currently available genetic testing (PGT-A, PGT-M, PGT-SR), the utility is in risk reduction, not elimination. The available alternative is to proceed without additional risk scoring, and to select embryos based upon morphology alone. The ability to predict risk of a genetic disorder in the embryo may, in cases without an opportunity for selection, also provide actionable information for more careful long-term monitoring and ultimately improved clinical care for individuals at risk or eventually effected by polygenic conditions (Bloss et al., 2011).

R.T., 2013. In vitro fertilization with single euploid blastocyst transfer: a randomized controlled trial. Fertil. Steril. 100 (1), 100–107 e101. Global-Burden-of-Disease-Study, 2016. Global, regional, and national comparative risk assessment of 79 behavioural, environmental and occupational, and metabolic risks or clusters of risks, 1990-2015: a systematic analysis for the Global Burden of Disease Study 2015. Lancet 388 (10053), 1659–1724. Hajian-Tilaki, K., 2013. Receiver operating characteristic (ROC) curve analysis for medical diagnostic test evaluation. Caspian J. Intern. Med. 4 (2), 627–635. Handyside, A.H., Lesko, J.G., Tarin, J.J., Winston, R.M., Hughes, M.R., 1992. Birth of a normal girl after in vitro fertilization and preimplantation diagnostic testing for cystic fibrosis. N. Engl. J. Med. 327 (13), 905–909. Iews, M., Tan, J., Taskin, O., Alfaraj, S., AbdelHafez, F.F., Abdellah, A.H., Bedaiwy, M.A., 2018. Does preimplantation genetic diagnosis improve reproductive outcome in couples with recurrent pregnancy loss owing to structural chromosomal rearrangement? A systematic review. Reprod. Biomed. Online 36 (6), 677–685. Khera, A.V., Chaffin, M., Aragam, K.G., Haas, M.E., Roselli, C., Choi, S.H., Natarajan, P., Lander, E.S., Lubitz, S.A., Ellinor, P.T., Kathiresan, S., 2018. Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nat. Genet. 50 (9), 1219–1224. Lakens, D., 2013. Calculating and reporting effect sizes to facilitate cumulative science: a practical primer for t-tests and ANOVAs. Front. Psychol. 4 863-863. Lello, L., Avery, S.G., Tellier, L., Vazquez, A.I., de Los Campos, G., Hsu, S.D.H., 2018. Accurate genomic prediction of human height. Genetics 210 (2), 477–497. Lello, L., Raben, T., Yong, S., Tellier, L., Hsu, S., 2019. Genomic Prediction of Complex Disease Risk. BioRxIV. Malinowski, J.R., Denny, J.C., Bielinski, S.J., Basford, M.A., Bradford, Y., Peissig, P.L., Carrell, D., Crosslin, D.R., Pathak, J., Rasmussen, L., Pacheco, J., Kho, A., Newton, K.M., Li, R., Kullo, I.J., Chute, C.G., Chisholm, R.L., Jarvik, G.P., Larson, E.B., McCarty, C.A., Masys, D.R., Roden, D.M., de Andrade, M., Ritchie, M.D., Crawford, D.C., 2014. Genetic variants associated with serum thyroid stimulating hormone (TSH) levels in European Americans and African Americans from the eMERGE Network. PLoS One 9 (12), e111301. Neal, S.A., Franasiak, J.M., Forman, E.J., Werner, M.D., Morin, S.J., Tao, X., Treff, N.R., Scott Jr., R.T., 2017. High relative deoxyribonucleic acid content of trophectoderm biopsy adversely affects pregnancy outcomes. Fertil. Steril. 107 (3), 731–736 e731. Schulman, J.D., Edwards, R.G., 1996. Preimplantation diagnosis in disease control, not eugenics. Hum. Reprod. (Oxf.) 11 (3), 463–464. Scott Jr., R.T., Upham, K.M., Forman, E.J., Hong, K.H., Scott, K.L., Taylor, D., Tao, X., Treff, N.R., 2013. Blastocyst biopsy with comprehensive chromosome screening and fresh embryo transfer significantly increases in vitro fertilization implantation and delivery rates: a randomized controlled trial. Fertil. Steril. 100 (3), 697–703. Sudlow, C., Gallacher, J., Allen, N., Beral, V., Burton, P., Danesh, J., Downey, P., Elliott, P., Green, J., Landray, M., Liu, B., Matthews, P., Ong, G., Pell, J., Silman, A., Young, A., Sprosen, T., Peakman, T., Collins, R., 2015. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 12 (3), e1001779. Treff, N.R., Thompson, K., Rafizadeh, M., Chow, M., Morrison, L., Tao, X., Garnsey, H., Reda, C.V., Metzgar, T.L., Neal, S., Jalas, C., Scott Jr., R.T., Forman, E.J., 2016. SNP array-based analyses of unbalanced embryos as a reference to distinguish between balanced translocation carrier and normal blastocysts. J. Assist. Reprod. Genet. 33 (8), 1115–1119. Yang, Z., Liu, J., Collins, G.S., Salem, S.A., Liu, X., Lyle, S.S., Peck, A.C., Sills, E.S., Salem, R.D., 2012. Selection of single blastocysts for fresh transfer via standard morphology assessment alone and with array CGH for good prognosis IVF patients: results from a randomized pilot study. Mol. Cytogenet. 5 (1), 24.

Acknowledgments The authors would like to thank the embryology team at Reproductive Medicine Associates of Connecticut and the technical and customer support teams at ThermoFisher Scientific. This study has been performed with UK Biobank data under application 15326. References ASRM-Practie-Committee, 2018. Use of preimplantation genetic testing for monogenic defects (PGT-M) for adult-onset conditions: an Ethics Committee opinion. Fertil. Steril. 109 (6), 989–992. Barness, L.A., 1996. 1995 In: seventh ed. In: Scriver, C.R., Beaudet, A.L., Sly, W.S., Valle, D. (Eds.), The Metabolic and Molecular Bases of Inherited Disease, vol 3. McGraw Hill, New York, pp. 4605 66(1), 87-87. Bloss, C.S., Madlensky, L., Schork, N.J., Topol, E.J., 2011. Genomic information as a behavioral health intervention: can it work? Pers. Med. 8 (6), 659–667. Burren, O.S., Adlem, E.C., Achuthan, P., Christensen, M., Coulson, R.M., Todd, J.A., 2011. T1DBase: update 2011, organization and presentation of large-scale data sets for type 1 diabetes research. Nucleic Acids Res. 39, D997–D1001 (Database issue). Chow, C.C., Vattikuti, S., Lee, J.J., Chang, C.C., Hsu, S.D.H., 2014. Applying compressed sensing to genome-wide association studies. GigaScience 3 (1). Elfil, M., Negida, A., 2017. Sampling methods in clinical research; an educational review. Emergency 5 (1) e52-e52. Forman, E.J., Hong, K.H., Ferry, K.M., Tao, X., Taylor, D., Levy, B., Treff, N.R., Scott Jr.,

9