A clinical model predicting the risk of esophageal high-grade lesions in opportunistic screening: a multicenter real-world study in China

A clinical model predicting the risk of esophageal high-grade lesions in opportunistic screening: a multicenter real-world study in China

Journal Pre-proof A clinical model predicting the risk of esophageal high-grade lesions in opportunistic screening: a multicenter real-world study in ...

2MB Sizes 0 Downloads 0 Views

Journal Pre-proof A clinical model predicting the risk of esophageal high-grade lesions in opportunistic screening: a multicenter real-world study in China Zhen Liu, PhD, Chuanhai Guo, BS, Yujie He, BS, Yun Chen, MD, Ping Ji, PhD, Zhengyu Fang, PhD, Fenglei Li, BS, Yuefei Tang, BS, Xiujian Chen, BS, Ping Xiao, MD, Chengwen Wang, MD, Weihua Yin, MD, Hai Guo, MD, Mengfei Liu, PhD, Yaqi Pan, BS, Fangfang Liu, PhD, Ying Liu, PhD, Zhonghu He, PhD, Yang Ke, MD PII:

S0016-5107(19)32580-5

DOI:

https://doi.org/10.1016/j.gie.2019.12.038

Reference:

YMGE 11902

To appear in:

Gastrointestinal Endoscopy

Received Date: 9 September 2019 Accepted Date: 22 December 2019

Please cite this article as: Liu Z, Guo C, He Y, Chen Y, Ji P, Fang Z, Li F, Tang Y, Chen X, Xiao P, Wang C, Yin W, Guo H, Liu M, Pan Y, Liu F, Liu Y, He Z, Ke Y, A clinical model predicting the risk of esophageal high-grade lesions in opportunistic screening: a multicenter real-world study in China, Gastrointestinal Endoscopy (2020), doi: https://doi.org/10.1016/j.gie.2019.12.038. This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. Copyright © 2019 by the American Society for Gastrointestinal Endoscopy

TITLE PAGE A clinical model predicting the risk of esophageal high-grade lesions in opportunistic screening: a multicenter real-world study in China

Short title: A clinical prediction model for esophageal high-grade lesions

Zhen Liu1# PhD, Chuanhai Guo1# BS, Yujie He2# BS, Yun Chen3,4# MD, Ping Ji5# PhD, Zhengyu Fang5# PhD, Fenglei Li6 BS, Yuefei Tang2 BS, Xiujian Chen7 BS, Ping Xiao5 MD, Chengwen Wang8 MD, Weihua Yin9 MD, Hai Guo8 MD, Mengfei Liu1 PhD, Yaqi Pan1 BS, Fangfang Liu1 PhD, Ying Liu1 PhD, Zhonghu He1* PhD, Yang Ke1* MD

1

Key laboratory of Carcinogenesis and Translational Research (Ministry of Education/Beijing), Laboratory of Genetics, Peking University Cancer Hospital & Institute, Beijing, P.R. China

2

Endoscopy center, Hua County People’s Hospital, Henan Province, P.R. China

3

Department of Ultrasound, Peking University Shenzhen Hospital, Shenzhen, Guangdong Province, P.R. China

4

Shenzhen Key Laboratory for Drug Addiction and Medication Safety, Shenzhen Peking University-Hong Kong University of Science and Technology Medical Center, Shenzhen, Guangdong Province, P.R. China

5

Clinical Research Institute, Shenzhen Peking University-Hong Kong University of

Science and Technology Medical Center, Shenzhen, Guangdong Province, P.R. China 6

Hua County People’s Hospital, Henan Province, P.R. China

7

Department of Pathology, Hua County People’s Hospital, Henan Province, P.R. China

8

Endoscope group, Department of gastroenterology, Peking University Shenzhen Hospital, Shenzhen, Guangdong Province, P.R. China

9

Department of Pathology, Peking University Shenzhen Hospital, Shenzhen, Guangdong Province, P.R. China

# These authors contributed equally to this paper * Authors for correspondence Yang Ke: Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education/Beijing), Laboratory of Genetics, Peking University Cancer Hospital & Institute, No. 52 Fucheng Road, Haidian District, Beijing 100142, PR China. Phone: 86-10-88196762; E-mail: [email protected]; Zhonghu He: Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education/Beijing), Laboratory of Genetics, Peking University Cancer Hospital & Institute, No. 52 Fucheng Road, Haidian District, Beijing 100142, PR China.

Phone: 86-10-88196762; E-mail: [email protected]; Grant support: This work was supported by Sanming Project of Shenzhen (SZSM201612061), the national key R&D program of China [grant number 2016YFC0901404] and the Digestive Medical Coordinated Development Center of Beijing Municipal Administration of Hospitals [grant number XXZ0204]. Disclosures There are no conflicts of interest with regard to publication of this study. Writing assistance We would like to thank Dr. Michael A. McNutt for editing this manuscript and he has no correlation with the funding source of the study. Author contributions: Conception and design, Z.H.H., Y.K.; Analysis and interpretation of the data, Z.L., C.H.G., Y.J.H., Y.C., P.J., Z.Y.F., F.L.L., Y.F.T., X.J.C., P.X., C.W.W., W.H.Y., H.G., M.F.L., Y.Q.P., F.F.L., Y.L.; Drafting of the article, Z.L., Z.H.H., Y.K.; Critical revision of the article for important intellectual content, Z.H.H., Y.K.; Final approval of the article, Z.H.H., Y.K.; Each author has reviewed and approved the final draft submitted.

A clinical model predicting the risk of esophageal high-grade lesions in opportunistic screening: a multicenter real-world study in China

ABSTRACT Background and Aims: Prediction models for esophageal squamous cell carcinoma are not common, and no model, which targets a clinical population, has previously been developed and validated. We aimed to develop a prediction model for estimating the risk of high-grade esophageal lesions for application in clinical settings, and to validate the performance of this model in an external population. Methods: This model was developed based on results of endoscopic evaluation of 5,624 outpatients in one hospital in a high-risk region in northern China, and was validated using 5,765 outpatients who had undergone endoscopy in another hospital in a non-high-risk region in southern China. Predictors were selected with unconditional logistic regression analysis. The Akaike information criterion was used to determine the final model structure. Discrimination was estimated using the area under the receiver operating characteristic curve (AUC). Calibration was assessed using a calibration plot with an intercept and slope. Results: The final prediction model contained 5 variables, including age, smoking, BMI, dysphagia, and retrosternal pain. This model generated an AUC of 0.871 (95%CI, 0.842-0.946) in the development set, with an AUC of 0.862 after bootstrapping. The 5-variable model was superior to a single age model. In the 1

validation population, the AUC was 0.843 (95% CI, 0.793-0.894). This model successfully stratified the clinical population into 3 risk groups and showed high ability for identifying concentrated groups of cases. Conclusions: Our model for esophageal high-grade lesions has a high predictive value. It has the potential for application in clinical opportunistic screening to aid decision-making for both healthcare professionals and individuals. Keywords Esophageal squamous cell carcinoma; Prediction model; Risk stratification; Opportunistic screening

INTRODUCTION Esophageal cancer (EC) is among the most common cancers in the world1, and there is striking variation in the distribution of EC over different geographic regions. In China, EC ranks as the third most frequent cancer and the fourth leading cause of cancer death2, and these Chinese cases account for about 55% of EC cases in the world. There are 2 main histopathologic types of EC, namely esophageal squamous cell carcinoma (ESCC) and esophageal adenocarcinoma (EAC), and the former represents more than 90% of EC cases in China. Upper GI endoscopy is the criterion standard for diagnosis of ESCC as well as its precursor lesions, such as esophageal squamous dysplasia and carcinoma in situ (CIS)3-5. Early detection and treatment of esophageal lesions can greatly improve the 2

prognosis6 and reduce mortality7-9. Organized screening programs have been initiated in high-prevalence regions of ESCC in China. In contrast to organized screening programs that require central management and monitoring, opportunistic screening, which refers to detection of lesions in people who present to healthcare professionals for various complaints should be the main screening modality in clinical settings, especially for regions without an organized screening program. Endoscopy is an invasive process that has a high cost and potential for significant side effects. When physicians performing endoscopy do not have specific expectations regarding risk in a given patient before the endoscopy, diagnosis of esophageal lesions may be missed. This is especially true in clinical settings where high sensitivity detection methods (eg, iodine staining or narrow-band imaging [NBI]) are not routinely used. Risk prediction modeling provides tools for estimation and stratification of individualized risk of esophageal lesions based on multiple pieces of information regarding readily identifiable predictors10-12. This is a crucial basis for establishing a cost-effective opportunistic screening strategy and is in turn essential for increasing the sensitivity of endoscopic examination by alerting endoscopists. Unfortunately, prediction models for ESCC are not common, and no model targeting a real-world clinical population has as yet been developed and externally validated. In this study, we developed and externally validated a prediction model for estimating risk of high-grade esophageal lesions using data from real clinical settings in China, 3

with the intention of providing an accurate and simple tool to aid decision-making in patient referral and endoscopic examination in opportunistic screening for ESCC. METHOD Study Subjects Study subjects for this investigation were recruited from 2 study centers, which are referred to in this study as the North Center and the South Center. The North Center was established in Hua County People’s Hospital, which is a central hospital in Hua County, Henan Province in Northern China. The South Center was established in Peking University Shenzhen Hospital, which is one of the major hospitals in Shenzhen City, Guangdong Province in Southern China. There are great economic and social disparities between these 2 regions. Hua County is an agricultural region with a rural population of 1.1 million that represents over 90% of the total population of the county. The per-capita Gross Domestic Product (GDP) of Hua County was $3419 in 2017. This region is well known for its high incidence of ESCC, which is 2 to 5 times as high as the national average. In contrast, Shenzhen City, which was the first special economic zone (SEZ) in China, is one of the largest and most economically dynamic cities in China. By the end of 2017, the permanent resident population had reached 12 million, and the per-capita GDP was as high as $27,100. The structure of the population of Shenzhen City is complex due to the large transient population from all parts of China who live there. The incidence of ESCC in Shenzhen City is similar to the average national incidence level for China13. 4

Subjects for this study were recruited from outpatients in both study centers who were undergoing upper GI endoscopy. Inclusion criteria included (1) age 45-69 years, (2) no history of cancer, mental disorder, or any contraindication for endoscopy, (3) completion of an adequate upper GI endoscopic examination. We consecutively recruited subjects in the North Center from March 1, 2017 to February 20, 2019, and in the South Center from June 19, 2017 to January 14, 2019. In this study, we used data from the North Center and South Center for model development and validation respectively. Collection of Predictors and Outcomes All subjects underwent a computer-aided one-on-one questionnaire investigation before endoscopic examination to collect demographic variables and information regarding potential predictors of ESCC. Candidate predictors were selected on the basis of review of the literature, which included age, gender, socioeconomic status (education level, household income per capita, and job type), cigarette smoking, alcohol drinking, consumption of hot tea, source of drinking water, family history of ESCC, body mass index (BMI), type of fuel used for cooking, exposure to fumes in the kitchen, pesticide exposure, intake of fruit and vegetables, unhealthy dietary habits and GI symptoms. These variables and their coding formats have all been described in a previous publication14. All subjects received upper GI endoscopy (Olympus CV-260 series endoscopy, Japan). Similar to endoscopy in most typical clinical settings, the entire esophagus and 5

stomach were visually examined under white light. NBI was used when suspicious lesions were observed, and biopsies were taken from all focal lesions. Biopsy specimens were fixed in 10% formaldehyde, embedded in paraffin, sectioned at 5 µm and stained with hematoxylin and eosin. These specimens were reviewed microscopically by 2 pathologists from the department of pathology in each study center, each of whom had at least 5 years of experience in the diagnosis of esophageal cancer. Pathologic diagnoses were generated without knowledge of the endoscopic findings or information from the patient questionnaires. In this study, the main outcome was defined as lesions showing severe dysplasia and above (SDA) of the esophagus found by pathologic examination of biopsy specimens, which included severe squamous dysplasia (atypical cells in all thirds of the epithelium without full-thickness involvement or invasion3), CIS, and ESCC. Statistical Analysis We first used univariate logistic regression to assess the association of each potential predictor and the presence of SDA lesions in the esophagus. Predictors with either P<0.05 or P<0.5 and odds ratio >1.3 were subjected to multivariate logistic regression models. The final model structure was determined by backward elimination using the Akaike information criterion (AIC). Subjects with missing values (BMI was missing in 222 (4.0%) and 10 (0.2%) subjects in the development and validation sets respectively) were excluded from the analysis.

6

The ability of this model to discriminate between patients with and without SDA lesions was evaluated by calculating the area under the receiver operating characteristics curve (AUC)15, and the discrimination estimates of the final multivariate model and the simple age model were compared using the Delong test16. The bootstrapping technique was used for internal validation with 200 random samples of size 3,000 to estimate the impact of overfitting of the model

17, 18

. For

external validation, values of predictors of each subject in the validation set were multiplied by the corresponding coefficients in the final model to generate a predicted probability. New ROCs were then established, and AUCs were estimated based on the predicted probabilities and observed outcomes. To assess the calibration of our model, a calibration plot was drawn to visually represent the relationship between predicted and observed risk19. Prefect calibration is characterized by a line with an intercept of 0 and slope of 1. To evaluate the effectiveness of the model in risk stratification and clinical utility, we divided patients based on their risk probabilities into 3 groups of low, moderate, and high risk. The 2 risk cutoff points that would achieve the desired partitioning were defined as the highest probabilities that maintained sensitivity of 100% and 80%, respectively, in the development set. The number of patients, detection rate and detection rate ratio as compared with universal screening in each risk group was calculated for both development and validation sets. We also calculated the sensitivity and specificity at every decile of the predicted probabilities in the development set, 7

and for the same indices in the external validation set corresponding to these probability cutoffs. In this study, the characteristics of the participants in development and validation sets were compared using the Chi-squared test and the rank-sum test for categorical and continuous variables respectively. Each patient in this study had only one endoscopic examination. STATA 13.1 and R statistics version 3.5.0 were used for all analysis. All tests were 2-sided and had a significance level of 0.05 unless otherwise specified. We recognized that there was multiple testing of outcome data arising from individual patients. The p-values for the univariate statistical tests were not corrected for multiple testing, because those tests were taken as exploratory. The subsequent multivariable logistic regression analysis was considered the main definitive statistical result, as it determined those variables independently associated with the occurrence of SDA after adjusting for the contributions of the other variables in the model. Other statistical results relating to further refinement of the model were secondary and taken as descriptive only, and did not require correction of their p-values for multiple testing. Ethics Statement This study was approved by the Institutional Review Board of the Peking University School of Oncology, China. Written informed consent was obtained from each participant. RESULTS 8

Description of Study Subjects In the development set, we consecutively recruited 5,624 subjects during the study period. 87 (1.55%) of these subjects were confirmed as having SDA by pathology, including 10 (0.18%) severe squamous dysplasia, 1 (0.02%) carcinoma in situ and 76 (1.35%) cases of ESCC. 62 subjects were excluded from development of the model, which included 3 with esophageal adenocarcinoma, 9 with unclassified esophageal cancers, 33 cardial cancers, and 17 noncardial gastric cancers. For the validation set, 5,765 subjects were successfully recruited and interviewed. 34 (0.60%) subjects had pathologically confirmed SDA lesions (3 severe squamous dysplasia, 1 carcinoma in situ and 30 ESCC). Fifty-four subjects were excluded from the validation dataset, including 3 with esophageal adenocarcinoma, 5 unclassified esophageal cancers, 4 cardial cancers, and 42 noncardial gastric cancers. Selected characteristics of the subjects in both datasets are tabulated in Table 1. These 2 populations were balanced for gender and cigarette smoking, but significant differences still existed for many variables. For example, a higher proportion of subjects in the development set had a family history of ESCC, alcohol drinking, unmarried status, low education level, fumes in the kitchen, use of coal or wood for cooking, and dysphagia. Subjects in the validation set were more likely to have a low BMI and to have weight loss. Development of Prediction Model For construction of the model, 15 variables were considered as candidates in the multivariate analysis (listed in Table 1). The final prediction model consisted of 5 9

predictors, including age, cigarette smoking, BMI of less than 22, presence of dysphagia and retrosternal pain (Table 2, Equation 1). Formula for the prediction model for ESCC

(Equation 1)

Risk of ESCC = 1/(1+exp-(-15.80 + 0.17 x age + 0.83 for smoking + 0.52 for BMI≤22 + 1.61 for dysphagia + 0.48 for retrosternal pain )) The AUC of the 5-variable model was 0.871 (95% confidence interval, 0.842-0.946), and discrimination was superior to that of the simple model consisting of age only (P<0.001) (Figure1, Supplementary Figure 1). Internal validation using bootstrapping generated an AUC of 0.862. External Validation of the Prediction Model When this model was applied to the validation set, the ROC constructed by the predicted probability and observed outcome yielded an AUC of 0.843 (95% CI, 0.793-0.894) (Figure 2). The AUC of the 5-variable model was significantly higher than that of the age only model (AUC: 0.649, P<0.001). Calibration of the prediction model resulted in a slope of 0.88 and an intercept of -1.35 (Supplementary Figure 2). Risk Stratification and Clinical Utility To facilitate the application of this model in practice, we divided patients based on their risk probabilities into 3 groups of low, moderate, and high risk (Table 3). The 2 10

risk cutoff points that would achieve the desired partitioning were defined as the highest probabilities that maintained sensitivity of 100% and 80%, respectively, in the development set, with predicted probabilities of 0.0026916 and 0.0156786 (Supplementary Table 1). As such, 25.0%, 37.5% and 37.5% of the total population from the development set were assigned into high-, moderate-, and low-risk groups, respectively. In the validation set, the proportion of subjects classified into each risk group was quite similar, and was 22.5% for the high-risk group, 41.5% for the moderate risk group and 36.0% in the low risk group. According to this strategy for risk stratification, 80% and 73.5% of SDA cases fell into the high-risk group, and no case fell into the low-risk group in the development or validation sets. At the same time, the detection rate in the high-risk group was elevated to more than 3 times that of the total population in both datasets, which reflects a considerable enrichment for potential SDA patients. Validation of A General Population Model In a previous study, we constructed a prediction model for identification of individuals at high risk for ESCC in a general population based on a large-scale randomized controlled trial (RCT)14. We also applied this general population model to the clinical datasets used in the present study and evaluated its predictive ability. This model generated an AUC of 0.763 for subjects ≤60 and 0.513 for subjects >60 in the development set. The AUC was 0.489 for subjects ≤60 and 0.698 for subject >60 in the validation set (Supplementary Figure 3). We further evaluated the performance of 11

the model developed in this study on the cohort used in the previous study. The AUC was 0.502 for subjects ≤60 and 0.603 for subjects >60. DISCUSSION In this study, we developed and validated a prediction model to estimate the risk of high-grade esophageal lesions in endoscopically evaluated outpatients in real-world clinical settings in China. To our knowledge, this is the first prediction modeling study for ESCC with independent external validation. By combining five easily acquired predictors, our final model achieved high discrimination and capacity for risk stratification in both a development and an external validation population with significant heterogeneity. This demonstrated great potential for more general use of this model in clinical practice. Our model will likely be useful in making decisions regarding referral for endoscopy in opportunistic screening efforts and will serve to alert physicians performing endoscopy that close attention must be paid to high-risk individuals. This study has several strengths. First, we used cross-sectional cohort data, which is preferable for development of a diagnostic risk model20. Case and non-case subjects were recruited from the same population in real clinical circumstances without meticulous inclusion or exclusion criteria. This ensured the target population was highly representative and enabled us to better estimate absolute risk and carry out risk stratification. Cross-sectional data were collected before subjects were made aware of their disease status, which for the most part prevented recall bias that often occurs in 12

case-control studies. Second, we successfully validated the model in an independent external population that showed notable heterogeneity with the development set, which is essential for recommendation of the model for use in clinical practice. No other ESCC related modeling study has reported use of external validation. In this study, we validated our model in a dataset that incorporated subjects from 32 out of 34 provinces in China. The validation set has characteristics that differ from the development set including population structure, behavior factors and detection rate of high-grade esophageal lesions. It came as a surprise that despite the great differences in these 2 datasets, the model retained good capacity for discrimination and stratification, suggesting that it has strong potential for generalization in clinical settings outside of areas of high prevalence. Calibration of this model showed that it has a slight tendency for overestimation of risk, which is mainly due to the significantly decreased detection rate in the validation set. However, overestimation of risk had no impact on the ability of the model to stratify individuals, and thus will not limit its utility. The model can be recalibrated to fit given local populations in future use. Some predictors in this model call for further explanation. Increasing age played an important role in predicting ESCC in this population, and using only age for prediction of ESCC achieved an AUC of 0.797 in the development set. However, the predictive performance of this age only model was poor in subjects >60 years of age. In contrast, the multivariate model performed well, and was superior to the age only 13

model in both >60 and ≤60 subgroups. Upper GI symptoms are often taken as an important indication for endoscopic examination in clinical practice. However, not all symptoms have predictive value, and the discriminating ability of a given single symptom may be insufficient. In our model, 2 factors associated with upper GI symptoms, namely dysphagia and retrosternal pain were included. Both of these symptoms showed relatively weak predictive values with an AUC of 0.671 for dysphagia and 0.581 for retrosternal pain. By combination of these 2 symptoms together with 3 other predictors, the final model demonstrated significantly improved prediction performance. Our prediction model has strong potential for supporting decision-making in several aspects of clinical practice. For clinical doctors, the estimated risk may help distinguish individuals with different levels of risk and allow tailoring advice regarding further endoscopy examination. We suggested that individuals who have high or moderate risk be urged to proceed to endoscopy. For individuals who are predicted to have low risk, a decision for or against endoscopy can be made by the physician and patient together with overall consideration of the willingness, profile of risk factors and suspicious symptoms for upper GI cancers of the patient. For physicians who perform endoscopy, we suggest intensive examination of the esophagus for patients whose risk is predicted to be moderate or high. Additional detection methods such as NBI or iodine stain are encouraged for use in patients with high risk. In addition, this model can provide direct information for patients 14

themselves regarding their own risk, and support joint decision making with their physician. In one of our previous studies, we constructed a prediction model for identification of individuals at high risk for ESCC in a general population based on a large-scale RCT. This model showed high discrimination accuracy and had potential for application in real screening programs. We also applied this general population model to the clinical datasets used in the present study and evaluated its predictive ability. The results showed that the general population model did not perform well when applied in clinical settings and vice versa. This indicates essential differences may exist in the clinical population and general population resulting from self-selection driven by symptoms and self-risk evaluation, leading to a discrepant predictor pattern. This discrepancy strongly warrants establishment of a specific model for clinical populations. Two crucial preconditions of opportunistic screening are (1) high-grade evidence proving the effectiveness of such screening; (2) an appropriate risk stratification tool for use in distinguishing high-risk individuals from outpatients in general. The effectiveness of endoscopy in screening for ESCC to reduce mortality has been demonstrated both in a non-randomized study7 and in our most recent report9. In addition, evidence from an RCT can be expected in the near future21. The model established in this study may provide clinicians with a low-cost and easy-to-use stratification tool, which will render opportunistic screening more accurate and 15

efficient. Because it is almost impossible to establish a model in a 100% pure “opportunistic screening population” under real-world circumstances, we further performed a sensitivity analysis by testing the performance of this model in subjects with varied levels of upper GI symptoms. The AUCs for this model in subjects with no more than 2, 3, or 4 symptoms in the validation set were 0.855, 0.830, and 0.851 respectively, showing the robustness of the model. In summary, we have developed a clinical model for predicting risk of high-grade esophageal lesions in China. This model had high accuracy of prediction and its performance has been validated in an independent population. Our study provides a useful risk-stratification tool for clinical practice that will support both referral for endoscopic examination in opportunistic screening and the endoscopic examination process itself.

REFERENCE 1.

2. 3.

4.

5.

Bray F, Ferlay J, Soerjomataram I, et al. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 2018;68:394-424. Chen W, Zheng R, Baade PD, et al. Cancer statistics in China, 2015. CA Cancer J Clin 2016;66:115-32. Wang GQ, Abnet CC, Shen Q, et al. Histological precursors of oesophageal squamous cell carcinoma: results from a 13 year prospective follow up study in a high risk population. Gut 2005;54:187-92. Dawsey SM, Fleischer DE, Wang GQ, et al. Mucosal iodine staining improves endoscopic visualization of squamous dysplasia and squamous cell carcinoma of the esophagus in Linxian, China. Cancer 1998;83:220-31. Mannath J, Ragunath K. Role of endoscopy in early oesophageal cancer. Nat Rev Gastroenterol Hepatol 2016;13:720-730.

16

6.

7.

8. 9.

10.

11.

12. 13. 14.

15.

16.

17.

18. 19.

20.

Wang GQ, Jiao GG, Chang FB, et al. Long-term results of operation for 420 patients with early squamous cell esophageal carcinoma discovered by screening. Ann Thorac Surg 2004;77:1740-4. Wei WQ, Chen ZF, He YT, et al. Long-Term Follow-Up of a Community Assignment, One-Time Endoscopic Screening Study of Esophageal Cancer in China. J Clin Oncol 2015;33:1951-7. Lao-Sirieix P, Fitzgerald RC. Screening for oesophageal cancer. Nat Rev Clin Oncol 2012;9:278-87. Liu M, He Z, Guo C, et al. Effectiveness of Intensive Endoscopic Screening for Esophageal Cancer in China: A Community-Based Study. Am J Epidemiol 2019;188:776-784. Xie SH, Lagergren J. A model for predicting individuals' absolute risk of esophageal adenocarcinoma: Moving toward tailored screening and prevention. Int J Cancer 2016;138:2813-9. Dong J, Buas MF, Gharahkhani P, et al. Determining Risk of Barrett's Esophagus and Esophageal Adenocarcinoma Based on Epidemiologic Factors and Genetic Variants. Gastroenterology 2018;154:1273-1281 e3. Thrift AP, Kendall BJ, Pandeya N, et al. A model to determine absolute risk for esophageal adenocarcinoma. Clin Gastroenterol Hepatol 2013;11:138-44 e2. He J, Chen W. China Cancer Registry Annual Report 2017. Beijing: People’s Medical Publishing House, 2018. Liu M, Liu Z, Cai H, et al. A Model To Identify Individuals at High Risk for Esophageal Squamous Cell Carcinoma and Precancerous Lesions in Regions of High Prevalence in China. Clin Gastroenterol Hepatol 2017;15:1538-1546 e7. Steyerberg EW, Vickers AJ, Cook NR, et al. Assessing the performance of prediction models: a framework for traditional and novel measures. Epidemiology 2010;21:128-38. DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 1988;44:837-45. Steyerberg EW, Harrell FE, Jr., Borsboom GJ, et al. Internal validation of predictive models: efficiency of some procedures for logistic regression analysis. J Clin Epidemiol 2001;54:774-81. Smith GC, Seaman SR, Wood AM, et al. Correcting for optimistic prediction in small data sets. Am J Epidemiol 2014;180:318-24. Steyerberg EW, Vergouwe Y. Towards better clinical prediction models: seven steps for development and an ABCD for validation. Eur Heart J 2014;35:1925-31. Wynants L, Collins GS, Van Calster B. Key steps and common pitfalls in developing and validating risk models. BJOG 2017;124:423-432.

17

21.

He Z, Liu Z, Liu M, et al. Efficacy of endoscopic screening for esophageal cancer in China (ESECC): design and preliminary results of a population-based randomised controlled trial. Gut 2019;68:198-206.

18

Table 1. Selected characteristics of subjects in development and validation sets. Development

Validation set

P

set (N, %)

(N, %)

valuec

54(50-62)

55(50-60)

<0.001

3,135(56.4)

3,178(55.7)

0.454

2,427(43.6)

2,533(44.3)

0

4,871(87.6)

5,534(96.9)

1

568(10.2)

160(2.8)

2

96(1.7)

15(0.3)

3-4

27(0.5)

2(0.0)

No

4,037(72.6)

4,147(72.6)

Yes

1,525(27.4)

1,564(27.4)

No

4,450(80.0)

4,784(83.8)

Yes

1,112(20.0)

927(16.2)

>22

4,322(80.9)

3,672(64.4)

22

1,018(19.1)

2,029(35.6)

1

148(2.7)

273(4.8)

2

1,403(25.2)

1,239(21.7)

≥3

4,011(72.1)

4,199(73.5)

5,218(93.8)

5,618(98.4)

344(6.2)

93(1.6)

Middle school or above

2,611(46.9)

4,061(71.1)

Primary school or below

2,951(53.1)

1,650(28.9)

No

646(11.6)

748(13.1)

Yes

4,916(88.4)

4,963(86.9)

No

2,154(38.7)

4,112(72.0)

Yes

3,408(61.3)

1,599(28.0)

No

1,992(35.8)

5,561(97.4)

Yes

3,570(64.2)

150(2.6)

Variables Age (media, IQR) Gender Female Male a

ESCC family history

<0.001

Cigarette smoking 0.986

Consumption of alcohol <0.001

b

BMI

<0.001

No. of family members <0.001

Marital status Married Unmarried

<0.001

Education <0.001

High temperature food preference 0.018

Fume exposure in the kitchen <0.001

Use of coal or wood as primary cooking fuel

19

<0.001

Dysphagia No

4,563(82.0)

4,848(84.9)

Yes

999(18.0)

863(15.1)

No

4,341(78.0)

4,363(76.4)

Yes

1,221(22.0)

1,348(23.6)

No

4,578(82.3)

3,430(60.1)

Yes

984(17.7)

2,281(39.9)

No

5,475(98.4)

5,677(99.4)

Yes

87(1.6)

34(0.6)

<0.001

Retrosternal pain 0.039

Weight loss <0.001

SDA <0.001

a. Number of ESCC cases in the immediate family and relatives within 3 generations. b. BMI was not available in 222 (4.0%) subjects from the development set and in 10 (0.2%) subjects from the validation set. c. P values were obtained from the Chi-squared test and rank-sum test for category and continuous variables respectively. Abbreviations: ESCC, esophageal squamous cell carcinoma; BMI, body mass index; SDA, severe dysplasia and above

20

Table 2. Predictors and regression coefficients of the prediction model for SDA generated from the development set. a

Predictors

Age

Non-SDA

SDA

(n, %)

(n, %)

-

-

3,990

47

(98.8)

(1.2)

1,485

40

(97.4)

(2.6)

Univariate

Multivariate

Multivariate

coefficients

coefficients

ORsc

(95% CI)

(95% CI)

(95% CI)

0.17 (0.13-0.21)

0.18 (0.14-0.22)

Ref

Ref

0.83 (0.40-1.25)

0.83 (0.38-1.28)

Ref

Ref

0.71 (0.25-1.18)

0.52 (0.03-0.99)

Ref

Ref

1.62 (1.20-2.05)

1.61 (1.14-2.07)

Ref

Ref

0.79 (0.35-1.23)

0.48 (-0.01-0.95)

1.19 (1.14-1.24)

Smoking No Yes

Ref 2.30 (1.46-3.61)

b

BMI

>22

4,265(98.7)

22

991 (97.4)

57 (1.3) 27 (2.6)

Ref 1.68 (1.04-2.73)

Dysphagia No

4,521(99.1)

Yes

954 (95.5)

42 (0.9) 45 (4.5)

Ref 4.99 (3.14-7.93)

Retrosternal pain No Yes Intercept

4,287

54

(98.8)

(1.2)

1,188

33

(97.3)

(2.7)

-

-

-

Ref 1.6 2(1.00-2.60)

-15.80 (-18.61--13.26)

a. Predictors were selected by a 2-step selection method in which all candidate predictors were first evaluated in univariate logistic regression models, and variables with either P<0.05 or P<0.5 and odds ratio>1.3 were subjected to multivariate logistic regression analysis where the AIC was used to determine the final predictor pattern. b. Cases with missing BMI values were excluded from the multivariate analysis. c. Odds ratio (OR) can be achieved as follows: OR= exp (coefficient) Abbreviations: BMI, body mass index; CI, confidence interval; OR, odds ratio; SDA, severe dysplasia and above;

.

21

Table 3. Effectiveness of risk stratification based on the prediction model in development and validation sets. Development set (n=5,340)

Risk stratification

Cutoffs

No. of

No. of

subjects

cases

classified

classified

into each

into each

risk group

risk group

N (%)

N (%)

Detection rate of esophageal high-grade lesions in

Validation set (n=5,701) Detection

No. of

No. of

rate ratio

subjects

cases

compared

classified

classified

with

into each

into each

risk group

risk group

examination

N (%)

N (%)

universe

each risk

a

group (%)

Detection rate of esophageal high-grade lesions in each risk group (%)

Detection rate ratio compared with universe examinationa

High

[0.0156786, 1]

1,336 (25.0)

68 (81.0)

5.1

3.2

1,282 (22.5)

25 (73.5)

2.0

3.3

Moderate

[0.0026916, 0.0156786)

2,000 (37.5)

16 (19.0)

0.8

0.5

2,368 (41.5)

9 (26.5)

0.4

0.6

Low

[0, 0.0026916)

2,004 (37.5)

0 (0.0)

0.0

0.0

2,051 (36.0)

0 (0.0)

0.0

0.0

a. The detection rate ratio was calculated as the detection rate in each risk group divided by the detection rate in the total population.

22

FIGURE LEGENDS Figure 1. Receiver operating characteristic (ROC) curves of the prediction model for SDA in the development set. Figure 2. Receiver operating characteristic (ROC) curves of the prediction model for SDA in the validation set. Supplementary Figure 1. Receiver operating characteristic (ROC) curves of the prediction model for SDA in different age groups. A, ROC of prediction models in subjects ≤60 years. B, ROC of prediction models in subjects >60 years. Supplementary Figure 2. Calibration plot of the prediction model for SDA in the validation set. Supplementary Figure 3. Validation of a general population model in clinical settings. A, ROC in subjects ≤60 for the development set. B, ROC in subjects >60 for the development set. C, ROC in subjects ≤60 for the validation set. D, ROC in subjects >60 for the validation set.

23

SUPPLEMENTS

Supplementary Table 1. Sensitivity and specificity of the prediction model for SDA in development and validation sets. Development set (n=5,340) Validation set (N=5,701) Proportion Proportion of high-risk Sensitivity Specificity of high-risk Sensitivity Specificity Cutoffsa subjects (%) (%) subjects (%) (%) (%) (%) 0 100.0 100.0 0.0 100.0 100.0 0.0 0.0007898 90.0 100.0 8.9 92.5 100.0 7.6 81.7 100.0 18.4 0.0013296 80.0 100.0 20.2 0.0019084 70.0 100.0 29.6 72.5 100.0 27.7 b 0.0026916 62.5 100.0 38.1 64.0 100.0 36.2 60.7 100.0 39.6 0.0030738 60.0 98.8 40.3 0.0046858 50.0 98.8 50.7 48.2 97.1 52.1 0.0077961 40.0 92.9 60.7 36.9 85.3 63.4 0.0125327 30.0 83.3 70.7 27.4 76.5 72.9 b 0.0156786 25.0 81.0 75.9 22.5 73.5 77.8 0.0204075 20.0 77.4 80.8 18.4 70.6 81.9 0.0401848 10.0 57.1 90.7 8.5 44.1 91.7 a. Cutoffs were selected at every decile of the predicted probabilities in the development set. b. Highest probability that insure 100% and 80% sensitivity in the development set.

Abbreviations AIC, Akaike information criterion; AUC, area under the curve; BMI, body mass index; CI, confidence interval; EC, esophageal cancer; ESCC, esophageal squamous cell carcinoma; NBI, narrow-band imaging; OR, odds ratio; RCT, randomized controlled trial; ROC, receiver operating characteristic curve; SDA, severe dysplasia and above