Volume 20 Issue 12
Dec.  2022
Turn off MathJax
Article Contents
LI Li-qiu, XU Cheng-yan, WANG Xiao-li, CAO Yong-qi, LI Yan, ZHAO Liang, WANG Zhao-xin, JIA Huan. XGboost prediction model for osteoarthritis risk based on community big data[J]. Chinese Journal of General Practice, 2022, 20(12): 2080-2083. doi: 10.16766/j.cnki.issn.1674-4152.002774
Citation: LI Li-qiu, XU Cheng-yan, WANG Xiao-li, CAO Yong-qi, LI Yan, ZHAO Liang, WANG Zhao-xin, JIA Huan. XGboost prediction model for osteoarthritis risk based on community big data[J]. Chinese Journal of General Practice, 2022, 20(12): 2080-2083. doi: 10.16766/j.cnki.issn.1674-4152.002774

XGboost prediction model for osteoarthritis risk based on community big data

doi: 10.16766/j.cnki.issn.1674-4152.002774






  • Received Date: 2022-03-18
    Available Online: 2023-02-07
  •   Objective  To explore the construction of osteoarthritis risk warning model by community medical big data and machine learning model, provide a quantitative tool for the early warning of osteoarthritis in the community, to provide an efficient management method for the prevention and treatment of osteoarthritis in the elderly.  Methods  The data of health records, health examinations and diagnosis and treatment data of six community health service centres in Shanghai from January 1, 2019 to December 31, 2019, were integrated to form an original database containing more than 40 000 samples and 126 variables. After data pre-processing and compound feature selection to screen the model characteristics, the XGBoost algorithm was used to construct a risk assessment model for osteoarthritis patients.  Results  Fourteen characteristics were screened in this study: diet with balanced meat and vegetables, height, weight, body mass index (BMI), time of each exercise, total cholesterol, high-density lipoprotein, low-density lipoprotein, hypertension, limb trauma, etc. High-density lipoprotein, total cholesterol, BMI, low-density lipoprotein and frequency of drinking were the top five characteristic factors in importance ranking, and their characteristic importance was more than 0.1. The XGBoost model of osteoarthritis risk assessment was constructed with 'osteoarthritis' as the output variable, and 14 features were screened by feature engineering as the input variable. After the XGBoost model was trained by eightfold cross-validation, the model was validated on the test set with an accuracy rate of 92%, a precision rate of 71% and recall rate of 65%, F1_score was 0.68, the area under the receiver operating characteristic curve reached 0.82, and the KS value was 0.48.  Conclusion  In this study, a risk warning model of osteoarthritis is constructed using community medical big data, and the overall fit and feature rationality of the model are good, which provides a tool for the early warning of osteoarthritis in the community and is conducive to the early diagnosis and treatment of osteoarthritis in the community.


  • loading
  • [1]
    中华医学会骨科学分会关节外科学组. 骨关节炎诊疗指南(2018年版)[J]. 中华骨科杂志, 2018, 38 (12): 705-715. doi: 10.3760/cma.j.issn.0253-2352.2018.12.001

    Group of Joint Surgery, Chinese Society of Osteology. Clinical Guidelines for Osteoarthritis (2018 edition)[J]. Chinese Journal of Orthopaedics, 2018, 38 (12): 705-715. doi: 10.3760/cma.j.issn.0253-2352.2018.12.001
    郑双, 徐建华, 黄淑婷, 等. 某三甲医院148例膝骨关节炎患者就医及治疗现状分析[J]. 中华疾病控制杂志, 2015, 19(1): 91-92, 106. https://www.cnki.com.cn/Article/CJFDTOTAL-JBKZ201501024.htm

    ZHENG S, XU J H, HUANG S T, et al. Cross-sectional study of the hospitalizing behavior and therapeutic status of 148 knee osteoarthritis patients in a third-level first-calss hospital[J]. Chinese Journal of Disease Control & Prevention, 2015, 19(1): 91-92, 106. https://www.cnki.com.cn/Article/CJFDTOTAL-JBKZ201501024.htm
    CROSS M, SMITH E, HOY D, et al. The global burden of hip and knee osteoarthritis: Estimates from the global burden of disease 2010 study[J]. Ann Rheum Dis, 2014, 73(7): 1323-1330. doi: 10.1136/annrheumdis-2013-204763
    任燕, 石娅娅, 谭波, 等. 中国人群膝骨关节炎危险因素的Meta分析[J]. 现代预防医学, 2015, 42(12): 2282-2284, 2292. https://www.cnki.com.cn/Article/CJFDTOTAL-XDYF201512053.htm

    REN Y, SHI Y Y, TAN B, et al. Meta-analysis of the risk factors for knee osteoarthritis among the Chinese population[J]. Modern Preventive Medicine, 2015, 42(12): 2282-2284, 2292. https://www.cnki.com.cn/Article/CJFDTOTAL-XDYF201512053.htm
    陈颂春, 王欣欣, 高翔. 膝骨关节炎危险因素的系统评价与Meta分析[J]. 老年医学与保健, 2016, 22(6): 405-410. doi: 10.3969/j.issn.1008-8296.2016.06.23

    CHEN S C, WANG X X, GAO X. Risk Factors for Knee Osteoarthritis: a Systematic Review and Meta-analysis[J]. Geriatrics & Health Care, 2016, 22(6): 405-410 doi: 10.3969/j.issn.1008-8296.2016.06.23
    杨迎春, 于晓璐, 顾海伦, 等. 辽宁省某三甲医院膝关节骨性关节炎患者常见影响因素的调查[J]. 现代预防医学, 2018, 45(8): 1516-1519. https://www.cnki.com.cn/Article/CJFDTOTAL-XDYF201808046.htm

    YANG Y C, YU X L, GU H L, et al. Survey of influencing factors on patients with knee osteoarthritis in a hospital of Liaoning province[J]. Modern Preventive Medicine, 2018, 45(8): 1516-1519. https://www.cnki.com.cn/Article/CJFDTOTAL-XDYF201808046.htm
    沈锋, 夏正明, 周满春. 广州市南石头社区中老年群体膝骨关节炎的危险因素分析[J]. 江西医药, 2019, 54(2): 144-146. https://www.cnki.com.cn/Article/CJFDTOTAL-JXYY201902021.htm

    SHEN F, XIA Z M, ZHOU M C. Risk factors analysis of knee osteoarthritis in the aged population in Nanshitou Community, Guangzhou City[J]. Jiangxi Medical Journal, 2019, 54(2): 144-146. https://www.cnki.com.cn/Article/CJFDTOTAL-JXYY201902021.htm
    张洪逵, 陈国华, 叶壮益. 膝骨关节炎发生的影响因素分析[J]. 实用中西医结合临床, 2018, 18(7): 120-121. https://www.cnki.com.cn/Article/CJFDTOTAL-SZXL201807066.htm

    ZHANG H K, CHEN G H, YE Z Y. Analysis of influencing factors of knee osteoarthritis[J]. Practical Clinical Journal of Integrated Traditional Chinese and Western Medicine, 2018, 18(7): 120-121. https://www.cnki.com.cn/Article/CJFDTOTAL-SZXL201807066.htm
    许永超. 基于多标签体检数据的疾病风险分析方法研究[D]. 郑州: 郑州大学, 2017.

    XU Y C. Study on disease risk analysis method based on multi-label physical examination data[D]. Zhengzhou: Zhengzhou University, 2017.
    夏涛, 徐辉煌, 郑建立. 基于机器学习的冠心病住院费用预测研究[J]. 智能计算机与应用, 2019, 9(5): 35-39. https://www.cnki.com.cn/Article/CJFDTOTAL-DLXZ201905008.htm

    XIA T, XU H H, ZHENG J L. Prediction of hospitalization expenses for coronary heart disease based on machine learning[J]. Intelligent Computer and Applications, 2019, 9(5): 35-39. https://www.cnki.com.cn/Article/CJFDTOTAL-DLXZ201905008.htm
    安莹, 黄能军, 杨荣, 等. 基于深度学习的心血管疾病风险预测模型[J]. 中国医学物理学杂志, 2019, 36(9): 1103-1112. https://www.cnki.com.cn/Article/CJFDTOTAL-YXWZ201909021.htm

    AN Y, HUANG N J, YANG R, et al. Deep learning-based model for risk prediction of cardiovascular diseases[J]. Chinese Journal of Medical Physics, 2019, 36(9): 1103-1112. https://www.cnki.com.cn/Article/CJFDTOTAL-YXWZ201909021.htm
    彭佳丽, 刘春容, 李旭, 等. 采用XGBoost和随机森林探索中国西部女性乳腺癌危险因素[J]. 现代预防医学, 2020, 47(1): 1-4. https://www.cnki.com.cn/Article/CJFDTOTAL-XDYF202001001.htm

    PENG J L, LIU C R, LI X, et al. Applying XGBoost and random frost to explore the risk factors of breast cancer among western Chinese women[J]. Modern Preventive Medicine, 2020, 47(1): 1-4. https://www.cnki.com.cn/Article/CJFDTOTAL-XDYF202001001.htm
    李占山, 刘兆赓. 基于XGBoost的特征选择算法[J]. 通信学报, 2019, 40(10): 101-108. https://www.cnki.com.cn/Article/CJFDTOTAL-TXXB201910010.htm

    LI Z S, LIU Z G. Feature selection algorithm based on XGBoost[J]. Journal on Communications, 2019, 40(10): 101-108. https://www.cnki.com.cn/Article/CJFDTOTAL-TXXB201910010.htm
    岳鹏, 侯凌燕, 杨大利, 等. 基于XGBoost特征选择的疾病诊断XLC-Stacking方法[J]. 计算机工程与应用, 2020, 56(17): 136-141. https://www.cnki.com.cn/Article/CJFDTOTAL-JSGG202017020.htm

    YUE P, HOU L Y, YANG D L, et al. XLC-Stacking Method for Disease Diagnosis Based on XGBoost Feature Selection[J]. Computer Engineering and Applications, 2020, 56(17): 136-141. https://www.cnki.com.cn/Article/CJFDTOTAL-JSGG202017020.htm
    RASHEED Z, RASHEED N, Al-SHOBAILI H A, et al. Epigallocatechin-3-O-gallate up-regulates microRNA-199a-3p expression by down-regulating the expression of cyclooxygenase-2 in stimulated human osteoarthritis chondrocytes[J]. J Cell Mol Med, 2016, 20(12): 2241-2248.
    卢敏强, 钟庆, 贾兆锋, 等. 雌激素与骨关节炎[J]. 国际骨科学杂志, 2018, 39(1): 41-44. https://www.cnki.com.cn/Article/CJFDTOTAL-GWGK201801015.htm

    LU M Q, ZHONG Q, JIA Z F, et al. Estrogen and osteoarthritis[J]. International Journal of Orthopaedics, 2018, 39(1): 41-44. https://www.cnki.com.cn/Article/CJFDTOTAL-GWGK201801015.htm
    沈明球, 刘俊昌, 王新军, 等. 新疆北疆牧区维、哈、汉族膝骨性关节炎致病因素的流行病学调查[J]. 中国组织工程研究, 2015, 19(29): 4614-4618. https://www.cnki.com.cn/Article/CJFDTOTAL-XDKF201529007.htm

    SHEN M Q, LIU J C, WANG X J, et al. An epidemiological investigation on the pathogenic factors of knee osteoarthritis in Uygur, Kazakh and Han populations in pastoral areas of northern Xinjiang Uygur Autonomous Region, China[J]. Chinese Journal of Tissue Engineering Research, 2015, 19(29): 4614-4618. https://www.cnki.com.cn/Article/CJFDTOTAL-XDKF201529007.htm
    LIU Y, ZHANG H F, LIANG N X, et al. Prevalence and associated factors of knee osteoarthritis in a rural Chinese adult population: An epidemiological survey[J]. BMC Public Health, 2016, 16: 94.
    ANTONY B, VENN A, CICUTTINI F, et al. Correlates of knee bone marrow lesions in younger adults[J]. Arthritis Res Ther, 2016, 18: 31
    石银朋, 奚阳, 张志毅, 等. 血脂对骨关节炎影响研究进展[J]. 中国实用内科杂志, 2020, 40(1): 67-69. https://www.cnki.com.cn/Article/CJFDTOTAL-SYNK202001016.htm

    SHI Y P, XI Y, ZHANG Z Y, et al. Research progress in the effect of blood lipids on osteoarthritis[J]. Chinese Journal of Practical Internal Medicine, 2020, 40(1): 67-69. https://www.cnki.com.cn/Article/CJFDTOTAL-SYNK202001016.htm
    吴鹏, 茆军. 代谢组学在中医药治疗膝骨关节炎中应用的研究进展[J]. 中国医药, 2021, 16(9): 1420-1422. https://www.cnki.com.cn/Article/CJFDTOTAL-ZGYG202109032.htm

    WUP, MAO J. Research progress on metabolomics in the treatment of knee osteoarthritis with traditional Chinese medicine[J]. China Medicine, 2021, 16(9): 1420-1422. https://www.cnki.com.cn/Article/CJFDTOTAL-ZGYG202109032.htm
    陈江水, 杨华瑞, 方志, 等. 脂质代谢异常与骨关节炎关系研究进展[J]. 海南医学, 2018, 29(5): 682-684. https://www.cnki.com.cn/Article/CJFDTOTAL-HAIN201805029.htm

    CHEN J S, YANG H R, FANG Z, et al. Relationship between abnormal lipid metabolism and progression of osteoarthritis[J]. Hainan Medical Journal, 2018, 29(5): 682-684. https://www.cnki.com.cn/Article/CJFDTOTAL-HAIN201805029.htm
  • 加载中


    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(3)  / Tables(2)

    Article Metrics

    Article views (313) PDF downloads(28) Cited by()
    Proportional views


    DownLoad:  Full-Size Img  PowerPoint