引用本文:卓于迪1,2,朱陵群1,2,张立山1,2,戴雁彦1,2,杨晓明1,2,程潞瑶1,2,苑艺1,2,甘叶娜1,2,周询2,邬倩颖2,郭烨2,李多多1,2.套索回归模型在中医临床研究数据的统计应用与R语言实践[J].世界中医药,2023,(07):. |
|
套索回归模型在中医临床研究数据的统计应用与R语言实践 |
Application of Lasso Regression Model in Clinical Research of Traditional Chinese Medicine and the R Programming Language Practice |
投稿时间:2021-07-02 |
DOI:10.3969/j.issn.1673-7202.2023.07.023 |
中文关键词: 套索回归 模型 弹性网络 R语言 正则化 中医临床研究 高维数据 降维 |
English Keywords:Lasso regression Model Elastic net R programming language Regularization TCM clinical research High-dimensional data Dimensionality reduction |
基金项目:国家自然科学基金项目(81673904);北京中医药大学大学生创新创业训练计划项目(X202110026048);北京中医药大学2019年度“教学名师培育计划”(京中校发[2019]36号) |
|
摘要点击次数: 451 |
全文下载次数: 0 |
中文摘要: |
目的:构建简洁且易于解释的模型,为同类研究提高模型预测的准确率提供参考。方法:从国家自然科学基金项目在研课题(81673904)中选取200例肺纤维化合并心力衰竭患者心力衰竭分级与其可能的相关因素,运用套索回归从患者的“性别”“年龄”“体质量指数”“收缩压”“舒张压”“血清总胆固醇”“空腹血糖”“舌质颜色”“舌苔颜色”“中医体质”等自变量中筛选特征变量,构建回归模型以探讨其与心力衰竭严重程度之间的关系。结果:剔除高维数据中的混杂因素,筛选特征变量,模型中包含6个特征变量,体质量指数=0.006 357 091,收缩压=0.219 695 622,血清总胆固醇=0.229 324 833,舌色红=0.004 216 705,苔色薄白=-0.825 660 057,苔色黄厚=0.356 499 153。肺纤维化合并心力衰竭出现严重心力衰竭的概率为P=-33.632+0.006×BMI+0.220×SBP+0.229×TC+0.004×是否红舌-0.826×是否薄白苔+0.356×是否黄厚苔。结论:得到的模型可用以解释严重心力衰竭发病的相关因素并推广到总体中进行预测。套索回归模型适用于中医临床研究的高维数据分析,可能具有推广价值。 |
English Summary: |
To establish a simple and explainable model and provide reference for improving the accuracy of model prediction in similar research.Methods:The data(heart failure grade and related factors) of 200 patients with pulmonary fibrosis complicated with heart failure in a research project supported by the National Natural Science Foundation of China(81673904) were collected.The Lasso regression was employed to screen out the characteristic variables from sex,age,body mass index(BMI),systolic blood pressure(SBP),diastolic blood pressure,total cholesterol(TC),fasting blood glucose,tongue color,tongue coating color,and traditional Chinese medicine(TCM) physique.A regression model was established to explore the relationship between the severity of heart failure and the characteristic variables.Results:After removal of the confounding factors in high-dimensional data,six characteristic variables were selected for the model,including BMI(0.006 357 091),SBP(0.219 695 622),TC(0.229 324 833),red tongue(0.004 216 705),thin white tongue coating(-0.825 660 057),and thick yellow tongue coating(0.356 499 153).The probability of severe heart failure was P=-33.632+0.006×BMI+0.220×SBP+0.229×TC+0.004×red tongue-0.826×thin white tongue coating+0.356×thick yellow tongue coating.Conclusion:The established model can explain the factors associated with severe heart failure and be applied to the prediction in the population.Lasso regression model is suitable for high-dimensional data analysis of TCM clinical research,demonstrating the value for promotion. |
查看全文 查看/发表评论 下载PDF阅读器 |