Yu Sheng

Sheng Yu

Associate Professor

Research Areas: medical informatics, natural language processing, data analysis of electronic medical records, knowledge graph construction, clinical decision support.

Office: Room 209-A, Weiqing Building, Tsinghua University

Phone: +86-10-62783842

Email: syu@tsinghua.edu.cn

Background
  • Ph.D, George Washington University, Systems Engineering (Operations Research)
  • Postdoctoral Research Fellow, Harvard University, 2012-2015
  • Research Fellow, Brigham and Women's Hospital, 2012-2015
  • Assistant Professor, Center for Statistical Science, Tsinghua University, 2015-2018
  • Associate Professor, Center for Statistical Science, Tsinghua University, 2018-
  • RONG Professor, Institute for Data Science, Tsinghua University, 2018-
TEACHING
  • Introduction to Data Science
  • Statistical Methods in Data Mining
  • Big Data Computing
  • Statistical Computing
PATENTS
  • 《信息检索方法和装置》for precisely identifying electronic medical records according to the meaning in natural language queries. Patent Number: ZL 201310200430.3. Inventor: Yu Sheng.
Publications
  1. Wenxin Ning, Stephanie Chan, Andrew Beam, Ming Yu, Alon Geva, Katherine P Liao, Mary Mullen, Kenneth D Mandl, Isaac S Kohane, Tianxi Cai, Sheng Yu*. Feature Extraction for Phenotyping from Semantic and Knowledge Resources. Journal of Biomedical Informatics (2019); Impact factor 2.882.
  2. Jiaqi Guan, Runzhe Li, Sheng Yu, Xuegong Zhang. Generation of Synthetic Electronic Medical Record Text. Proceedings of IEEE BIBM (2018).
  3. Jessica Gronsbell, Jessica Minnier, Sheng Yu, Katherine Liao, Tianxi Cai. Automated Feature Selection of Predictors in Electronic Medical Records Data. Biometrics (2018); Impact factor 1.524.
  4. Anil Can, Victor Castro, Dmitriy Dligach, Sean Finan, Sheng Yu, Vivian Gainer, Nancy Shadick, Guergana Savova, Shawn Murphy, Tianxi Cai, Scott Weiss, and Rose Du*. Elevated International Normalized Ratio is Associated with Ruptured Aneurysms. Stroke (2018); Impact factor 6.032.
  5. Anil Can, Robert F. Rudy, BS, M. Castro, Sheng Yu, Dmitriy Dligach, Sean Finan, Vivian Gainer, Nancy A. Shadick, Guergana Savova, Shawn Murphy, Tianxi Cai, Scott T. Weiss, Rose Du*. Association between Aspirin Dose and Subarachnoid Hemorrhage from Saccular Aneurysms: A Case-Control Study; Neurology (2018); Impact factor 8.320.
  6. Anil Can, Victor M. Castro, Dmitriy Dligach, Sean Finan, Sheng Yu, Vivian Gainer, Nancy A. Shadick, Guergana Savova, Shawn Murphy, Tianxi Cai, Scott T. Weiss, Rose Du*. Low Serum Calcium and Magnesium Levels and Rupture of Intracranial Aneurysms; Stroke (2018); Impact factor 6.032.
  7. Jennifer A. Sinnott*, Fiona Cai, Sheng Yu, Boris P. Hejblum, Chuan Hong, Isaac S. Kohane, Katherine P. Liao. PheProb: Probabilistic Phenotyping Using Diagnosis Codes to Improve Power for Genetic Association Studies. Journal of the American Medical Informatics Association (2018); Top journal of medical informatics, impact factor 4.270.
  8. Jian Zhang, Anil Can, Srinivasan Mukundan Jr., Michael Steigner, Victor M. Castro, Dmitriy Dligach, Sean Finan, Sheng Yu, Vivian Gainer, Nancy A. Shadick, Guergana Savova, Shawn Murphy, Tianxi Cai, Zhong Wang, Scott T. Weiss, Rose Du*. Morphological Variables Associated with Ruptured Middle Cerebral Artery Aneurysms. Neurosurgery (2018); impact factor 4.889.
  9. Thomas H. McCoy#, Sheng Yu#, Kamber L. Hart, Victor M. Castro, Hannah E. Brown, James N. Rosenquist, Alysa E. Doyle, Pieter J. Vuijk, Tianxi Cai*, Roy H. Perlis*. High Throughput Phenotyping for Dimensional Psychopathology in Electronic Health Records. Biological Psychiatry (2018); DOI: 10.1016/j.biopsych.2018.01.011; #Co-first author; Top journal in Psychiatry impact factor 11.412. https://www.sciencedaily.com/releases/2018/02/180226103436.htm
  10. Thomas H. McCoy, Victor M. Castro, Kamber L. Hart, Amelia M. Pellegrini, Sheng Yu, Tianxi Cai, Roy H. Perlis*. Genome-wide Association Study of Dimensional Psychopathology Using Electronic Health Records. Biological Psychiatry (2018); DOI: 10.1016/j.biopsych.2017.12.004; Top journal in Psychiatry impact factor 11.412.
  11. Anil Can, Victor Castro, Dmitriy Dligach, Sean Finan, Sheng Yu, Vivian Gainer, Nancy Shadick, Guergana Savova, Shawn Murphy, Tianxi Cai, scott weiss, and Rose Du*. Lipid-Lowering Agents and High HDL are Inversely Associated with Intracranial Aneurysm Rupture; Stroke (2018); Impact factor 6.032.
  12. Anil Can, Victor M. Castro, Yildirim H. Ozdemir, Sarajune Dagen, Dmitriy Dligach, Sean Finan, Sheng Yu, Vivian Gainer, Nancy A. Shadick, Shawn Murphy, Tianxi Cai, Guergana Savova, Scott T. Weiss, Rose Du*; Alcohol Consumption and Aneurysmal Subarachnoid Hemorrhage; Translational Stroke Research (2018), 9(1):13-19; Impact factor 4.503.
  13. Anil Can, Victor M. Castro, Sheng Yu, Dmitriy Dligach, Sean Finan, Vivian Gainer, Nancy A. Shadick, Guergana Savova, Shawn Murphy, Tianxi Cai, Guergana Savova, Scott T. Weiss, Rose Du*; Antihyperglycemic Agents are Inversely Associated with Intracranial Aneurysm Rupture; Stroke (2018), 49(1):34-39; doi: 10.1161/STROKEAHA.117.019249 ; Impact factor 6.032.
  14. Sheng Yu*, Yumeng Ma, Jessica Gronsbell, Tianrun Cai, Ashwin N. Ananthakrishnan, Vivian S. Gainer, Susanne E. Churchill, Peter Szolovits, Shawn N. Murphy, Isaac S. Kohane, Katherine P. Liao, Tianxi Cai. Enabling Phenotypic Big Data with PheNorm; Journal of the American Medical Informatics Association (2018), 25(1, special issue featuring data science):54-60; doi: 10.1093/jamia/ocx111; Top journal of medical informatics, impact factor 4.270.
  15. Anil Can, Victor M. Castro, Yildirim H. Ozdemir, Sarajune Dagen, Dmitriy Dligach, Sean Finan, Sheng Yu, Vivian Gainer, Nancy A. Shadick, Guergana Savova, Shawn Murphy, Tianxi Cai, Guergana Savova, Scott T. Weiss, Rose Du*; Heroin Use is Associated with Ruptured Saccular Aneurysms; Translational Stroke Research (2017); doi: 10.1007/s12975-017-0582-y; Impact factor 4.503.
  16. Anil Can, Victor Castro, Yildirim H Ozdemir, Sarajune Dagen, Sheng Yu, Dmitriy Dligach, Sean Finan, Vivian S Gainer, Nancy Shadick, Shawn Murphy, Tianxi Cai, Guergana Savova, Ruben Dammers, Scott T Weiss, and Rose Du*; Association of Intracranial Aneurysm Rupture with Smoking Duration, Intensity, and Cessation; Neurology (2017), 10-1212; Impact factor 8.320.
  17. Sheng Yu*, Abhishek Chakrabortty, Katherine P. Liao, Tianrun Cai, Ashwin N. Ananthakrishnan, Vivian S. Gainer, Susanne E. Churchill, Peter Szolovits, Shawn N. Murphy, Isaac S. Kohane, Tianxi Cai. Surrogate-assisted Feature Extraction for High-throughput Phenotyping; Journal of the American Medical Informatics Association (2017), 24 (e1): e143-e149; doi: 10.1093/jamia/ocw135; Top journal of medical informatics, impact factor 4.270.
  18. Victor M. Castro, Dmitriy Dligach, Sean Finan, Sheng Yu, Anil Can, Muhammad Abd-El-Barr, Vivian Gainer, Nancy A. Shadick, Shawn Murphy, Tianxi Cai, Guergana Savova, Scott T. Weiss, Rose Du*; Large-scale identification of subjects with cerebral aneurysms using natural language processing; Neurology (2017): 88(2), 164-168; Impact factor 8.320.
  19. Florence H. Yong, Lu Tian*, Sheng Yu, Tianxi Cai and L.J. Wei. Optimal stratification in outcome prediction using baseline information; Biometrika, 103.4 (2016): 817-828; Impact factor 1.448.
  20. Tianrun Cai, Andreas A. Giannopoulos, Sheng Yu, Tatiana Kelil,Beth Ripley, Kanako K. Kumamaru, Frank J. Rybicki, and Dimitrios Mitsouras*. Natural Language Processing Technologies in Radiology Research and Clinical Applications. RadioGraphics (2016), 36, no. 1: 176-191. Impact factor: 3.427.
  21. Sheng Yu*, Katherine P. Liao, Stanley Y. Shaw, Vivian S. Gainer, Susanne E. Churchill, Peter Szolovits, Shawn N. Murphy, Isaac Kohane, and Tianxi Cai. Toward High-throughput Phenotyping: Unbiased Automated Feature Extraction and Selection from Knowledge Sources; Journal of the American Medical Informatics Association (2015), 22(5):993-1000; doi: 10.1093/jamia/ocv034; Top journal of medical informatics, impact factor 4.270; EDITOR'S CHOICE.
  22. Castro, V., Shen, Y., Yu, S., Finan, S., Pau, C.T., Gainer, V., Keefe, C.C., Savova, G., Murphy, S.N., Cai, T. and Welt, C.K.*. Identification of subjects with polycystic ovary syndrome using electronic health records. Reproductive Biology and Endocrinology (2015), 13(1), p.116. Impact factor 2.226.
  23. Sheng Yu*, Kanako K. Kumamaru, Elizabeth George, Ruth M. Dunne, Arash Bedayat, Matey Neykov, Andetta R. Hunsaker, Karin E. Dill, Tianxi Cai, and Frank J. Rybicki. Classification of CT Pulmonary Angiography Reports By Presence, Chronicity, and Location of Pulmonary Embolism with Natural Language Processing; Journal of Biomedical Informatics, 52 (2014): 386-393; Impact factor 2.882.
  24. Vishesh Kumar*, Katherine Liao, Su-Chun Cheng, Sheng Yu, Uri Kartoun, Ari Brettman, Vivian Gainer, Andrew Cagan, Shawn Murphy, Guergana Savova, Pei Chen, Peter Szolovits, Zongqi Xia, Elizabeth Karlson, Robert Plenge, Ashwin Ananthakrishnan, Susanne Churchill, Tianxi Cai, Isaac Kohane, Stanley Shaw. Natural Language Processing Improves Phenotypic Accuracy in an Electronic Medical Record Cohort of Type 2 Diabetes and Cardiovascular Disease; Journal of the American College of Cardiology (2014), 63(12):A1359; Impact factor 19.896.
  25. Sheng Yu* and Enrique Campos-Náñez. Adaptive Convex Enveloping for Multidimensional Stochastic Dynamic Optimization; 62nd IIE Annual Conference and Expo. Proceedings. 2012. Best Paper of Operations Research.