【统计学论坛】Improving hierarchical models using historical data with applications to microarray data analysis

近期活动
时间:2015年12月17日 16:00-17:00 报告人:Zhaohui Qin

Title (题目): Improving hierarchical models using historical data with applications to microarray data analysis
Time (时间): 4:00pm-5:00pm, 2015-12-17 (Thursday)
Location (地点): 伟清楼209 (Center for Statistical Science, Tsinghua University)

Speaker (报告人): Zhaohui (Steve) Qin, Emory University

20151208192110_221

Abstract (摘要):

Modern high throughput biotechnologies such as microarray and next generation sequencing produce massive amount of information for each sample assayed. However, in a typical high throughput experiment, only very limited amount of data are observed for each individual feature, thus the classical large p, small n problem. Bayesian hierarchical model, capable of borrowing strength across features within the same dataset, has been recognized as an effective tool in analyzing such data. However, the shrinkage effect, the most prominent feature of hierarchical features, can lead to undesirable over-correction for some features. In this work, we discuss possible causes of the over-correction problem and propose several alternative solutions. Our strategy is rooted in the facts that in Big Data era, large amount of historical data are available which should be taken advantage of. Our strategies present a new framework to enhance the Bayesian hierarchical models. Through simulation and real data analysis, we demonstrated superior performance of the proposed strategies.

About the speaker (报告人介绍)
Steve Qin is an Associate Professor of Biostatistics and Bioinformatics at Emory University. He received his PhD degree in Statistics from the University of Michigan and was a Postdoctoral Fellow at the Department of Statistics of Harvard University. The major goal of Dr. Qin’s work is to utilize his extensive training in statistics to provide analytical tools for the genetics and genomics research community. His current research focuses on topics such as developing and evaluating model-based methods to analyze high-throughput genomics and epigenomics data from ChIP-Seq, RNA-Seq, WGBS and Hi-C experiments. In addition to method development, he has collaborated extensively with biologists and clinicians to assist their efforts of extracting novel insights from biomedical data.