ntry-header

      近日,我中心邓柯副教授课题组在统计国际顶尖期刊 Journal of the American Statistical Association (JASA)发表题为“Partition-Mallows Model and Its Inference for Rank Aggregation”的研究论文,提出了一种推断排名聚合的新方法。曾在邓柯课题组工作的朱万闯博士是该文的第一作者,姜瑛恺博士和刘军教授为共同作者,邓柯副教授是论文的通讯作者。

排名聚合是指如何聚合从不同信息源获得的关于某些个体的排序,从而得到一个更加‘精确’的排序。例如,有m位评委为n名运动员的能力进行排序。排名聚合致力于对这m个排序进行整合分析以得到一个新的排序,能够更加准确地反映n名运动员能力的高低。现实中,m位评委的可靠性可能会存在差异,部分可靠性较低的评委可能会误导排名聚合的结果。开发基于数据驱动的方法来自动识别不同评委的可靠性,并据此优化排名聚合的结果,具有重要的实际意义。

邓柯和刘军等人曾于2014年在JASA发表了题为“Bayesian Aggregation of Order-Based Rank Data”的论文中,提出了一种基于划分模型(partition model)的排名聚合方法BARD。BARD将排序对象划分为两个组别,“相关个体组”和“背景个体组”,并假设可靠性高的评委们会以更高的概率将中的个体排位于中的个体之前。该方法能够在有效识别评委可靠性的同时,通过弱化可靠性较差的评委在排名聚合中贡献,来消除他们可能带来的负面作用。但是,该方法简单忽略了两个组别中各个体的差异,从而在很大程度上损失了组内排名的信息。从应用的角度看,这是该方法的一个重要局限性。

本文在上述工作的基础上,采用更加精细的 Mallows模型对组别的组内排名进行了建模,将partition模型和Mallows 模型的优势结合起来,得到了能力更强的排名聚合模型Partition-Mallows model。该模型构建了对具有复杂结构的排名数据进行定量描述的一般框架,在充分利用组间及组内的排名信息的基础上,不仅可以有效识别评委可靠性的差异,还能够产出更有效率的排名聚合。我们从理论上证明了该方法的可靠性,并通过大量的计算机模拟和实证研究验证了该方法在处理具有分组结构的排名聚合问题上具有明显优势。

图源:论文原文

该研究工作获得中国国家自然科学基金(Grants 11771242 & 11931001)、北京智源人工智能研究院(Grant BAAI2019ZD0103)和美国国家科学基金(Grants DMS-1903139 and DMS-1712714)的资助。

论文链接:https://doi.org/10.1080/01621459.2021.1930547

 

#post-12265
ntry-header

2021年6月25-26日,清华大学统计学研究中心发起并召开“2021第二届清华大学统计学教学改革研讨会”。会议旨在优化统计学课程体系,并为统计学科教学工作者搭建沟通与交流的平台。来自北京大学、中国人民大学、中央财经大学、北京师范大学等23所高校的近五十名学者出席了本次研讨会。

邓柯致辞
王江典主持会议

会议伊始,作为本次会议的组织者及主持人—清华大学统计学研究中心讲师王江典博士表示,清华大学统计学教学改革研讨会至今已成功举办两届,该会议的发起既是为了响应国家教学改革的号召,也是希望通过此平台增加统计学教学一线教师们的交流,推动全国高校统计学科教学事业的发展。随后,清华大学统计学研究中心执行主任邓柯副教授代表主办方致欢迎辞,并简要介绍清华大学统计学科的教学情况。

与会教师做主题分享

本次研讨会分为两个主题。首先是针对“教学理念与体系”的探讨,中国人民大学王菲菲、上海交通大学宋艳艳、西南财经大学周凡吟三位老师围绕此话题作了主题报告。第二部分则围绕“教学方式与经验”展开,东北师范大学蔺杉、清华大学王琛、中国人民大学李扬分享了各自的教学心得与体会。

沙龙讨论环节

沙龙讨论环节围绕三个议题展开:“现代化教学手段的设计与使用”、“理论型课程vs实践型课程教学探讨”、“机器学习/统计计算类课程的课程设计与教学讨论”,分别由清华大学王江典、邓婉璐、周在莹三位老师主持。经过热烈的讨论与交流,与会专家学者达成以下共识:教学理念应加强应用导向;应增设机器学习的相关课程;应积极探索新教学方式以提高教学效果。

与会学者合影

教学改革工作任重道远,需要一线的教学工作者结合自身经验,不断地交流与思考,才能推动国内统计学教育稳步前进,共促我国统计学科的发展与进步。与会专家学者充分肯定了本次研讨会召开的必要性,并表示希望通过此次研讨会能切实地推动全国高校统计学的教学改革工作。

 

 

#post-12261
ntry-header

2021年6月10日,“第五届北大-清华统计论坛”在北京大学镜春园甲乙丙楼报告厅成功举办。北京大学、清华大学两校统计学科百余名师生参加了本次会议。北大-清华统计论坛是北大清华两校统计学科的传统学术活动,由北京大学统计科学中心和清华大学统计学研究中心联合发起,至今已成功举办五届。

北京大学统计科学中心主任姚方教授致开幕词

会议伊始,北京大学统计科学中心主任姚方教授致开幕词。姚方介绍了北大统计学科的发展历史与学科发展现状,回顾了两校统计学科紧密合作的深厚渊源,强调了北大-清华统计论坛的重要意义,并表达了对未来两校统计学科更加深入合作的美好愿景。

北京大学数学科学学院耿直教授作特邀报告
清华大学经济管理学院许宪春教授作特邀报告

随后是论坛的特邀报告环节。据悉,国际知名统计学家郁彬教授、刘军教授、林希虹教授、陈松蹊教授、周晓华教授、杨立坚教授、姚方教授、唐纳德·鲁宾教授(Donald Rubin)都曾作为特邀嘉宾出席论坛并作特邀报告。本届论坛的特邀报告嘉宾为北京大学数学科学学院耿直教授和清华大学经济管理学院许宪春教授。耿直教授是统计学领域因果推断方向的国际知名专家,本次报告围绕“因果网络局部学习与因果作用可识别性”展开。许宪春教授是清华大学中国经济社会数据研究中心主任,国家统计局原副局长、高级统计师,特邀报告的题目是“数据资产统计与核算问题研究”。

两校博士研究生进行口头汇报(从左到右、从上到下依次是黄昆、张心雨、金子捷、余博、胡文杰、周航、胡晓玉、朱珂)

海报展示与交流

海报展示与评选也是北大-清华统计论坛的经典环节,北大、清华两校的博士研究生们针对各自的研究成果进行口头汇报及海报展示,深入沟通交流。

北京大学姚方教授为优秀海报获奖者黄昆、张心雨同学颁奖

清华大学邓柯副教授为优秀海报获奖者胡晓玉、周航同学颁奖
北京大学耿直教授为优秀毕业生获奖者林颖倩同学颁奖
清华大学许宪春教授为优秀毕业生获奖者蒋斐宇同学颁奖

经过紧张激烈的评选,清华大学统计学研究中心17级博士研究生张心雨、18级博士研究生黄昆,北京大学数学科学学院16级博士研究生胡晓玉、17级博士研究生周航获评“优秀海报奖”,清华大学16级博士研究生蒋斐宇、北京大学16级博士研究生林颖倩获评本年度“优秀毕业生”称号。

参会师生合影
#post-12258
ntry-header

5月28-30日,“清华大学2021全国优秀大学生统计与数据科学交流会(暨优秀大学生夏令营)”活动成功举办。本次活动共收到来自全国99所高校的390份报名申请,经过层层筛选,来自清华大学、北京大学、北京师范大学、浙江大学、上海财经大学、西安交通大学等等多所国内知名高校的共43名同学通过线上线下结合的方式参加了此次活动。

申请者学校分布
申请者专业背景分布词云图
对申请人的专业背景进行统计发现,大部分同学本科所学专业为统计学、数学、金融、计算机等理工类相关学科,也有英语、语言管理、市场营销等专业的同学申请,这也符合统计学研究其交叉性强的特点。统计学研究领域包罗万象,优秀的统计学人才能利用多学科知识或跨学科背景为科学研究助力。
中心执行主任邓柯副教授

活动伊始,中心执行主任邓柯副教授简要介绍了清华大学统计学研究中心在学科建设、人才培养、学术成果等方面所取得的成绩。

随后,中心去年新入职的王天颖、张静怡、杨朋昆、胡志睿四位助理教授针对各自的研究方向进行了分享。

随后是对参加活动同学的考察环节,包括笔试和数据分析报告两个方面。数据分析环节两人一组“团战”,既考察了大家的专业技能,也考察了团队合作的能力。

中心李东副教授做活动总结
中心邓柯副教授为营员颁发证书

经过几轮的考察及评选,来自中国科学技术大学的张国宇等10名同学荣获“优秀营员”称号;西安交通大学的孔令杰等20名同学荣获“荣誉营员”称号。

附完整优秀营员及荣誉营员名单(排名不分先后):

优秀营员(10名):

张国宇、于浩洋、李易诚、史博文、应怀原、范歆远、杨亦宁、林子谦、孙弘毅、蔡乐衡

荣誉营员(20名):

孔令杰、甘伟烨、刘志涵、张灿睿、李艺康、王柏晴、刘程畅、江柔蓝、徐韬、刘原实、张至隆、戴一凡、杨至文、熊光智、周川、袁慧华、王若妍、尹嘉恒、汪奕晨、张煜

#post-12257
ntry-header

为响应国家对高等教育改革的号召、促进后疫情时代的统计学教育适应社会发展的最新需求,基于2019年“第一届统计学教学改革研讨会”的讨论,清华大学统计学研究中心将于2021年6月25-26日举办“第二届统计学教学改革研讨会”,同从事一线教学的各位统计学者立足教学理念、教学体系、培养模式、教学方式四个维度,进行进一步深化探讨,欢迎各位老师报名参加!

【会议地点】:清华大学校内或附近

【会议时间】:2021/06/26  (周六)

【报到时间】:2021/06/25(周五)

【报名日期】:即日起至2021/06/15

【联系人】:

王江典:wangjiangdian@tsinghua.edu.cn

报名可发送“姓名、单位、教学课程、联系方式”至王江典老师邮箱。

#post-12248
ntry-header

自2016年来,北京大学、清华大学两校优秀的统计学师生济济一堂,发挥两校的学科优势,互通有无,着力推动中国优秀统计青年人才的成长,助力中国统计学学科的发展。为传承兄弟院校间的团结协作和友好交流,两校将于2021年6月10日举办第五届北大-清华统计论坛。

会议时间:2021年6月10日 下午1:00-5:00

会议地点:北京大学镜春园82号院甲乙丙楼报告厅(二层东侧)

主办方:北京大学统计科学中心  清华大学统计学研究中心

会议报名:

报名时间:即日起至2021年5月15日

报名链接:https://docs.qq.com/form/page/DWmhOWHRCZVdxQUli#/fill

特邀报告

北京大学 耿直教授
北京大学数学科学学院教授
北京生物医学统计与数据管理研究会理事长
中国人工智能学会不确定性人工智能专委会副主任

 

清华大学 许宪春教授
清华大学经济管理学院教授
清华大学中国经济社会数据研究中心主任
国家统计局原副局长、高级统计师
清华大学中国经济社会数据研究中心主任
国家统计局原副局长、高级统计师

 

会议议程:

时间 议程
13:00-13:30 报到
13:30-13:40 开幕式
13:40-14:40 大会报告一(耿直教授)
14:40-15:00 茶歇&合影
15:00-16:00 大会报告二(许宪春教授)
#post-12242
ntry-header

Talking with Great Minds—Howell Tong

Keywords: International view   Broad-minded   Practical need  Leadership

Prof. Tong in Tsinghua
  1. Childhood and study experience oversea.

Q: Can you talk about your childhood?

A: In my childhood, my family and I were faced with general tough life conditions, but eventually we went through the hardship and became stronger. I am also very lucky that I always have good teachers. One of the teachers that I remember particularly once told us a story about Hua Luogeng when I was very young. That was the time Hua Luogeng returned to China. My teacher told me how he became famous after studying, and that actually made a quite impression on me. I think it’s because of his story that I decided to study Mathematics. I moved to England in 1961 when my father worked there. The secondary school I attended was not a top school, but was very comprehensive. Under the support of the school and headmaster, I picked up English quickly. I was the only boy from that school who went to a university.

Q: In that period, did you encounter any challenge in your study or life?

A: Yes. First of all, I need to get used to the English way of schooling, for example moving across different classrooms to have lessons, and different dietary habit on campus. But luckily the students around were all very nice, and we became good friends. Because of this experience, I was able to know different culture and their way of speaking. It’s quite a big challenge to adapt from the Hong Kong school system to a working class environment in London at that time.

Q: Is there any particular reason that you choose Statistics as your major?

A: Well as I said I decided to study Mathematics, and I graduated in Mathematics in Manchester. We had good statistics teachers and received many statistics courses, which was unusual at that time in England. I also got a chance to listen to a lecture about probability theory given by an eminent probability researcher. I was impressed by his lecture and became interested in probability theory. Because of my family, I decided to suspend the post-graduate study and took a job. During that time, I started to read some papers, and I came across one paper about time series by my former teacher in Manchester. That’s how I became interested in time series. I wrote to him and went back to Manchester. Due to some reasons, I accidentally became a university teacher teaching statistics instead of a post-graduate student. I was very lucky.

  1. Early career as a statistician.

Q: How did you finally decide to become a statistics researcher?

A: Once I returned to Manchester, I became quite clear that statistics is the career I want to pursue. Thanks to my school, I had an opportunity to meet with other scientists and technologists, and became interested in control engineering and stochastic control. So, time series became quite a natural subject for me. My early career was mainly oriented in frequency domain, and I changed to time domain later on when I met Akaike. He visited us in Manchester for half a year, working on multivariate control system using multivariate linear AR model, as well as some aspect of AIC(Akaike Information Criterion). We became very good friends, and I wanted to learn more from him, so I applied a Royal Society Japan Fellowship, and went to japan for 6 months. During that visit, I read a number of papers he collected, and learned a lot from not only the papers, but also the marks and personal notes he made. It was very valuable for me. By talking to him, I learned the background of why he did certain research. He did research not in front of the desk, but went out and met other scientists. He did not publish many papers in the first ten years of his career, but did a lot of great works later. He spent time cultivating friendship with engineers and other people. Because of this, he was asked to solve a problem of selecting a suitable model from a number of models in the field of predicting. That’s the original problem behind AIC. So, I got a deep understanding of the whole idea of his research besides reading papers.

Q: We know that you published many great papers in your early career, so what’s your secret for this fantastic achievements?

A: I remember the words of Mr. Yang Zhenning. He said do something that you are really passionate about. My father never interfered in my study, and I never interfered in my children’s career either. Let the person choose what he or she is really interested in. My mentor is a time series analyst, but he never pushed me, so I had the chance to choose my own area. The reason why I choose statistics is because I want to produce something new, so I am lucky to be in the right environment where there is no pressure. I am also very lucky to have a good wife taking care of my family, and lucky to have the chance meeting with other scientists. I am a good learner, and I am able to pick up the things I want to learn. I think passion is very important rather than any secret. Remember to be observant and passionate.

  1. About the threshold model

Q: Now let’s talk about one of your most important work in non-linear time series, the threshold model. Where did the idea come from?

A: When I was visiting Akaike, I learned the way he produced the spectral density estimate. So, I used the approach on the lynx data, which I was very interested in. There was a session in the Royal Statistic Society and I presented this paper. During that discussion, there was one gentleman who made a very, very important comment. He said that the data is cyclical, but the cycle is not symmetric. The lynx population would rise slowly but fall rapidly. If you use a linear Gaussian model, you would never be able to capture it. Also, he said that from the point view of dynamical system, the cycle should be considered limit cycle. So, if you can produce a model that leads to limit cycle, it would be ideal. And David Cox and Akaike also made some similar comments. However, it is very difficult and is a big challenge.So, I decided to work on the problem. But my entire education up to that time was all in linear. So, I need to teach myself nonlinear dynamical systems.

Then, one day I was in my garden and mowing the lawn. When you mow the lawn, you go strip by strip. Suddenly, the idea of piecewise linearity came into my mind. This is because I was subconsciously thinking of the problem all the time.

Then I started working on the idea and a student did programs. One day she brought me some results which were too perfectly periodic to be possible. Then I found that she forgot the noise. This was the first time I saw limit cycle. Then, I said we could also see whether this model can produce other nonlinear phenomena, such as subharmonics, higher harmonics, amplitude-frequency dependency and so on. And it turned out that the model could do that.

Q: Did you encounter difficult times with the model?

A: Yes. A lot of people discussed the paper but I could not say everybody liked it, maybe because the idea was so new. I also got one or two people attacking. The model was invented in 1980s but has remained fairly quiet for 10 years. It was in about the 1990s that the model attracted a lot of attention. So, the beginning was not easy.

Q: From your experience, how to find a good research problem?

A: First of all, you have to be social. To me, statisticians are toolmakers. What tool you want to invent must be dictated by practical needs from people on the ground. So, we should go out, interact and collaborate with other scientists. We should be members of scientific teams. Don’t follow fashion blindly. I never want to follow fashion. When I did nonlinear time series, almost none of the leaders in time series worked on that.

There are probably two types of research. One is the run-of-mill research, which means you have an incremental improvement. Those things do not take us long and you can publish these very quickly. The other one is the revolutionary research. Of course, in one’s lifetime, one would probably not have more than a couple of such revolutions. But you must always keep them in mind, work on them in any spare time.

  1. About the leadership

Q: You have been Chair of Statistics at several universities. How can you do good jobs in both academic and management? What’s your secret?

A: I adopted the principle I learned from Lao Tzu (老子) and Sun Tzu’s “Art of War” (孙子兵法). I cannot micromanage, so if there is any big job I will identify a suitable person. Then I will give the person my full support. So if you use one person you need to trust him (用人不疑,疑人不用).

  1. About statistics in the future

Q: Do you worry about the future of statistics given the competition from Machine Learning and AI?

A: As Lao Tzu has said, behind every good fortune there is a misfortune, and misfortune leads to good fortune (祸兮福之所倚,福兮祸之所伏). I think the two aspects are certainly true for what challenge statistics is facing in the domain of data science. But if we sensibly steer our ship of statistics, we can benefit. Machine learning is certainly a powerful tool, but some of the ideas are not unknown or uncommon in statistics. Because in statistics, the basic training is how to handle randomness, and for anything that requires that, statistics has advantages. But on the other hand, we have to be fully prepared and liberate our minds. Some of the old ideas may be too restrictive. We used to deal with small data set in days of Fisher, but now we have to deal with large data sets. To defeat the new challenge, we have to adopt the attitude in Chinese culture: when foreigners come, we absorb them.

So, I don’t worry. As long as we are broad-minded and ready to adapt, we can survive and grow.

 

#post-12226
ntry-header

2020年12月27-29日,“世界华人数学家联盟年会”在安徽合肥举行,清华大学统计学研究中心邓柯副教授作为第一作者的学术论文“On the unsupervised analysis of domain-specific Chinese texts”荣获“2020世界华人数学家联盟最佳论文奖-银奖”。该论文是邓柯副教授与美国哈佛大学Peter Bol教授、哈佛大学刘军教授和萨福克大学李佳漪副教授共同完成,论文发表于美国科学院院刊PNAS杂志。

文章提出运用统计学模型和原理进行无指导中文文本分析的新方法-TopWORDS,可对特定领域中文文本进行词语发现和中文分词。此方法还可以结合其他文本分析工具,如词嵌入、主题模型、关联规则挖掘等,可提取文本中的主要特征和信息,是中文文本挖掘领域的重要突破。

丘成桐先生(右)和林勇教授(左)给邓柯副教授颁奖​

 

#post-12224
ntry-header

近日,“清华大学第九届青年教师教学大赛”(简称“青教赛”)落下帷幕,我中心邓婉璐老师荣获“清华大学第九届青年教师教学大赛”(理科、医科组)一等奖;周在莹老师荣获“清华大学第九届青年教师教学大赛”(理科、医科组)二等奖。

据悉,全校共有47个院系的140名青年教师参加了清华大学第九届青年教师教学大赛的培训和比赛。赛前进行了2个月的教学培训,从教学内容、教学方法、教案写作等全方位对青年教师进行了教学培训,之后组织了10次教学工作坊进行交流。

全国第五届青年教师教学大赛暨清华大学第九届青年教师教学大赛总结交流座谈会在上周召开,副校长郑力,校务委员会副主任、工会主席王岩出席座谈会,副教务长、教务处处长曾嵘主持会议。

郑力副校长出席总结交流座谈会

郑力向在国赛和校赛上取得优异成绩的青年教师表示祝贺。郑力回顾了青教赛的发展历程,对指导教师团队的工作表示肯定。他指出,青教赛是一个学习的盛会,在备赛、参赛的过程中,青年教师通过指导教师的辅导、选手之间的交流提高了教学能力和水平;青教赛是一个创作的盛会,参赛选手不断突破自我、追求卓越,提升了专业素养;青教赛是一个传承的盛会,青年教师在指导教师的言传身教下迅速成长,将学习到的教学技能应用到课堂上,将学校教书育人的优良传统和经验一代代传承下去。

邓婉璐老师(下排中)在座谈会发言
       座谈会上,获奖教师踊跃发言,分享了自己参赛的感悟与收获。大家认为,竞赛活动的引导性极强,备赛以及参赛的过程为今后的教学生涯积累了宝贵经验。比赛时评委提出的意见客观、全面、准确,指出了教学过程的一些细节问题,让青年教师更好地理解课程的学科思维特点、发展方向、理念及核心价值,明确了未来努力的方向。

座谈会现场

图文|清华新闻网

#post-12221
ntry-header

11月27日,清华大学举行抗击新冠肺炎疫情表彰大会。地球系统科学系宫鹏教授、徐冰教授领衔的“流行病学传播预测与对策”科技抗疫突击队荣获“清华大学抗击新冠肺炎疫情先进集体”荣誉称号。我中心执行主任邓柯副教授、侯琳副教授及中心博士生刘朝阳、沈翀、王掣、宋爽、余博作为突击队骨干成员,共同出席表彰大会接受表彰。

宫鹏教授(后排左五)、邓柯副教授(前排左四)、侯琳副教授(前排右一)及突击队师生荣获表彰
#post-12215