【水木数据科学访谈】清华大学杰出访问教授汤家豪院士

新闻动态、最热新闻
发布时间:2021年03月03日

Talking with Great Minds—Howell Tong

Keywords: International view   Broad-minded   Practical need  Leadership

Prof. Tong in Tsinghua
  1. Childhood and study experience oversea.

Q: Can you talk about your childhood?

A: In my childhood, my family and I were faced with general tough life conditions, but eventually we went through the hardship and became stronger. I am also very lucky that I always have good teachers. One of the teachers that I remember particularly once told us a story about Hua Luogeng when I was very young. That was the time Hua Luogeng returned to China. My teacher told me how he became famous after studying, and that actually made a quite impression on me. I think it’s because of his story that I decided to study Mathematics. I moved to England in 1961 when my father worked there. The secondary school I attended was not a top school, but was very comprehensive. Under the support of the school and headmaster, I picked up English quickly. I was the only boy from that school who went to a university.

Q: In that period, did you encounter any challenge in your study or life?

A: Yes. First of all, I need to get used to the English way of schooling, for example moving across different classrooms to have lessons, and different dietary habit on campus. But luckily the students around were all very nice, and we became good friends. Because of this experience, I was able to know different culture and their way of speaking. It’s quite a big challenge to adapt from the Hong Kong school system to a working class environment in London at that time.

Q: Is there any particular reason that you choose Statistics as your major?

A: Well as I said I decided to study Mathematics, and I graduated in Mathematics in Manchester. We had good statistics teachers and received many statistics courses, which was unusual at that time in England. I also got a chance to listen to a lecture about probability theory given by an eminent probability researcher. I was impressed by his lecture and became interested in probability theory. Because of my family, I decided to suspend the post-graduate study and took a job. During that time, I started to read some papers, and I came across one paper about time series by my former teacher in Manchester. That’s how I became interested in time series. I wrote to him and went back to Manchester. Due to some reasons, I accidentally became a university teacher teaching statistics instead of a post-graduate student. I was very lucky.

  1. Early career as a statistician.

Q: How did you finally decide to become a statistics researcher?

A: Once I returned to Manchester, I became quite clear that statistics is the career I want to pursue. Thanks to my school, I had an opportunity to meet with other scientists and technologists, and became interested in control engineering and stochastic control. So, time series became quite a natural subject for me. My early career was mainly oriented in frequency domain, and I changed to time domain later on when I met Akaike. He visited us in Manchester for half a year, working on multivariate control system using multivariate linear AR model, as well as some aspect of AIC(Akaike Information Criterion). We became very good friends, and I wanted to learn more from him, so I applied a Royal Society Japan Fellowship, and went to japan for 6 months. During that visit, I read a number of papers he collected, and learned a lot from not only the papers, but also the marks and personal notes he made. It was very valuable for me. By talking to him, I learned the background of why he did certain research. He did research not in front of the desk, but went out and met other scientists. He did not publish many papers in the first ten years of his career, but did a lot of great works later. He spent time cultivating friendship with engineers and other people. Because of this, he was asked to solve a problem of selecting a suitable model from a number of models in the field of predicting. That’s the original problem behind AIC. So, I got a deep understanding of the whole idea of his research besides reading papers.

Q: We know that you published many great papers in your early career, so what’s your secret for this fantastic achievements?

A: I remember the words of Mr. Yang Zhenning. He said do something that you are really passionate about. My father never interfered in my study, and I never interfered in my children’s career either. Let the person choose what he or she is really interested in. My mentor is a time series analyst, but he never pushed me, so I had the chance to choose my own area. The reason why I choose statistics is because I want to produce something new, so I am lucky to be in the right environment where there is no pressure. I am also very lucky to have a good wife taking care of my family, and lucky to have the chance meeting with other scientists. I am a good learner, and I am able to pick up the things I want to learn. I think passion is very important rather than any secret. Remember to be observant and passionate.

  1. About the threshold model

Q: Now let’s talk about one of your most important work in non-linear time series, the threshold model. Where did the idea come from?

A: When I was visiting Akaike, I learned the way he produced the spectral density estimate. So, I used the approach on the lynx data, which I was very interested in. There was a session in the Royal Statistic Society and I presented this paper. During that discussion, there was one gentleman who made a very, very important comment. He said that the data is cyclical, but the cycle is not symmetric. The lynx population would rise slowly but fall rapidly. If you use a linear Gaussian model, you would never be able to capture it. Also, he said that from the point view of dynamical system, the cycle should be considered limit cycle. So, if you can produce a model that leads to limit cycle, it would be ideal. And David Cox and Akaike also made some similar comments. However, it is very difficult and is a big challenge.So, I decided to work on the problem. But my entire education up to that time was all in linear. So, I need to teach myself nonlinear dynamical systems.

Then, one day I was in my garden and mowing the lawn. When you mow the lawn, you go strip by strip. Suddenly, the idea of piecewise linearity came into my mind. This is because I was subconsciously thinking of the problem all the time.

Then I started working on the idea and a student did programs. One day she brought me some results which were too perfectly periodic to be possible. Then I found that she forgot the noise. This was the first time I saw limit cycle. Then, I said we could also see whether this model can produce other nonlinear phenomena, such as subharmonics, higher harmonics, amplitude-frequency dependency and so on. And it turned out that the model could do that.

Q: Did you encounter difficult times with the model?

A: Yes. A lot of people discussed the paper but I could not say everybody liked it, maybe because the idea was so new. I also got one or two people attacking. The model was invented in 1980s but has remained fairly quiet for 10 years. It was in about the 1990s that the model attracted a lot of attention. So, the beginning was not easy.

Q: From your experience, how to find a good research problem?

A: First of all, you have to be social. To me, statisticians are toolmakers. What tool you want to invent must be dictated by practical needs from people on the ground. So, we should go out, interact and collaborate with other scientists. We should be members of scientific teams. Don’t follow fashion blindly. I never want to follow fashion. When I did nonlinear time series, almost none of the leaders in time series worked on that.

There are probably two types of research. One is the run-of-mill research, which means you have an incremental improvement. Those things do not take us long and you can publish these very quickly. The other one is the revolutionary research. Of course, in one’s lifetime, one would probably not have more than a couple of such revolutions. But you must always keep them in mind, work on them in any spare time.

  1. About the leadership

Q: You have been Chair of Statistics at several universities. How can you do good jobs in both academic and management? What’s your secret?

A: I adopted the principle I learned from Lao Tzu (老子) and Sun Tzu’s “Art of War” (孙子兵法). I cannot micromanage, so if there is any big job I will identify a suitable person. Then I will give the person my full support. So if you use one person you need to trust him (用人不疑,疑人不用).

  1. About statistics in the future

Q: Do you worry about the future of statistics given the competition from Machine Learning and AI?

A: As Lao Tzu has said, behind every good fortune there is a misfortune, and misfortune leads to good fortune (祸兮福之所倚,福兮祸之所伏). I think the two aspects are certainly true for what challenge statistics is facing in the domain of data science. But if we sensibly steer our ship of statistics, we can benefit. Machine learning is certainly a powerful tool, but some of the ideas are not unknown or uncommon in statistics. Because in statistics, the basic training is how to handle randomness, and for anything that requires that, statistics has advantages. But on the other hand, we have to be fully prepared and liberate our minds. Some of the old ideas may be too restrictive. We used to deal with small data set in days of Fisher, but now we have to deal with large data sets. To defeat the new challenge, we have to adopt the attitude in Chinese culture: when foreigners come, we absorb them.

So, I don’t worry. As long as we are broad-minded and ready to adapt, we can survive and grow.


问:能谈谈您的童年吗?


答: 我的童年时期,我和我的家人面临着普遍艰苦的生活条件,但我们最终度过了艰难时期,并变得更加坚强。我也非常幸运,总是能遇到好老师。我特别记得的一位老师在我很小的时候给我们讲过华罗庚的故事。那是在华罗庚刚回中国的时候。我的老师告诉我们他是如何通过学习而成名的,这确实给我留下了深刻的印象。我想,正是他的故事让我决定学习数学。


1961年,我父亲去英国一所中囯餐館当待应生工作,我也随之移居英国。我上的中学不是顶尖学校,但非常综合。在学校和校长的支持下,我很快学会了英语。我是那所学校唯一一个考上大学的男生。


问:在那段时期,您在学业或生活中遇到过什么挑战吗?


答: 是的。首先,我需要适应英国的教学方式,例如在不同的教室之间移动上课,以及校园里不同的饮食习惯。但幸运的是,周围的同学都非常友好,我们成了好朋友。因为这段经历,我得以了解不同的文化及其说话方式。当时从香港的学校体系适应到伦敦的一个工人阶级环境是一个相当大的挑战。


问:您选择统计学作为专业有什么特别的原因吗?


答: 嗯,如我所说,我决定学习数学,并且我在曼彻斯特大学数学专业毕业。我们当时有很好的统计学老师,并上了很多统计课程,这在那时的英国是不常见的。我还有机会听了一位著名概率论研究学者关于概率论的讲座。他的讲座给我留下了深刻印象,使我对概率论产生了兴趣。


由于家庭原因,我决定暂缓研究生学习,先找了一份工作。在那期间,我开始阅读一些论文,并偶然看到了我在曼彻斯特的一位老师写的一篇关于时间序列的论文。这就是我对时间序列产生兴趣的缘由。我给他写了信,然后回到了曼彻斯特。由于一些原因,我意外地成为了一名教授统计学的大学讲师,而不是一名研究生。我非常幸运。


早期统计学生涯


问:您最终是如何决定成为一名统计学研究者的?


答: 一旦回到曼彻斯特,我就非常清楚地意识到统计学是我想要追求的事业。多亏了我的学校,我有机会接触到其他科学家和技术专家,并对控制工程和随机控制产生了兴趣。因此,时间序列对我而言成了一个非常自然的学科。


我早期的研究主要面向频域,后来在我遇到赤池弘次(Akaike)后转向了时域。他在曼彻斯特访问了我们半年,从事使用多元线性自回归模型(AR model)的多元控制系统以及AIC(赤池信息量准则,Akaike Information Criterion)某些方面的研究。我们成了非常好的朋友,我想向他学习更多,因此申请了皇家学会日本研究奖学金,去了日本六个月。


在那次访问期间,我阅读了他收集的大量论文,不仅从论文本身,还从他做的标记和个人笔记中学到了很多。这对我非常宝贵。通过与他交谈,我了解了他从事某些研究的背景。他做研究不是坐在书桌前,而是走出去会见其他科学家。在他职业生涯的头十年里,他发表的论文并不多,但后来做了大量出色的工作。他花时间培养与工程师及其他人的友谊。正因如此,他才被邀请去解决一个在预测领域从众多模型中选择合适模型的问题。这就是AIC背后的原始问题。所以,除了阅读论文,我对他的整个研究思想有了深刻的理解。


问:我们知道您职业生涯早期发表了许多优秀的论文,取得如此惊人成就的秘诀是什么?


答: 我记得杨振宁先生的话。他说,要做你真正热爱的事情。我的父亲从未干涉过我的学习,我也从未干涉过我孩子们的事业。让一个人选择他或她真正感兴趣的领域。我的导师是一位时间序列分析师,但他从未逼迫过我,因此我有机会选择自己的领域。我选择统计学是因为我想创造新东西,所以我很幸运地处在一个没有压力的合适环境中。我也非常幸运有一位好妻子照顾家庭,并且有幸能遇到其他科学家。我是一个善于学习的人,能够掌握我想学的东西。我认为热情非常重要,而不是什么秘诀。记住要保持观察力和热情。


关于阈值模型 问:现在我们来谈谈您在非线性时间序列方面最重要的工作之一——阈值模型。这个想法是从哪里来的?


答: 在我访问赤池先生时,我了解了他生成谱密度估计的方法。于是,我将这种方法用在了我非常感兴趣的猞猁数据上。我在英国皇家统计学会的一次会议上发表了这篇论文。在那次讨论中,有一位先生提出了一个非常非常重要的评论。他说数据是周期性的,但周期并不对称。猞猁的数量会缓慢上升但迅速下降。如果你使用线性高斯模型,永远无法捕捉到这一点。他还从动力系统的角度指出,这个周期应该被视为极限环(limit cycle)。因此,如果能建立一个能产生极限环的模型,那将是理想的。大卫·考克斯(David Cox)和赤池也提出了一些类似的评论。然而,这非常困难,是一个巨大的挑战。因此,我决定研究这个问题。但直到那时,我接受的全部教育都是关于线性模型的。所以,我需要自学非线性动力系统。


然后有一天,我在我的花园里修剪草坪。当你修剪草坪时,你是一 strip strip 地修剪。突然,分段线性(piecewise linearity)的想法浮现在我的脑海里。这是因为我一直在潜意识里思考这个问题。


然后我开始研究这个想法,一位学生负责编程。有一天,她给我带来了一些结果,这些结果周期性地完美到不可能是真实的。然后我发现她忘记了添加噪声。那是我第一次看到极限环。之后,我说我们还可以看看这个模型是否能产生其他非线性现象,例如次谐波、高次谐波、幅频依赖性等等。结果证明这个模型可以做到。


问:您在使用这个模型时遇到过困难时期吗?


答: 是的。很多人讨论了那篇论文,但我不能说每个人都喜欢它,也许是因为这个想法太新了。我也受到了一两个人的抨击。这个模型在 20 世纪 80 年代发明,但沉寂了大约 10 年。直到 20 世纪 90 年代左右,该模型才引起了广泛关注。所以,开头并不容易。


问:根据您的经验,如何找到一个好的研究问题?


答: 首先,你必须善于社交。对我来说,统计学家是工具制造者。你想要发明什么工具必须由实际领域人员的实际需求来决定。因此,我们应该走出去,与其他科学家互动和合作。我们应该成为科学团队的成员。不要盲目跟风。我从来不想追随潮流。当我开始研究非线性时间序列时,时间序列领域的领军人物几乎没有人研究这个。


研究可能有两种类型。一种是常规性研究(run-of-the-mill research),意味着你取得渐进式的改进。这些东西不需要我们花很长时间,而且你可以很快发表这些成果。另一种是革命性的研究。当然,人的一生中可能不会有超过一两次这样的革命。但你必须时刻牢记它们,在任何空闲时间都致力于此。


关于领导力 问:您曾在多所大学担任统计学系主任。您是如何在学术和管理两方面都做得很出色的?您的秘诀是什么?


答: 我采用了从老子和孙子的《孙子兵法》中学到的原则。我无法进行微观管理,所以如果有任何重大工作,我会确定一个合适的人选。然后我会给予这个人全力支持。所以,用人不疑,疑人不用。


关于统计学的未来 问:考虑到机器学习和人工智能的竞争,您担心统计学的未来吗?


答: 正如老子所说:“祸兮福之所倚,福兮祸之所伏。”我认为这两个方面对于统计学在数据科学领域面临的挑战来说无疑是正确的。但如果我们能明智地驾驭统计学这艘船,我们就能受益。


机器学习无疑是一个强大的工具,但其中一些思想在统计学中并非不为人知或罕见。因为在统计学中,基本的训练是如何处理随机性,对于任何需要处理随机性的东西,统计学都具有优势。但另一方面,我们必须做好充分准备并解放思想。一些旧的思想可能限制性太强。我们过去在费希尔(Fisher)时代处理的是小数据集,但现在我们必须处理大数据集。为了战胜新的挑战,我们必须采取中国文化中的态度:洋为中用。


所以,我并不担心。只要我们心胸开阔并准备好适应,我们就能生存和发展壮大。