Stopping for a bit – or for a bit more?

[My blog post published yesterday and cross-posted from Club Soda]

January is a popular month for changing your drinking. Everyone will know at least someone who is making a New Year’s resolution not to drink for a month. Or doing a “Drynuary“, as they say in America. In Club Soda speak, we call this kind of thing stopping for a bit: committing to a period of no drinking, but not quitting permanently. It is generally understood that taking a break from alcohol is good for you and your health. But it may also have longer-lasting benefits, which aren’t as well known yet.

University of Sussex psychologist Dr Richard de Visser has studied the impact of taking a month off booze, and he has found some very interesting results. Most people felt a sense of achievement after the month, which is maybe not such a surprise. Four in five saved money, more than half had better sleep and more energy. And just under half (49%) also lost weight.

What’s most interesting is that in August, six months after their dry month, three in four were drinking less than they used to, and 4% had stayed completely sober for the whole time! And similar changes were seen not only in the people who completed the dry month, but also in those who started but didn’t finish. It seems therefore that just making that commitment to stop for a bit and trying will have an impact.

Journalist Peter Oborne completed a dry January in 2013, and wrote three columns about it. They tell an interesting story. In the first column, written on 30th December, he says he is “not really that heavy a drinker”, and tells stories of people who have drank much more than him. He even mentions that Hitler never drank (and that therefore we should all drink in order not to become fascist dictators?). But he clearly felt worried enough about his drinking to decide to stop for a bit. But he is not looking forward to it.

The second column, dated 20th January, starts by noting how well he is sleeping, and how refreshed he feels each morning, and how his back pain has disappeared. But then follows a long list of complaints: how he struggles to write like before, how meeting friends is miserable, how he’s started smoking (to support not drinking!), and wondering if the month will ever end.

The surprise ending to this story comes in the final column, on 3 February. Our writer still says the month was miserable, and confesses to a few more lapses. But he mentions more positives too: better health and complexion, more sleep, clarity of mind, more work done, loss of weight. And then the realisation: “At the end of it all, I’ve had no choice but to admit that I’m an addict. It’s not that I need a drink to get through an average day. I need several. And this slavish dependency has got to end.” He is now determined to cut down – but not to quit completely, as that would be “terrible”.

At Club Soda we will never say that one size fits all – we feel strongly that everyone must be free to set their own goals, whether to cut down, stop for a bit, quit, or any combination of the three. Peter Oborne ‘s story is a good illustration of the benefits of this thinking: he would not have been ready to quit, or even to cut down. But after stopping for a bit, he has changes his mind – and his drinking.

So, what do we take away from this? Most importantly, that stopping for a bit will often lead to cutting down in the longer-term as well. And that it brings significant changes for almost everyone, even if you don’t achieve your initial goal of a full month without drinking.

And what if you fall off the wagon? Lapse? Slip? Accidentally have a glass of vino? The first thing to remember is to not get upset. Think about the situation that lead to the drink? What triggered it? Could you have avoided it? The second thing is to forgive yourself and move on. Decide what to do next. If you’re really determined, you could start from zero again, and still try to go a whole month without a drink, starting from today. Or you can just carry on until the 1st of February as planned. Either way, you will have made a significant change to your drinking. Even Peter Oborne confessed to slipping five times during his month, and still made it to the end. His reasoning: he didn’t drink on 26 out of the 31 days of January!

Some writings, 1998-2004

The following “bibliography” should be a complete listing of all my published academic writings from 1998 to 2004. I’m posting it here both for the world to see, and as a handy place where I can find it myself if I ever need it again! I’ve added brief comments for some of the items; sometimes just to remind myself what the papers were all about. I’ve also included links to everything that is available online (most of the old working papers now only via Outliers vs. nonlinearity in time series econometrics is the main theme here, and there are also several papers on long memory in the form of fractional integration. My non-academic writings, including a rock gig review at Rumba, to follow some other day perhaps!

Outliers in nonlinear time series econometrics. Annales Universitatis Turkuensis, Series B, Number 243. University of Turku

This is my economics PhD dissertation, which contains an introduction and four articles: three published ones (in Communications in Statistics, Finnish Economic Papers and Applied Financial Economics), and an unpublished one analysing the impact of outliers on ARFIMA model estimation, with a simple robust two-stage estimation method.

Peer-reviewed journal articles
The effects of outliers on two nonlinearity tests. Communications in Statistics – Simulation and Computation, vol. 29, pp. 897-918 (2000)

A simulation study, showing how even a single outlier in a time series of 500 observations can seriously distort some commonly used tests for nonlinearity (ARCH and bilinearity tests here). Previous work had only considered more frequent outliers – this paper shows that the number of outliers can be very small, and the adverse impact still significant.

Outliers in eleven Finnish macroeconomic time series. Finnish Economic Papers, vol 14, pp 14-32, (2001)

Evaluating the impact of outliers on macroeconomic time series analysis. Conclusion: outliers can have a significant impact, and their treatment should always be carefully considered. I’m afraid I have yet to come to a completely satisfying conclusion about the best way of handling outliers in empirical work.

Outliers and predictability in monthly stock market returns. The Finnish Journal of Business Economics, vol 4/2002, pp 369-380 (2002)

Do outliers influence whether stock markets are predictable using simple time series forecasting methods? With mixed results.

Long memory and outliers in stock market returns. Applied Financial Economics, vol 13, pp 495-502 (2003)

First, a simulation study showing that the presence of outliers will bias time series (fractional integration) long memory estimates towards zero. An empirical example then shows that long memory is detected in stock market data more often if outliers are first taken into account.

Unemployment persistence of different labour force groups in Finland. Applied Economics Letters, vol 10, pp 455-458 (2003)

Fractional integration long memory models are used to estimate a measure of unemployment persistence for different labour force groups. The results show that unemployment is less persistent for females and young people, than for males and the entire labour force.

Long Memory in a Small Stock Market. Economics Bulletin, vol 7, pp 1-13 (2003)

An empirical assessment of  the presence of long memory in Finnish stock market data. Depending on the testing method used, statistically significant long memory is detected in 24% to 67% of the series, which is considerably more than what is usually found in data of this kind. This article is based on a working paper with some additional results (see below).

Genetic algorithms for outlier detection and variable selection in linear regression models. Soft Computing, vol 8, pp 527-533 (2004)

Possibly my best idea, and also the most cited thing I’ve published. Proposes a new method for simultaneous outlier detection and variable selection, which overcomes a number of problems in this kind of statistical analysis. I’ve also got an application of this method for economic growth data, which I’ll try to polish and share here soon.

Research reports and working papers
Outliers in time series: A review. Research Reports No. 76, University of Turku, Department of Economics (1998)

My statistics Master’s dissertation. A review of statistics and econometrics research on outliers: their impact, detection, treatment, and modelling.

A nonlinear moving average test as a robust test for ARCH. Research Reports No. 81, University of Turku, Department of Economics (1999)

An idea I had – would using a test for one kind of nonlinearity work in detecting another kind, which may often be difficult to detect? Especially when outliers are involved? The answer: not really…

Small sample properties of a joint ARCH-bilinearity test. Research Reports No. 84, University of Turku, Department of Economics (1999)

Another idea – if you create a simultaneous Lagrange multiplier test for two different types of nonlinearity, how would that compare to the individual tests? The answer: about the same…

Aittokallio, T., O. Nevalainen, J. Tolvi, K. Lertola & E. Uusipaikka: Computation of restricted maximum-penalized-likelihood estimates in hidden Markov models. Turku Centre for Computer Science, Technical Report No. 380 (2000)

My main contribution here was to propose a specific kind of hidden Markov model (HMM) for modelling financial data series. The estimated HMM had two components to model the majority of observations: one with low, one with high volatility, to mimic “normal” and turbulent periods. Additional HMM components were then added to model outliers, or very extreme observations. Sadly this was never published anywhere.

Nonlinear model selection in the presence of outliers. Research Reports No. 90, University of Turku, Department of Economics (2001)

Playing around with model selection and outliers using information criteria, with limited success. But this work led to the later genetic algorithm paper in Soft Computing.

Suomalaisten makrotaloudellisten aikasarjojen stationaarisuus ja pitkän muistin ominaisuudet. Research Reports No. 95, University of Turku, Department of Economics (2002) [Stationarity and long memory properties of Finnish macroeconomic time series]

Showing that once you take outliers and level shifts into account, there is very clear evidence for the presence of long memory in macroeconomic data. I can’t remember why I wrote this one in Finnish, as the results could have been of interest outside of Finland as well. And this paper also does not seem to be available anywhere online any more?

Long memory in the Finnish stock market. Research Reports No. 103, University of Turku, Department of Economics (2002)

The Economics Bulletin article above is based on this working paper, which has additional results for volatility data, and results of estimated ARFIMA-FIGARCH models as well.

Book reviews and short notes
Vielä yksikköjuurista ja työttömyysaikasarjojen tilastollisesta luonteesta. Kansantaloudellinen aikakauskirja, vol 1/1999, pp 159-163 (1999) [A further note on unit roots and the statistical properties of unemployment time series, the Finnish Journal of Economics]

Rationaalisista odotuksista. Sosiologia, vol 2/2000, pp 145-146 (2000) [On rational expectations, Sosiologia – the Journal of Westermarck Society]

Miten olla hyvä taloustieteilijä? Kansantaloudellinen aikakauskirja, vol 2/2001, pp 339-341 (2001) [How to be a good economist? A book review of McCloskey, D. N.: How to be human – though an economist, the Finnish Journal of Economics]

Poikkeavat havainnot epälineaarisessa aikasarjaekonometriassa. Lectio praecursoria. Kansantaloudellinen aikakauskirja, vol 1/2002 [Outliers in nonlinear time series econometrics. Doctoral lecture, the Finnish Journal of Economics]

My introductory lecture at my PhD viva – a brief summary of my dissertation, aimed for the general public. I used my father as a guinea pig to test whether he would get it. (He did!)

Book review of Dhrymes, P. J.: Mathematics for Econometrics (3. ed.). Journal of the Royal Statistical Society, series D – the Statistician, vol 51, pp 411-412 (2002)

Book review of Ghysels, E, Swanson, N. R and Watson, M. W. (eds.): Essays in econometrics: The collected papers of Clive W. J. Granger.  Journal of the Royal Statistical Society: Series D (The Statistician), vol 52, pp 113-114 (2003)

Book review of Tsay, R. S.: Analysis of financial time series. Journal of the Royal Statistical Society: Series D (The Statistician), vol 52, pp 128-129 (2003)

Book review of Zivot, E. and Wang, J.: Modeling Financial Time Series with S-Plus. Journal of the Royal Statistical Society: Series D (The Statistician), vol 52, p 705 (2003)

Big data and small

We live in exciting times for sure. “Big data” (enormous databases and methods of analysing them) is creating all kinds of new knowledge. So I’m not saying that it’s all hype, and I did for example enjoy reading Kenneth Cukier and Viktor Mayer-Schönberger’s book Big Data.

But there sure is a lot of hype around as well. One particular meme I’m not so keen about is the claim that we now live in a whole new “N = all” world, where statistics is no longer needed, since we can just check from the data exactly how many x are y (e.g. people who live in London and bought something online last month, or something else that in the past we would have had to estimate from a sample to find out). Yes, there is a lot of information like this that is now easily available, and the big data advocates have many cool anecdotes to tell. And Google probably knows more about us that we do ourselves.

One obvious situation where old-fashioned statistical inference will be needed for some time still is medical research. Say you’re developing a new drug. You will need to do your phase 1, 2, and 3 trials just as before, and convince people at each stage that it’s safe to carry on. Unless you can somehow feed your new prototype drug to everyone in the world, record the outcomes in your data lake, and do your data mining? And there are surely many other situations like it, outside of academia as well. One of my previous jobs was on bank stress testing, which requires econometric modelling using very limited data sets and, yes, plenty of statistical inference.

I suppose in terms of the hype cycle, we are still in the initial peak phase of great expectations. And eventually all of these new methods will find their place in the great toolbox of data analytics. Right next to the boring old regression models, and slightly less old and never boring decision trees and neural networks.