Analyzing Time-Series of Individual Data
If you haven’t been living in a cave for the last couple of years, you definitely noticed an increase in data collection, data mining and visualization. HRV tracking, jump output tracking, estimating 1RMs from velocity-load data, game statistics, performance analysis, various testing statistics, body weight, Run Keeper, Run Tracker, and all that quantified-self movement. Collecting data is getting easier and easier – even without one being aware of it. What is still falling behind is making sense of all that data. For example, you might have been collecting HRV or rest HR every morning for the last couple of months, or even better training load using session RPE and duration. How do you analyze this? How do you visualize this data? How do you make sense of it? How much certain statistic need to drop to provide any worthwhile change and real-world effect?
Luckily, the statistics we learned in school didn’t help us. Too much reliance on Fisherian approach (using p value) and too much usage of statistical significance that doesn’t mean much to a coach. Even worse, they (lay people with no formal education in inferential statistic) misinterpret term statistical significance as real-world significance, instead of low chance [p<0.05, p<0.01, p<0.001 etc] of acquiring such an extreme score if null hypothesis is true. If this sounds confusing – it is, and unfortunately, according to Geoff Cumming (author of excellent Understanding the New Statistics book) even the researchers don’t get these concepts right.
If you are interested in these subjects you should definitely read everything ever written by Will Hopkins – and I will give you a quick-start presentation one need to read to understand the important concepts of magnitude base statistics and SWC (Smallest Worthwhile Change) and TE (Typical Error):
How to Interpret Changes in an Athletic Performance Test [Very Important]
Client Assessment and Other New Uses of Reliability [Very Important]
Couple of great researchers, like Martin Buchheit (@mart1buch) are pushing the envelope in using magnitude-based statistics (SWC and TE and chances) – but as far as I know a lot of journal editors are still resistant to forget about p value.
The Dance of p values
Anyway, as coaches we are not interested in group averages and making an inferences to a populations (at least we shouldn’t if we are not thinking about research career). We are interested in individual response and unfortunately we had a lot of flawed thinking over the years using flaw of the averages and thinking that all individuals will respond in a similar and predictable way.
Luckily a lot more studies are leaned toward showing inter-individual variability, quantifying it and visualizing it, besides worrying only on the group averages and whether they get statistically significant effect of the treatments.
What we need to do is start thinking in terms of individuals and their unique reactions. All training is single subject experiment, even if you work in team sports (a bit harder to implement, but still very important).
Taisuke Kinugasa (@umekinu) is one of the few researchers focusing on single-case research design and analysis of single-subject time-series. If you are wondering what are single subject time series it is all that data you collect on yourself (quantified self), like HRV.
Speaking of HRV, recent papers coauthored by Martin Buchheit and other great researchers, brought into light some very applicable tips for coaches to be used on a daily basis. Part of that applicability is using SWC and TE (progressive statistics, magnitude-based approach) and single-case design (in some papers).
Evaluating Training Adaptation with Heart Rate Measures: A Methodological Comparison. Int J Sports Physiol Perform. 2013
Heart rate variability in elite triathletes, is variation in variability the key to effective training? A case comparison. Eur J Appl Physiol. 2012 Nov;112(11):3729-41
What they showed is that having either week averages or rolling 7-days averages “appears to be superior method for evaluating positive adaption to training compared with assessing its value on a single isolated day”.
I have wrote about rolling averages and Z-scores in evaluating wellness data HERE so I won’t go into details too much.
Another interesting approach was to estimate BASELINE for each athlete and estimate SWC of that baseline. The researchers did this by taking first two weeks of the intervention as baseline. Then this baseline and SWC of it (usually 0.3 to 0.5 of intra-individual SD) is used to estimate ‘context’ to 7-days rolling averages.
Sometime this approach is used in sports and for baseline is taken certain period of the year. Another option is to have ‘rolling’ average as well and that might include longer time frame than 7-days rolling average. Again, there are pros and cons of each approach and analyzing time series is more an art than it is a science. Not sure if there is a right thing to go about it.
The idea is to get baseline and SWC, and then to use Rolling averages and TE (it is beyond me how is this calculated, except using rolling 7-days SD) to get chances for beneficial/trivial/harmful changes (see links above from Will Hopkins).
The simplest approach might be to use percent change between last score and rolling average (or longer baseline). Unfortunately this approach doesn’t take individual variability into considerations (see more HERE).
Another approach that takes this into account is to get daily Z-Score which is number of rolling 7-days SDs that last score is different that rolling average [Z-Score = (Last_Score – Rolling_AVG) / Rolling_SD ]. I believe that this is the approach behind iThlete HRV coding system. If you are out of your normal variability then you get a flag.
What we want to achieve with all these approaches is ‘flags’ – what is a normal score and what is abnormal. Again this is more art than it is a science, but I believe the right analysis is a must – one just need to put it in the right context.
Long story short, I have created a Excel workbook that analyses time-series using some of the approaches above. I wanted to thank Andrew Flatt (@andrew_flatt) for providing me with his HRV data and to Andrew Murray (@cudgie) for giving me an idea of using Effect sizes for comparing Baseline and Rolling average (same as daily Z-Score).
Here is the video of me demonstrating the software. Also, below you can find and download the Excel workbook.