![]() We are unfortunately not going to cover them here because they’re really long. Section 2 outlines calculating the standard error, specifically equations 2.9 and 2.12. Convert our statistic to a t-statistic by dividing by the standard error.Estimate the standard error of our autocorrelation value.We going to briefly touch on the method because it’s really really complex. Also note that autocorrelation is simply the correlation between a value and it’s prior value, for all values. The paper discusses a way to run an autocorrelation test on a time series dataset, but the concepts can generalize to other tests. If you have strong evidence that your time series data are truly exchangeable, then you can run a regular permutation test, but if not, you’ll need to leverage the below method. Often, time series data are not exchangeable - prior values can be determinant of future values. ![]() Permutation Tests for Time Series Dependence If they have the same distribution, they’re exchangeable. The sequence on the left is the original data and the sequence on the right is a permutation of the original data. Mathematically, we define exchangeability as shown in figure 5. It’s that simple.įigure 5: definition of exchangeability. Parametric methods assume an underlying distribution. Permutation tests are appealing because they are non-parametric and only require the assumption of exchangeability. Now that we understand the method, let’s determine when to use it. If you like code, here’s some pythonic pseudocode for calculating the p-value: permutations = permute(data, P=120) observed_median = median(data) p_medians = p_val = sum(p > observed_median for p in p_medians) / len(p_medians) Determine the proportion of permutation medians that are more extreme than our observed median.For each permutation, calculate the median.Calculate the median of the observed data (the Deaths column).Using the median as our test statistic (although it can be any statistic derived from our data), we’d follow the steps below: Note that we will never see a duplicate permutation - permutation tests sample an array of all possible permutations without replacement. Those draws are then combined to estimate the population distribution. At the end of this step, we’ll have a large number of theoretical draws from our population. Image by author.įirst, we develop many permutations of our variable of interest, labeled P1, P2, …, P120. There are 5 observations, represented by each row, and two columns of interest, Risk and Deaths.įigure 2: framework for a permutation test. In figure 2, we see a graphical representation of a permutation test. From there, we can determine how rare our observed values are relative to the population. The purpose of a permutation test is to estimate the population distribution, the distribution where our observations came from. Permutation tests are very simple, but surprisingly powerful. Let’s slow down a bit and really understand permutation tests… Permutation Tests 101 But, how does permutation testing actually work? This “studentization” process allows us to run autocorrelation tests on non-exchangeable data. To account for the lack of exchangeability, we divide our test statistic by an estimate of the standard error, thereby converting out test statistic to a t-statistic. The p-value is the proportion of samples that have a test statistic larger than that of our observed data. To get a p-value, we randomly sample (without replacement) possible permutations of our variable of interest. Permutation tests are non-parametric tests that solely rely on the assumption of exchangeability. ![]() In this post we will discuss the basics of permutation tests and briefly outline the time series method. However, it’s pretty efficient and can be implemented at scale. The method is very mathy and brand new, so there’s little support and no python/R libraries. Image by author.Ī recent paper published by researchers at Stanford extends the permutation testing framework to time series data, an area where permutation tests are often invalid. Here, 98.2% of our permutation distribution is below our red line, indicating a p-value of 0.018. The red vertical line is our observed data test statistic. Figure 1: example of a permutation test distribution.
0 Comments
Leave a Reply. |