MRW and Ecology – Part III: Autocorrelation
Ideally, when studying ecological aspects of an individual’s whereabouts based on (for example) series of GPS fixes, N should not only be large. The series of fixes should also be non-autocorrelated to ensure statistically independent samples of space use. Since these two goals are difficult to fulfill simultaneously (the latter tend to undermine the former), two workarounds are common. Either the autocorrelation issue is ignored albeit recognized, or space use is analyzed by path analytical methods rather than the more classical use-availability approach. Both workarounds have drawbacks. In this post I show for the first time a surprisingly simple method to compensate for the oversampling effect that leads to autocorrelated series of fixes.
Again, as in Part II of this series, I focus on how to improve realism and reduce the statistical error term when studying ecological aspects of habitat selection, given that data compliance with the MRW framework has been verified (see, for example, this post regarding red deer) or can be feasibly assumed. Hence, the individual’s characteristic scale of space use (CSSU) is the primary response variable we are looking for. In part II the proper proxy for local intensity of space use was described as the inverse of CSSU (actually, the inverse of the parameter c).
However, by default the basic version of the Home range ghost equation I = c√N = cN0.5, where I is the total area of fix-embedding virtual grid boxes at the CSSU scale, assumes a data set of N serially non-autocorrelated fixes. This is difficult to achieve, due to the simultaneous goal to have a large N available for the analysis. Splitting the data into sub-sets of N from several habitat classes makes the autocorrelation issue even more challenging. Thus, over-sampling of the animal’s movement seems unavoidable. In the following example I illustrate how such an oversampling effect on local and temporal CSSU estimates may be accounted for.
As a reference scenario, consider the default MRW condition of non-autocorrelated fix sampling of an animal moving in a homogeneous environment. Non-autocorrelation is achieved by sampling at larger intervals than the average interval between successive return events. In the illustration above the spatial scatter of 10,000 fixes (grey dots) shows a relatively stationary space use when comparing N=100 fixes from early, middle and late part of the sampling period (blue, red and yellow dots, respectively). However, return events that took place during the last part of the series have a more spread-out set of historic locations to return to, and this explains why the 100 yellow fixes cover a somewhat larger range than a similar sample size from the series’ early part.
When sampling a series of fixes from the actual path for a given time period*, two methods may be applied; continuous sampling containing a section of the series, varying in length N; and frequency-based sampling where N fixes are uniformly spread over the entire time interval for the total series (higher sampling frequency implies larger N). With reference to the Home range ghost formula above, I shows compliance with a non-asymptotic power law with exponent z≈0.5 (log-log slope close to 0.5). Grid resolution (pixel size) has been optimized in accordance to previously described method. The well-behaved pattern in this scenario is due to lack of strong auto-correlation under both sampling regimes. In other words, the animal’s path has not been over-sampled. Still, the difference between continuous sampling (open triangles) and frequency-based sampling (open squares) shows that the former is more prone to short term random effects, in this example seen as the “plateau” of I(N) in the range N=23 to 27. The characteristic average scale, log(c), is given by the I(N) intercept with the y-axis, where log2(N)=0.
Observe the set of black circles, which represent the average log[I(N)] from the two sampling methods covering the same sampling period at Ntotal.
Next, consider an example with strongly autocorrelated fixes. The ecological condition will be described below as “semi-punctuated site fidelity”**. Again, the colour codes in the spatial scatter of fixes (above) describe subsets of 100 fixes from the early, middle and late part of the total sampling period.
What is important under this condition is the behaviour of log[I(N)] under the two sampling methods, continuous and frequency. As expected with autocorrelated series, sub-sampling the total series by the frequency method – relative to continuous sampling – will tend to show a larger I for a given N over the middle range of log(N). Similarly, continuous sampling tends to show a smaller area for a given N, relative to expectation from the Home range ghost equation.
However, when averaging the respective log(N,I) points from the two sampling schemes the compliance with I ∝ √N is restored! Thus, the estimate of CSSU may be properly estimated also from over-sampled paths. Despite the substantial under-detection of true space use based on N autocorrelated fixes, the statistical-mechanical theory of MRW in fact predicts the true I(N) – and hence also the CSSU – by performing the averaging trick above.
Why does the average of continuous and frequency-sampled estimates represent the true I(N)? Consider the vertical distance between the respective pairs of log(N,I) points to represent non-observed “ghost area” as a result of over-sampling. The stronger the over-sampling the larger the ghost area. If the sampling regime had regarded non-autocorrelated series, the ghost area would have been small (as in the first example above), due to weak degree of over-sampling. Stronger auto-correlation leads to stronger ghost area. Why is the ghost area splitting the area from frequency sampling and continuous sampling by 50% in log-log terms? This theoretical question requires a deeper statistical-mechanical explanation, which is still in theoretical progress. However, the answer is linked to the 50%/50% inward/outward expansion property of MRW (search Archive).
NOTES
*) If the total sampling period is not kept constant (same time period for Ntotal), CSSU will be influenced by the fact that late return events are targeting a more spread-out scatter of previous locations. Despite this, CSSU will tend to contract somewhat with total observation period (temporal extent). This transient effect will be explored in an upcoming blog post
**) An extreme form of temporal space use heterogeneity is achieved by “punctuated site fidelity”. For example, for every 1/50th part of the total series length the animal erases its affinity to previous locations and begin developing affinity to newer locations only. For example, in the third section of such a path, return events the following return events do not target the initial two parts of the series. The first location in each of the 50 successive parts (time sections) is chosen randomly within the total arena, hence a “punctuated” kind of site fidelity. This scenario could in model-simplistic terms illustrate GPS sampling of an animal that occasionally is changing its space use in accordance to changing food distribution during the season. It could also illustrate an intrinsic predator avoidance strategy, whereby fitness may improve by occasional abrupt changes of patch use, and this may under specific conditions be more advantageous than the cost of occasionally giving up utilization of familiar patches. The scenario could also illustrate patch deterioration with respect to a critical resource; energy profit in utilized patches may deteriorate owing to foraging, and thus trigger a “reset” of over-all patch use in conceptual compliance with a variant of the marginal value theorem.
A less dramatic and more realistic variant of temporal heterogeneity, “partially punctuated site affinity”, is simulated by keeping – for example – the last 10% or 2% of the path locations of the foregoing part of the path as potential return targets on equal footing with the successively emerging locations in the present part. This condition leads to a tendency for a “drifting home range” (Doncaster and Macdonald 1991), with some degree of locking towards previous patch use, similar to the condition that was numerically explored in Gautestad and Mysterud (2006).
REFERENCES
Gautestad A. O. and I. Mysterud. 2006 Complex animal distribution and abundance from memory-dependent kinetics. Ecological Complexity 3:44-55.
Doncaster C. P. and D. W. Macdonald. 1991 Drifting territoriality in the red fox Vulpes vulpes. Journal of Animal Ecology 60, 423-39.
Again, as in Part II of this series, I focus on how to improve realism and reduce the statistical error term when studying ecological aspects of habitat selection, given that data compliance with the MRW framework has been verified (see, for example, this post regarding red deer) or can be feasibly assumed. Hence, the individual’s characteristic scale of space use (CSSU) is the primary response variable we are looking for. In part II the proper proxy for local intensity of space use was described as the inverse of CSSU (actually, the inverse of the parameter c).
However, by default the basic version of the Home range ghost equation I = c√N = cN0.5, where I is the total area of fix-embedding virtual grid boxes at the CSSU scale, assumes a data set of N serially non-autocorrelated fixes. This is difficult to achieve, due to the simultaneous goal to have a large N available for the analysis. Splitting the data into sub-sets of N from several habitat classes makes the autocorrelation issue even more challenging. Thus, over-sampling of the animal’s movement seems unavoidable. In the following example I illustrate how such an oversampling effect on local and temporal CSSU estimates may be accounted for.
As a reference scenario, consider the default MRW condition of non-autocorrelated fix sampling of an animal moving in a homogeneous environment. Non-autocorrelation is achieved by sampling at larger intervals than the average interval between successive return events. In the illustration above the spatial scatter of 10,000 fixes (grey dots) shows a relatively stationary space use when comparing N=100 fixes from early, middle and late part of the sampling period (blue, red and yellow dots, respectively). However, return events that took place during the last part of the series have a more spread-out set of historic locations to return to, and this explains why the 100 yellow fixes cover a somewhat larger range than a similar sample size from the series’ early part.
When sampling a series of fixes from the actual path for a given time period*, two methods may be applied; continuous sampling containing a section of the series, varying in length N; and frequency-based sampling where N fixes are uniformly spread over the entire time interval for the total series (higher sampling frequency implies larger N). With reference to the Home range ghost formula above, I shows compliance with a non-asymptotic power law with exponent z≈0.5 (log-log slope close to 0.5). Grid resolution (pixel size) has been optimized in accordance to previously described method. The well-behaved pattern in this scenario is due to lack of strong auto-correlation under both sampling regimes. In other words, the animal’s path has not been over-sampled. Still, the difference between continuous sampling (open triangles) and frequency-based sampling (open squares) shows that the former is more prone to short term random effects, in this example seen as the “plateau” of I(N) in the range N=23 to 27. The characteristic average scale, log(c), is given by the I(N) intercept with the y-axis, where log2(N)=0.
Observe the set of black circles, which represent the average log[I(N)] from the two sampling methods covering the same sampling period at Ntotal.
Next, consider an example with strongly autocorrelated fixes. The ecological condition will be described below as “semi-punctuated site fidelity”**. Again, the colour codes in the spatial scatter of fixes (above) describe subsets of 100 fixes from the early, middle and late part of the total sampling period.
What is important under this condition is the behaviour of log[I(N)] under the two sampling methods, continuous and frequency. As expected with autocorrelated series, sub-sampling the total series by the frequency method – relative to continuous sampling – will tend to show a larger I for a given N over the middle range of log(N). Similarly, continuous sampling tends to show a smaller area for a given N, relative to expectation from the Home range ghost equation.
However, when averaging the respective log(N,I) points from the two sampling schemes the compliance with I ∝ √N is restored! Thus, the estimate of CSSU may be properly estimated also from over-sampled paths. Despite the substantial under-detection of true space use based on N autocorrelated fixes, the statistical-mechanical theory of MRW in fact predicts the true I(N) – and hence also the CSSU – by performing the averaging trick above.
Why does the average of continuous and frequency-sampled estimates represent the true I(N)? Consider the vertical distance between the respective pairs of log(N,I) points to represent non-observed “ghost area” as a result of over-sampling. The stronger the over-sampling the larger the ghost area. If the sampling regime had regarded non-autocorrelated series, the ghost area would have been small (as in the first example above), due to weak degree of over-sampling. Stronger auto-correlation leads to stronger ghost area. Why is the ghost area splitting the area from frequency sampling and continuous sampling by 50% in log-log terms? This theoretical question requires a deeper statistical-mechanical explanation, which is still in theoretical progress. However, the answer is linked to the 50%/50% inward/outward expansion property of MRW (search Archive).
NOTES
*) If the total sampling period is not kept constant (same time period for Ntotal), CSSU will be influenced by the fact that late return events are targeting a more spread-out scatter of previous locations. Despite this, CSSU will tend to contract somewhat with total observation period (temporal extent). This transient effect will be explored in an upcoming blog post
**) An extreme form of temporal space use heterogeneity is achieved by “punctuated site fidelity”. For example, for every 1/50th part of the total series length the animal erases its affinity to previous locations and begin developing affinity to newer locations only. For example, in the third section of such a path, return events the following return events do not target the initial two parts of the series. The first location in each of the 50 successive parts (time sections) is chosen randomly within the total arena, hence a “punctuated” kind of site fidelity. This scenario could in model-simplistic terms illustrate GPS sampling of an animal that occasionally is changing its space use in accordance to changing food distribution during the season. It could also illustrate an intrinsic predator avoidance strategy, whereby fitness may improve by occasional abrupt changes of patch use, and this may under specific conditions be more advantageous than the cost of occasionally giving up utilization of familiar patches. The scenario could also illustrate patch deterioration with respect to a critical resource; energy profit in utilized patches may deteriorate owing to foraging, and thus trigger a “reset” of over-all patch use in conceptual compliance with a variant of the marginal value theorem.
A less dramatic and more realistic variant of temporal heterogeneity, “partially punctuated site affinity”, is simulated by keeping – for example – the last 10% or 2% of the path locations of the foregoing part of the path as potential return targets on equal footing with the successively emerging locations in the present part. This condition leads to a tendency for a “drifting home range” (Doncaster and Macdonald 1991), with some degree of locking towards previous patch use, similar to the condition that was numerically explored in Gautestad and Mysterud (2006).
REFERENCES
Gautestad A. O. and I. Mysterud. 2006 Complex animal distribution and abundance from memory-dependent kinetics. Ecological Complexity 3:44-55.
Doncaster C. P. and D. W. Macdonald. 1991 Drifting territoriality in the red fox Vulpes vulpes. Journal of Animal Ecology 60, 423-39.