MRW and Ecology- Part VII: Testing Habitat Familiarity

Consider having a series of GPS fixes, and you wonder if the individual was utilizing familiar space during your observation period – or started building site familiarity around the time when you started collecting data. Simulation studies of Multi-scaled random walk (MRW) shows how you may cast light on this important ecological aspect of space use.

First, you should of course test for compliance with the MRW assumptions, (a) site fidelity with no “distance penalty” on return events, (b) scale-free space use over the spatial range that is covered by your data, and (c) uniform space utilization on average over this scale range. One single test in the MRW Simulator, the A(N) regression, cast light on all these aspects. First, you seek to optimize pixel resolution for the analysis (estimating the Characteristic scale of space use, CSSU). Next, if you find “Home range ghost” compliance; i.e., incidence I expands proportionally with square root of sample size of fixes, your data supports (a) spatial memory utilization with no distance penalty due to sub-diffusive and non-asymptotic area expansion, (b) scale-free space use due to linearity of the log[I(N)] scatter plot, and (c) equal inter-scale weight of space use due to slope ≈ 0.5.

Supposing your data confirmed MRW, how to test for time-dependent strength of habitat familiarity? Consider the following simulation example, mimicking space use during a season and under constant environmental conditions.

The red dots show log(N,I) for various sample sizes up to the total set of 11,000 fixes. Each dot represents the average I for respective N of the two methods continuous sampling and frequency sampling (counteracting autocorrelation effect; see a previous post). However, analyzing the first 1,000 fixes separately (black dots) consistently revealed a more sloppy space use in terms of aggregated incidence at a given N, relative to the total season. The next 1,000 fixes, however, was compliant with the total series both with respect to slope and y-intercept (CSSU) (green dots).

The reason for the discrepancy in space use during the initial period of fix sampling* was in the present scenario the actual simulation condition; site familiarity was set to develop “from scratch” simultaneously with the onset of fix collection. I define strength of site familiarity as proportional with the total path length from which the model animal collects a previous location to return to**. In the start of the sampling period, the underlying path is short in comparison to the total path that was traversed during the total season, and – crucially – return steps targeted previous locations from the actual simulation period only, and not locations prior to to this start time. In other words, the animal was assumed to settle down in the area at the point in time when the simulation commenced.

To conclude, if your data shows CSSU and slope of similar magnitude in the early and later phase of data collection, you sampled an individual with a well-established memory map of its environment during the entire observation period. The implicit assumption for this conclusion is of course that the environmental conditions was constant during the entire sampling period, including the initial phase. Using empirical rather than synthetic data means that additional tests would have to be performed to cast light on this aspect.

NOTE

*) The presentation above reflects the pixel resolution that was optimized for the total series. The first 1,000 fixes showed a more coarse-grained space use, reflected in a 50% larger CSSU scale (not shown: optimal pixel size was 50% larger for this part of the series) despite constant movement speed and return rate for the entire simulation period. In this scenario a larger CSSU [coarser optimal pixel for the A(N) analysis] signals a less mature habitat utilization in the home range’s early phase. The CSSU was temporarily inflated during build-up of site familiarity, but – somewhat paradoxically – the accumulated number of fix-embedding grid cells (incidence) for a given N at this scale was smaller. These two effects, reflecting degree of habitat familiarity during home range establishment, should be considered a transient effect.

**) Two definitions should be specified:

  • I define strength of site familiarity as proportional with the total path length from which the model animal collects a previous location to return to.
  • I define strength of site fidelity as proportional with the return frequency.

Both definitions rest on the assumptions of no distance penalty on return targets and no time penalty on returns; i.e., infinite spatio-temporal memory horizon relative to the actual sampling period.

The MRW Simulator: Importing Your Own GPS Data

You have a large database of GPS fixes, and you wonder if your animals have utilized their habitat in accordance to standard theory of mechanistic movement (the null hypothesis) or in compliance with the MRW theory (the alternative hypothesis). The MRW Simulator is tailormade for this kind of test.  If MRW is verified you may proceed with various analyses of behavioural ecology under the alternative statistical-mechanical theory. The initial test procedure is simple: (1) import your data, (2) prepare for a test of model compliance by applying one or more built-in algorithms, and (3) import the generated data tables for statistical test into third party packages (R, Excel, etc.).

You can import data to the MRW Simulator by preparing a two-column text file, using comma or TAB as delimiter between the two coordinate values for successive locations.

By default you should use the file name import.txt, but other names are also allowed (given the correct data structure). Place the file in the data folder (…/mov) and choose the menu “File | Import data from txt or csv”.

You are asked to define the file name for the imported data. By default, the name is set to “seed1.txt”. During importing the original series is centred on coordinate (0,0), the middle of the arena window. The arena size for the analysis is automatically adjusted to twice the space needed to display the set of the imported fixes.

After import, you set a couple of check boxes on the MRW Simulator’s user interface in accordance to User guide before clicking the run button (the MRW simulator re-formulates your imported data to its own format and saves the result in the text file levy1.txt). In particular, setting simulation series length to zero and choosing “use seed1.txt” as first part of the simulated series ensures that only your own data are reformatted. Within a fraction of a second the procedure exits without adding simulated data to the series, and you are ready to perform various analytical tasks on the levy1.txt file (see menu “Analyze”).

The procedure “A(N) regression” is typically applied to analyze space use at the home range scale. It is a convenient choice to test for MRW compliance of your data.

You are asked which of the Levy*.txt files to analyze for fix-filling area as a function of sample size N (number of fixes in the Levy*.txt file). Next, the analysis is executed in accordance to the scales set in “Arena extent for analysis”, “Arena grain for analysis” and “Pixel (intra-grain resolution)” in the MRW Simulator’s user interface.

In this procedure, set extent = grain. Pixel regards a ratio; the relative resolution of grain/pixel. For example, setting pixel = 10 performs analysis at the virtual grid scale 1/10 of arena scale; i.e., 10×10 grid cells. See User guide for details.

The progress is shown below the arena window. The algorithm is counting incidence over a range of sample sizes N at the given pixel resolution; first by sequential (continuous) sampling up to Ntotal and then by frequency (uniform) sampling over the total series. Search my blog or read my book for these concepts.

The result is saved in a text file containing a table of incidence (non-empty grid cells at the given pixel resolution) as a function of sample size N under the two sampling conditions. These data may then be imported in for example Excel for graphical presentation and statistical analysis; for example, a regression.

If you find a discrepancy between the scatter from the two sampling methods (you normally do!), your data is probably serially autocorrelated. To remove autocorrelation effect, take the average incidence for respective magnitudes of N, as was explained in this post. Conveniently, the MRW Simulator does this task for you (you find the averaging table below the tables for continuous and frequency sampling). This averaging procedure also adjusts for a “drifting home range” scenario, which also produces autocorrelation.

Does the result support MRW? First, you must verify presence of a characteristic scale of space use (CSSU), which is a property of scale-free movement under influence of spatial memory under the “parallel processing” postulate.

To test for CSSU you should experiment with various pixel resolutions and see if the log[(N,incidence)] pattern converges to a slope ∼0.5 at a given scale. If so, CSSU ≈ (pixel scale)2 = c.

If you don’t find reasonably good compliance with linearity of log[I(N)] = log(c) + 0.5*log(N) or the slope exceeds 0.5, try a coarser pixel resolution. If the slope is smaller than 0.5, try a finer pixel resolution.

If this test of convergence to log-linearity with slope ≈ 0.5 fails, you have probably either supported one of the null models; i.e., Brownian motion-like or Lévy-like space use void of spatial memory influence (slope ≈1, which is quite insensitive to change of pixel scale) or the classic paradigm: home range movement under influence of a constraining border zone [I(N) showing an area asymptote rather than a power law expansion with exponent close to 0.5].


The MRW Simulator 2.0 will now be made available as a free add-on tool for all buyers of my book. If you purchase it through my shopping cart at www.thescalingcube.com, you will get the program and its user guide bundled with the book. Existing book owners: contact me at arild@gautestad.com and I’ll fix you a personal download link – free of charge. You may purchase by invoice – see top of this page!


The MRW Simulator – Finally Available!

Back in 1997 I started programming the foundation for a personal simulation environment for Multi-scaled random Walk, the MRW Simulator. Through countless updates over these 20 years the program has gradually matured into a version which finally is ready for limited distribution towards peers in the field of animal space use research.

The MRW Simulator is a Windows©-compliant tool to generate various classes of animal movement (self-produced data series) or to import existing data series. The generated or imported data – consisting of a sequence of (x,y) coordinates – may then be subject to various kinds of statistical protocols through simple menu clicks. The generated text files are then typically exported for detailed analyses and presentation of results in other applications, like the R package or Excel©.

While R is based on an interpreted language, the MRW Simulator is a fully complied program. Thus, movement paths of length up to 20 million steps may be simulated within minutes of execution time, rather than multi-hours or days. A multi-scaled analysis of data over a substantial scale range is almost forbidden in an interpreted system due to the algorithm’s long execution period. In the MRW Simulator such analyses are performed in a fraction of this time. Thus, R and the MRW Simulator may supplement each other. R is strong on statistics and algorithmic freedom; the MRW Simulator is strong on time–effective execution of a small set of basic but typically time-consuming algorithms.

The opening screen contains menus (1), a window where the simulated or imported set of fixes are displayed (2) and various command buttons, check boxes and information fields (314).

To get your first experience with the system, try out the most basic setting for a simulation. First, choose among classes of movement; Levy walk/MRW, Correlated random walk, and Composite random walk (superposition of two correlated random walks) (3). The difference between LW and MRW is explained below.

For your first test, choose Levy walk / MRW (3), with default setting for fractal dimension (D=1) and maximum displacement length between successive steps (truncation=1,000,000 length units). D=1 simulates the condition where the animal on average utilizes its environment with similar scale-free weight at each intermediate scale from unit step length to maximum step (setting 1<D<=2 skews space use towards finer-scale space use on expense of coarser scales, again in average terms).

In a column of text fields (4) you may define conditions like series length, properties for the simulated path, size of the arena and grid resolution for the subsequent analysis. For example, the difference between Levy walk and MRW is given by setting a return frequency >0 for MRW (implying targeted return events to previous locations at the chosen average frequency). For this first run, just keep the default values.

Later you will learn how to additionally modify the conditions by including a pre-defined series of coordinates (in a file called seed*.txt, where * regards an incremental number) (5). At this stage, just keep default settings.

By default the simulation runs in a homogeneous environment. The set of “Habitat heterogeneity” fields (6) allows defining the corners of a rectangle where the model animal behaves in a more “fine-grained” manner by reducing average movement speed. Other ecological aspects may also be defined, like a method to account for temporal and local resource exhaustion. As a start, just keep defaults.

Now, click the “Single-series” command button (7). You should see a number of fixes appearing as dots in the arena window.

The number of fixes reflect the ratio of total series length and the observation interval on this series; i.e., “Number of fixes” (Norig= 1,000,000) multiplied by an average “Observation frequency” (p=0.001). This leads to an observed series length – a path sample – of ca 1,000 fixes; which are displayed in the observation window.

Before moving on to your first data analysis, observe that the simulation’s default settings are defined by “schemes”, which can be pre-loaded from a dropdown menu (8). You may also run a number of replicate simulations in an automated sequence (9). The arena may be copied to the clipboard (10) for subsequent pasting into other applications like a Word document, an Excel sheet, etc.

The “Data path” field (11) displays the folder where the system saves and retrieves data. By default, the data resides in a subfolder, “\mov”, under the location of the MRW simulator’s EXE file. This location is set during program setup.

The field “Fractal resolution range” (12) defines the scale range over which a subsequent analysis of the scatter of fixes – selected from the Analysis menu – will be performed by the so-called box counting method.

The field “A(N)” (13) shows the progress of another analysis, total area (incidence) as a function of sample size, N.

The counter (14) is automatically incremented each time you click the “Singe-series” button (7). TIP: To repeat (and overwrite) an existing series, edit the counter number (14) to one decrement below the actual series. For example, to re-execute data series number 5, edit the counter field to “4” before clicking the button (7). To re-execute series 1, edit the field to “-1” (the number zero is reserved as the initial setting number).

The data file containing “observed” fixes resides in the \mov folder (see above), with name “levy*.txt”. (* = 1, 2, 3, …). It contains three columns of data; x-coordinate, y-coordinate, and inter-step distance.


The MRW Simulator 2.0 will now be made available as a free add-on tool for all buyers of my book. If you purchase it through my shopping cart at www.thescalingcube.com, you will get the program and its user guide bundled with the book. Existing book owners: contact me at arild@gautestad.com and I’ll fix you a personal download link – free of charge. You may purchase by invoice – see top of this page!


In the next blog post I’ll show some of the menu procedures of the MRW Simulator, including how to import you own GPS space use series for analysis on-the-fly.

MRW and Ecology – Part VI: The Statistical Property of Return Events

Animals that combine scale-free space use with targeted returns to previous locations generate a self-organized kind of home range. In short, the home range becomes an emergent property from such self-reinforcing revisits. Obviously, any space use pattern from complex processes outside the domain of Markov (mechanistic) theory needs to be analyzed using methods that are coherent with this kind of behaviour. Below I exemplify further the versatility of the MRW approach to adjust for serial auto-correlation (see Part III). I also show the quite surprising model property that the sub-set of inter-fix displacement lengths for return events seems to have a similar statistical distribution as the over-all pattern of exploratory step lengths. This additional emergent property of space use may lead to methods to test a wide range of behaviour-ecological hypotheses, for example to which extent an animal calculates on an energy cost with respect to distance to potential target locations for returns..

In ecological research it is traditionally considered logical that an animal considers a return to a distant familiar location to be less preferred than revisiting closer locations. On the other hand, by default (a priori) the MRW model does not include such a distance penalty on long-distance returns. Recently, the realism of this model premise has gained empirical support from studies on bison and toads (Merkle et al. 2014,2017; Marchand et al. 2017). In the MRW model’s standard version, a given return step is targeting any previous locations with equal probability except for the additive effect of number of previous visits to a given site, which increases the statistical probability for future revisits (self-reinforcing site fidelity). The implicit assumption is that the added energetic cost from long distance returns either is negligible relative to other parts of the energy budget, or the fitness value from keeping in touch with familiar locations regardless of current distance far exceeds the energy consideration. While this property regards a homogeneous environment it is trivial to adjust to a heterogeneous scenario without loss of the general principle. In this post I present more details on the return step property of MRW from a theoretical angle, as a starting point to test the model’s default condition on real data.

First, consider that the robustness of the MRW-based method to estimate an individual’s characteristic scale of space use (CSSU) within a given time and space extent is key to understand the energy aspect of return events as outlined above. The property of return events imposes a characteristic scale; i.e., CSSU, on space use, despite the scale-free nature of exploratory steps. For a given period, CSSU is a combined function of average movement speed and average return frequency. In a previous post I proposed how CSSU may be estimated even in auto-correlated (“over-sampled”) data series of location fixes. In this post I present a pilot analysis which strengthens this approach.

Consider the two simulated Home range ghost results to the right; incidence, I, as a function of number of fixes, N. The first set of fixes (circles) regards weakly auto-correlated series of fixes from return rate 1:10 and fix sampling at 1:100, while the second series (squares) resulted from a strongly autocorrelated path sampling (return rate 1:100 and sampling at 1:10). As was shown in Part III of this set of blog posts, by performing the “averaging trick” on log(I,N) from frequency and continuous sampling (open symbols for respective sets) the average log-log slope remains close to z=0.5 (area expanding proportionally with square root of sample size) even for the strongly auto-correlated series. The slight deviance from z=0.5 in the two series should be considered normal variability to be expected from one simulated series to the next (averaging over large sets of series would bring z closer to 0.5).

Critically, the present result also shows compliance with the expected change of the characteristic scale of space use (CSSU, represented by the parameter c in the Home range ghost formula I=cN0.5) as a function of the ratio between frequency of return events relative to exploratory moves (assuming constant average movement speed). In other words, observation frequency, which represents a sub-set of all displacements along a path (sampling of fixes) should not influence the CCSU estimate despite influencing the degree of auto-correlation. According to MRW theory, fewer returns during a constant average movement speed lead to larger CSSU*. In the analysis of the present two series, ten times smaller return rate led to an optimized unit pixel size (I≡1) of magnitude √10 = 3.3 times larger than for the weakly autocorrelated series with higher return frequency.

In the Figure above, the two CSSU scales have both been rescaled to c=1 ([log(c)=0], but respective series’ unit scale (I=1) is de facto correctly found to be very different in absolute terms.In the present examples, CSSU was estimated to c1 = 1252 area units for the high frequency return scenario and c2 = 4002 area units for the second series with fewer returns (and stronger degree of autocorrelation).

To conclude, after optimizing pixel size in respective series by analyzing I(N) over a range of pixel resolutions as previously described in my book and other posts, this preliminary analysis verifies a strong coherence between return step frequency and the magnitude of CSSU in accordance to the theoretical parameter prediction, despite strong difference in degree of serial autocorrelation in the sample of relocations. On other words, the CSSU estimate is quite resilient to the researcher’s choice of fix sampling scheme.

However, another aspect of the return step component of may turn out to be valuable to test the opposing energy hypotheses with respect to distance penalty, as outlined above.

Quite surprisingly I must admit, even considering the implicit “no distance penalty” model design, the tail part of the step length distribution of returns is quite similar to the tail of observed step lengths (fixes) that are sampled from the total series of steps!

The example series above with the weakest degree of auto-correlation (circles in the top Figure) shows similar functional form between the over-all distribution of binned step lengths [log(L); red circles below] and return distances (open symbols)**.

As expected from the weakly autocorrelated series, the fit to the power law function with Levy exponent β=2 of the exploratory steps is showing a clear “hump” in the extreme part of the Log(L) distribution of fixes, due to influence from intermediate return events.

For the more strongly autocorrelated series with N=10,000 fixes from a total series of 100,000 steps and a lower return frequency 1:100 (Figure below) we see – as theoretically expected – a more subdued hump for the fixes, due to less influence from return events***. The hump would be even less pronounced if the fix sampling frequency had been even larger (Gautestad and Mysterud 2013; in particular Figure A2 in Supplementary material).

 

Again the tail distribution of return lengths – where the total set of 1,000 events is shown as triangles – is similar to the the over-all distribution of fixes (1,000 first and 1,000 last of the N=10,000 fixes, shown as red and green circles). The median length for return steps is larger under this scenario (740 length units, versus 262) due to a ten times lower return frequency in relative terms. On the other hand, the median length for the actual set of fixes is strongly reduced as a consequence of the ten times larger fix sampling frequency.

To summarize, while the estimate of CSSU is quite resilient to fix sampling frequency, the (observed) median step length of fixes and (unobserved) length of return steps are influenced by fix sampling rate and return rate, respectively. Despite independence between the median length for observed series and hidden return lengths, both aspects of movement show a similar distribution of lengths.

Finally, what if the return step targets had not been set a priori to be independent on distance; i.e., by invoking distance penalty on return events? I have not tested this aspect yet in a modified MRW simulation model, but intuitively I predict the distribution of return steps to morph towards a negative exponential function rather than a power law, as in the exploratory kind of moves. As aconsequence, the “hump” effect in the distribution of fixes should also be more subdued. Hence, by testing the difference in functional form of return steps and step lengths of observed fixes, one may have a method to test empirically the energy hypothesis that was outlined above.

The challenge, of course, is to develop a method to distinguish between exploratory moves and return events in empirical data. In simulation data it is simple to filter out the returns; in true space use data it is necessary to distinguish returns from path crossing by chance. More on this methodology in an upcoming post.

NOTES

*) Thus, the ratio returns/exploratory moves have a similar influence on CSSU as a change in average movement speed where the speed is expressed as the average staying time in a given grid cell. In Gautestad and Mysterud 2010, Eq. 4, we defined the expected length of step x, Lx, as a function of a scaling parameter for movement speed δ and fractal dimension of the path, d:

Lx = (δ[1 − Rnd])−1/d         (Eq. 4)

where Rnd is  a random number 0 ≤Rnd < 1 and δ is a scaling parameter. In some sense δ may be interpreted as a parameter for expected staying time in a given patch, since larger δ implies smaller Lx and thus increased local fix contagion.
Gautestad and Mysterud 2010, p2744

Thus, by defining the space use’s fractal dimension D as D≡d, we have the relationship with CSSU’s Home range ghost parameter, c, and movement speed:

c ∝ 1/√δ  |  D = 1          (Eq 5).

**) Due to a return step frequency of 1:10 and actual fix sampling frequency of 1:100, the total set of return events exceeds the fix sample by a factor of 10. Thus, I have compared the distribution of of return lengths from the early part of the simulated path (open squares) with return lengths towards the end of the path (open triangles), keeping both samples at same size as the set of observed fixes. Red circles in the Figure above represent 10,000 fixes from a total series of 1 million steps. When studying the first and the last part of the 100,000 hidden return steps specifically, their distribution looks indistinguishable from the series of “observed” fixes. Triangles show the result for the first 10,000 return events, and the squares show the result from the last 10,000 returns during the total 1 of million steps.

***) In this example where observation frequency exceeds the intrinsic return frequency by a factor of 10, the first and last part of the set of fixes (red and green circles, respectively) was used for comparison with the total set of return steps (open triangles).

REFERENCES

Gautestad, A. O., and I. Mysterud. 2010. Spatial memory, habitat auto-facilitation and the emergence of fractal home range patterns. Ecological Modelling 221:2741-2750.

Gautestad, A. O., and A. Mysterud. 2013. The Lévy flight foraging hypothesis: forgetting about memory may lead to false verification of Brownian motion. Movement Ecology 1:1-18.

Marchand, P, M. Boenke and D. M. Green. 2017. A stochastic movement model reproduces patterns of site fidelity and long-distance dispersal in a population of Fowler’s toads (Anaxyrus fowleri). Ecological Modelling 360:63–69.

Merkle, J. A., D. Fortin and J. M. Morales. 2014. A memory-based foraging tactic reveals an adaptive mechanism for restricted space use. Ecology Letters 17:924–931.

Merkle, J. A., J. R. Potts and D. Fortin. 2017. Energy benefits and emergent space use patterns of an empirically parameterized model of memory-based patch selection. Oikos 126:185–195

Statistical-mechanical Details on Space Use Intensity

While stronger intensity of space use in the standard (Markovian/mechanistic) biophysical model framework is equal to the proxy variable fix density, density=N/area, the complex system analogue is 1/c. This alternative expression for intensity is derived from from the Home range ghost formula cN0.5 c√N). Below I illustrate the biophysical difference between the two intensity concepts by a simple Figure and some basic mathematics of the respective processes. The extended statistical mechanics of complex space use underscores the importance of estimating and applying a realistic spatial resolution, close to the magnitude of CSSU, when analyzing individual habitat utilization within various habitat classes. The traditional density variable for space use intensity will invoke a large noise term and even spurious results in ecological use/availability analyses of home range data.

A spatial dispersion of a small and a large sample of fixes is shown in the upper and lower row, respectively. Two resolutions (spatial scales) are shown; the spatial extent (large squares) and a virtual grid scale (dotted lines, shown in the upper right square only). For interpretation of low and high intensity of complex space use, 1/c, see the main text.

In statistical-mechanical terms, one of the main discrepancies between the traditional space use models (mechanistic modelling) and complex movement (MRW) regards the representation of locally varying intensity of space use.

Classical space use intensity may be calculated from a single scale, and trivially extrapolated to a coarser resolution up to the full area extent.

Why is this “freedom to zoom” feasible and mathematically allowed? Consider an example where the system extent is represented by the demarcation of a specific habitat type within a home range, simplified by a square under four conditions in the Figure to the right. Due to assumed compliance with standard statistical mechanics under classical space use analysis, we are specifically assuming finite system variance within the given spatial extent,

Var(X1) + Var(X2) + … +  Var(Xn) = σ2

where [X] is the set of spatial elements from sectioning a system’s extent into sub-sets 1, 2 , 3 .., n; and sub-sets into sub-sub-sets to find respective sub-set variances. Thus, Var(Xi) is the i‘th element’s second moment variability (variance). For example, σ2 could be the intrinsic variances of the spatial inter-cell number of fixes in the virtual grid cells in the Figure above’s upper right scenario (sub-sub grid cells not shown).

The variance also changes proportionally with density. In other words, variance is stationary upon scaling and can thus be assumed to change proportionally with grid scale and density. This implies compliance with the central limit theorem. Even if intra-cell variance is not constant between grid cells at a given resolution within the given extent, as is expected in a heterogeneous habitat where local density varies, the sum of variance of these local parts is independent of this finer-scale variability between sub-components. Once again I underscore that this enormously simplifying system property regards scenaria under the standard statistical-mechanical framework!

On the other hand, the local variability of fix density from complex space use does not comply with the central limit theorem. Intensity of use needs to be calculated over a scale range – from “grain” to extent – rather than any scale, and the grain scale must be chosen with care. 

Traditionally, space use may be quantified by the magnitude of “free space” (area/N) in a sample of N relocations (fixes) of an individual, due to compliance with the central limit theorem, as explained above. On the other hand we have complex space use; i.e., scale-free movement under influence of spatial memory and under compliance with the parallel processing postulate. Under this biophysical framework free space is expressed by the ratio area/√N, rather than the ratio area/N, and quantified by the characteristic scale of space use (CSSU). CSSU is a function of average movement speed and average return rate to previous locations. The  system complexity from CSSU implies that the sum of the system parts’ standard deviation – rather than variance – is stationary upon re-scaling; i.e., 

s.d.(X1) + s.d.(X2) + … +  s.d.(Xn) = √σ2

In other words, by default the spatial statistics follow a Cauchy distribution with scale parameter γ=1, rather than the classical Gaussian distribution. CSSU is proportional with the parameter c in the Home range ghost formula I=c√N, where I is number of fix-embedding virtual grid cells at spatial scale c≈CSSU. 

What if we “lose focus” by studying the system (applying the grain scale) at coarser of finer resolutions than the CSSU? In the illustration above it is assumed that the superimposed virtual grid in the upper right corner reflects a spatial resolution that is close to this system’s true CSSU. If the system’s CSSU had been higher (1/c implicitly lower, as in the upper left-hand scenario), applying the same observer-defined grid resolution as in the upper right scenario would show deviance from the Cauchy distribution. The Cauchy scale parameter and the Home range ghost exponent are both inflated*) due to this “out of focus” situation; i.e., γ>>1 and z>>0.5<1.

In short, by superimposing a virtual grid at scale <<CSSU, we will observe I≈cNz with z≈1 rather than z≈0.5. The parameter c and thus the true CSSU has been erratically estimated. The power exponent z → 1 as grid scale is successively decreased by using cells that are smaller than the true CSSU scale under this condition. However, compliance with z=0.5 may be regained under a “Low 1/c” scenario (upper left) by sufficiently increasing the grid scale relative to cell sizes in the “large 1/c” scenario shown in the upper right example. We can then re-estimate CSSU by such scale zooming towards a coarser resolution and find that z → 0.5 as the coarse-graining is approaching the true CSSU. By comparing scenaria with low and high CSSU; i.e., high and low intensity of space use (1/CSSU), we can rise behavioural-ecological hypotheses about these differences. One obvious example regards strength of intra-home range habitat selection, but where intensity of space use is expressed by habitat-specific 1/c rather than density of fixes.

On the other hand, starting with a too coarse grid cell scale to estimate CSSU will lead to 0<z<<0.5. Defining the scale for I for system observation substantially larger than the true CSSU scale means that I will be seen to increase extremely slowly or not at all with increasing N. In Cauchy terms, the scale parameter 0<γ<1. Hence, the chance to need an extra grid cell to cover all fixes when increasing sample size to N+1 is very small, but not negligible! Occasional sallies of surprising magnitude happen! Surprising from the standard statistical-mechanical framework, but just part of the picture in a space use system that obeys parallel processing principles.

To summarize, while a “Gauss-compliant” (non-complex) kind of space use allows the average intensity of space use to be considered trivially constant upon zooming and linear rescaling over a scale range within the system extent, “Cauchy-compliant” space use requires a search for the correct grain scale to find the system’s average CSSU at this scale within the given extent. 

More details on the statistical-mechanical system description of complex space use is found in my book.

NOTE

*) Apparently but erroneously, the variability under too fine-grained pixel resolution (grid cell scale) leading to z≈1 and Cauchy scale parameter γ≈2 may be interpreted as Gauss-compliant statistics. However, the Cauchy distribution does not have finite moments of any order. Thus, in strict terms, the reference to √σ2 under the γ=1 scenario is not correct since variance is a term under standard statistical mechanics, but represents a commonly applied approximation (Mandelbrot 1983, Schroeder 1991).

REFERENCES

Mandelbrot, B. B. 1983, The fractal geometry of nature. New York, W. H. Freeman and Company.

Schroeder, M. 1991, Fractals, Chaos, Power Laws – Minutes from an Infinite Paradise. New York, W. H. Freedman and Company.

MRW and Ecology – Part III: Autocorrelation

Ideally, when studying ecological aspects of an individual’s whereabouts based on (for example) series of GPS fixes, N should not only be large. The series of fixes should also be non-autocorrelated to ensure statistically independent samples of space use. Since these two goals are difficult to fulfill simultaneously (the latter tend to undermine the former), two workarounds are common. Either the autocorrelation issue is ignored albeit recognized, or space use is analyzed by path analytical methods rather than the more classical use-availability approach. Both workarounds have drawbacks. In this post I show for the first time a surprisingly simple method to compensate for the oversampling effect  that leads to autocorrelated series of fixes.

Again, as in Part II of this series, I focus on how to improve realism and reduce the statistical error term when studying ecological aspects of habitat selection, given that data compliance with the MRW framework has been verified (see, for example, this post regarding red deer) or can be feasibly assumed. Hence, the individual’s characteristic scale of space use (CSSU) is the primary response variable we are looking for. In part II the proper proxy for local intensity of space use was described as the inverse of CSSU (actually, the inverse of the parameter c).

However, by default the basic version of the Home range ghost equation I = c√N, where I is the total area of fix-embedding virtual grid boxes at the CSSU scale, assumes a data set of N serially non-autocorrelated fixes. This is difficult to achieve, due to the simultaneous goal to have a large N available for the analysis. Splitting the data into sub-sets of N from several habitat classes makes the autocorrelation issue even more challenging. Thus, over-sampling of the animal’s movement seems unavoidable. In the following example I illustrate how such an oversampling effect on local and temporal CSSU estimates may be accounted for.

As a reference scenario, consider the default MRW condition of non-autocorrelated fix sampling of an animal moving in a homogeneous environment. Non-autocorrelation is achieved by sampling at larger intervals than the average interval between successive return events. In the illustration above the spatial scatter of 10,000 fixes (grey dots) shows a relatively stationary space use when comparing N=100 fixes from early, middle and late part of the sampling period (blue, red and yellow dots, respectively). However, return events that took place during the last part of the series have a more spread-out set of historic locations to return to, and this explains why the 100 yellow fixes cover a somewhat larger range than a similar sample size from the series’ early part.

When sampling a series of fixes from the actual path for a given time period*, two methods may be applied; continuous sampling containing a section of the series, varying in length N; and frequency-based sampling where N fixes are uniformly spread over the entire time interval for the total series (higher sampling frequency implies larger N). With reference to the Home range ghost formula above, I shows compliance with a non-asymptotic power law with exponent z≈0.5 (log-log slope close to 0.5). Grid resolution (pixel size) has been optimized in accordance to previously described method. The well-behaved pattern in this scenario is due to lack of strong auto-correlation under both sampling regimes. In other words, the animal’s path has not been over-sampled. Still, the difference between continuous sampling (open triangles) and frequency-based sampling (open squares) shows that the former is more prone to short term random effects, in this example seen as the “plateau” of I(N) in the range N=23 to 27. The characteristic average scale, log(c), is given by the I(N) intercept with the y-axis, where log2(N)=0.

Observe the set of black circles, which represent the average log[I(N)] from the two sampling methods covering the same sampling period at Ntotal.

Next, consider an example with strongly autocorrelated fixes. The ecological condition will be described below as “semi-punctuated site fidelity”**. Again, the colour codes in the spatial scatter of fixes (above) describe subsets of 100 fixes from the early, middle and late part of the total sampling period.

What is important under this condition is the behaviour of log[I(N)] under the two sampling methods, continuous and frequency. As expected with autocorrelated series, sub-sampling the total series by the frequency method – relative to continuous sampling – will tend to show a larger I for a given N over the middle range of log(N). Similarly, continuous sampling tends to show a smaller area for a given N, relative to expectation from the Home range ghost equation.

However, when averaging the respective log(N,I) points the compliance with I ∝ √N is restored! Thus, the estimate of CSSU may be properly estimated also from over-sampled paths. Despite the substantial under-detection of true space use based on N autocorrelated fixes, the statistical-mechanical theory of MRW in fact predicts the true I(N) – and hence also the CSSU – by performing the averaging trick above.

Why does the average of continuous and frequency-sampled estimates represent the true I(N)? Consider the vertical distance between the respective pairs of log(N,I) points to represent un-observed “ghost area” as a result of over-sampling. The stronger the over-sampling the larger the ghost area. If the sampling regime had regarded non-autocorrelated series, the ghost area would have been small (as in the first example above), due to weak degree of over-sampling. Stronger auto-correlation leads to stronger ghost area. Why is the ghost area splitting the area from frequency sampling and continuous sampling by 50% in log-log terms? This theoretical question requires a deeper statistical-mechanical explanation, which is still in theoretical progress. However, the answer is linked to the 50%/50% inward/outward expansion property of MRW (see this post).

NOTES

*) If the total sampling period is not kept constant (same time period for Ntotal), CSSU will be influenced by the fact that late return events are targeting a more spread-out scatter of previous locations. Despite this, CSSU will tend to contract somewhat with total observation period (temporal extent). This transient effect will be explored in an upcoming blog post

**) An extreme form of temporal space use heterogeneity is achieved by “punctuated site fidelity”. For example, for every 1/50th part of the total series length the animal erases its affinity to previous locations and begin developing affinity to newer locations only. For example, in the third section of such a path, return events the following return events do not target the initial two parts of the series. The first location in each of the 50 successive parts (time sections) is chosen randomly within the total arena, hence a “punctuated” kind of site fidelity. This scenario could in model-simplistic terms illustrate GPS sampling of an animal that occasionally is changing its space use in accordance to changing food distribution during the season. It could also illustrate an intrinsic predator avoidance strategy, whereby fitness may improve by occasional abrupt changes of patch use, and this may under specific conditions be more advantageous than the cost of occasionally giving up utilization of familiar patches. The scenario could also illustrate patch deterioration with respect to a critical resource; energy profit in utilized patches may deteriorate owing to foraging, and thus trigger a “reset” of over-all patch use in conceptual compliance with a variant of the marginal value theorem.

A less dramatic and more realistic variant of temporal heterogeneity, “partially punctuated site affinity”, is simulated by keeping – for example – the last 10% or 2% of the path locations of the foregoing part of the path as potential return targets on equal footing with the successively emerging locations in the present part. This condition leads to a tendency for a “drifting home range” (Doncaster and Macdonald 1991), with some degree of locking towards previous patch use, similar to the condition that was numerically explored in Gautestad and Mysterud (2006).

REFERENCES

Gautestad A. O. and I. Mysterud. 2006 Complex animal distribution and abundance from memory-dependent kinetics. Ecological Complexity 3:44-55.

Doncaster C. P. and D. W. Macdonald. 1991 Drifting territoriality in the red fox Vulpes vulpes. Journal of Animal Ecology 60, 423-39.