A Statistical-Mechanical Perspective on Site Fidelity – Part V

Two important statistical-mechanical properties need to be connected to the home range concept under the parallel processing postulate; micro- and macrostates. When a system is in its equilibrium state, (a) all microstates are equally probable and (b) the observed macrostate is the one with the most microstates. The equilibrium state implies that entropy is maximized.

First consider a physical system consisting of indistinguishable particles. For example, in a spatially constrained volume G of gas (a classical, non-complex system) that consists of two kinds of molecules, at equilibrium you expect a homogeneous mixture of these components. At every virtually defined locality g within the volume, the local density (N/g) of each type of molecule is expected to be the same, independently of the resolution of the two- or three-dimensional virtual grid cell g we use to test this prediction.

This homogeneous state is in compliance with a prediction based on the most probable macrostate for N particles, due to the system’s equilibrium condition. Entropy is the same everywhere within our system, both when comparing localities at a given resolution g within G, and when comparing entropy at different resolutions. In the latter case, the coarser-scale entropy is a simple sum of the entropy of the embedded grid cells at finer cell sizes. However, these properties and linear relationships require “Markov-compliant” dynamics at micro-resolutions (see Preface of my book, or previous posts). In this post I describe the statistical-mechanical micro-/macrostate concepts from the classical home range model’s perspective (the convection model; see previous posts), which satisfies this Markov assumption.

The description below thus serves as a preparation for a follow-up posts in this series, where I intend to present properties with respect to micro-/macrostates under the non-Markovian condition for complex space use; i.e., under the premise of parallel processing. One can then compare classic system properties as they are generally assumed in statistical analysis of animal space use, with the alternative set of properties under the parallel processing conjecture.

Hence, first consider the standard statistical-mechanical description, based on a series of N serially non-autocorrelated fixes from an individual’s movement within its home range. This implies that the condition of a “deep” hidden layer is satisfied, meaning that the data may be interpreted in the context of statistical mechanics.


The image above illustrates how the hidden layer emerges from sampling an animal’s path in the serially autocorrelated domain. This condition implies that ergodicity is satisfied only at relatively fine resolutions g within G, not at the scales of the path as a whole as represented by G (see Part I of this group of posts). To achieve full ergodicity at scale G the sampling interval (lag) should be large enough to make at least one return event probable within this interval. The sampled series would then have reached the non-autocorrelated domain of statistical-mechanical system representation.

Let us in the following assume this coarser range of temporal sampling scales. At this fully expressed ergodic condition of space use at the home range scale G, the micro/macrostate aspects of a complex home range process may most easily be compared to classic predictions. In this state the animal’s present location gives no indication of its next location within the home range. The utilization distribution (UD, the “density surface of fixes”) is then expected – due to the unpredictability of the animal’s next location based on the present location – to properly reflect the spatially varying probability of the next fix’ actual location.

Simply stated, this probability function – whether simple 2-dimensional Gaussian distribution or a more complicated but realistic multi-modal form (reflecting intra-home range habitat heterogeneity) is the most probable outcome from N fixes, given that N is very large. Why? The answer requires a statistical-mechanical system description. Thus, let us superimpose a virtual grid onto the spatial scatter of N fixes.

Respective grid cells of size g have local density of fixes varying from zero to (quite unlikely) the theoretical maximum N within the given set of cells. Then consider one of these cells, with density N’/g, where g is grid cell area at the chosen resolution. Since we assume a classical system, N’ represents the collection of mutually independent revisit events to this cell during the total sampling time T.

  • Microstates: A total of N relocations of the animal implies N-1 time steps of sampling during T, with intervals (“lag”) of length t = T/(N-1) ≈ T/N. Within this range of time scales, from t to T, consider the local series of events in a given grid cell, leading eventually to density N’/g during T. In some extreme cases, the N’ fixes could appear successively from the moment it was observed in the cell for the first time at the n’th increment (0<n<N-N’) and then during each of the following N’-1 observations. In another scenario consider that only N’-1 observations took place in succession, while the last one appearing anytime during the remaining intervals, t, between n=0 and n=N. Each of these combinations represents a given microstate (permutation under a given condition). Hence, the “(N’-1) +1″ set of microstates” is larger and thus more probable than the “N’+0” set of microstates, given that N'<<N. Similarly, a “(N’-2)+2” scenario is more probable than the two foregoing scenaria.
  • Macrostate: The scenario with the largest set of microstates is the one that we most likely will find in a local subset of N’ fixes from a total set of N. Under this specific condition we can calculate the system’s entropy as the sum of this maximized magnitude of microstates. If we split the series of GPS fixes into N/x equal parts (x is a positive integer and x<<N), we expect to find N’/x fixes in the given cell, with some statistical variation given by locally varying staying times (a function of the animal’s behaviour and its interaction with the local environment within g). In other words, the local density of fixes is expected to change inversely proportional with x. Hence, the utilization distribution (UD) of the animal’s space use is in this scenario in principle independent of N. However, the larger the sample size the closer we find the estimated UD from N fixes to satisfy the true UD from “infinite” N. The standard error of respective local mean densities over subsets N’/x decreases trivially (square root-kind of change) with increasing N and thus with increasing N’.

In a later post in this series you will get a glimpse of how this picture may look like from an alternative modelling approach, under the premise of complex space use (MRW-like), where the classical micro-/macrostate theory requires modifications. In other words, what is the micro-/macrostate description under the parallel processing premise, where the N’-1 revisits to a given locality g are not mutually independent?

In that respect, one aspect that may of you may find outright disturbing is my argument that the floor of the illustration above under a broad range of annimal space use conditions should not be termed “mechanistic” . Because parallel processing violates the Markovian principle, which represents the foundation both for a mechanistic process and its statistical-mechanical representation, the lower level needs to be renamed and the upper level will need an extended theory.

I call the process non-mechanistic, which is quite different from “static”. It means a generalized form of dynamics, where mechanistic behaviour is embedded as a special case where the process complies with the Markovian principle.

Three Important MRW Model Assumptions Confirmed

My statistical-mechanical model representative for simulation parallel processing – the Multi-scaled random walk (MRW) – unifies two traditionally disparate directions of research; theory for site fidelity (area-restricted space use, the home range) and theory for scale-free movement (Lévy walk-like). Space use is represented by a set of relocations (fixes). The fixes are assumed to be collected at a sufficiently large interval (lag) to ensure a statistical-mechanical representation of the system. MRW is characterized by a wide range of system properties. Some have been explored by simulations and subsequently supported by empirical pilot tests on a wide range of vertebrates. However, verification of some model assumptions have been left behind for another day. Here I catch up with three of them.

I have previously described the individual’s characteristic scale of space use, CSSU, as the MRW model’s proposed substitute for the problematic “home range size” concept.

Using serially non-autocorrelated samples to study individual space use, CSSU describes the “balancing scale” where outwards area expansion under increased sample size of relocations N (“the Home range ghost”) is counteracted exactly by the inwards expansion from resolving finer-grained details of spatial use as N is increased. CSSU has a great potential for ecological inference. For example, it is expected to be smaller under more favourable conditions (larger return frequency to known patches). This effect – smaller CSSU and smaller home range for a given N and average movement speed due to stronger homing – has already been verified in simulations (Gautestad and Mysterud 2013). Smaller CSSU for a smaller movement speed for a given homing strength has also been explored (Gautestad and Mysterud 2010).

Obs scale diff test collective

Assumption 1: CSSU is independent of sampling frequency

The image above shows simulated space use, produced specifically for this post. It confirms the MRW model’s assumption that CSSU is not only independent of sample size of relocations, N, but also independent on sampling frequency 1/t, in the domain of non-autocorrelated fixes (the ergodic condition at the home range scale).

Both the slope z and the intercept c is very similar in the two series [z≈0.5, log2(c)≈0.3 at the pixel resolution for studying incidence, g, which is indicated by the red squares, equal to g=1/(100*100) of arena size, G (1/100 of arena scale √G). The parameters z and c are similar despite a 10-fold difference in sampling interval, t.

However, the intercept log(c) is not exactly zero for the actual choice of g, and this implies that a slight adjustment is needed to find CSSU:

log2(c) = 0.3 →
c = 1.23 →
CSSU = c’ = g*1.23 →
CSSU = (1/100)*1.23G = 0.0123G   | z≈0.5; i.e., D≈1

where g is the two-dimensional unit grid size (pixel size), which was was applied to produce the I(N) result to find c in this demonstration. D is the fractal dimension of the scatter of fixes. Due to the small discrepancy between estimated c=1.23 and the ideal c‘=1 (“area pr. relocation”), the adjustment from c to c‘ may be performed with simple arithmetics as shown. Larger discrepancies would require a “zooming” exercise on grid resolution, g, to achieve an intercept closer to c=1, which represents CSSU.

Assumption 2: superposition of space use by several individuals sharing the same area does not corrupt the CSSU estimate for the pooled set of fixes


When two or more individuals are utilizing the same area, the MRW theory assumes that the respective sets of relocations (inset in the illustration above) may be pooled prior to estimate the individuals’ average CSSU in this area and under the given conditions. The present “quick and dirty” pilot test using three simulated series indicates that this is indeed the case. Both the slope parameter z and the intercept log2(c) are of similar magnitude for the average regression of the three sets separately (filled circles) and the pooled set (open circles). A practical dependence on this assumption was shown in this post.

In due course a more formal test should be performed with respect to both MRW assumptions above, though, to verify that in the limit of large samples and large number of pooled sets the slight discrepancy shown here is indeed vanishing.

Assumption 3: habitat heterogeneity does not influence CSSU

CSSU homogeneousIn a previous post I explored the kernel density estimate (KDE) under condition of habitat heterogeneity, which was implemented in a simplistic manner by adding some spatial “attraction points”. It was shown that the KDE’s inherent N-paradox was quite resilient to the effect from changing from a homogeneous to a heterogeneous environment.

CSSU heterogeneousIn the illustration above and to the right I have explored if this resilience to environmental conditions also regards the CSSU estimate for the same MRW data sets (non-autocorrelated fixes). The open symbols regard incidence collected at different spatial resolutions (pixel scale), and the series with filled symbols reflect the best approximation for CSSU (see procedure above how to adjust for the final calculation of CSSU at c=1). Theoretically, since the CSSU is reflecting the average space use scale within the given spatial extent, such an average should not be dependent on its variance (e.g., from embedded habitat heterogeneity).

The present pilot test seems to confirm this assumption. Using the same MRW series as in this post and this post, the respective CSSU estimates are quite similar. Follow-up studies are needed, though, to confirm this kind of resilience under large-sample data conditions and a broader range of habitat heterogeneity.


Gautestad, A. O., and I. Mysterud. 2010. Spatial memory, habitat auto-facilitation and the emergence of fractal home range patterns. Ecological Modelling 221:2741-2750.

Gautestad, A. O., and A. Mysterud. 2013. The Lévy flight foraging hypothesis: forgetting about memory may lead to false verification of Brownian motion. Movement Ecology 1:1-18.

Spatial Analysis of Serially Autocorrelated Fixes

In previous posts I have mostly assumed serially non-autocorrelated fixes, and simulations have also reflected this coarse temporal sampling scale. However, in this post I started exploring the autocorrelation effect’s interesting statistical-mechanical properties. In this post I elaborate further on this theme, in particular its enhanced effect on the N-paradox under the kernel density estimation (KDE).

From a statistical-mechanical perspective a high degree of coarse-graining of an animal’s space use is an advantage, due to a deep “hidden layer” and – consequently – better compliance with the ergodic principle in statistical mechanics. In short, system parameters are more precisely estimated when some spatio-temporal scale distance from the animal’s true path is achieved. In practice GPS series are often collected at high frequency in order to give large series for statistical analysis. This is an advantage for detailed ecological inference at the behavioural “micro-scale” of path analysis, but may hinder a proper statistical-mechanical approach towards description of statistical-mechanical system properties. However even in the domain of serially autocorrelated data the statistical-mechanical analysis makes sense, given that the system extent is similarly constrained (spatial range needs to be constrained to reflect finer temporal resolution, and thereby reinstate better compliance with system ergodicity; see this post).

KDE autocorr MemRWI have previously criticized the KDE for lack of power to distinguish scale-specific from scale-free space use. In the two illustrations to the right I show a similar KDE analysis based on more high-frequently sampled paths, leading to serial autocorrelation. I acknowledge that KDE is strictly meant for non-autocorrelated series. However, since many are applying this approach also for high frequency series I briefly comment on this scenario here.

KDE autocorr MRWThe N-paradox – KDE area becoming larger for small samples of fixes – is clearly visible also for autocorrelated series: a smaller sample tends to show a larger area – yellow colour – for a given 99% isopleth both for scale-specific (MemRW) and scale-free (MRW) movement. Like I found for non-autocorrelated series, the discrepancy is even larger for smaller percentage “slices” of the utilization distribution (not shown). The KDE results for autocorrelated series are shown in the illustration below (open symbols).

I have previously documented how the KDE is unable to differentiate between a scale-specific and a scale-free kind of space use. Thus, under the condition of scale-free space use (and consequently applying the proposed incidence method – counting non-empty virtual grid cells for a given N – as an alternative to KDE) auto-correlated series require some additional caution with respect to estimate the important MRW parameters c and z. These parameters are estimated from the home range ghost formula for incidence, I(N) = cNz. Exploration of simulations leads to the following rules-of-thumb:

  • Incidence (I) should be calculated separately over a range of spatial resolutions (grid scales) to find optimal scale, as outlined in this post.
  • If the data are MRW compliant one should observe that z will tend to increase slightly beyond z=0.5 for small grid scales, and decrease below z=0.5 at larger scales relative to the optimal scale. These tendencies are visible both for non-autocorrelated and autocorrelated series, but is more pronounced in the latter case.
  • The best parameter estimates are achieved by applying the grid scale with best compliance with z=0.5 to find c from the intercept at this scale (this is not a tautology, due to extra statistical information that can be extracted from the regressions; I will explain in a later post). The characteristic scale of space use, CSSU, can then be estimated by the simple calculation shown here.
  • Observe that CSSU is dependent on sampling frequency when data are autocorrelated. Thus, CSSU is time-dependent under this condition. A modifying factor is thus needed to correct for the strength of autocorrelation. See explanation in this post.
  • If the underlying process is not MRW compliant but both scale-specific an serially auto-correlated (illustrated below by the MemRW with a z=0.69 estimate; filled circles), the incidence approach is apparently non-consistent with the expected z = 0.25 wich was previously shown for non-autocorrelated sampling. The reason for the large discrepancy of z in this particular scenario is easy to explain. At a the fine-grained sampling scale in the domain of autocorrelation a 2-speed movement is revealed under the present conditions (instantaneous relocation during a return event, like a ground-foraging bird with sudden return flights to a previous patches). This speed variability is “hidden” if sampling interval is increased relative to the present autocorrelated fixes, and thereby achieving non-autocorrelated series from sub-sampling. Consequently, z=0.25 should then be expected to be maintained also for this type of 2-speed movement – as was in fact documented in the previous post. In other words, the inflated z in the result below for MemRW is due to interference from path dynamics at fine resolutions into the “hidden layer” scales, leading to an intermittent process between micro- and meso-scale dynamics. Thus, z=0.69 is a statistical artifact – a curiosity – from this mixture of movement modes at the chosen observation frequency of fixes.

KDE autocorr


Lévy Walking or Not Lévy Walking – That’s the Question

As summarized and reviewed by many, one of the hot topics in the field of movement ecology regards to what extent – and under which ecological conditions – an animal performs scale-free movement. As readers of my book and my blog definitely have observed, I advocate a distinction between several kinds of scale-free movement (summarized by the Scaling cube); in particular, (a) the standard battleground of Lévy walk vs. Brownian motion (scale-free vs. scale-specific movement) (b) Lévy walk vs. composite random walk (a “Lévy walk look-alike”, i.e., pseudo-scale-free movement), and (c) Multi-scaled random walk (MRW; scale-free movement with site fidelity). The third variant has so far not received much attention in this debate. This is in my view unfortunate, because this approach apparently has the potential to resolve much of the controversy!

It is a fact that empirical research on scale-free vs. scale-specific movement generally has ignored the effect from animal site fidelity. This important and very universal property of animal space use seems to have got lost in convoluted and heated discussions on statistical methods and statistical interpretation of results. Why is this unfortunate? Simply because an inclusion of site fidelity when modeling space use allows a given movement pattern to appear scale-free or scale-specific, simply as a function of sampling frequency.

Figure3AResearch in this field tends to focus on the Lagrangian aspect of scale-free vs. scale-specific movement; i.e., the distribution F(L) of inter-fix displacement lengths L in a set of relocations from sampling at interval t along the true path. Based on our simulation results, a visual inspection of F(L) may apparently give a good clue about the actual generating process, as illustrated conceptually by the three classic candidate models in the left-hand part of the illustration above (the Figure is copied from Gautestad and Mysterud, 2013). For example, with log-transformed axes, linearity over a range of 10-100 length units is needed to make the distribution acceptable as statistically “scale-free” and thus Lévy walk compliant over this range, based on a first visual inspection (more formal MLE statistics will then typically confirm this). Such linearity implies a power law distribution, with log-log slope of magnitude –β and 1<β<3.

As the slope becomes steeper than -3 in log-log plot, the data is most likely reflecting scale-specific movement – apparently! Why “apparently”? Because the standard set of candidate models all define this specific transition as a critical magnitude of β to distinguish scale-free from scale-specific movement, without considering the influence from the sampling frequency 1/t; i.e., without considering a site fidelity effect on observed β.

The traditional set of models are implicitly based on an assumption that the animal’s path is not influenced by site fidelity. In other words, the animal is assumed to self-cross its historic path by chance only. Very unlikely, under a broad range of ecological conditions! MRW, which in my view should be included in the set of candidate models in this melting pot, will force upon the analysis a modification of how this scale-specific/scale-free transition at β≈3 is interpreted. In the MRW model, scale-free space use is combined with site fidelity. At present, MRW is the only model that provides this combination! 

Site fidelity implies a sensitivity to sampling frequency. If 1/t is substantially larger (interval smaller) than the animal’s average return frequency towards previously visited locations, we do not expect much effect from site fidelity on the distribution F(L). There is only a minute chance for a return event taking place during a given sampling interval. The MRW process will at these temporal resolutions appear scale-free and Lévy walk compliant. However, if the ratio ρ between the animal’s average return interval and the researcher’s choice of sampling interval is approaching unit size from above – and perhaps even passing over to ρ<1 (high probability of at least one return event during t) – the distribution F(L) will no longer look scale-free. Under this condition the effect from spatial memory on space use comes into focus and will tend to blur any log-log linearity in the F(L) distribution.

The site fidelity effect is illustrated in the right-hand part of the image above. At high frequency sampling relative to the individual’s average return interval, the effect from return events – the site fidelity effect – is basically concentrated in the outermost part of the tail of the distribution (blue area). In the MRW model, this “hump” pattern is reflecting that return events, whenever they have happened during a sampling interval, will tend to produce a quite large displacement at the next relocation in the series, relative to the median step length from sampled displacements that are void of such returns. Next, consider that ρ is decreased from e.g. 100 towards a smaller ratio using a larger t. For example, in the range 1<ρ<10 simulations show that the slope may be “artificially” steepened by the site fidelity effect, making the functional form increasingly similar to a truncated Levy walk as ρ →1. A statistical analysis may be expected to show a somewhat reduced β in the small-L range and an increased β in the large-L range. Under this condition the influence from return events become more broadly distributed over the F(L) range (red area). When ρ<1, F(L) may give the impression of Brownian motion compliance due to a concentration of return steps in the left-hand part of the distribution (green area).

It seems like a distributional mixture in the zone between truncated Lévy walk and Brownian motion is the typical pattern observed in real F(L) distributions, making it appear like an animal that my be mode-shifting between truncated Lévy walk and Brownian motion. However, considering the alternative MRW model introduces another interpretation: the movement could potentially be scale-free under influence of site fidelity, which makes the functional form of F(L) observer-dependent. The latter implies that F(L) becomes a function of ρ (where the observer defines the denominator). Based on the typically observed intermittent appearance of F(L) in real data, the discussion continues circling back and forth, year after year: is the pattern complying best with truncated Lévy walk, or a scale-specific Brownian motion? Adding MRW to the set of candidate models for Bayesian analysis might perhaps resolve the gridlock and bring us closer to consensus?

t should be easy enough to test MRW’s potential strength in this regard, given a sufficiently large data set, collected over a sufficiently high sampling frequency. Studying the effect on F(L) from different ρ then becomes a matter of sub-sampling the series.

Both truncated Lévy walk and MRW will show a steepened slope upon such sub-sampling, but these two alternative hypotheses may be easily distinguished by supplementing the F(L) analysis with fractal analysis of the spatial fix pattern (MRW will maintain D≈1, while truncated LW will tend towards larger D). One could also look for the home range ghost (incidence increasing proportionally with √N for MRW and incidence increasing proportionally with N for LW).

Appendix Fig (large rho)When ρ>≈10, simulations of MRW show that return events tend to inflate the occurrence of very long steps, giving the impression of a “hump”, for example as shown above for at ρ≈10. Several researchers in the field of Lévy research have in fact anecdotically observed such a hump, without having a plausible behavioural explanation for it within the context of the traditional candidate models. However, if ρ>>10, the hump may in the MRW scenario almost disappear, making F(L) even more Lévy-like. The illustration to the right shows the simulation result for ρ=100 (Gautestad and Mysterud, 2013).

Already in Gautestad and Mysterud (2005) we illustrated the ρ effect on the “problematic”  F(L) distribution. In Gautestad and Mysterud (2013) we elaborated further on this aspect. However, so far it looks like nobody has grasped this approach to see to what extent this “process-oriented” MRW framework might contribute to bringing the controversy out of the quagmire. In the meantime, the Lévy controversy rolls on, based on models void of spatial memory effects on F(L).

The MRW approach might also contribute to resolve other hot topics in movement ecology. For example, according to the Lévy walk/flight hypothesis for optimal foraging (a more specific aspect of the issue outlined above), Brownian motion is expected in an environment with a relatively productive environment with a relatively predictable resource distribution. In contrast, the MRW working hypothesis would be: in an environment with a relatively productive environment the individual’s home range is expected to be smaller, due to higher return frequency (stronger site fidelity, and thus a smaller ρ for a given sampling frequency). Hence, based on MRW one may predict that the site fidelity effect on F(L) will be more pronounced under these circumstances. If observation frequency, 1/t, is not increased accordingly, one may erroneously find support for the Levy flight foraging hypothesis. Supplementary test should be performed before concluding.


Gautestad, A. O., and I. Mysterud. 2005. Intrinsic scaling complexity in animal dispersion and abundance. The American Naturalist 165:44-55.

Gautestad, A. O., and A. Mysterud. 2013. The Lévy flight foraging hypothesis: forgetting about memory may lead to false verification of Brownian motion. Movement Ecology 1:1-18.