Are You a Conformist?

All the new and exciting developments in movement ecology demonstrate that a new era for the science of animal space use may be in the making. However, not all ecologists are that enthusiastic. On one hand we are confronted with new theoretical ideas, criticism of old dogma and off-piste hypotheses that are still in the process of being empirically scrutinized. On the other hand these developments also bring up the expected dose of controversy, confusion, and plain ignorance of what’s on the table. Taking the latter attitude would imply a risk of becoming a conformist, which should not be confounded with sound scientific skepticism. A conformist typically observe a new theoretical direction and its empirical support, but will tend to reject its implications “anyway” – even prior to testing the new approach and methods on one’s own data (or prior to waiting for others to do the job on their data).

In a recent post, Stepping Away From the Markov Lamppost, I focused on one particular aspect from the heterogeneous zoo of new theoretical proposals. In this post I will be more concrete in my criticism of conformatory attitudes in wildlife ecology. Consider the kernel density estimate (KDE), which has a central position in studies on animal space use by estimating the utilization distribution (UD).

… we consider that the animals’ use of space can be described by a bivariate probability density function, the UD, which gives the probability density to relocate the animal at any place according to the coordinates (x, y) of this place. Calenge (2015), p14.

Many kinds of kernel functions exist, but there is a general agreement that the so-called “smoothing parameter”, h, is more crucial to the UD estimate than the qualitative choice of kernel function. Thus, the bivariate normal kernel is a common choice (and is the default choice in – for example – the adehabitatHR package for R).

KDE from MRW_100 (Levy2 sampled)KDE produces isopleths for a given h; contour lines representing expected equal-sized intensity of space use along the specified demarcation, given the actual sample of relocations (fixes), N. So – what is the problem?

If the dynamics behind the pattern do not comply with standard statistical assumptions, the demarcated area from KDE (from a given h) will overcompensate for the effect from change in sample size of fixes, N. This surprising tendency for shrinking area with increasing N when using the KDE has repeatedly been documented for many species, like brown bear (Belant and Follmann 2002), cerulean warbler (Barg et al. 2005), white-tailed deer (Fieberg and Börger 2012), mule deer (Schuler et al. 2014) and coyotes (Schuler et al. 2014).

Despite this range of empirical studies showing non-trivial N-dependence on the KDE contour lines (op. cit.), the majority of ecologists seem to be quite ignorant. In other words, they show a conformist attitude and continue applying the standard method as if the documented shortcomings did not exist – or at best they are acceptable. 10-15 years of wake-up calls should perhaps be enough? Obviously not. As a result, we continue to see home range analyses with subjective and quasi-objective choices of KDE smoothing and other adjustments. This may block progress towards alternative space use models with potentially stronger predictive power (for example, with improved value for studies on habitat selection).

What does the empirically observed “overcompensation for the effect of change of N” imply? For example, it means that the home-range ghost paradox (non-asymptotic space-use demarcation with increasing N) under the KDE protocol may be camouflaged by a statistical artifact, due to a specific statistical assumption inherent in the KDE. Instead of observing area increasing with N as under the repeatedly verified home range ghost expectation when using a range of other non-parametric demarcation methods (incidence, MCP, R/SD, …), the KDE shows a tendency to “buffer” this pattern and even tends to make home range area appear shrinking with increasing N.

The dubious KDE assumption contains two facets;

  • animal movement and space use, even under site fidelity conditions, is assumed to be mechanistic and thus Markov compliant (see several previous posts). This follows statistical-mechanically from the choice of (for example) the bivariate normal kernel whether h is small or large. The Markov assumption is apparent also from the choice of model for simulations, for example the bivariate Ornstein-Uhlenbeck design (see below).
  • N-dependency is assumed to be statistically trivial, implying that given a sufficiently large N, a home range area asymptote is expected. The KDE is in fact ideally designed to “calibrate” for this small-sample artifact (Fieberg and Börger 2012).

MKDE from MRW_1000 (Levy2)y book is in some chapters devoted to arguing and documenting – both theoretically and empirically – that both assumptions may be wrong. The home-range generating process may not be Markovian, and a home range area asymptote may not be expected.

What is at stake, if one allows oneself to consider these nuisances to be more than a question of generators of some statistical noise? Stepping away from the Markov framework means that one has to consider non-mechanistic theory and accompanying statistical methods when studying home range data. Under this alternative framework, core concepts like “home range area asymptote”, “outlier fixes relative to core area fixes”, “home range area”, “home range overlap” and many others need to be critically evaluated and adjusted – possibly replaced – by similar concepts under the alternative theory. The standard framework should definitely not just be accepted a priori, because there exists a 50-100 year tradition to do so. Given the piling indications of a shaky theory and dubious methods, just moving on as usual will cement a conformist attitude.

Criticizing a broadly applied method in wildlife ecology should of course be supplemented with a feasible alternative. With respect to the two dubious KDE assumptions above, in my book I propose these alternatives:

  • Exploring the bivariate Cauchy distribution as an alternative candidate model for the kernel function, or – more simply – apply incidence (virtual grid cells containing at least one fix at the optimized grid resolution, CSSU). These approaches are better fit to cope with multi-scaled (even scale-free, to some extent) space use under influence of spatial memory.
  • Acknowledging that memory-influenced space use is expected to generate a statistical fractal with respect to spatial fix dispersion. Consequently, one should explore 1/c from the Home range ghost formula as an expression for “space use intensity”, rather than auto-applying the traditional density of fixes approach (the UD), which is inherently non-fractal. As a first-level approach the density based UD provides a good representation of space use in coarse terms, but may not be a proper tool for more detailed analyses of habitat utilization and other aspects of ecological inference. I will be more specific in an upcoming post.

Finally, what does this post’s illustrations show? Using a commonly applied method in R (KDE under the adehabitatHR Package), the respective 50-99% isopleths for small and large samples of serially non-autocorrelated fixes tend to show a contraction of home range area for larger N. For example, while the 99% isopleth for N=1,000 demarcated ca 15% smaller area than the 99% isopleth for N=100, the difference was 49% smaller area for the 90% isopleths! According to the theory for the actual simulation, the MRW model, this contraction is in fact predicted from applying a demarcation method which implies a “smooth” and differentiable function (represented by the density surface of the UD) upon a “rugged” and non-differentiable kind of density surface (fractal dispersion of fixes, emerging from a tendency for self-reinforcing re-visits to previous locations).

Observe that “rugged” in the context of multi-scaled space use is a different concept than a multi-modal UD. The latter is inherently smooth despite revealing more peaks in the UD at fine resolutions, regardless of the actual magnitude of the KDE’s smoothing parameter.

Interestingly, the KDE’s N-dependency issue has been documented also in other simulation studies, for example based on the bivariate Ornstein-Uhlenbeck model; a mechanistic convection model for the home range (Fieberg 2007). The bivariate Ornstein-Uhlenbeck model, due to its combination of Markov compliance and scale-specificity, simulates data that is statistical-mechanically compliant with the concept of a “smooth” UD and also with the classical area asymptote. However, the N-dependency may be due to serially autocorrelated samples (see table 1 in Fieberg 2007, and an upcoming post in prep.).

The MRW example above is based on serially non-autocorrelated samples. Thus, MRW is the first theoretical model that is able to demonstrate coherence between simulations and real data with respect to the KDE’s N-dependency issue in addition to showing coherence with the empirically observed fractal structure of the UD and the non-asymptotic HR area property (the home range ghost).

So here you have it again. Are you ignoring the empirically verified weaknesses (or paradoxes) inherent to the KDE and similar statistical models like Brownian bridge? Are you ignoring a statistical-mechanical theory that claims to offer a solution to these paradoxes, under a broader statistical-mechanical approach? In other words, are you a conformist, or are you ready to engage in a sound critical evaluation of the proposed alternative methods?

If you are reading these questions you are not a conformist. They have already jumped off further up.

REFERENCES

Barg, J. J., J. Jones, and R. J. Robertson. 2005. Describing breeding territories of migratory passerines: suggestions for sampling, choice of estimator, and delineation of core areas. Journal of Animal Ecology 74:139-149

Belant, J. L., and E. H. Follmann. 2002. Sampling considerations for American black and brown bear home range and habitat use. Ursus 13:299-315.

Calenge, C. 2015. Home Range Estimation in R: the adehabitatHR Package.

Fieberg, J. 2007. Kernel density estimators of home range: smoothing and the autocorrelation red herring. Ecology 88:1059-1066.

Fieberg, J., and L. Börger. 2012. Could you please phrase “home range” a question? Journal of Mammalogy 93:890-902.

Schuler, K. L., G. M. Schroeder, J. A. jenks, and J. G. Kie. 2014. Ad hoc smoothing parameter performance in kernel estimates of GPS-derived home ranges. Wildlife Biology 20:259-266.

A Statistical-Mechanical Perspective on Site Fidelity – Part III

In Part I of this group of posts I introduced physical concepts like entropy and ergodicity, and I contrasted these concepts under classic and extended (complex) system conditions of animal space use. In Part II I followed up on this theme by adding a novel piece of theory to the jigsaw puzzle, where “inward contraction” of the magnitude of entropy towards finer spatial scales of a home range was counterbalanced by “outward expansion” of entropy as a reflection of the area expansion under increasing sample size N under the Home range ghost concept. In this manner I showed how the MRW model was coherent with a key premise for a sound statistical-mechanical framework despite its qualitatively different structure in comparison to classic Boltzmann-Gibbs theory. Basically, the extended theory rests on a conjecture about parallel processing rather than Markov-compliant mechanics at micro-scales. In this Part III follow-up take additional steps towards a theory for complex statistical mechanics of animal space use.

A statistical-mechanical clarification of the parallel processing conjecture is important for many reasons, also from an ecological modelling perspective. For example, it may provide a theoretical justification for the concept I introduced in another post – the individual’s characteristic scale of space use (represented by the parameter c in the Home range ghost equation, A = c√N). In short,

  • why does a given individual’s aggregated movement over the actual fix sampling period self-organize towards a balancing level – the characteristic scale of space use?
  • What is so special about this scale, which obviously is larger for an elephant than for a mouse, and – for a given individual – larger in a harsh environment versus a more optimal habitat?
  • Why approximately square root expansion of observed space use (power law, with exponent 0.5) with increasing sample size of fixes, N, as we have verified in many data sets covering a wide range of species?

Balance of entropy

The illustration to the left recapitulates the general property of the Multi-scaled random walk model (the home range ghost, due to exponent >0) from the perspective of observed area accumulation as a function of N. In this case, area regards number of non-empty virtual grid cells, incidence, I, at a chosen grid resolution. This resolution is indicated by the respective squares on the right hand side. “The characteristic scale of space use” is marked as CSSU. Both axes are log-transformed to better visualize the model’s power law property (green line). Log(c) is found as the intercept with log(I).

First a trivial clarification. The green line does not expand with N “forever”, despite a lack of asymptote in the model formulation. Within a given time frame T (for example, a season), the animal has obviously limited its space use to a given range. Thus, if sampling frequency is increased sufficiently during a fixed T the green line will ultimately have to level off. However, due to the “absorbing” effect of inward area expansion and the small power exponent of 0.5, N typiclly has to be very large to reveal this limit.

In fact, the frequency of fix collection 1/Δt within T will have to be increased to a magnitude where the hidden layer (see definition under The scaling cube) becomes too narrow for a proper statistical-mechanical system description. In that case one has approximated the micro-scale path resolution, which implies that there is insufficient “degrees of freedom” left on the animal’s true movement path to make the animal’s next position sufficiently uncertain in statistical terms. In short, the ergodic property – explained in Post I of this group – is no longer satisfied at sampling interval Δt under this condition.

Returning to the conceptual graph above, the slope z=0.5 at resolution CSSU (green line) illustrates the standard model. At this pixel scale, inward contraction equals outward expansion (see above), and additionally expresses the condition where the animal has distributed its temporally scaled goals with equal statistical weight over the scale range (see Part I-II of this set of posts). In other words, the animal has on average over the actual period T and temporal resolution Δt executed short-, medium- and longer term goals in a specific and self-organized manner that has led to geometric scaling rather than just an arithmetic accumulation of spatial displacements (a power law, rather than an asymptotic growth of incidence with increasing N). I refer to the book and other blog posts for conceptual details on the parallel processing conjecture.

CSSU is the key “balancing point” of space use with respect to spatial scale level, and it provides a great potential for ecological interpretation. This is how to estimate it. If the chosen pixel size is is too fine- or coarse-scaled relative to the true CSSU, the relationship I(N) breaks down unless N is very large. At fine resolutions new fixes will all tend to fall outside the existing set of non-empty cells (red line), until a sufficiently large N is reached and z=0.5 is achieved. Thus, I(N) grows proportionally with N rather than with √N in this range of N. The given sample size is not sufficiently large to reveal the animal’s fractal space use pattern from “inward” entropy contraction at these fine scales. Further, the intercept log(c) becomes superficially inflated (see dotted line with arrow towards the y-intercept). On the other hand, if grid resolution is chosen too coarse relative to the true CSSU, new fixes will all tend to fall inside the the existing home range demarcation (blue line). Thus, I(N) is independent of N in this range of N (zero slope). Again, a large N will be needed for the fractal dispersion of fixes to escape this zone of statistical artifacts. However, the estimated y-intercept is in this case too low relative to the true CSSU.

As outlined in my book, optimizing grid resolution for I(N) to estimate CSSU is a matter of “zooming” towards the virtual grid cell (pixel) size where log(c) is close to zero. At this scale, the power exponent z is predicted to be close to 0.5 (MRW under the default condition β=2 for exploratory moves).

In a follow-up post (Part IV) I plan to bring forward yet another novel brick in the extended statistical-mechanical theory for complex space use, by bringing in the characteristics of micro- and macrostates as we move from a Markovian condition (the standard theory) to a parallel processing system.

 

Bridge Building Needed

Large databases of animal space use (eg, GPS), environmental conditions (e.g., GIS) and sophisticated models for statistical and behavioural analysis are now improving our field of research at a rapid pace. However, there still exists an unfortunate level of skepticism among some field ecologists towards some classes of theoretical models. On the other hand, there also exists a similar despair among many theoreticians over lack of consensus over common concepts that seek to link true behaviour to simplistic model representations. A better foundation of animal ecology as a hard science with high predictive power of models could be achieved if both “camps” collaborate better to clarify (a) what regards sound criticism from empiricists of some models’ realism, (b) some unfortunate misconceptions of theoretical terms among some field ecologists, and (c) reasons for a reluctance among many theoreticians to explore more complex kinds of space use, in particular related to spatial memory outside the “mechanistic” modelling approach.

Roe deer. Photo by AOG.

Roe deer. Photo by Arild.

Below, I single out three aspects in particular, which could benefit from better bridge building between empiry and theory – and also within the theoretical camp.

The first bridge: Homogeneous vs. heterogeneous environment. Some ecologists (both field ecologists and theoreticians) have criticized some simulated movement conditions for lack of realism by not including a variable habitat. In a recent post I elaborated on this theme, and advocated that a homogeneous environment in many contexts represents an ideal initial playground for the sake of disentangling the influence on movement of intrinsic (cognitive) and extrinsic (environmental) origin. In addition to simulation models, homogeneous conditions are for this reason also utilized in biological experiments. I have already posted a nice example showing how lab mice; i.e., real animals, revealed specific statistical-mechanical properties of movement in a totally homogeneous field. Likewise, simplistic simulations in a homogeneous arena may in contribute to explore and illuminate the qualitative difference between a mechanistic (“Markovian”) and a multi-scaled (“Parallel processing”) kind of memory map utilization, which is an intrinsic process property. This fundamental clarification shows the power of modelling as a tool to improve theoretical ecology and produce clear and testable hypotheses. Follow-up simulations – using either the Markovian or the parallel processing framework as an assumption – should then add realism and explore more specific hypotheses by invoking habitat heterogeneity (see my book for details).

The second bridge: Random movement. In this post I described various classes of stochastic movement; for example, Brownian motion, Levy walk and Multi-scaled random walk, with respective corner positions of the Scaling cube. In particular, Brownian motion (classic random walk) and its variant correlated random walk have been applied extensively in theoretical ecology to represent animal movement. However, as all field ecologists know – and all theoreticians should know – animals do not move stochastically. A physicist may describe a particle performing Brownian motion as a stochastic kind of motion, but still acknowledging that the particle’s jagged path is the result of a series of completely deterministic collision events. Each of these motion-influencing events could in principle (albeit not in practice) be described by an extensive deterministic equation. Thus, a stochastic model like Brownian motion represents an immensely effective system simplification by replacing deterministic process details with a stochastic model with high predictive power (consider the successful theory of diffusion). Similarly, the application of stochastic representations of movement in ecological models does not necessarily imply that the modeller consider movement to be non-deterministic at the behavioral (micro-scale) level. Stochastic movement is a complementary process abstraction, allowing for a range of ecological parameter estimates from real data; for example the diffusion rate as a function of various biological and ecological conditions.

The third bridge: Extending the framework of stochastic modelling. All field ecologists working on vertebrates (and many invertebrates) know from first-hand experience that animals generally (a) have the cognitive capacity to utilize a memory map and (b) consequently show indications of relating to their environment over a range of spatial and temporal scales. However, from the theoretical side it is an unfortunate fact that many stochastic models either …

  • disregard spatio-temporal memory effects,
  • disregard the biophysical difference between a mechanistic and a non-mechanistic kind of movement (Markovian vs. non-Markovian; see above), or
  • show a reluctance to explore theory for a combination of memory and the observed tendency for scale-free space use.

Site fidelity and the consequential emergence of a home range – including a potential for self-reinforcing patch ulilization – are prime examples where non-Markovian model design should be considered.

The common denominator for all three bridges is a need for ecologists in general to pay closer attention to the biophysics of animal space use; i.e., acknowledging the complementarity between animal behaviour and statistical-mechanical representations of this behaviour, for the sake of developing novel models with high predictive power.

Stepping Away From the Markov Lamppost

From a theoretical angle MRW represents the backbone of my proposed framework for an extended statistical-mechanical interpretation of individual movement, building on the parallel processing postulate. What is the conceptual realism of the MRW model relative to the historically and contemporary dominating approach – mechanistic models for animal space use?

Regarding the core question – “what is the conceptual realism of the MRW model” – I’m in this post tempted to answer it by turning the question on its head:

What is the conceptual realism of the classic framework for movement ecology?

  • The standard approach in space use modelling is to apply the mathematics and statistics from the domain of a mechanistic system representation.
  • A mechanistic system implies that the process is Markov compliant.
  • Moving away from this Markov/mechanistic approach means a need to take bold steps into the darkness away from the lamppost.
lamppost

(click image for source).

Why a need to leave the comfort zone? Let us take some steps in succession. First step: What is Markov compliance?

“A Markov process can be thought of as ‘memoryless’: loosely speaking, a process satisfies the Markov property if one can make predictions for the future of the process based solely on its present state just as well as one could knowing the process’s full history. i.e., conditional on the present state of the system, its future and past are independent.” (Wikipedia).

A typical example of a Markov process is a randomly moving particle. To satisfy “random” in the apparently contradictory context of totally deterministic movement, consider that the particle’s path is sampled at sufficiently large interval to allow for a transition from deterministic “micro-scale” behaviour (Newtonian mechanics in the present example) to a stochastic meso-scale path. At this coarser scale we have invoked a deep “hidden layer“; i.e., a temporal scale distance from successive movement-influencing collisions with other particles or obstacles along the fine-grained (true) path of the animal. Consequently, at this coarser temporal scale we easily get a Brownian motion (or standard correlated random walk) representation of the path – with properties that satisfy classic diffusion.

Recall from above the present premise that the behaviour at micro-scale needs to be Markov-compliant. Thus, a typical mechanistic process satisfies a typical diffusion process at meso-scale whatever rules are applied at the micro-scale; deterministic or stochastic, simple or more complicated.

Observe the dual mode of process description: a deterministic-mechanistic rule- and physical laws-execution of movement at micro-scales, and a stochastic representation of the same process at a coarser temporal resolution (you may call the latter statistical-mechanical). The Markov/mechanistic principle applies at both levels; only the observer’s temporal focus has changed and – consequently – the model formulation has flipped from the detailed behaviour-oriented description to a statistical-mechanical one (the Brownian motion/diffusion theory).

Regardless of temporal resolution, under what biological/ecological circumstances are Markov compliance and the mechanistic principle not applicable? Consider any animal that constrains it space use in a manner which we all can agree represents execution of home range behaviour. In other words, some kind of spatial memory has to be involved. One may thus ask, is this memory utilization mechanistic (Markov-compliant) or non-mechanistic?

“Animals typically are expected to move in a more complex manner than obeying a simple diffusion law. Verification of constrained space use (non-random self-crossing of the path) indicates site fidelity at the individual level and thus
utilization of long-term spatial memory.” (My book, page 16).

This citation does in fact cover memory influence both under a special formulation of Markov-compliant behaviour and a non-Markovian behaviour. First, consider the Markov approach. A rapidly increasing number of mechanistic models that include site fidelity has indeed been developed. The “trick” has been to move away from the traditional centre-biased kind of movement (the Markov-compliant and memory-less convection model for home range dynamics), either by

  1. …assuming a central place foraging (spatial memory, with returns to a specific location), or
  2. …letting the centre of attraction change dynamically; i.e., in a step-by-step manner (for example, Börger et al. 2008, van Morter et al. 2009).

Operationally, to satisfy the property of a mechanistic/Markovian process, the model animal is under both modelling approaches “equipped” with an increased cognitive capacity, allowing it to “scan” its environment beyond its present perceptual field and store past experiences in a spatially explicit manner (and building a memory map in the process). In this manner the animal may successively adjust its direction and movement speed to allow for (a) its internal state (e.g., hungry or not), (b) its local conditions (what it perceives through its senses at this point in time), and (c) the added information from its memory map.

Since this adjustment process involves recalculation of the next-step directional vector at every time increment and this recalculation at this specific point gives that same output whether the complete database of previous step decisions is included or not, the process is indeed mechanistic and Markov-compliant. “Paradoxically” – since it involves spatio-temporal memory. However, the strictly sequentially resolved movement decisions in this model design resolves the paradox. Another confirmatory sign of a Markov process is the fact that respective influences from terms under (a), (b) and (c) are summed (superimposed) to produce the updated next-step vector. The superposition principle only applies for a mechanistic/Markovian process.

In a context of site fidelity, consider that the lamppost in the comic strip above is represented by the classic convection model. Then consider that the spatial memory-extended mechanistic/Markov class of models represents the twilight zone, which the kind of approach described above contributes to lighten up. All ecologists working on animal space use should appreciate the strength of this approach in comparison to the convection model for home ranges.

However, are Markovian based memory-including models sufficiently flexible to reflect site fidelity in a realistic manner? With realism I do of course refer to the acid test: confronting model output with real data, like GPS space use representations. Since you are reading my blog and perhaps also my papers and my book, you are already aware that the Markov condition (mechanistic modelling) is not the only way to implement site fidelity. I advocate that a third step away from the lamppost should be explored, entering a zone where the mechanistic modelling approach is replaced by a framework that allows for a different kind of spatio-temporal utilization of memory.

In this specific model design, at every time increment the recalculation of the next move does not necessarily give the same output whether the complete database of previous step decisions is included or not. The current process is still connected to the past, in a manner which a Markov/mechanistic process cannot handle.

I started this post with the upside-down question “What is the conceptual realism of the classic framework for movement ecology?” Based our own pilot tests involving a range of vertebrate species (see excerpt in my book) we have found that a realistic modelling framework should be able to reproduce the following three statistical properties of home range data:

  • The home range ghost: log[I(N)] = log(c) + z*log(N), where 0.4<z<0.6 and the grid scale for incidence calculation has been optimized (see this post and this post).
  • The scale-free space use property: Fractal dimension D of spatial scatter, 0.9<D<1.3 (for example, using the box counting method).
  • The scale-free property the step length distribution F(L): log[F(L)] ∝ –β*Log(L), where 1.8<β<2.3 and fix sampling has been performed in the relevant frequency domain (Gautestad & Mysterud 2013).

Call it the acid test of the Markov paradigm as null model in the context of animal site fidelity. So far, all our empirical pilot studies have shown compliance with these three properties. What about your data?

REFERENCES:

Börger, L., B. Dalziel, and J. Fryxell. 2008. Are there general mechanisms of animal home range behaviour? A review and prospects for future research. Ecology Letters 11:637-650.

Gautestad, A. O., and A. Mysterud. 2013. The Lévy flight foraging hypothesis: forgetting about memory may lead to false verification of Brownian motion. Movement Ecology 1:1-18.

van Moorter, B., D. Visscher, S. Benhamou, L. Börger, M. S. Boyce, and J.-M. Gaillard. 2009. Memory keeps you at home: a mechanistic model for home range emergence. Oikos 118:641-652.