Interesting Science

Reservoir concepts: axioms and prepositions – Guest post by Rebecca Mancy

Haydon et al.’s (2002) “reservoirs framework” paper provides a structure for understanding reservoirs of infection by distinguishing between maintenance, source and target populations and clarifying the relationship between them (see Viana et al. 2014 Box 1, Figure I for an open access explanation). Reading it for the first time a few years ago, I found myself drawn into testing the structures and the relationships between them, generating new examples, verifying that the framework provided was sufficient to describe them, and checking whether they were topologically equivalent to those depicted in the figure. It felt like learning a new axiomatic system in mathematics. At the time, I paid very little attention to the terminology. The language of squares, circles and arrows seemed sufficient.

The Black Death, probably caused by Yersinia Pestis and depicted here on a panel of the Great Tapestry of Scotland, has long been one of the most iconic examples of a pandemic implicating a reservoir of infection and explanations of its mechanisms still attract considerable scientific interest. Photo:  Alex Hewitt/Trustees of the Great Tapestry of Scotland (GTOS)

The Black Death, probably caused by Yersinia Pestis and depicted here on a panel of the Great Tapestry of Scotland, has long been one of the most iconic examples of a pandemic implicating a reservoir of infection and explanations of its mechanisms still attract considerable scientific interest. Photo: Alex Hewitt/Trustees of the Great Tapestry of Scotland (GTOS)

Although our recent update (Viana et al., 2014) focused primarily on reviewing threads of evidence and discussing ways in which they might be woven into a tapestry to allow us to better identify reservoir systems, its writing has also led to new discussions about the framework and the structural relationships it encompasses. But it has also led to discussions of terminology. Most of these have focused on our use of the term ‘reservoir’. I’ve spent the last couple of hours trawling online dictionaries and etymological sources in order to better understand our use of the word. A foray into another of my favourite worlds: language. Understanding the different meanings of the word might also help us to understand some of the uncertainties that have been raised about the framework.

Most free online English language resources, such as Merriam-Webster and Online Etymology Dictionary, provide fairly limited information on the etymology of the word reservoir beyond a reference to its French origins. The Centre National de Ressources Textuelles et Lexicales, a CNRS resource centre, helps us to trace the term a little further back. According to the entry in their etymological dictionary, the first recorded use of the term dates back to 1510 when it was used to refer to a receptacle for holding a liquid. By 1547, it was being used more generally as a space fitted out for the conservation and storage of provisions, and by 1601 had adopted a figurative meaning, being used to refer to anything capable of serving as a repository. Despite these subtle changes, these uses all relate to the notion of a container employed for purposeful storage. Perhaps surprisingly, it was only in 1742 (in French; and slightly earlier in English according to the OED) that it took on the meaning of a place serving as a natural reserve of something. Yet among modern definitions, even if we exclude epidemiological meanings, we find a third use of the term as a supply of something in which the reservoir no longer refers to the container but to a resource that is contained. I suspect this is the sense in which Acheson employed it in explaining that Winston Churchill

“… still had his glorious sense of words drawn from the special reservoir from which Lincoln also drew, fed by Shakespeare and those Tudor critics who wrote the first Prayer Book of Edward VI and their Jacobean successors who translated the Bible.”

Dean Gooderham Acheson (1961) Of Winston Churchill in Sketches from Life of Men I Have Known.

Actually, what alerted me to the different meanings was not the etymology at all, but prepositions, something else that Churchill is reputed to have been sensitive to. What none of the above sources note is that the distinction between the different meanings of the word reservoir can be detected in its association with particular prepositions. When using it in the sense of a purposeful receptacle, we use the preposition for, such as when we refer to a ‘reservoir for heating oil’; in the sense of a natural reservoir, we would generally employ the preposition of, as we might if talking about a ‘subterranean reservoir of natural gas’. In the case where we want to emphasise the idea that a natural reservoir serves as a supply, we use both prepositions, but the meaning of the word for now changes. In the phrase ‘a subterranean reservoir of natural gas for the population of Scotland’, the word for refers not to the natural gas (as it did in the heating oil example) but to the population due to receive the gas.

In the epidemiological context, the equivalent of a reservoir of natural gas for a population would look something like

A reservoir of [infectious agent] for [target population].

And yet, the epidemiological literature is replete with examples of the equivalent of ‘a reservoir for natural gas’ (i.e. a receptacle into which one puts natural gas). A search in my Mendeley library brings up a list of examples: ‘a potential reservoir for Leishmania’, ‘a reservoir for a coronavirus’, ‘the reservoir for the origin of the SARS epidemic’, ‘a reservoir for emerging infectious diseases’, ‘a reservoir for rabies’ and ‘a reservoir for bovine tuberculosis’. When we write in this way, I am sure that we are simply being imprecise rather than implying a sense of human purpose in the maintenance of these reservoirs. But we really should try to use language a bit better than that.

But how does this distinction relate to the question of how we describe structures using the reservoir framework? Firstly, it explains why we choose to refer to the target in the definition of a reservoir. Basing our definition on that of Haydon et al. (2002), we explain that “A ‘reservoir of infection’ is defined with respect to a target population as ‘one or more epidemiologically connected populations or environments in which a pathogen can be permanently maintained and from which infection is transmitted to the target population’”. Thus, according to the framework, referring to a reservoir without reference to a target constitutes under-specification. Obviously, without maintenance there would be no reservoir; but equally, if there were no target population into which disease spills over then the term maintenance population would fully characterise the system and there would be no need to refer to a reservoir. For example, for a multi-host pathogen such as the virus causing foot-and-mouth disease, referring to buffalo as ‘the reservoir’ makes little sense because the system is under-specified: if we complete the definition by specifying a target, the factual accuracy of the statement “buffalo are the reservoir of FMDV for <target>” depends on the particular target we choose.

More precisely, the framework in Haydon et al. (2002) should be thought of as serving to describe not just reservoirs, but target-reservoir systems. According to the framework, populations and communities are classified in two ways. Firstly, according to their maintenance status as either capable of maintaining the pathogen in the long term or incapable of doing so; and secondly, according to their role in transmission between populations within the target-reservoir system as target, source, or neither. The simplest way to characterise the full system is then to view these dimensions as orthogonal: every population has an attribute from each of the two dimensions.

This construction helps to answer a number of questions that have arisen in discussion with colleagues. For example, it means we may still wish to refer to a reservoir even if the target population is capable of maintenance (or R0 in the target is greater than one). For example, this would be the case if some infections in the target came from other maintenance populations in the system. Furthermore, a source population can be maintenance or non-maintenance. Source populations that are not capable of maintaining a pathogen alone can form an essential or inessential part of a maintenance community, or simply assist in the transfer from the maintenance population to the target. In fact, all three possibilities might be involved in the transmission and persistence of the plague bacterium, Yersinia Pestis, in relation to different flea species and mammalian host communities (Eisen & Gage, 2009; Webb, Brooks, Gage, & Antolin, 2006). One might ask, as Ashford (2003) has, whether or not vectors that do not contribute to pathogen maintenance should be included in the reservoir. As Ashford notes, this particular point could be argued either way; nonetheless, distinguishing between types of vectors is important when designing interventions.

Fundamentally, there are two ways to protect the target: either we prevent maintenance, or we prevent transmission from the maintenance community to the target. As we explain in Viana et al. (2014), there are various ways to achieve these aims, which we refer to as press, pulse and block. However, this categorisation focuses on the implementation rather than the aim. For example, a pulse intervention may consist of culling (to prevent maintenance) or vaccination (to prevent transmission to the target); a block action may employ fences erected between non-target, non-maintenance populations (to prevent community-level maintenance) or between a maintenance community and the target (to prevent transmission to the target).

Simple reservoir-target systems showing the three kinds of vectors. T denotes the target population, V the vector source and P an additional population involved in the target-reservoir system. Arrows indicate transmission between populations, circles represent non-maintenance populations while squares are maintenance populations; maintenance communities are shown with a dashed outline.

Figure 1. Simple reservoir-target systems showing the three kinds of vectors. T denotes the target population, V the vector source and P an additional population involved in the target-reservoir system. Arrows indicate transmission between populations, circles represent non-maintenance populations while squares are maintenance populations; maintenance communities are shown with a dashed outline.

To come back to the importance of distinguishing between vectors that are involved in maintenance and those that are not, an interesting case arises. In Figure 1, although eliminating the vector is effective for different reasons in the three cases, the set of interventions is actually identical (eliminate population P, eliminate the vector population V, block transmission link a, block transmission link b). In the first and third case, eliminating V is effective because it breaks the transmission link to the target; in the second case, its elimination is also prevents maintenance in the community consisting of the vector V2 and population P2.

Ultimately, perhaps the most important questions about the definition and associated framework relate not to the word reservoir in the definition, but to how generally the framework applies. For example, should we use it for situations such as environmental persistence without pathogen reproduction? It seems fairly natural to apply it to when considering to parasites, but should it extend to organisms such as toxin-producing algae or fungi that do not require living matter in order to reproduce? These are all fascinating questions and it should be fun thinking about whether and how to best integrate them.

(With thanks to Daniel Haydon and Mafalda Viana for comments.)

[RRK – Comments can be made here, or addressed directly to rebecca.mancy A T glasgow.ac.uk]

References

Ashford, R. W. (2003). When Is a Reservoir Not a Reservoir? Emerging Infectious Diseases, 9(11), 1495–1496.

Eisen, R. J., & Gage, K. L. (2009). Review article Adaptive strategies of Yersinia pestis to persist during inter-epizootic and epizootic periods. Veterinary Research, 40(1), 1–14.

Haydon, D. T., Cleaveland, S., Taylor, L. H., & Laurenson, M. K. (2002). Identifying reservoirs of infection: a conceptual and practical challenge. Emerging Infectious Diseases, 8(12), 1468–73. Retrieved from http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2738515&tool=pmcentrez&rendertype=abstract

Viana, M., Mancy, R., Biek, R., Cleaveland, S., Cross, P. C., Lloyd-Smith, J. O., & Haydon, D. T. (2014). Assembling evidence for identifying reservoirs of infection. Trends in Ecology & Evolution, 29(5), 270–279. doi:10.1016/j.tree.2014.03.002

Webb, C. T., Brooks, C. P., Gage, K. L., & Antolin, M. F. (2006). Classic flea-borne transmission does not drive plague epizootics in prairie dogs. Proceedings of the National Academy of Sciences of the United States of America, 103(16), 6236–41. doi:10.1073/pnas.0510090103

 

Sherlock Holmes and the deductive paradigm of forensic epidemiology.

No blogs for ages, and then two in one week …

Deductive logic at its finest

Earlier this year, Dom Mellor was giving a talk to the epidemiology group at Glasgow, where he started by saying that, in his view, Sherlock Holmes represented the perfect example of forensic epidemiology. In a sense he was right, and at least some of you will know that it is commonly believed that the Holmesian forensic technique was based on Conan Doyle’s experiences as an Edinburgh medical student, where the medical doctor and University Professor Joseph Bell impressed the young student in his lectures. It was said that “all Edinburgh medical students remember Joseph Bell – Joe Bell – as they called him. Always alert, always up and doing, nothing ever escaped that keen eye of his. He read both patients and students like so many open books. His diagnosis was almost never at fault.” Sherlock Holmes most famous quote, taken from the Sign of the Four: “When you have eliminated the impossible, whatever remains, however improbable, must be the truth” is the iconic expression of deductive logic, and it could be said to be the ultimate goal of forensic epidemiology. I recall a colleague saying to me “Identify and eliminate the source of Infection, and you eliminate the epidemic”, apparently quoting from the highly respected veterinary epidemiologist, Prof. Mike Thrusfield at the Dick Vet School in Edinburgh, though I cannot comment on the accuracy of the quote. Of course, it is also well recognised that it would usually be impossible to be so sure as Sherlock Holmes in real life, but this nevertheless represents a sort of platonic ideal of forensics.

“Balance of probabilities, little brother” Mycroft Holmes, Hearse, Sign and Vow (from http://www.bbc.co.uk)

Move forward a century and more, and the hugely popular TV series ‘Sherlock’ presents a modern updating of the old stories, an updating which, to my great surprise, I have thoroughly enjoyed. In the third series, in the episode ‘Hearse, Sign and Vow’, Sherlock and his older, more intelligent brother Mycroft are engaged in a contest to characterise a man from only his woolly hat. In this contest Sherlock queries one of Mycroft’s “deductions”, when Mycroft replies “Balance of probabilities, little brother.” Now this statement is decidedly un-Holmesian – in the world of Arthur Conan Doyle’s Sherlock Holmes, probabilities have nothing to do with it. This statement is in fact, one of inductive logic. And it could be argued that the mathematical and statistical modelling of infectious diseases lies very much more in this inductive tradition. Not so much concerned with identifying the single chain of transmission, modelling traditionally concentrates on the identification of general, population level principles of transmission, and an overall ‘balance of probabilities’ of getting the right pattern.

These two traditions – that of the forensic epidemiologist and the mathematical/statistical epidemiologist do not sit easily together, and indeed it could be argued that much of the controversy over the 2001 Foot-and-mouth disease (FMD) epidemic in Great Britain can be attributed to precisely that clash of cultures.

Phylodynamic reconstruction of a foot-and-mouth disease (FMD) epidemic. (A) Identified likelihood that a particular infected premises was the source of another infected premises based on a space–time–genetic model. Circle size is proportional to the relative likelihood of that event. (B) Spatial relationships among premises in the dataset. Reproduced from Morelli et al. PLoS Pathogens 2012.

Phylodynamic reconstruction of a cluster of cases from the 2001 FMD epidemic in Great Britain. (A) Identified likelihood that a particular infected premises was the source of another infected premises based on a space–time–genetic model. Circle size is proportional to the relative likelihood of that event. (B) Spatial relationships among premises in the dataset. Adapted from Morelli et al. PLoS Pathogens 2012.

Now however, the integration of rapid high throughout sequencing of pathogens allows us to trace to a very fine scale the movement of pathogens from place-to-place, and even from individual-to-individual. Combined with mathematical models, this can often lead to very precise identification of likely sources of infection. The figure here is taken from a paper by Marco Morelli while he was working with Dan Haydon at Glasgow, illustrating precisely that kind of analysis using data from the 2001 FMD epidemic. Of course the most likely source under one model of transmission is not necessarily proof that the relationship is the true one (e.g. what if another model gives an equally strong but different prediction?) and there are many challenges still to be addressed. Despite these issues, the future is bright and it is just possible that, through these new technologies and approaches, we can at last approach that Holmesian ideal.

The Goldsboro Incident really happened! Nonlinearity, mathematical and statistical models

 

Opening up the paper today, I was pleased to see this story on the front page of the Guardian, about the Goldsboro incident in November 1961. Why pleased? Well for years the Goldsboro incident has been my analogy of choice for explaining the difference between linearity and nonlinearity, based on an interpretation of nonlinearity inspired by George Sugihara on physical vs. biological noise. I’ve always prefaced this analogy by saying that it was unconfirmed but useful – and now it appears to be true! So what happened in Goldsboro? From the companion piece in the Guardian:

The document, obtained by the investigative journalist Eric Schlosser under the Freedom of Information Act, gives the first conclusive evidence that the US was narrowly spared a disaster of monumental proportions when two Mark 39 hydrogen bombs were accidentally dropped over Goldsboro, North Carolina on 23 January 1961. The bombs fell to earth after a B-52 bomber broke up in mid-air, and one of the devices behaved precisely as a nuclear weapon was designed to behave in warfare: its parachute opened, its trigger mechanisms engaged, and only one low-voltage switch prevented untold carnage.

Image

The conventional interpretation of nonlinearity. Doubling the input either more than doubles (e.g. oversteer in a car) or less than doubles (e.g. understeer) the response.

Our formal understanding of nonlinearity is based on the idea that, if we consider a response to an input, doubling the input will result, if there is a linear response, in a doubling of the response. Thus if I press the accelerator on my car twice as hard, I might expect to travel (approximately) twice as fast. In a nonlinear response, the return is either more than or less than twice.   However, an alternative understanding of nonlinearity is illustrated by the Goldsboro Incident, where the difference between 5 of 6 safeties failing, and 6 of 6, is the difference between an incident quietly swept under the rug for 50 years, and a monumental disaster.

The difference between 5 switches being triggered and 6 is the difference between a hole in the ground and a nuclear explosion.

The difference between 5 switches being triggered and 6 is the difference between a hole in the ground and a nuclear explosion.

This interpretation of nonlinearity can be viewed in terms of the difference between multiplication and addition. We are quite good at predicting additive phenomena; the problem is, we are are less proficient when it comes to multiplication. The recent story of the death of four year old Daniel Pelka (and this is a type of story repeated with tragic Sisyphean regularity) is a case in point. How could this happen? How could so many safety checks fail? How could so many people miss the warning signs? The truth of the matter is likely to be that there are many, many more cases where “the system” almost fails, but with no observable consequence. Overburdened, pressurised staff, sometimes under motivated or under pressure not to raise alarms unnecessarily, may cut corners or make mistakes far more often than we are aware. It is also likely true that because there is no immediate consequence to these actions (the effect of nonlinearity) the potential for disaster is missed. The question may in fact not be, why does this happen, but why does it not happen more often?

And this leads us to the concept of extrapolation and mathematical and statistical models. Statistical models are fantastically valuable tools for rigorously describing relationships in data. However they are fundamentally ontological in nature; that is, built to classify rather than to explain mechanisms, and thus the ultimate arbiter of the quality of a statistical model is the fit to the data. Of course, in designing the statistical model and in interpreting it, a good scientist will be aware of the existence of these underlying mechanisms. This awareness will drive both experimental design and observation, and the interpretation of the statistics. However, these considerations lie outside the statistical model itself. In contrast, mathematical models should be phenomenological, i.e. built to directly describe the often nonlinear relationships between variables, and therefore they are better suited to extrapolate or predict away from the data, rather than interpolate. What is often not understood, is that even very good mathematical models may give an inferior fit to the statistical within close bounds of the data – the aim is not to develop the best fit to the data, but to better be able to predict what may occur, when moving farther away from known data.

Mathematical models can often provide a poor fit the data, but, if formulated to appropriately describe a fundamental aspect of the data, can provide insight into possible trends as we move away from the known data.

Mathematical models can often provide a poor fit the data, but, if formulated to appropriately describe a fundamental aspect of the data, can provide insight into possible trends as we move away from the known data.

Of course, this is at best a caricature of both mathematical and statistical models, with modern quantitative sciences using in various ways combinations of both of them. Nevertheless there is a fundamental difference in models that aim to describe, and models that aim to explain, a difference that must be considered when evaluating the interpretation of any model.

Nonlinearity is a critical concept in ecology, evolution and epidemiology. The emergence of new pathogens is one example of this. For example, in a paper a few years ago, Nim Pathy and Angela McLean used a theoretical model, to ask whether or not a pathogen (in this case, avian influenza) that has caused hundreds of cases but with little transmission indicates that the species barrier cannot be crossed. Another way of looking at this question is to ask which is worse, 4 introductions of avian flu into humans from birds, or a single introduction, where a chain of 4 infections in humans occurs but the disease then fails? Extrapolation from currently observed data requires an insight into the underlying mechanisms that drive the phenomenon to be understood (in this case, the emergence of a new human pathogen). What Pathy and McLean showed using nonlinear mathematical models, was that a lack of demonstrated transmission cannot rule the possibility of adaptability, regardless of how many zoonoses have occurred – thus even when we think we are safe, we are not necessarily so.

Of course, while I am (unsurprisingly) a keen proponent of the use of mathematical models, it must always be kept in mind that prophecy is difficult, and the biblical admonition against following false prophets reflects the popularity of trying to predict the future, the frequency of our failures, and the ease with which we can be led into following those predictions, especially when espoused by recognised experts.