Ecological Modelling 174 (2004) 19–24
Ecosystem as a text: information analysis of the global vegetation pattern Yuri M. Svirezhev∗ Potsdam Institute for Climate Impact Research, P.O. Box 601203, D-14412 Potsdam, Germany
Abstract Let us consider a text, which is written in the English language. At the zero level of perception (or description), we know only the number of symbols (n = 29: 26 letters, 1 blank, 1 comma and 1 full stop). Then the information per symbol I (0) log2 29 ≈ 4.85 bits. At the second level of perception we take into account the frequencies of the symbols (letters); then I (1) ≈ 4.03 bits. At the next levels we take into account double, triple, etc. correlations, i.e., words of two letters, three letters, etc., so that the information per letter are I (2) /2 ≈ 3.32 bits, I (3) /3 ≈ 3.10 bits,. . . . The redundancy of information, R(i) , and its cost, C(i) , are equal to R(0) = 0, R(1) = 0.15, R(2) = 0.30, R(3) = 0.35, and C(0) = 1, C(1) = 1.18, C(2) = 1.43, C(3) = 1.54. Let us consider a description of the global vegetation pattern (GVP). At the zero level of description we have a number of significant biomes or vegetation types, n. In accordance with Bazilevich n = 30, then I (0) log2 30 ≈ 4.9 bits. At the second level we take into account the relative areas covered by biomes. Then I (1) ≈ 4.41 bits, R1 = 0.1, and C1 = 1.11. At the next level of description we consider the spatial correlations between different pairs of biomes, then I (2) /2 ≈ 3.6 bits, R2 = 0.265, and C2 = 1.36. Note that in addition to their areas, the biomes are also characterised by three parameters: annual productivity Pi , living biomass Bi and dead organic them in relation to biome area σ i we define the frequencies 30 matter Di . DWeighting 30 B pPi = Pi σi / 30 i=1 Pi σi , pi = Bi σi / i=1 Bi σi and pi = Di σi / i=1 Di σi . The description in terms of productivity, living or dead organic matter can be considered as description at the first level with information per letter defined as (I (1) )P,B,D = n P,B,D P,B,D (1) P − i=1 pi log2 pi . Then (1) for productivity: (I ) = 3.61 bits, C2P = 1.32, RP2 = 0.24; (2) for living biomass: (I (1) )B = B B 3.27 bits, C2 = 1.5, R2 = 0.33; (3) for dead organic matter: (I (1) )D = 4.13 bits, C2D = 1.17, RD2 = 0.16. One can see that information about the biome productivity, living biomass and dead organic matter is more valuable than the information about the biome areas distribution. Information about the distribution of living biomass has the maximal cost (C2B = 1.5). © 2004 Elsevier B.V. All rights reserved. Keywords: Information theory; Global vegetation pattern
1. Introduction: some basic concepts Ludwig von Wittgenstein said that any physical object or system could be represented as a text, written in a special language with a proper alphabet and grammar. ∗ Tel.: +49-331-288-2671; fax: +49-331-288-2695. E-mail address: [email protected]
Let the text be a single word with length N. If the alphabet contains n symbols (for English n = 29: 26 letters, 1 blank, 1 comma and 1 full stop) then every symbol is repeated in the word N1 , N2 , . . . , Nn times ( ni=1 Ni = N). The total number of different words from these N symbols of this n-symbols language is equal to W = N!/ ni=1 Ni !. Then the total information contained in the word (text) is equal to I = −N ni=1 pi log2 pi (in bits) where Pi = Ni /N. The specific information, i.e., the information per symbol,
0304-3800/$ – see front matter © 2004 Elsevier B.V. All rights reserved. doi:10.1016/j.ecolmodel.2003.12.041
Y.M. Svirezhev / Ecological Modelling 174 (2004) 19–24
will be equal to n
Is = −
pi log2 pi .
This is the so-called Shannon measure of information, or the information entropy (Shannon and Weaver, 1963). Since receiving information reduces uncertainty then the Shannon concept can be formulated as information per symbol = mean value of uncertainty per symbol.
R(r) = 1 −
In the Shannon concept symbols (letters) are considered as primary elements of a language. However, a text can consist of separate words, so that the words (not letters) could be considered as such primary elements. For instance, if the alphabet contains n letters then nr words of r letters could be constructed under the alphabet. If pij . . . k is the probability of forma r tion of the r words then the information entropy of rth order will be equal to I (r) = −
It is obvious that I (1) = I, i.e., Shannon’s entropy. Under the assumption that the source of information is stationary and generates an ergodic Markov sequence it can be shown that I (r) r + 1 (r) (r+1) I ≤ I and I = lim . (3) r→∞ r r Let us consider a text, which is written in the English language. At the zero level of perception (or description) we know only the number of symbols (n = 29). Then the information per symbol I (0) = log2 29 ≈ 4.85 bits. At the next level of perception we take into account the frequencies of the symbols (letters); then I (1) ≈ 4.03 bits. At the second levels, when we take into account double, triple, etc. correlations, i.e., words of two letters, three letters, etc., we get the following values of information per symbol (Ebeling et al., 1990): I (2) 2
≈ 3.32 bits;
I (3) 3
≈ 3.10 bits; . . . .
I (r) , max I (r)
r = 1, 2, . . . ,
where max I (r) = r log2 n. For the zero level the redundancy is defined as R(0) = 0. Then the corresponding redundancies in English will be equal to R(0) = 0, R(1) = 0.15, R(2) = 0.30, R(3) = 0.35. The latter, for instance, implies that only 35% of letters are redundant at the third level, i.e., 65% of randomly distributed letters are sufficient for the understanding of the text. The cost of information can be defined as the degree of non-redundancy (Volkenstein, 1988): C(r) =
pij . . . k log2 pij . . . k , r i,j,...,k=1 r
r = 1, 2, . . . .
We see that the amount of information per symbol decreases as it is transmitted from lower to higher levels. This implies that at each level there is redundant information. Certainly, only non-redundant information has a cost, but the repeating of information, its redundancy, provides the reliability of its transmission defending it from errors and destruction of the text by noise. The redundancy of information at rth level of perception (or description) can be defined (Klix, 1974) as
1 . 1 − R(r)
Then for each level we have C(0) = 1, C(1) = 1.18, C(2) = 1.43, C(3) = 1.54. We used here one of the simplest definitions of the cost of information. In fact, this problem “What is the cost of information?”, in spite of continuing discussion, is still far from its completion. This discussion falls outside the framework of our article, but nevertheless we shall cite one example. Suppose, there is some aim. Let probabilities of its attainment before and after receiving information be equal to P0 and P1 , respectively. Then the cost of information is equal to C = log2 (P1 /P0 ) (Kharkevich, 1963). However, if the aim is unattained without information (P0 = 0) then the cost of any finite information is equal to infinity. This is not very understandable.
2. Information analysis of the global vegetation pattern If we look at a standard botanical description of some territory we can see that it contains, firstly, a list of species (types, forms, etc.) of plants represented in
Y.M. Svirezhev / Ecological Modelling 174 (2004) 19–24
the territory, and secondly, the percentage of cover, pi , i.e., the percentage of the total territory covered by ith species. This is a typical linguistic construction, in which the alphabet of the corresponding language is formed by the names of all the species contained in the list. It immediately appears useful to apply information methods to its analysis, in particular to estimate redundancy and the cost of information contained in the text at different levels of description. Now let us consider a description of the global vegetation pattern (GVP). At the first level of description we have a list of biomes, or vegetation types. In accordance with Walter (1964, 1968) and Bazilevich (1973, 1993)—see also Svirezhev (2002)—the number of different biomes is equal to 30, and they are listed in Table 1. The corresponding biomes maps can be found in the above-cited books. If we know nothing about these biomes except for their denomination and number then the natural assumption at the zero level of description (and perception) is that they are absolutely equivalent in the list. If we consider the list as some alphabet with 30 letters, then the information per letter is equal to I (0) = log2 30 = 4.9 bits. However, this is not the case with biomes, since they are not entirely equivalent. Certain biomes occupy large areas, while the areas of others are negligibly small (for instance, 9th and 28th biomes in Table 1), and the production of some biomes is also much higher than the production of others, etc. In other words, in reality the GVP is a hierarchical structure
with respect to different characters, many of which are known (see Table 2). Therefore, at the next (first) level of description all these properties might be taken into account. For instance, using the data from Table 2 we can calculate the relative areas of biomes, pi = σi / 30 i=1 σi , where σ i is the total area of ith biome. The information per letter at this level of description is equal to I (1) = − 30 i=1 pi log2 pi = 4.41 bits. The redundancy of information at this level is equal to R(1) = 1 − (4.41/4.9) = 0.1, and its cost C(1) = 1/(1 − 0.1) = 1.11. These values are calculated by using Eqs. (4) and (5). Until this point, we have not taken into account the spatial pattern of global vegetation, i.e., the spatial correlation between different pairs of biomes, which is a very important characteristic of the GVP. For instance, the correlation between Tundra and North taiga biomes is very high, but in contrast the correlation between Tundra and Evergreen tropical rain forest is very low. In order to estimate the spatial correlation between different pairs of biomes the Walter–Bazilevich biomes map is used. For this we introduce a new concept of the border Γ ij between ith and jth biomes. The border contains points belonging both to ith and jth biomes. The total length of the border, which, in a general case, can consist of several separate parts, is denoted by the same symbol. It is obvious that when the border is longer, the interaction between the bounding biomes is more intensive. Using the biomes map the lengthsof all borders are calculated. By defining pij = ij / 30 i,j=1 ij we get the following expression
Table 1 Different types of vegetation (biomes) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Polar desert Tundra Mountainous tundra Forest tundra North taiga Middle taiga South taiga Temperate mixed forest Aspen-Birch lower taiga Deciduous forest Subtropical mixed forest Xerophytic woods and shrubs Forest steppe Temperate dry steppe Savannah
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Dry steppe Sub-boreal desert Sub-boreal saline desert Subtropical semi-desert Subtropical desert Mountainous desert Alpine and sub-alpine meadows Evergreen tropical rain forest Deciduous tropical forest Tropical xerophytic woodland Tropical savannah Tropical desert Mangrove forest Saline land Subtropical and tropical woodland
Y.M. Svirezhev / Ecological Modelling 174 (2004) 19–24
Table 2 Annual net primary production P (kg C/(m2 year)), density of living biomass B (kg C/m2 ) and density of dead organic matter D (kg C/m2 , in 1 m soil) for different types of vegetation (biomes) a
1 2 3 4 5 6 7 8 10 11 12 13 14 15
2.55 2.93 2.23 1.55 5.45 5.73 6.60 2.12 7.21 5.75 3.91 3.72 4.29 1.66
0.068 0.144 0.15 0.26 0.22 0.25 0.26 0.35 0.53 0.71 0.23 0.3 0.32 0.44
0.148 0.76 0.76 1.5 3.2 6.2 7.4 8.0 15.0 14.2 1.5 0.76 0.76 1.5
0.938 3.08 3.06 5.02 4.52 6.06 11.5 16.1 16.9 14.4 8.4 23.3 18.1 14.8
16 17 18 19 20 21 22 23 24 25 26 27 29 30
2.66 2.08 2.59 1.99 7.16 1.15 3.54 10.4 7.81 9.18 17.1 11.5 0.37 0.9
0.15 0.18 0.096 0.14 0.044 0.18 0.3 1.3 0.95 0.54 0.5 0.068 0.068 0.78
0.32 0.45 0.18 0.32 0.096 0.32 0.76 18.0 16.0 2.4 2.4 0.144 0.15 16.0
7.04 6.8 4.56 4.94 0.87 9.49 13.4 13.4 13.1 10.6 10.2 1.4 2.75 12.1
a: biome type; b: biome area (×106 km2 ).
for the amount of information per word at this level of description: I
pij log2 pij ≈ 7.2 bits.
For the redundancy and the cost of information we obtain, correspondingly: R(2) = 0.265 and C(2) = 1.36. Since at this level the elementary unit is a two-letter word then the information per letter will be equal to I (2) /2 ≈ 3.6 bits. Note that in addition to their areas, the biomes are also characterised by three values: annual productivity Pi , living biomass Bi and dead organic matter Di (see Table 2). It is obvious that we can define D the frequencies pPi , pB each value as i and pi for 30 P B pi = Pi σi / i=1 Pi σi , pi = Bi σi / 30 i=1 Bi σi and 30 pD = D σ / D σ . The description in terms of i i i=1 i i i productivity, living or dead organic matter can be considered as description at the first level with information per letter (formally per one-letter word) defined as (I (1) )P,B,D = −
pP,B,D log2 pP,B,D i i
Using the data from Table 2 we can calculate the corresponding value of information (per letter), its cost and redundancy:
1. For productivity: (I (1) )P = 3.61 bits, C2P = 1.32, RP2 = 0.24. 2. For living biomass: (I (1) )B = 3.27 bits, C2B = 1.5, RB 2 = 0.33. 3. For dead organic matter: (I (1) )D = 4.13 bits, C2D = 1.17, RD 2 = 0.16. Note that to reduce notational complexity we have replaced the order superscript with a parallel subscript (e.g., C(2) has become C2 ). These values show that information about the biome productivity, living biomass and dead organic matter is more valuable than the information about the distribution of the biome areas. Information about the distribution of living biomass has the maximal cost (C2B = 1.5). 3. Global and local information: speculations It is necessary to say a few words about information at global and regional levels of scaling. If we keep in mind that the total number of species in the contemporary biosphere is equal to n ≈ 106 then information per species at the zero level of description and the global scale is equal to (I (0) )B = log2 106 ≈ 19.9 bits (about the thermodynamic basis of the value, see Svirezhev and Svirejeva-Hopkins, 1997). At this level of description all specimens of a community differ from each
Y.M. Svirezhev / Ecological Modelling 174 (2004) 19–24
other only by one indication, namely, by its membership of one or another species. However, as we saw above, if the biosphere is represented as a composition of higher taxonomic units (terrestrial biomes and aquatic ecosystems; later on we shall consider only the terrestrial part of the biosphere), which in turn represent the composition of species, i.e., words are formed by groups of letters representing species, then the information per word is less than 19.9 bits. The same effect is observed when we estimate the amount of information per species for some taxonomic unit, which is lower than the biosphere. The point is that when we describe a community belonging to some ecotype (for instance, either an aquatic community in a lake, or a plant community of some biome, etc.) then the alphabet may be significantly shorter. Then the amount of information per letter will be less than in the previous case. This seeming loss of information is a result of its redundancy at this “regional” or “local” level in comparison with the global level. In fact, the information about all the species present in the biosphere is absolutely redundant when we describe some regional or local community. Here we need only the information about those species, which are typical for the locality considered. Note that the “lost” information is not actually lost; it is usually used to extract the locality from the biosphere. Note that the elementary units of the biosphere— populations of different species—are organised within the biosphere in such “hard” structures as trophic chains, trophic levels, ecosystems and biogeocoenoses, which are more or less “spread” on the Earth’s surface. In this case, the information per species belonging to the “hard” structure will be less than 19.9 bits. We can expect that its redundancy and cost have to increase. Apparently, the concept of dominant species in a biological community exploits namely the redundancy of information. Let us consider the following example. The number of different plant species typical for the Russian dry steppe is equal to 150. Then the information per species contained in the botanical description of any steppe community (list of species) is equal to (I (0) )R = log2 150 ≈ 7.23 bits. The redundancy of information at the regional level in relation to the global one at the zero level of description is RR = 1 − (7.23/19.9) ≈ 0.64, and its cost CR = 1/(1 − RR ) ≈ 2.8.
Certainly, we can give another interpretation of these results. Indeed, what does imply the value of information equal to 19.9 bits per species? Note that this is a global value. Since a considered system is the Globe then this statement is equivalent to the following: the probability to find a given species in the list of all species that can be found on the Globe is equal to 2−19.9 . The same probability, but calculated for the list of steppe species (150) and the region of Russian steppe, will be equal to 2−7.23 . An analogous interpretation occurs for biomes. At the zero level of description the probability to find one and more plants representing one (of 30) given biomes is equal (0) to 2−I . In this case we apply properly the following probabilistic model. Given any (completely random) reordering of the list of all species, the chance of finding a particular species in a particular location in the list is the value stated. That is because each species occurs exactly once in the list of all species. Whether it is the Earth or a biome, the selection of a location and then looking at first living organism encountered, the absolute density would need to be considered. In other words we have to pass to the first level, where the density characteristics may be defined. At the first level, when we know the area of each biome, the same probability but calculated for a sin(1) gle plant will be higher, 2−I , I (1) < I (0) . Hence, the more is the area of certain biome, the higher is the probability to meet its representative at any arbitrary point. At the second level the probability to find a pair of plants representing a given pair of biomes in the (2) close vicinity of any point of land is equal to 2−I /2 . It is also higher then the previous probability. In addition to area each biome can be characterised by either productivity, or living biomass, or dead organic matter. Then the corresponding probabilities (1) P,B,D PrP,B,D = 2−(I ) are the probabilities to find at 1 any point of land a plant with characteristics, which are typical for a given biome. Note that all these statements can be paraphrased as “. . . the probability to find a plant—representative of any arbitrary biome at a given point of land. . . ”. Alexander von Humbold has formulated the concept in his “Flora Freibergiensis” as “Das Sein das Sein des Seinden sei”. There are empirical facts, which bring us to think about some very special properties of information in application to biological communities and about “linguistic” analogies between “natural” alphabetic
Y.M. Svirezhev / Ecological Modelling 174 (2004) 19–24
languages (English, Russian, etc.), social systems and biological communities. The information per individual for many communities (not only biological) tends to concentrate within a fairly narrow interval with a supremum of about 5 bits per individual (Margalef, 1995). There is an impression that Nature avoids both very low and very high diversity described by this value. The same picture is seen in alphabetic languages where information per letter, as a rule, does not exceed five (Ebeling et al., 1990). If Shannon’s entropies are estimated for the distribution of human population with respect to professional groups in developed countries, then their values also do not exceed this limit. However, the analogous estimation, made, for instance, on a beach, gives a much higher value (Margalef, 1995).
4. Conclusion In this work, we tried to show that the concepts and methods related to the theory of information could be useful for the description of the vegetation patterns. A standard method of the description of plants community in the social botany is a list of species. The next level of description is a percent of cover by each species. Then the spatial organisation of community (aggregation of specimens, neighbourhood of species, etc.) is described at the third level. At every level such information measures as the amount of information per one letter of used alphabet, redundancy of infor-
mation and its cost could be defined. By comparing them at the different levels of description we can judge about the efficiency of either or another description. References Bazilevich, N.I., 1973. Biogeochemistry of the main types of global vegetation. In: Proceedings of the V Meeting of the USSR Botanical Society, Kiev, pp. 239–244. Bazilevich, N.I., 1993. Biological Productivity of Ecosystems of Northern Eurasia. Nauka, Moscow, 293 pp. Ebeling, W., Engel, A., Feistel, R., 1990. Physik der Evolutionsprozesse. Akademie-Verlag Berlin, 374 pp. Kharkevich, A.A., 1963. The cost of information. In: Problems of Cybernetics, vol. 9. Nauka, Moscow, pp. 71–102. Klix, F., 1974. Organismische Informationsverarbeitung. Akademie-Verlag, Berlin. Margalef, R.A., 1995. Information theory and complex ecology. In: Patten, B.C., Jørgensen, S.E. (Eds.), Complex Ecology. Prentice-Hall, New Jersey, pp. 40–50. Shannon, C.E., Weaver, W., 1963. The Mathematical Theory of Communication (first published in 1949). University of Illinois Press, Champaign, IL, USA, 388 pp. Svirezhev, Yu.M., 2002. Simple spatially distributed model of the global carbon cycle and its dynamic properties. Ecol. Model. 155, 53–69. Svirezhev, Yu.M., Svirejeva-Hopkins, A., 1997. Diversity of the biosphere. Ecol. Model. 97, 145–146. Volkenstein, M.V., 1988. Biophysics. Nauka, Moscow, 592 pp. Walter, H., 1964. Die Vegetation der Erde in öko-physiologischer Betrachtung. Die tropischen und subtropischen Zonen, vol. 1. Fischer, Jena, p. 538 S. Walter, H., 1968. Die Vegetation der Erde in öko-physiologischer Betrachtung. Die gemässigten und arktischen Zonen, vol. 2. Fischer, Jena, p. 1001 S.