- Email: [email protected]

Pattern Recognition Lette~ 17 (1996) 1349-1359

Active fusion - A new method applied to remote sensing image interpretation Axel Pinz *, Manfred Prantl, Harald Ganster, Hermann Kopp-Borotschnig Institute for Computer Graphics, Technical University Graz, A-8010 Graz, Austria Received 18 July 1996

Abstract

Today's computer vision applications often have to deal with multiple, uncertain and incomplete visual information. In this paper, we introduce a new method, termed "active fusion", which provides a common framework for active selection and combination of information from multiple sources in order to arrive at a reliable result at reasonable costs. The implementation of active fusion on the basis of probability theory, the Dempster-Shafer theory of evidence and fuzzy sets is discussed. In a sample experiment, active fusion using Bayesian networks is applied to agricultural field classification from multitemporal Landsat imagery. This experiment shows a significant reduction of the number of information sources required for a reliable decision. Keywords: Information fusion; Image understanding; Active fusion; Probability theory; Bayesian networks; Dempster-Shafer theory of evidence; Fuzzy sets; Fuzzy measures; Entropy

1. Motivation

Information fusion deals with the integration of information from several different sources, aiming at an improved quality of results (e.g. a better decision (Dasarathy, 1994), the robust behavior of an algorithm, improved classification accuracies). Major applications of fusion are medical image interpretation ( e.g. Bloch, 1996b; Pinz et al., 1995 ) and remote sensing (e.g. Maitre, 1995; Cl6ment et al., 1993). Examples for the successful application of fusion in remote sensing are reported for • improved classification (e.g. Schoenmakers and Vuurpijl, 1995), • monitoring (e.g. Goodenough et al., 1995),

* Corresponding author. E-mail: [email protected]

• integration of diverse information (images, maps, GIS, knowledge) (e.g. Zhuang et al., 1991). In computer vision, there is the underdetermined or illposed problem to recover a description of a 3D scene from a 2D image. The number of possible solutions to ill-posed problems in vision can be reduced by information fusion (Aloimonos and Shulman, 1989; Clark and Yuille, 1990; Hager, 1990; Abidi and Gonzalez, 1992; Pinz and Bartl, 1992). In remote sensing applications of fusion the main concern is complexity. There are many different sources at different spatial and radiometric resolutions available, all of them being tainted with uncertainty, ambiguity and incompleteness. Important questions are: which sources to select, which processing strategy to follow, how to react in case of unexpected problems (e.g. poor classification accuracy) ?

0167-8655/96/$12.00 Copyright (~) 1996 Elsevier Science B.V. All rights reserved. PH S0 167-8655 (96) 00092-X

1350

A. Pinz et al./Pattern Recognition Letters 17 (1996) 1349-1359

In the terms of soft computing, uncertainty should be modeled, alternatives should be left open, and a "hard" decision should be drawn only towards the end of the processing. The new method proposed in this paper is able to effectively deal with such situations. The main goal of this paper is to introduce the "active fusion" method based on three different mathematical frameworks (probability theory, DempsterShafer evidence theory, and fuzzy sets). We present in some detail an implementation using Bayesian networks and show first results for a small remote sensing application.

2. Active fusion Active fusion extends the paradigm of information fusion, being not only concerned with the methodology of how to combine information, but also introducing mechanisms in order to select the information sources to be combined. This method leads to reliable results at reasonable costs (as compared to a "brute force" combination of all available data sources) and opens new possibilities in active visual information acquisition. Fig. 1 illustrates the concept of active fusion controlling a general image understanding framework. The active fusion component constitutes a kind of expert system/control mechanism, which has knowl-

edge about data sources, processing requirements, and about effective selection and combination of multisource data. The upper half of the drawing corresponds to the real world situation, while the lower half reflects its mapping in the computer. Boxes and ellipses denote levels of representation and levels of processing, respectively. Solid arrows represent the dataflow, dashed ones the controlflow in the image understanding system. The range of actions available to the active fusion component includes the selection of viewpoints (i.e. scene selection and exposure), the activation of image analysis modules (e.g. segmentation, grouping), up to the direct interaction with the environment (exploratory vision). The individual modules are activated by the active fusion component and upon termination, they report their results including confidence measures and an indication of success or failure. These partial results are then integrated into the systems current description of the situation (image, scene, world description).

3. Design of an active fusion m o d u l e Fig. 2 depicts the main components of the active fusion module with an emphasis on the control aspect. Once the problem to be tackled is expressed in a form suitable for the mathematical framework used, the actions available to the system are ranked according to their costs and expected benefit. The results of the most promising action are subsequently fused into the current solution and it is checked whether more actions should be performed. This check decides to terminate processing, if the uncertainty in the current solution is already low enough for a decision to be Active Fusion Module , . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

i

I P~"m I

NO

..................................

Fig. I. The concept of active fusion controlling a general image understanding framework.

-~;;~ .............

Fig. 2. Top-level design of an active fusion module.

A. Pinz et al./Pattern Recognition Letters 17 (1996) 1349-1359 taken, or if no improvement can be expected by further actions. If it is indicated to perform more actions, the whole process of ranking, fusing and assessing repeats itself, until the system comes up with a solution or an indication of failure. The subsequent three sections present suggestions for implementing key parts of the above framework using probability theory, the Dempster-Shafer theory of evidence, and fuzzy set theory. Depending on the actual mathematical tool used to realize the general functionality of active fusion, the details of the individual parts are slightly different. 3.1. Active fusion with probability theory In order to understand why probability theory is a useful tool for realizing the active fusion framework, we recall that many problems in computer vision can be formulated as the problem to decide among many competing hypotheses (e.g. object recognition, landuse classification, texture analysis). Even problems of parameter estimation (e.g. stereo measurements, robot navigation) can be regarded as such a problem class, if only we allow for infinitely many hypotheses (i.e. the possible values for a certain parameter). In order to decide among those hypotheses, the system obtains sensor observations, incorporates them into its view of the world and thus constrains the set of possible results. In the language of probability theory the set of hypotheses as well as the sensor observations are formulated as random variables (discrete or continuous) with an associated probability distribution over the possible values. The probability distribution characterizes the uncertainty the system has about the true value of the hypotheses variables as well as the sensor observations. Fusion. Bayes' rule provides the formalism of how to reduce the uncertainty associated with a hypothesis, H, by incorporating observations, Ol . . . . . On: P ( H I O ~ . . . . . On)

= a P ( O l . . . . . On I H) P(H).

(1)

P ( H I Oj . . . . . On) denotes the probability of hypothesis H once we have made certain observations, P ( H ) denotes the probability prior to any observations, P ( Oi . . . . . On I H) summarizes the characteris-

1351

tics of the sensors (or process chains) and a is merely a normalizing constant. Representation. However, in practice the estimation of such joint probabilities as P ( O l . . . . . O. I H) is very hard to achieve. Hence, we normally try to formulate problems in a way so that observations do not influence the top level hypotheses directly, but are connected to them via a number of mediating variables that can be thought of as subgoals. This leads to a problem description where the effects of observations are more localized (i.e. they directly influence only a smaller subgoal) and the necessary probability distributions can be estimated much easier. A very popular representation for problems formulated in the way described above are Bayesian networks (see Section 4 for an example and description). As Bayesian networks make the dependency between the variables explicit, the factors directly influencing the state of a random variable are localized and the creation of such a model is much easier than the formulation of the general joint probabilities. Control. Even though Bayes' rule defines a way of fusing information in order to reduce the uncertainty in some hypotheses variables, it does not give a direct prescription of what or how much information should be combined before a decision is made. On the other hand, if we assume that our computer vision problem has been formulated in the probabilistic framework described above, decision theory provides us with tools to realize the "active" part of our active fusion system (i.e. the selection and activation of information sources). Surprisingly, it is exactly the uncertainty inherent in the system variables that provides us with a gauge to control the information acquisition and the termination criterion. What source will be queried first depends on a trade-off between the aspect of how valuable this information will probably be in reducing the uncertainty of a set of target hypotheses, and on how expensive it is to acquire the information (e.g. in terms of money or computation time). In the language of decision theory the selection of an information source or the activation of processes to compute new information are simply regarded as a set of actions available to the decision maker (i.e. the ac-

1352

A. Pinz et al./Pattern Recognition Letters 17 (1996) 1349-1359

tive fusion system). Choosing an action will have consequences (e.g. altering the confidence in a hypothesis variable). If we are able to devise a utility measure to each possible consequence, indicating the degree of desirability, this can be used as a platform for the system to decide what action to perform next. However, the consequences of an action are an uncertain quantity. All that is known to the decision maker is a probability distribution over the possible consequences for a given action. So the best thing the decision maker can do is to choose the action which maximizes its "expected utility" (i.e. the consequences weighted by their probability distribution). If we do not have utility information we can still perform a ranking on the data sources based on their information content towards resolving the uncertainty contained in a target hypothesis (see Section 4.3 for an example).

Termination. Once the system is able to specify a schedule on how to activate information sources, the question arises of when to stop data acquisition. Again, the management of uncertain information associated with our hypotheses set provides a basis for answering this question. The system should continue to fetch new information until one of the following conditions is met: 1. all information sources have been queried; 2. the confidence in our hypothesis variable is sufficiently high, so that we can warrant taking a decision; 3. no significant improvement in reducing the uncertainty of the hypotheses is expected when querying an information source; or the costs of the information do not justify its acquisition.

3.2. Active fusion with the Dempster-Shafer theory of evidence Representation. In Dempster-Shafer evidence theory (Shafer, 1976) the main representation scheme is the frame of discernment, O. It defines the working space for the desired application, since this set consists of all propositions for which the information sources can provide evidence. This set is finite and consists of mutually exclusive propositions that span the hypotheses space.

Information sources can distribute mass values on subsets of the flame of discernment, Ai E 2 ° (Eq. ( 2 ) ) . An information source assigns mass values only to those hypotheses for which it has direct evidence; i.e. if an information source cannot distinguish between two propositions, it assigns a mass value to the set including both propositions. The derivation of the mass distribution is the most crucial step since it represents the knowledge about the actual application as well as the uncertainty incorporated in the selected information source.

0 <~ m(Ai) <. 1,

Ai E 2 °.

(2)

The mass distribution has to fulfill the following conditions: (i) m(0) =0;

(ii)

m(Ai) = 1.

~

AiE2 e~

Fusion. Mass distributions from two different information sources, mi, re.j, are combined with Dempster's orthogonal rule (Eq. ( 3 ) ) . The result is a new distribution, mi,j = mi • mJ, which incorporates the joint information provided by the sources.

mi,i(Ak) =(1 - K) (-1)

Z

mi(Ai)mj(A.i)'

AiCIAj=Ak

with K =

mi(Ai)mi(A.i)"

Z

(3)

Aif'qAj=~

From a mass distribution, numerical values can be calculated that characterize the uncertainty in and the support of certain hypotheses. Belief (Eq. (4a)) measures the minimum or necessary support whereas plausibility (Eq. (4b)) reflects the maximum or potential support for that hypothesis. These two measures span an uncertainty interval [ bel(Ai), pl (Ai) ] for this hypothesis. (a)

bel(Ai)= Z

m(A)),

AjCAi

(b) pl(ai) = l - bel(Ai).

(4)

Control. The selection of the information source is achieved by a cost-benefit analysis of integrating a specific information source. Measures of entropy adapted to evidence theory (measures of dissonance, confu-

A. Pinz et al./Pattern Recognition Letters 17 (1996) 1349-1359 sion, nonspecificity (Klir and Folger, 1988, pp. 169188) belief entropy, core entropy (Stephanou and Lu, 1988) ) can be used to measure different aspects of information content in a mass distribution. Measures of dissonance and of confusion (Eqs. (5) and (6)) display the uncertainty in a mass distribution, whereas the measure of nonspecificity (Eq. (7)) shows the ability of the information sources to distinguish between propositions. These measures are used to calculate entropies before and after including a specific information source. The information source which maximizes the difference of entropies is selected as the next one for fusion.

E(m) = - Z

m(Ai) log 2 pl(Ai),

(5)

m( Ai) log 2 bel(Ai),

(6)

Ai

C ( m) = - Z

1353

conceptually different approaches are both promising candidates for incorporating fuzzy methods into the active fusion paradigm. In the following we will concentrate on fuzzy set theoretic modeling and describe key parts of a fuzzy expert system for active fusion.

Representation. Using fuzzy sets Ai = ( (x, ]~Ai(X) ) I x E X} we can represent hypotheses depending on feature parameters x E X quite similar to probability distributions over the set of possible feature values X. l From sensor observations the application calculates feature values x which in turn give immediate conclusions regarding the confidence ]-~ai(x) in specific hypotheses Ai. At the highest recognition level, a fuzzy set over the set of possible object hypotheses is used to represent the confidence into the presence of the known objects.

Ai

V(m) = Z

m( ai) l°g2 IAi]'

(7)

A¢

with

IAI = cardinality of set A.

Termination. The partial result is assessed by calculating confidence measures which can be derived from the mass distribution directly and/or from the belief, plausibility and uncertainty measures. These confidence measures act as termination criteria (e.g. belief value above a certain threshold, uncertainty value below limit) for the iteration process of selecting an information source, deriving its mass distribution and combining it with the already developed mass distribution. If the result is satisfactory, i.e. the confidence measure reaches a desired value, the iteration terminates and a final decision can be taken according to the task the system had to solve (e.g. choose a class assignment in a classification task upon the highest degree of belief). 3.3. Active fusion with fuzzy sets Within the framework of fuzzy methodologies the problem to decide among competing hypotheses can be tackled using fuzzy sets and fuzzy logic (Zadeh, 1973; Klir and Folger, 1988) and also by means of fuzzy measures (Wang and Klir, 1992). These two

Fusion. More complicated hypotheses relying on many sensor observations and intermediate results can be formulated with the help of fuzzy fusion operators (e.g. min, max) corresponding to the linguistic expressions and, or, not etc. New hypotheses can be generated applying fuzzy reasoning on the set of given hypotheses. Fuzzy reasoning or approximate reasoning is an inference procedure used to derive conclusions from a set of fuzzy if-then rules and one or more conditions. The rules fire simultaneously providing output values for a final composition step. Fuzzy relations are the language to formulate the compositional rule of inference, which is the essential rationale behind fuzzy reasoning. This fusion step will usually influence the confidence values for some object hypotheses. Control. The selection of new acquisition and evaluation processes is carried out after analyzing the result of fuzzy reasoning over the given hypotheses and sensor observations. Especially the newly generated hypotheses are examined to arrive at a decision for the next step. We want our fuzzy expert system to choose among the possible next actions the one which maximizes its expected utility. If feasible and sensible, this utility can I Note, however, that in contrast to probability distributions the membership values do not need to sum to one.

1354

A. Pinz et al./Pattern Recognition Letters 17 (1996) 1349-1359

Fig. 3. Landsat TM images of a region east of Vienna captured at three different times of the year (from left to right: April, June and August 1985). be estimated beforehand using measures of information content. For more demanding tasks this approach might turn out to be too simplistic since there may be no easy way to estimate robustly the expected utility with explicit formulae. In such cases it is possible to incorporate implicit knowledge about the utility of actions in another rule base. Following this line, another fusion step is carried out to complete the selection. This time the hypotheses serve as input conditions, while fuzzy if-then rules code meta-knowledge on the utility of specific actions given specific hypotheses. The output of this fuzzyreasoning step is then used to arrive at a crisp decision for the action with the highest expected utility. The result of this meta-reasoning might also indicate that there is no action available which would result in significant improvements and thereby give a clue for termination. While the selection part of the control unit analyzes the hypotheses to arrive at a decision for the next action, the termination part of the control unit analyzes the current hypotheses and proposed actions to find out whether specific termination conditions are met. The most desirable way to finish is to have the expert system gather enough information to be able to decide unambiguously for one specific object hypothesis. This is the case if the fuzzy set representing the confidence in the object hypotheses becomes sufficiently concentrated around one specific hypothesis. Other, less wanted causes for termination are already Termination.

mentioned above, i.e. no more information sources, costs are too high, no significant improvements expected.

4. A simple active fusion system for field classification using Bayesian networks The experiment we present in this section is intended to illustrate some of the ideas proposed in our active fusion framework. We decided to base this sample implementation on probability theory as this concept is best developed within our research team (compared to Dempster-Shafer and fuzzy set theory). This should, however, not indicate a general preference of probability theory over the two other methodologies. Having chosen a probabilistic framework as our mathematical toolkit, our problem representation is based on Bayesian networks (Pearl, 1988). We would like to emphasize that the presented active fusion example only serves demonstration purposes and, therefore, contains many simplifications as compared to "real-world" applications. 4.1. Field classification with multitemporal and multispectral satellite images

The data material we worked on consists of 5 Landsat TM images (channels 1 to 5) at three different times of the year (April, June and August) as shown in Fig. 3. All the images are registered upon each other.

A. Pinz et aL/Pattern Recognition Letters 17 (1996) 1349-1359

1355

tion of the observed data, but also top-down from a hypothesis towards measurements to be expected. As the network encodes the relations between a specific problem and the information needed for its answer, one can simulate the contributions of an information source to this answer and thus rank the actions to be performed by the system. Examples of using Bayesian networks for an active approach to computer vision are given in (Rimey and Brown, 1994; Levitt et al., 1989).

4.2.2. A Bayesian network model for multispectral and multitemporal field classification Fig. 4. Regions selected as agricultural fields. From these satellite images we manually selected a number of regions corresponding to agricultural fields (see Fig. 4): The task now was to decide what crop category was grown on a specific test field presented to the network for classification. In order to solve this classification problem, characteristic values for each field have to be calculated. In our case this amounted to calculating mean and standard deviation for the pixel values constituting each field. Every field, therefore, has 30 numerical values associated with it (i.e. mean and standard deviation for each spectral channel at the three observed dates).

4.2. Model design using Bayesian networks

In the framework of Bayesian classification the task of assigning a field to one of N crop classes, Cl . . . . . C,v, given a set of measurements from n sensors, Xk, k c { 1, n}, is represented as

P(Cj I x , , x 2 . . . . . x . ) = oLP(X1 ,X2 . . . . . Xn I Cj) e ( c i ) ,

(8)

with j E { 1,2 . . . . . N}, a a normalizing constant, and P (Ci) being the a priori probabilities for the different classes. P ( CJ I xj , x2 . . . . . xn ) is the posterior probability that Ci is the correct class given the observed data, X, . . . . . X,, from the sensors. Assuming conditional independence between the n sensors this can be reformulated as

P(Ci l X~,X2 . . . . . x . ) 4.2.1. A brief introduction to the ideas of Bayesian networks Bayesian networks are directed acyclic graphs where each node represents a random variable and the links between the nodes are quantified by conditional probabilities. The structure of a Bayesian network encodes the dependency relations between the variables in the network. As the links in the network are established through causal relations pointing from cause to effect, the network provides an intuitively pleasing way of formulating a model of a specific problem. Bayesian networks have become popular over the last years mainly because of the development of efficient algorithms (Pearl, 1988) for updating the probability distribution of the random variables in the network in the light of new observations. Additionally, it is not only possible to reason from measurements in a bottom-up fashion towards the most likely interpreta-

=~e(x~ I Cj)P(x2 I c 9 " .P(xo [ Cj)P(C;). (9) Conditional independence means that if we know the type of crop we observe, as well as the typical sensor response for this class (summarized in P(Xi I Cj)), then the values Xt of sensor 1 tell us little new about the values of Xi of sensor i. This assumption does not necessarily always hold. Especially if the images were taken at the same time by all sensors and with similar wavelengths then there might well be a significant correlation between deviations in the sensor responses as, for example, caused by peculiar weather conditions. However, the conditional independence assumption greatly reduces the mathematical effort required and, therefore, is widely used. In cases where we not only have observations at one specific time, but rather have access to multitemporal

1356

A. Pinz et aL/Pattern Recognition Letters 17 (1996) 1349-1359

data, we can use this additional information to improve our classification result. The easiest way to incorporate multitemporal data in the Bayesian network formalism is via a mediating variable which summarizes the influence of previous observations and thus ensures conditional independence between the different time instances. In our specific case the random variable corresponding to the classification result at a different date, denoted C ~, acts as a mediating variable. Mathematically this is stated as (again assuming conditional independence)

P(Ci l Xl,X2 . . . . . x n , c t)

--,~p(x~ I G)P(X21G)...p(x,, IG) x e(c'tG)P(G),.

(10)

P(GIc') The equation shows that the influence of C t can be regarded as changing the a priori probability of Ci. Fig. 5 gives the network model considering multispectral and multitemporal data sources.

variable I to i, is characterized by

n ( T l i) = - ~ p ( t

[ i) log[p(t l i) ],

t

and the average remaining uncertainty, summed over all possible values for 1, is given as

H ( T [ I) = Z

H(T l i)p(i). i

The difference between H ( T ) and H ( T I 1) is a measure of the potential of I to reduce the uncertainty in T or its "value of information". That is, we calculate Shannon's measure of mutual information,

H(T) - H(T I I), for each information node and choose the node which maximizes this term. 2 In our experiment the set of available information sources are the leaf nodes of Fig. 5, namely the mean and standard deviation values for a specific field at one of the three times of the year using one of the five available satellite channels.

4.3. Information management

4.4. A sample run

In a general setup the acquisition of new information or the activation of processes would have to be based on the notion of costs and the desirability of possible results of these actions. In our experiment of classification, however, we are only concerned with the simpler question of choosing the most informative data source first. A simple measure for the information content of an information source I towards resolving the uncertainty contained in a target variable T (i.e. its "value of information") is given by Shannon's measure of mutual information (see (Pearl, 1988) for a discussion on this and several other measures). The measure of mutual information is based on the assumption that the uncertainty regarding a variable I with a probability distribution p (i) is represented by its entropy

From the fields selected in Fig. 4 we estimated the conditional probabilities for our model of Fig. 5. As we did not have enough training fields to accurately estimate the needed probabilities, we assumed a Gaussian distribution for the mean and standard deviation values within a specific class and just estimated the mean and variance parameters for this Gaussian distribution. We then selected a test field and the system decided which information source to query first, in order to decide on the class of the test field. After a data source was selected the result of that source was entered as a finding into the network and a new source was selected (now also based on previous results). This process was repeated until the class node indicated a decision on the field's identity with a confidence level above a certain threshold (90% in our case). Fig. 6 shows some of the states the system evolves through until the

H(1) = - ~ p ( i )

log[p(i)].

i

This means that the remaining uncertainty of a target variable T, given the instantiation of the information

2 For convenienceof notation we dropped the dependency of the above terms on all the evidence e that was previously entered into the network. Hence, for example, the term H(T [ I) should actually read H (T I I, e ) .

A. Pinz et al./Pattern Recognition Letters 17 (1996) 1349-1359

1357

Fig. 5. The Bayesian network model for field classification considering multispectral and multitemporal data. (a)

(bl

(d)

Fig. 6. Several stages in the evaluation of the classification of a test field. (a) Fields used for training the Bayes net. (b) Initial probability distribution for a test field. (c) Updated probability after data from first source (Landsat TM channel 5 of June: mean). (d) Final distribution after fifth source (Landsat TM channel 3 of June: standard deviation).

1358

A. Pinz et al./Pattern Recognition Letters 17 (1996) 1349-1359

final answer is given. Fig. 6(a) depicts the training fields. Figs. 6 ( b ) - ( d ) illustrate the probabilities assigned to the different classes during several stages of information acquisition. The test field is shown in its outlines, whereas the shaded areas corresponding to the training fields indicate the probability that the test field is of the same class type as the underlying training area. Bright grey values mean high probabilities and dark grey values indicate low probability values.

Acknowledgements The authors gratefully acknowledge the support of this work by the Austrian "Fonds zur FiSrderung der wissenschaftlichen Forschung" under grant $7003. We also thank Rainer Kalliany (FWF grant $7001) for supplying the remote sensing images used in this experiment.

References 5. Conclusion and outlook We have presented the general outline of a system for "active fusion" that selectively acquires and integrates new information in order to refine its answer to a user query. The processes of information fusion and information source selection (or process control) are strongly coupled and should, in our view, be treated simultaneously. Doing so results in an integrated system where the border lines between fusion and control start to disappear. Three ways to implement the active fusion system using three well established mathematical tools were described: probability theory, evidence theory, fuzzy set and measure theory. The three methodologies differ in the way computer vision problems are expressed and subsequently also in the way the activation and integration o f information sources is accomplished. All three techniques have advantages and disadvantages regarding their expressive power or the ease of process control. A thorough comparison of the three approaches for active fusion in image understanding is a future research goal of our group. Such a comparison will have to be based on theoretical properties of each theory (Bloch, 1996a) as well as on practical considerations in relation to computer vision problems (e.g. incorporating spatial information, complexity, fragility of algorithms). A first sample application has demonstrated our active fusion approach using Bayesian networks. The results show that the selection of only the most informative data sources can lead to a significant reduction in terms of the number of resources that have to be activated. Still, the results are robust and reliable. This experiment constitutes a successful demonstration of soft computing in remote sensing data analysis. Future experiments of increasing complexity are under way.

Abidi, M. and R. Gonzalez,Eds. (1992). Data Fusion in Robotics and Machine Intelligence. Academic Press, New York. Aloimonos, J. and D. Shulman (1989). Integration of Visual Modules. An Extension of the Marr Paradigm, Academic Press, New York. Bloch, I. (1996a). Information combination operators for data fusion: a comparative review with classification. IEEE Trans. Syst. Man Cybernet., A: Systems and Humans 26 ( 1), 52-67. BIoch, I. (1996b). Some aspects of Dempster-Shafer evidence theory for classification of multi-modality medical images taking partial volume effect into account. Pattern Recognition Lett. 17 (8), 905-919. Clark, J. and A. Yuille (1990). Data Fusion for Sensory Information Processing. Kluwer Academic Publishers, Dordrecht. Cldment, V., G. Giraudon, S. Houzelle and E Sandakly (1993). Interpretation of remotely sensed images in a context of multisensor fusion using a multispecialist architecture. IEEE Trans. Geoscience Remote Sensing 31 (4), 779-791. Dasarathy, B.V. (1994). Decision Fusion. IEEE Computer Society Press, Silver Spring, MD. Goodenough, D.G., P. Bhogal, D. Charlebois, S. Matwin and O. Niemann (1995). Intelligent data fusion for remote sensing. In: Proc. Internat. Geoscience and Remote Sensing Symp.. IGARSS'95, Vol. 3, 2157-2160. Hager, G.D. (1990). Task-Directed Sensor Fusion and Planning, Kluwer Academic Publishers, Dordrecht. Klir, G. and T. Folger (1988). Fuzzy Sets, Uncertainty, and Information. Prentice-Hall, Englewood Cliffs, NJ. Levitt, T., T. Binford, G. Ettinger and P. Gelband (1989). Probability-based control for computer vision. In: Image Understanding Worl6~hop, Palo Alto, CA, 23-26 May 1989. Morgan Kaufmann, San Mateo, CA, 355-369. Maitre, H. (1995). Image fusion and decision in a context of multisouree images. In: G. Borgefors, Ed., Proc. 9th SCIA, Scandinavian Conf on Image Analysis, Vol. I, 139-153. Pearl, J. (1988). Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, San Mateo, CA. Pinz, A. and R. Bartl (1992). Information fusion in image unde~tanding. In: Proc. Ilth ICPR, Vol. 1, 366-370. IEEE Computer Soc. Press, Silver Spring, MD. Pinz, A., H. Ganster, M. Prantl and P. Datlinger (1995). Mapping the retina by information fusion of multiple medical datasets.

A. Pinz et al./Pattern Recognition Letters 17 (1996) 1349-1359 In: Human Vision, Visual Processing, and Digital Display VI, IS&T/SPIE Proc. 2411, 321-332. Rimey, R.D. and C.M. Brown (1994). Control of selective perception using Bayes nets and decision theory. Internat. J. Computer Vision 12, 173-207. Schoenmakers, R.P. and L.G. Vuurpijl (1995). Segmentation and classification of combined optical and radar imagery. In: Proc. lnternat. Geoscience and Remote Sensing Symp., IGARSS'95, Vol. 3, 2151-2153. Shafer, G. (1976). A Mathematical Theory of Evidence. Princeton University Press, Princeton, NJ. Stephanou, H.E. and S.-Y. Lu (1988). Measuring consensus

1359

effectiveness by a generalized entropy criterion. IEEE Trans. Pattern Anal Mach. lnteU. 10 (4), 544-554. Wang, Z. and G. Klir (1992). Fuzzy Measure Theory. Plenum Press, New York. Zadeh, L. (1973). Outline of a new approach to the analysis of complex systems and decision processes. IEEE Trans. Syst. Man Cybernet. 3 ( 1), 28-44. Zhuang, X., B.A. Engel, M.F. Baumgardner and P.H. Swain ( 1991 ). Improving classification of crop residues using digital land ownership data and Landsat TM imagery. Photogrammetric Engineering and Remote Sensing 57 ( 11 ), 1487-1492.