Potential methods in pattern recognition

Potential methods in pattern recognition

Analytica Eisevier Chimico Acta, 134 (1932) 139-151 Scientific Publishing Company, Amsterdam - Printed in The Netherlands POTENTIAL METHODS IN PA...

900KB Sizes 0 Downloads 4 Views

Analytica Eisevier

Chimico Acta, 134 (1932) 139-151 Scientific Publishing Company, Amsterdam

-

Printed

in The Netherlands

POTENTIAL METHODS IN PATTERN RECOGNITION Part 5. ALLOC, Action-orientated Decision Making

D. COOMANS

and D. L. MASSART*

Farmaceutisch Instituut. Brussels (Belgium)

Vrije

Wnkersiteit

Brussel,

Laarbeeklaan

IO3, B-l 090

I. BROECKAERT

Dienst Castro-Enterologie. Brussels (Belgium) (Received

20th

July

Sint Pieter

Hospitaai,

Hoogstraat

323,

B-I 000

19Sl)

SUMMARY The possibilities of action-orientated pattern recognition with the supervised pattern recognition technique, ALLOC, are discussed_ The emphasis is on the importance of the definition of overlapping regions between ciasses as a way for obtaining more information about the separation between classes. Action-orientated classification and feature selection with ALLOC are discussed using the results obtained for two data bases concerning the characterization of the functional state of the thyroid and the determination of the origin of milk samples.

In a previous article of this series [l] , classification of objects with the supervised pattern recognition method, ALLOC, of Hermans and Habbema [23 was discussed. A single boundary between each pair of related classes is then needed in the pattern space. The boundary corresponds with the minimum a posteriori probability of error [ 1, 23. This means that an object is classified in the class to which it seems to belong with the largest a posteriori probability (as obtained with the Bayes equation). The use of the minimum a posteriori probability of error rule is based on the assumption that it is not worse to misclassify an object from one class than to misclassify an object from another class. The usual way to evaluate the performance of such a classification rule is to calculate the correct classification rate. Such a classification problem is often called an identification problem [ 21 _ In analytical chemistry, classification problems are often pure identification one often uses non-probabilistic problems. For identification problems, methods such as KNN (k-nearest neighbour rule) [3--51 and LLM (linear learning machine) [ 3, 4, 6, 71 _ In the related domain of medical decisionmaking based on the results of laboratory tests, however, non-probabilistic methods seldom find application_ The reason is that a medical decision is never related to a simple identification but to an action which may have important consequences. For instance, a physician will not consider a patient ~~~~-2~~~/~2/0~~~-~~~0/$02.75

0 1982

Bisevier Scientific

Publishing

Company

140

as healthy when the probability that he is healthy is only slightly larger than the probability that he is not; more certain diagnosis is needed because of the risk associated with a misclassification_ Such a risk is usually not equal for the classification of an ill patient as normal and a healthy one as ill. All this leads to the use of action-orientated decision boundaries and the construction of overlapping regions (regions of doubt) between the diagnostic classes. >Ioreover, it is often seen that patients belonging to the overlapping regions require an action distinct from those where the diagnosis has been established with a high degree of certainty. On the same basis, some classification problems encountered in non-clinical analytical chemistry may also need an action-orientated approach_ For instance, the detemlination of the origin of milk samples [S, 91 may be considered action-orientated. This happens when the decisions are associated with foocl quality control where for instance pure samples have to be differentiated from adulterated samples. The aim of this paper is to discuss the action-orientated possibilities of _-1LLOC on the basis of the THYROID (l, lO-12) and MILK (S, 9) examples used in previous papers. It is also shown that, even for identification problems, an action-orientated approach, and more especially the evaluation of the amount of overlap between classes, gives rise to a more complete picture of the performance of a probabilistic classification technique. ACTION-ORIEXTXTED

PROCEDURES

Action-orientated classification Ari action-orientated classificat.ion procedure starts with the determination of the a posteriori probabilities of class membership_ This is done on the basis of the Bayes equation [ 13, 141 _ In a previous article of this series [ 11, it was seen that XLLOC differs from other probabilistic techniques in the way in which the probability densities in the Bayes equation are estimated_ Action-orientated classification with ALLOC can be carried out according to two different philosophies. First, the minimum probability of error classification rule is replaced by the minimum overall risk rule. This means that the boundary between two classes is displaced so that it is closer to the class with smaller misclassification risk (meaning that it is considered a smaller risk that a patient who belongs in reality to this class is classified into another one)_ Alternatively, a region containing doubtful cases is defined, which means that the boundary is now a zone instead of a line_ The minimum overall risk rule. The minimum overall risk classification corresponds with the classification of an object in the class with the smallest conditional risk. The conditional risk of an object i with measurements xi for a class w,, out of K classes is given by the equation R (o~/x~)

= f

q=1

‘pqP (Wq/Xi)

(1)

The minimum conditional risk classification rule reduces to the minimum a posteriori probability of error rule or in other words the action-orientated approach reduces to the identification approach. If one assumes that the prior probabilities in the Bayes equation are the same, the decision boundary between the classes is situated in the position where the two probability density curves cross each other in Fig. 1. A boundary region can be obtained by considering two different ratios a/b. The boundary which is closest to the w 7-class is determined by considering b >a. This means that a misclassification of an object belonging to class o 1 is considered to be worse than a misclassification of an object belonging to w2_ Therefore the decision boundary with this loss matrix (III in Fig. 1) is displaced to the right of the identification boundary II_ The second boundary can be obtained by taking b < a. This boundary, I, is closer to w 1 than the identification boundary II. The region between boundaries I and III is the overlapping region o ll_ On the basis of these boundaries, it is possible to obtain an estimate of the degree of overlap of classes o 1 and o 2. The number of objects situated in the boundary region is counted and the correct classification rate is calculated. This refers to objects which are classified in the correct class (not in the overlapping region). A more complete picture is obtained when the degree of overlap is explored in a dynamic way by varying the boundaries I and III. Up to now only two adjoining learning classes have been considered_ However, the concept of overIapping regions can be extended to more adjoining classes and to the multivariate case. ALLOC defines for the boundaries I and III of the two-class problem of Fig_ 1 a threshold value 6 for the 4 posteriori probabilities. The decision rule for a KcIass problem then becomes: classify i in class oP if P(uJxJ

= max [P(oJx,)]

; q = 1, _ . ., K

(8)

and I? (Oplxi) > 6

WI

(9)

uJl2

w2

Fig. 1. CIassification boundariesbetweentwo classesw, and o1 accordingto the loss matrixgivenby Eqn. (6). Boundaries:I. o > b; II, a = b; III, a < b.

143

If object i is not classified in any class by application (9), it is a boundary zone case. -4 close relationship exists between the loss matrix for the binary decision problem that the relationship given by 6 =

b/(a + b) for P (w ,/si)

of expressions

(S) and

and S _ It can be shown between Q, b and 6 is

(10)

and 6 = a/(a + b) for

P (w -/xi)

(11)

When symmetrical overlap regions are used, two boundaries are considered so that for the first boundary the loss matrix of Eqn. (6) is used (loss matrix L with 6 > a) and for the sec_ond boundary a loss matrix where a is replaced by b and b by a (loss matrix L with a > b). In this situation the decision rule as formulated in espressions (S) and (9) for two classes is equivalent to: classify i in class 0 1 if R (w */xi) < R (w&J

(12)

and in class o Z if

E

(w JSi)

<

R

(13)

(cd ,/Si)

If object i is not classified. it is a boundary zone case. R and E indicate that the loss matrices L and z, respectively, are used. The latter decision rule (espressions 12 and 13) can be considered as an extension of the rule given by espression (5). Further extensions of the concept of overlapping regions have been discussed by Habbema et al. [16] and Hermans et al. [ 171, but they have not been implemented in ALLOC.

Action-orientated

feature selection

It has already been shown [9] that the forward feature selection procedure of ALLOC is directly based on classification rates. Xn accurate way to express a classification rate in an identification problem is the probability of error estimated by means of the leave-one-out classification of the members of the learning set. The probability of error is then given by P (error)

=

f

P (w,)

n,(error)/n4

(14)

q=l

where for wq, 4 = 1, . . . . K (the I< learning classes), P (w,) is the prior probability of wq (for explanation, see [ 9]), n&error) is the number of objects on the basis of the leave-one-out procedure, belonging to wq, misclassified and n, is the total number of objects belonging to wq_ The feature selection procedure described earlier [ 91 may be applied also in action-orientated feature selection on the condition that P be replaced by

R, the overall conditional fication of the members given by P (w,)

risk estimated by means of the leave-one-out of the learning set. The overall conditional

1,, nJerror)/n

4

classirisk is

(15)

xvhere n,(error) indicates the number of objects belonging to wq but which
BXSES

Differentiation of pure milk from different species and nzixttcres (~IIILK) Statistical linear discriminant analysis was applied to these data by SmeyersVerbeke et al. [S] and the same data were used earlier [ 91 for discussion of the feature selection procedure of _4LLOC. The data base consists of different milk classes, each of which has 20 samples characterized by 15 gas chromatographic peak measurements. Besides the three pure milk classes - cow (C), goat (G) and sheep (S) -- synthetic binary mixtures are also considered_ In the present study, the same binary differentiations as studied earlier [ 91 are investigated, i.e., the differentiation between a pure milk class and another pure milk class or a related 9/l misture class. The XILK differentiation problem is a typical exampIe in analytical chemistry where an action-orientated approach may beof interest in practice. Functional state of the thyroid gland (THYROID) This data base consists of three learning classes :EU, HYPER and HYPO. For each patient (1~100~1 sample), five laboratory results (RTSU, T4, T3, TSH, fiTSH) are available_ From the clinical point of view, the differentiation EU/HYPER and EU/HYPO is of interest_ Studies on this data base have been reported earlier (1, lo-12)_ RESULTS

AND

DISCUSSION

Evairtation of the overlap between classes For this purpose, symmetrical overlap regions are used. For symmetrical overlap regions, the methods discussed above under Action-orientated procedures are the same, and so the method corresponding to expressions (S) and (9) is used. The threshold probability 6 defines the symmetrical overlap region_ In the MILK example as well as in the THYROID example, the overlaps are evaluated on the basis of the percentage of objects which are correctly classified with a 6 degree of certainty, i.e., objects that are classified

1.15

in the correct class (not in the overlapping region). In a binary decision problem, the first value (0.500) corresponds to a single identification boundary. For the milk esample, the optimal subset of variables selected by the ALLOC selection procechre [9, 181 was xed initially. Table 1 shows the number of samples which are correctly classified with a (5 degree of certainty as a function of 8. It can be seen that more information can Ix obtained about the separation of the classes by considering different 6 values instead of one single identification boundary with S = 0.500. For instance, although

the pure milli classes seem to be separated the basis of 6 = 0.500, a further comparison

completely

from

each

other

on

can be made on the basis of the degree of certainty of the classification; for the differentiation S/G (sheep! goat), no overlap is observed up to a 6 value equal to 0.950 but for the other differentiations C/S (cow/sheep) and C/G (cow/goat) the separation is not so clear-cut. It can be concluded that S/G can be better distinguished than C/G and C/S, and C/S better than C/G. Furthermore, it can be seen that differcntiation of the mixture C9Gl from C is not so obvious as indicated by the straightforward identification results. hloreover, the results obtained with large 6 values show that the separation C/CSGI is somewhat more difficult than the separation C/G. Although this was expected, it could not be observed from the classification rate of the simple identification approach. \Ylien a h = 0.93, then Table I sholvs safe overlap region is created by taking, e.q., . with certainty for the difthat very few, if any, samples can he identified ferentiations C/CSSl, S/ClSS and G/SlGS. This means that, althoug!l a relatively good classification rate is obtained for 6 = 0.300, in practice predictions are very difficult and doubtful in these situations_ The THY XOID example also shows clearly that t!le evaluation of overlap complete information about the separability of the regions gives more

TABLE

1

MILK differentiations showing the number classification is 40) correctly classified with ‘5

0.500 0.600 0.750 0.800 0.900 0.950 0.975 0.990 0.999

C/G (1)”

C/S (1)

S/G (1)

C/CSSl

40

40

39

40

40 40

38 38 37 37 35 32

-10 40

3i 29 20

19

39

3s 37 32 20

“The number in brackets problem.

40 40

40 ‘10 39 39 37 is the

of samples (the maximum, h degree of certainty

(3)

S/clSS (3)

C/CSGl

32

40 39 37 35 3-i 18

22 12

15 7 2 2 2 0 number

6 3 1 1 0 0 of

variables

i.e.,

12

6 4 used in the particular

1005

(3)

correct

G/SlGS (2) 32 26 10 4 1 1 1

1 0 differentiation

146

learning classes and about the confidence

that one may have in the classifi-

cation of new objects For the ALLOC classification on the basis of the originaI set of 5 laboratory tests, Table 2 reveals that a better classification

rate is obtained with 6 = 0.500 and with the 5 original laboratory tests for the differentiation EU/HYPER than for the differentiation EU/HYPO. However, as the 6 values are increased, it can be seen that for the HYPO class, in contrast to the HYPER and El-l classes, there is no decrease of the correct classification rate. This means that a larger number of HYPO cases can be detected with a higher degree of certainty (see S > 0.95). This way of presenting results is much more useful in (medical) decisionmaking. In fact, non-probabilistic pattern recognition techniques such as the linear learning machine (LLM) are not useful at all in medical decision making and in many problems in analytical chemistry where a degree of certainty is required for a decision, even when excellent classification results are obtained, such as have been shown earlier [ l] _ Another example concerns the difference between an ALLOC classification on original data or after preprocessing by statistical linear discriminant analysis (SLDA + ALLOC); the latter has been discussed [12] _ Table 2 shows that on

the basis of a simple leave-one-out

classification

of the learning classes (6 =

O-500), no important difference is obtained between ALLOC and SLDA + ALLOC for both the EU/HYPER and EU/HYPO differentiations. However, SLDA + ALLOC performs less well than ALLOC does, because the correct classification rate decreases with increasing 6 value faster for SLDA + ALLOC; the EU cases are usually not detected with a high degree of certainty on the basis of SLDA + ALLOC. A reason for the worse results of SLDA + ALLOC is certainly the loss of information. In the case of binary decision problems, the SLDA + ALLOC procedure supposes that the different probability levels (chosen by means of the 6 values) are parallel hyperplanes in the pattern space. In reality, this is seldom the case. The original ALLOC classification procedure, however, is able to discover the real form of the decision boundaries. For the esamples given in Table 2, the loss of information was observed only when higher 6 values were considered instead of simply 0.500. The loss for EU/HYPO is larger than for EU/HYPER. For practical reasons and in situations discussed earlier [ 121, however, SLDA + ALLOC may still be preferred to ALLOC applied on ‘the original variables although the performance is somewhat less good. Feature selection in the case of symmetrical overlapping regions The feature selection procedure discussed earlier [9] was based on the maximum correct classification rate without considering overlapping regions. The criterion for the selection of the optimal subset of variables is given above by Eqn (11). The question whether the selected optimal subset of variables also minimizes the overlap between the classes, i.e., maximizes the correct classification rate according to “classification with 6 degree of certainty”, is of some importance_ In the MILK example this was evaluated by calculating

147 TABLE

2

THYROID

0 500

O.GOO 0.750 o.soo 0.900 0 950 0.9i5 0.990 o.L199

99 99 99 35 97 93 33 90 SG

example showing the classification rates with 6 deLTee of certainty

9i 9i 94 9.: 59 i7 71 ti9 63

9s 9s 9G.S 3G 93 .s 5 PI.5 79.:; 7.1.5

9; 97 96 94 .ss .<.: $1 71 4:

9; 9; 9-l 91 91 d9 d3 7.1 ti3

9; 9; 95 9 9 .5 sz).:i Stj.5 s5 72 5 .i .i

0.600 O.GOO 0.7.50 0.500 0.900 0.950 0.9;s 0.990 0.999

99 99 99 99 9s 9; 95 89 71

.s1

s3 53 s3 .s 3 s3 .s3 :: .‘; .s 3

93 91 91 91 91 90 59 dG 77

at each step of the feature selection procedure the number of samples which were correctly classified with 6 degree of certainty for 6 = 0.500, 0.750, 0.950 and 0.999. Figure 2 shows the diaqams for the binary differentiation of the pure milk samples C/G, C/S and S/G, aad Fit. 3 shows the diagrams for binary differentiations between mistures and the related pure milk class. Figure 2 shows that the pure milk classes are completely distinguishable for each degree of overlap, but higher 6 values require more variables than considered in the optimal subsets of Table 2. This is clearest for the differentiation C/S_ For a simple classification, one variable suffices, but at. least 4 and S variables are respectively necessary in order to obtain certain ties of 0.950 and 0.999. For the differentiation between a pure milk and a mixture (Fig. 3) the behaviour of the curves is more complicated_ When the four binary differentiations are observed the following conclusions can be reached. First, an important overlap is obsewed between the classes, in each step of the selection procedure; even where a complete separation is obtained with 6 = 0.500 (see C/CSGl), the number of samples which are classified with a high degree of certainty is rather small. Secondly, in order to optimize the classification rate for a higher degree of certainty than 0.5, more variables are needed, i.e., more information is required_ Thirdly, the negative influence observed on the classification rate for low 6 levels when the number of variables used for ALLOC classification is increased up to the ori$naI set of 15 variables, is not observed for higher (5 leveis. This means that lower 6 level classifications are more sensitive to noise, possibly because the more doubtful (i.e., closer to the borderline) the objects are, the more sensitive they are to noise. The feature selection in the THYROID esample was done in the same wa> as in the MILK esample. Figure 4 shows for the EU/HYPER and EU/HYPO differentiation the relationship for 6 = 0.500, 0.750, 0.950 and 0.999 between the percentage of correctly classified patients (leave-cne-out procedure) and the number of selected variables in each step of the procedure. It can be seen for the EU/HYPER differentiation that at all levels of certainty, a set of and T3) is the most appropriate choice. No 3 variables (i.e., T4, ATSH important change was obtained when the other variables were added to the selected set. The separation is very acceptable : on the basis of the three vari-

40

.- ..- _ -_.. . --+-i)-

__ .._=----1

._ ._..._. i_ >-

I

30 G 2

::

20

:k+’

--.-Y

5

IO n

5

IOn

5

IOn

--7.

--n. --‘>-

20

Fig. 2. Differentiation between pure hlILK classes showing the relationship between the number of correctly classified samples with 6 degree of certainty and the number of variables (n) selected by the ALLOC feature selection procedure according to Eon. (1-I). (0)6 = 0.500; (X )S = 0.750; (:)5 = 0.950; (:;)a = 0.999. The final lines shown up to 11= 10 continue to n = 15.

30

30

./*, -_._.

m 0, ”

\.._

20

20

.

. ____

,__.._ I-.-.._--:.\._._ -_.___.:

G IO

Fig. 3. Differentiation between a pure MILK class and a 9/l mivture class showing the relationship between the number of correctly classified samples with 6 degree of certainty and the number of variables (n) selected by the ALLOC feature selection procedure according to Eqn. (14). (*)6 = 0.500; (x)6 = 0.750; (@)a = 0.950; (a)6 = 0.999.

119

ables 74% of the cases can be dia.gnosed with a certaintv of 0.999. For the EU/HYPO differentiation, it was concluded in previous papers [9, 101 that the combination of two variables (T4 + ATSH) provides the best classification rate and that a further addition of variables causes a slight decrease of the classification rate. This is true for the simple identification (b = 0.300) or with 6 values such as d = 0.750. However, Fig. 4 shows that for the karcer 6 values, a slight increase of the classification rate is observed ivhen more variables are added to the proposed set of two. It can therefore be concluded that the first two variables are essential for a good separation between the classes; further addition of variables produces greater certainty about the classification. This may not be trivial for medical diagnosis. In this respect, the set of five variables is theoretically the most appropriate choice because 77% of the cases can be diagnosed with a certainty of 0.999. _-Is in the 3IILK example, a decrease of the classification rate for small h values toEether with an increase for the larger d values is observed in the KU/I-lYP0 differentiation. However this phenomenon is here less ~3ronouncecl. _Action-orientated

feature

selection

In the previous section, the feature selection procedure for simple classification given by Fkln. (14) was evaluated in view of minimizing the amount of overlap between classes. In this section the action-orientated feature selection procedure &en by Eqn. (15) is discussed on the basis of the THYROID example. It is assumed in this case that. the losses associated with a correct diagnosis are zero. Five different ratios between Q and b of the loss matris of Eqn. (6) were considered, i.e., u/b = 100/l, 20/l, l/l, l/20 and l/100. The value b is the loss associated lvith classif\rinc an individual of the EU class in the pathological class and the value a is the loss associated with the inverse situation_ In this way, five single boundaries are assumed. For the ratio l/l, the feature selection is equivalent to the simple classification by Eqn. (14), the ratios 100/l and 20/l correspond to a selection of variables which are essential for screening pathological cases, and the ratios l/20 and

Fig. 4. EU/HYPO and EU/HYPER differentiation of the THYROID example showing the relationship between % correct classification with 6 degree of certainty and the number of variables selected by the ALLOC feature selection procedure according to Eqn. (1-I). (0)s = 0.500; (X)fI = 0.750; (0)6 = 0.950; (i’!)6 = 0.999.

150 TABLE

3

Action-orientated selection of features for the EU/HYF’O ferent loss ratios a/b Step

1 2 3 4 5

differentiation according to dif-

alb

100

20

1

0.05

0.01

T4(O_026)a TSH(0.015) ATSH(O_005) RT3U(O.O05) T3i0.005j

T4(0.213) TSH(0.057) RT3U(O_O50) T3(0.060) ATSHiO.677j

T4(0.087) ATSH(O_027) T3(0.040) RT3U(O.O53) TSHiOe073j

TSH(0.183) T4(0.167) ATSH(0.150) RT3U(0.150) T3i0.150j

TSH(O.012) T4(O.G10) T3(0.010) RT3U(0.010) ATSHi0.042j

=Numbers in brackets are the overall conditional risks obtained in the particular step of the selection (see Eqn. 15)_

l/100 correspond to a selection of variables essential for screening the EUcases. Table 3 shows the sequence in which the variables were selected for the EU/HYPO differentiation according to the different loss ratios; the appropriate subset of variables is shown separated from the superfluous variables by underlines. It can be seen that the optimal subset of variables differs according to the ratio a/b. This means that screening of HYPO cases is preferentially done by measuring the variables T4, TSH and 4TSH (a/b = 100) or T4, TSH and RT3U (a/b = 20). For screening of EU cases, only TSH and T4 are needed. It is surprising that T4 is most important for the HYPO screening while TSH is most important for the EU screening_ These results can be explained on a biochemical basis, but this is outside the scope of this paper. Furthermore, it is possible to define overlapping regions on the basis of two screening boundaries (e.g., the ratios 100/l and l/100 which in fact correspond to 6 = 0.99). However, this approach differs from the one discussed in the previous section because this leads to a classification based on two different subsets of variables, one for each boundary_ However, with this approach the classification ratio found was only slightly better than with the single optimal set of variables proposed above (see Fig. 4). For the EU/HYPER differentiation, the influence of the loss ratio on the selection of variables was similar. CONCLUSIONS The action-orientated approach can be utilized in two ways involving either the construction of a decision boundary which takes different risks associated with the decision into account or the construction of overlapping regions. Even in identification problems, a more detailed picture will be obtained about the separability of the classes when overlapping regions are taken into account. In this respect, ALLOC is a very attractive software package for

151

pattern recognition. It combines an accurate method for probability estimation with the possibility of action-orientated classification and feature selection. A disadvantage, however, is the absence of criteria for the detection of ou tliers. REFERENCES 1 D. Coomans. I. Broeckaert, -4. Tassin and D. L. Massart, Anal. Chim. Acta, 133 (1981) 215. 2 J. Hermans and J. D. F. Habbemn, nlanual for the ALLOC-discriminant analysis program, Department of AIedicaI Statistics, University of Leiden, P-0. Bos 2060, Leidcn, Netherlands, 1976. 3 D. L. Duewer, J. R. Koskinen and B. R. Kowalski , _4RTHUR (available from B. R. Kowalski, University of Washington, Seattle). RI. Harper, D. L. Duewer and B. R. Kowalski, in B. R. Kowalski (Ed.), Chcmometrics, Theory and Practice, Am. Chem. Sot. Symp. Ser.. No. 52, 1977. B. R. Kowalski and C. F. Bend-r, Pattern Recognition, S (1976) 1. N. J. Nilsson, Learning Machines, McGraw-Hill, New York, 1965. D. R. Preuss and P. C. Jurs, Anal. Chem., -l6 (197-1) 520. J. Smeyers-Verbeke, D. L. %_ssart and D. Coomans, J. _-1ssoc. Off. Xnnl. Chem., 60 (19’77) 1383. D. Coomans, bl. P. Derde, I. Broeckaert and D. L. hIassart, Anal. Chim. Actn, 133 (1961) 241. D. Coomans, I. Broeckaert, XI. Jonckheer, P. Blockx and D. L. 1Inssart. Anal. Chim. Xcta, 103 (1978) -109. D. Coomans, L. Kaufman and D. L. Massart, Anal. Chim. .-\cta, 112 (1979) 97. D. Coomans, I. Broeckaert and D. L. Massart, Anal. Chim. Acta, 132 (19Sl) 69.

1 A. 5

6 ‘7 S 9 10 11 12 13

G. F. Bos and G. C. Tiao, Bayesian Inference New York, 1953. l-1 R. 0. Duda and P. E. Hart, Pattern Classification New York, 1973. 15 E. ~1. Patrick, Decision _4nalgsis in lledicine:

in Statistical nnd Scene .\lethocis

Boca Raton, Florida, 19i9. 16 J. D. F. Habbema, J. Hermans and A. T. van der Burgt, 17 J. Hermans, J. D. F. Habbema and A. T. van der Burgt, 1s J. D. F. Habbema and J. Hermans, Technometrics, 19

Analysis,

Analysis, and

.4ddison-\\‘esiey,

IWey-Intelscience,

Xpplications,

Biomeirika, Bull. I.S.I.,

CRC

Press,

61 (197-I) 313. -I5 (19’73) 523.

(19'77)-1Si.