Pattern recognition methods for optimizing multivariate tissue signatures in diagnostic ultrasound

Pattern recognition methods for optimizing multivariate tissue signatures in diagnostic ultrasound

IXASONIC IMAGING 8, 165-180 (1986) PATTERN RECOGNITION METHODS FOR OPTIMIZING MULTIVARIATE TISSUE SIGNATURES IN DIAGNOSTIC ULTRASOUND Michael F...

1MB Sizes 0 Downloads 3 Views

IXASONIC

IMAGING

8,

165-180

(1986)

PATTERN RECOGNITION METHODS FOR OPTIMIZING MULTIVARIATE TISSUE SIGNATURES IN DIAGNOSTIC ULTRASOUND Michael

F. Insanal, R bert F. Wagnerl, Brian 2. Garra', Resa Momenan s , and Thomas H. Shawker

Center

1Office of Science and Technology for Devices and Radiological Health, Rockville, MD 20857

FDA

'Dept. of Diagnostic Radiology National Institutes of Health Bethesda, MD 20205 3Dept.

of Electrical Eng. and Computer George Washington University Washington, DC 20052

Science

Described is a supervised parametric approach to the detection and classification of disease from data. Statistical pattern acoustic recognition techniques are implemented to design the best ultrasonic tissue signature from a set of measurements and for a given task, and to rate its performance in a way that can be compared with other diagnostic tools. In this paper, we considered combinations of four ultrasonic tissue parameters to discriminate, liver and chronic active in z, between normal hepatitis. The separation between normal and diseased samples was made by application of the Bayes decision rule for minimum risk which includes the prior probability for the presence of disease and the cost of misclassification. Large differences in classification performance of various tissue parameter combinations were demonstrated using the Hotelling trace criterion (HTC) and receiver operating characteristic (ROC) analysis. The ability of additional measurements to increase or decrease discriminability, even measurements from other diagnostic modalities, can be evaluated directly in this manner. @ 1986 Academic Press, Inc. Key words:

I.

Classification, discriminant analysis, hepatic disease, Hotelling trace criterion, pattern recognition, principal components, quantitative ultrasound, ROC analysis.

INTRODUCTION

The fundamental problem in diagnostic ultrasound is the detection and classification of target signals in noisy backgrounds. In a detection task, data is analyzed and, based on specified criteria, a decision is made between two classes, e.g., disease or no disease. In a classification task, the choice is among any number of possible classes, e.g., normal tissue types or disease conditions. Conventional B-scan ultrasonography is currently limited by considerable variation in observer performance of such tasks, particularly in the case of liver disease [1,2]. However, recent studies have shown that quantitative ultrasonography, i.e., acoustic and image parameter estimation, can improve the diagnostic performance of an ultrasound exam [2,3]. In quantitative ultrasound, parameters are estimated from the acoustic data to form a tissue signature that, in an ideal implementation, would uniquely define the tissue type and state of health. Past efforts to quantitatively differentiate between normal and diseased tissues using 0X1-7346/86

165

$3.00

Copright 0 1986 by Academic Press, Inc. All rights of reproduction in any form reserved.

INSANA

ET AL.

single tool)

ultrasound parameters have not produced a consistent diagnostic suggesting that a combination of measurement parameters or features may be required to uniquely each tissue identify encountered _ type Obviously , the best features for classifying tissues have a large variability between or among the different tissue types and a small variability within a tissue Likely candidates are parameters that type. describe physical properties of the tissues, such as ultrasonic attenuation, speed of sound, and scattering. These have demonstrated clinical potential effectiveness beyond that of conventional ultrasonography by revealing quantitative information that may be hidden or otherwise unavailable to the viewer of the image. Recently, a number of papers have appeared that describe multiparameter methods for characterizing liver and breast tissue directly from purely statistical features of the B-scan texture [2,4-83. The genera 1 approach has been to first measure a large number of statistical properties or features for a region of interest, typically between forty [2] and several hundred [4], for a population of patients with known disease states. The multivariate methods of statistical pattern recognition have been used to reduce the feature space by selecting the least correlated features which best discriminate among the various disease classes of the population. Invariably, these investigations have revealed that fewer than ten image features were required to accurately classify the tissue states of interest, suggesting that the intrinsic dimensionality of the tissue characterization problem (at least for liver and breast tissue!) is probably between three and ten. Examples of image features that were found to be significant for classifying disease states are gray level mean, and measures of correlation, entropy and Unfortunately, this skewness. approach does yield a interpretation of not readily physical the statistical features selected. We have recently described a method of obtaining a tissue signature from the statistical properties of the acoustic speckle that describes the structure of the organ scanned. This work is an extension of our investigations into the statistics of the radiofrequency (rf) echo signal, its envelope (B-scan or magnitude signal), and the squared envelope (intensity signals) for simple and more complex [lo-121 scattering 191 media found in diagnostic imaging. Three parameters were derived for use in grading subtle changes in image texture 1131 and machine detecting low-contrast lesions [ll]. In this paper we describe methods of optimizing performance and automating the use of a multidimensional The goal of our analysis is to determine the smallest parameters required to provide the physician with objective criteria to detect and classify the presence of II. 1.

the diagnostic tissue signature. number of tissue physically-based disease.

METHODS Describing

tissue

structure

from

image

texture

Histological studies have shown that tissue scatterers vary in size and shape, and that the different structures have varying degrees of spatial order. The simplest biological scattering medium is unclotted blood which is completely disordered, consisting of randomly distributed Rayleigh sea tterers. At the other extreme is the very complex anisotropic structure of skeletal muscle tissue. This tissue is highly ordered, with nearly periodic scatterers that repeat over a long range. The organization of scattering structures for most biological media fall somewhere in between blood and skeletal muscle. In this analysis, soft tissues are represented

166

PATTERN RECOGNITION METHODS IN ULTRASOUND

(random\ (IdI CLASS

pizq /

I long

range

\ order

short (e.g.,

CLASS

/\ pizq

as an scatterers

Schematic

acoustically of three

diagram

uniform classes

III

(Is) CLASS

1.

order vessels)

/pFLYizq

(Ts,vaAs)

Fig.

range blood

of

our

II model

medium in are positioned

of

a soft

which (Fig.

tissue

sets 1).

of

scattering

identical

medium.

discrete

Class I consists of small randomly positioned scatterers of sufficient concentration to give an echo signal with circular Gaussian statistics 1141, i.e., the echo signal is a complex, zero-mean Gaussian random variable in which the real and imaginary parts have equal variance. Approximately seven to ten scatterers per resolution cell are sufficient to claim A histogram of the gray-scale pixel Gaussian statistics to second order. values for the B-scan (magnitude) data will follow a Rayleigh probability density function (pdf) and the point signal-to-noise ratio (SNR 1, i.e., The the ratio of mean to standard deviation, will be 1.91 [9,1?]. histogram of pixels in the squared B-scan (intensity) image is distributed exponentially with a SNR of 1.0. The medium is entirely characterized by firs t-order statistics s&h as the average incoherent backsca ttered intenI sity from this diffuse tissue component, d’ Class II consists of small but nonrandomly distributed tissue scatterers with regular (quasi-periodic) and long-range order. This class of scatterer contributes a coherent or specular component to the echo signal that is spatially varying with mean ? and variance var(1 ). If both classes of scatterers, I and II, are pregent and if the dimer?sion of the regular structure is well below the resolution of the imaging system, then var(1 ) is zero and the SNR for magnitude and intensity are larger than if ozly class I scatterers’are present. The nature of the image texture is now more dependent on the properties of the medium. The medium can be described by two first-order statistical quantities, the average backscattered intensities I and f and the statistics of the speckle are described by a Rician pdf [pO,ll]. ” If the quasi-periodic class II scatterers are resolved, then an average scatterer spacing h can be estimated and var(I ) is greater than zero. A modif ied Rician pdf, that has been generalis;d to include a spatially structured specular signal, describes the statistics of the speckle and has been derived in [lO,ll]. Examples of specular scatterers in tissues with resolvable long-range order are the portal triads in liver parenchyma and the collagenous sheaths that surround muscle fascicles [16,17]. The parameters d and var(Is) are calculated from the second-order statistics of

167

INSANA

the intensity data necessary to uniquely in section IV.

ET AL.

and, when resolvable identify the medium.

structures This topic

is

are present, discussed

are further

Class III scatterers are nonrandom and specular, but with short-range order, such as organ surfaces and blood vessels. These short-ranged specular structures produce deterministic signals that must be eliminated from the data if tissue homogeneity is to be expected. A simple matched filter technique to automatically identify suspected blood vessels from the image data has been reported [ 12 1. Three scattering features that describe the tissue structure are measured from the autocorrelation and power spectrum, i.e., second order statistical propertires, of the squared B-scan signal. These are i, r = i /I and o: =var”‘(I )/Id and have been described in detail previously ]121.s P’o this three-dimznsional feature space we add a fourth feature the slope of the ultrasonic attenuation coefficient with frequency, ao , having the units dB/cm-MHz. This quantity is measured from the rf signals using a spectral difference method originally described by Kuc [la], with modifications to eliminate the effects of acoustic beam diffraction [19]. The ratio r is an indication class I and class II scatterers. first-order statistical measure. are principally second-order hand, structure of the sea ttering characteristic scale and magnitude 2.

Data

collection

and

of the relative scattering intensities of Like the attenuation coefficient, r is a The parameters 2 and IS: , on the other tissue features. They describe the tissue, in particular the average of specular scatterers.

processing

In this work, rf echo signals are recorded directly from a specially modified Diasonics mechanical sector scanner (Diasonics model DS-20). A The 3.5 MHz / 19 mm diameter transducer, focused at 8 cm, was used. the pulse is approximately Gaussian-shaped with a FWHM spectrum of bandwidth of 0.88 MHz, corresponding to a FWHM pulse length of 0.6 mm. An operator scanning the patient and viewing the real-time B-scan display monitor selects a region of interest (ROI) for analysis. and the rf signals for that ROI Logarithmic amplification is then disabled, Before recording the are digitized at 8 bits and at a rate of 22.1 MHz. signals, the operator has the opportunity to adjust the (linear) depth-gain The compensator to visually obtain a constant overall image brightness. four parameters are calculated off-line [20] from the average power spectrum for each ROI and the results are averaged. In this study, four to six ROIs were collected. The ROIs were approximately 4 cm in depth and 2 recorded during suspended cm in width (~20 data vectors) and were respiration. An intensity signal is calculated for each data vector recorded by In a precalculating the squared modulus of the analytic signal [14]. processing step, blood vessels are eliminated from each data vector when detected by matched filtering and the result is detrended to eliminate low The data frequency variance due to incomplete depth-gain compensation. vectors are multiplied by a cosine taper window [21] and zeros are added to interpolate the spectrum and obtain vector lengths which are a power of two The power spectrum is calculated for each vector in the for FFT analysis. ROI and averaged. III.

DETECTION

the

The specific smallest error

AND

CLASSIFICATION

clinical between

OF DIFFUSE

objective two classes

in

this of

168

LIVER

paper ultrasound

DISEASE is

to data

discriminate - 31 studies

with from

PATTERN

RECOGNITION

METHODS

IN

ULTRASOUND

a

1 0.8

0.2 I 0.8

I 1.0

I 1.2

I 1.4

I 1.6

I 1.8

I 2.0

0.2 I 0.2

I 0.4

I 0.6

zi (mm) Fig.

2.

I 0.8

I 1.0

I 1.2

I 1.4

Yl

Two-dimensional scatter diagrams of u’ VS. 2 for liver data (a). from 31 normal subjects (0) and 48 patients %ith chronic hepatitis Scatter diagrams of the first two principal components (A). (b). calculated for all four dimensions of the data population in figure The corresponding eigenvalues for y 2a. and y represent 71 percent of the total variance in the da k a. Discriminability between the two classes for the original and rotated feature spaces remained essentially unchanged.

individuals with no known clinical evidence of liver disease and 48 studies This data is from patients with clinically-proven chronic hepatitis. Initially we plotted in figure 2a for two of the four possible dimensions. apply principal components analysis to this training data set to study the Then, a statistical properties of each class of data in 4-space. discriminant function is calculated to partition the feature space into the By varying the decision threshold of the discriminant two classes. function, an ROC (Receiver Operating Characteristic) curve can be generated to compare the diagnostic performance of features individually and in combinations. Features are selected for a given diagnostic task based on a scalar index of performance, A which is the area under the ROC curve. A large A indicates good diagzn)os tic performance. The ROC results are compared= with another scalar performance measure, the Hotelling trace criterion. 1.

Principal

components

analysis

An efficient and effective tissue characterization program is one which reduces the dimension of its feature space to include only those features that contribute toward detection and classification of disease states. The feature space, of course, is defined by the number and range of parameters used in the tissue signature. Ideally, we’d like each feature to contain uncorrelated information about the tissues. Several highly correlated features may provide good classification performance while conveying essentially the same information due to the high degree of correlation. Principal components (PC) analysis is a tool for studying correlations between features and establishing confidence intervals for predicting where in feature space the members of a class may be expected to fall. The first principal component of a multidimensional linear combination of observed features that accounts It can also fraction of the total feature variance 1221. vector in the direction of the best least squares Line The second principal component is orthogonal to the first

169

observation is the for the Largest be described as a through the data. and is the linear

I 1.6

INSANA

ET AL.

combination that accounts for the next largest fraction of variance, and on. Each are found from the covariance matrix of the sample populations. PC analysis determines sets of features that are optimal for representing They are not optimal for distinguishing among the classes the data. data. This can be seen by plotting the four-dimensional patient data the plane of the first two principal components (Fig. 2b). We have used analysis to graphically visualize the degree of correlation between features. To begin, the tissue - signature is defined, where xl = d, x2 = assumed to be normally distributed

p(x)

where li population. given by

and

E are the mean vector and covariance The sample mean is -j; E and the sample -.

first combination Yl

The That

The quantity equation :p;;y’az”, The in the coefficients orientation original

(x-x)T> --

=

)‘>

< > and

vet

principal of

the T

= al

coefficient is, alsatisfies

data

x=

vector the

principal of of a_lare of the space. all

Figure 3 shows the corresponding the elliptical If the two roots

x.)

and tor

denotesT (x - El

expected is its

of

set

component measurement allxl

the in first

= cos

matrix of covariance

the features

+ a21x2

is al equation

an

value. corresponding

of

+ a31x3

observations

+ a41x4

eigenvector

of

By

1



a21

= cos

the

class of normal data two principal axes, yl boundary marks the contour Xl and A2 are equal the

170

2’

is

a

. the

that satisfies 1 denotes the The vector

e

convention, row vector,

f 2 1

covariance

matrix.

the determinant first principal ~1 is normalized

component is interpreted geometrically greatest scatter in the measurement fact the directional cosines that principal axis relative to the

0

the qarent matrix is

(2)

Al is the greatest eigenvalue The subscript ]A11 - s1 = 0. nd here I is the identity matrix. = 1. -1 -1

first direction

of in PC

1

2 where o.. = <(x. -2 the c6lu& (_x - z) ‘i’, its transpose. The

feature vet tor - x = (x, , x7 v x2, x3 = o’ , and x4 = oo. I Th> dgta each dfmension, i.e.,

=

s = <(x-X) --

linear

or r, in

so

as the data. determine axes in

axis The the the

-*-’

represented in two dimensions and and y . Along the major axis, y , (standard deviations). of i! .5o The variance ellipse ixlcircular.

PATTERN

RECOGNITION

METHODS

Fig.

1.0

1.2

1.4

1.6

1.8

IN

3.

liver data plotted in two of the four original measurement dimensions. 91 and 82 are the angles between the first princiand the original pal axis y axes, d an a o’, respectively. The &i.pse is the constant density contour containing a? percent of the probability mass for data that is normally distributed. Normal

2.0

d (mm)

in

the

direction

of

the

first

principal

axis

ULTRASOUND

is

given

by

1221

(6) where m is represents population the cluster

the the

in is

number of patients in the class. The )confidence interval 87 percent (1.50 selected arbitrarily. this two-apace an 81 was given by the mean vector, L.

contour for the

in figure 3 normal data The center of

figure 3 are nearly parallel to the original The principal axes in showing that the two features are weakly correlated. (The feature axes, correlation coefficient, p , between z and o’ was found to be 0.2.) Often the number of representative samples in th”e training sets is small and several of the features may be partially or entirely correlated: in this case, the covariance matrix may be ill-conditioned or singular. One method of improving the condition of the covariance matrix is to increase the size Another method is to reduce the order of the matrix of the training set. Since the principal by combining or eliminating correlated features. components are uncorrelated Linear combinations of the original features, features may be eliminated by using only those principal components y. with This way a Large number of corre’lated the greatest eigenvalues A.. features can be reduced to’s smaller number of oncorrelated features. Usually feature reduction is then performed on the uncorrelated features to avoid redundant information. We have used PC analysis to identify correlated features but have chosen to retain the unrotated feature space to allow for a more direct physical interpretation of the results. Previously a’ = var+(I r ‘were high$ (p < 0.1). 2.

Quadratic

4 instead of [12] we proposed the parameter v = var (Is)/: We replaced v with o: when it was discovere 8 that v and )/I . codrrelated (p = 0.7) whereas o ’ and r were Less correlated s

and

linear

discriminant

functions

A discriminant function is a rule for classifying observation into one of several classes [23,24]. The decision rules are developed from a training set by supervised methods . A training set of data is a set of known class membership. The form of the decision

171

a multivariate discriminants or what are called observations with rule depends on

INSANA

0.6

0.8

1.0

1.2

1.4

1.6

1.6

ET AL,

2.0

0.6

0.6

1.0

1.2

Fig.

4.

1.4

1.6

1.8

2.0

ii (mm)

d (mm)

(a). Normal (0) and chronic hepatitits (A) liver data in two dimens ions. The line is the decision boundary, Eq. (7), for 11’ = 0, i.e., equal prior probabilities and misclassification costs. The accuracy at this threshold is 83 percent. Same as figure 4a, (b). except that the data has been replaced by constant density contours.

characteristics multivariate liver tissue)

of the distribution of 2, which we normal. The two classes in our application and u2 (chronic active hepatitis).

We have used the standard different mean vectors 51 and This rule is quadratic in S2.

4(x-X,) --

assumed are o 1

Bayes classifier for two classes x and different covariance matrices -ht e data 2 and has the form

-

(x-ii,) --

have

- %(x-x,) --

T

-1 s2

IS11 (x-X,) --

+ $

In

__

\<

to be (normal

Sl

&’

with and

(7)

is21

Eq.(7) is a decision rule that assigns the measurement true, and otherwise assigns x toci . P(ci. ) and P(o probabilities that the patient-is no&al and i as hepatit where P(K~) + P(h!,) = 1. The term 1’ is the threshold

to class W 1 if ) are the prior 21 s, respectively, value for the

decision

cij

rule

and

is

equal

to

In

[::

2;

, where

z::]

2

is

the

cost

of

misclassifying responses from class j as those of class i. Eq.(7) is known as the Bayes decision rule for minimum risk since it was derived to minimize the expected cost of misclassifying data. As we shall see later, the performance of the classifier is studied for a range of R’ values. Therefore precise estimates of disease prevalence and misclassification The normal and hepatitis data are plotted in two costs are not critical. The line is the resulting quadratic of the four dimensions in figure 4. decision function of Eq.(7) for equal prior probabilities and costs, i.e., the right side of Eq.(7) is zero. Eq . (7) can be reduced to a classes have the same covariance matrices will be different and used: S = P(‘*.l)Sl + P(O*)S2. as

(x -1

-X)-2

linear function if matrix. In general, therefore an average The linear Bayes

T s-1 -

172

(ii1

we

assume that the two the sample covariance covariance matrix is classifier is expressed

+ ?F*;-2, ,<

a’

*

(8)

PATTERN

RECOGNITION

METHODS

IN

ULTRASOUND

Eq.(8) says to average the noise in the two classes and use the combined The first term in Eq.(8) is the covariance matrix to prewhiten the data. linear discriminant function, the second term is the balance point half way side is the threshold. If the between the means, Q(? + x ) 3 and the right discriminant functiorilis r2 ess than the balance point plus the threshold, It may be instructive to view the vet tor as a prewhitening matched filter [25], in which case

More decision

information making

can

T --mx

-

on be

the found

bmT(Xl+X2) 2-

\<

likelihood in [26-281.

function

In our experience with small patient patients per class), the linear and quadratic approximately the same net discriminability. 3.

Feature

,t’

.

(9)

approach

numbers Bayes

to

(currently classifiers

statistical

less

that have

50 given

selection

Features are selected based on diagnostic performance. The performance of a classifier can be evaluated for any combination of measurement features by testing data points for condition oj (positive for hepatitis) and measuring the true positive fraction (TPF) a&d false positive fraction (FPF). One possible summary measure is the diagnostic accuracy [29]. This in addition, the prior probability P(ci2) and is defined by the requires, equation A = A is positive negative

the

accuracy fraction fraction

(TPF)

p(ti2)

+ (1-FPF)

(1

- P(K,))

.

(10)

of diagnosis at a fixed decision threshold. (TPF) is the sensitivity of the test and (TNF = 1 - FPF) is the specificity of the test.

The the

true true

If A is measured using the same set of data that was used to train the classifier, obviously the results will be biased. Using another set of data with known clinical findings is a better test, but one that is not always available. As a compromise, we have chosen the “round robin” approach to performance estimation suggested by Castleman [30] when labeled data is at a premium. In this method, one patient point is withheld from the calculation of the decision rule. The withheld point is then classified from the result and the score is kept. When this is done iteratively for each point in both classes, the overall performance of the set of measurement parameters is estimated for a set of decision thresholds. For example, the accuracy of the classifier shown in figure 4 is 83 percent. The decision boundary curve in figure 4 results when Eq.(7) is an equality and 1 ’ = 0; i.e., equal prior probabilities and misclassification costs. Accuracy depends on disease prevalence P(w2) [31,32]. 90 percent of all patients tested do not have disease, and the decision, regardless of the data, was always highly biased test would be 90 percent accurate! 4.

ROC analysis

the

The performance accuracy over

of the

the range

tissue parameters may of possible decision

173

If, i.e., negative,

be studied thresholds.

for P(OJ,)

example, = 0.10, then this

by measuring This can

be

INSANA ET AL.

Normal -2.0

.oi

Deviate,

-1.0

.05 .I

z(FP)

0.0

.3

1.0

.5 .7

2.0

.9 .95

ROC curves for a) all four ultrasound parameters considered ) and the pairs b) (U,r, (d, o'pb~~8 c) (r o#). A is the agea under thi OC cu&e (on a linear scale [29]) and has the range 0.5 6 As 6 1.0.

.99

VW

accomplished by varying the right side of the decision function, Eqs. (7), The actual level of the decision threshold will depend on the (a), or (9). task and the penalties for a wrong answer. In one screening procedure, for example, where the prior probability for disease may be small, the false positive fraction should be kept small and therefore a strict threshold is used [33]. When testing a high risk population, one may wish to relax the strict threshold and accept a higher false positive fraction to be sure of detecting as many of the positive cases as possible. Varying the right side of the decision function will generate a family of (TPF, FPF) pairs which span the range of operating thresholds, and ROC (Receiver Operating Characteristic) curves are a convenient way of displaying these results. ROC curves for three combinations of the four ultrasonic features are given in figure 5. These curves are plotted using a probability scale that is linear in the normal deviate [29]. With this scale, the ROC curve is generally a straight line [43]. A single scalar index of performance is the area under the ROC curve, A , evaluated when the data is plotted on a linear (not probability) scale. $alues for A range between 0.5 (guessing) and 1.0 (perfect discriminating) and may be dgscribed as the sensitivity of the tissue signature averaged over all specificities [32]. As values [34] are plotted in figure 6 for all possible combinations of the four measurement features, (a,r,o' s9~oL 5.

Hotelling

trace

criterion

trace

Another measure of overall criterion (HTC) [35-371.

diagnostic The HTC is

J=tr where

SW is

the within-class

between-class is

the expected

the means, of

scatter

the matrix

vector

and is within

, scatter

matrix,

matrix,

by s

(11)

S

Sb

of the mixture

given

performance is the Hotelling expressed by the scalar quantity

of all

=iilP&iS$

the brackets.

174

L classes, The operator

i.e.,

the mean of

tr { } is

the trace

PATTERN

1 Feature

RECOGNITION

METHODS

IN

ULTRASOUND

HotellIng

trsce critwbn,

J

Ares under ROC cum,

A,

I I

2 Features

i

[email protected] 0

3 Features

4 Features

i

0.6

a

r

0

d

0,a

r,a

dl,a

r,o

d,o

d,r

r,o,a

d,o,a

d,r,a

d,r,o

d,r,o,a

Features Fig.

Two summary measures considered IJarameters with chronic hepatitis. indicates one standard

6.

of the performance of the four ultrasound to discriminate between normals and patients The error bar on the A measurement deviation. a-a oJ o Z%‘, and d z z. Note: s

The HTC may be considered as a generalized SNR2 to be used in selecting the best features for classification. .I will be large when the difference in the class means is large as compared to the within-class variability in Therefore, features may be selected by maximizing J. One the data. criterion for an economical reduction in dimensionality involves finding a subspace k, where k 6 n and n is the total number of features, such that the

sum

in any simply simple signature provides

of

the

eigenvalues

of

in

SikSbk

the

subspace

is

larger

than

the

sum

other k-dimensional subspace. Experimentally, this can be done by for all proposed tissue signatures. HTC is a comparing J values method of the intrinsic separability of a tissue evaluating that is faster and easier to calculate than ROC curves, but less detailed information.

Unlike ROC analysis, HTC may be extended to the L >, 2 class discrimination probl m. The HTC is simply the multi-c21ass generalization I of the Hotelling T statistic, and the Hotelling T statistic is the multivariate generalization of the univariate Student t test statistic The evolution of the T* statistic to the HTC is straightforward, but [Xl. no such direct evolution to the multi-class discrimination problem appears imminent in applications of ROC analysis 129).

to

Fukunaga the minimum E

the

Values four

,<

[P(c,)

(231 and probability P(ti,)14

of J are measurement

Barrett of

et error

exp[-J/(8 plotted features.

al. for

[ 36 ] have shown classification

P(cl,) in

figure

can the

be related inequality

P(u2))l. 6 for

175

that J E by

all

possible

combinations

of

INSANA

IV.

ET AL.

DISCUSSION

We infer from the results of figure 6 that the of this diagnostic procedure is equivalently summarized index J or by the binormal ROC area A . A fundamental and J may be derived for the twos class case when A dfs tribu tions multivariate normal This are ]441. expressed in terms of the error integral

A

z

overall performance by either the HTC relationship between the underlying data relationship can be

=

(12)

The index J has a range of 0 to - , corresponding to a range to 1.0. We have compared our measurements of A with those Eq. (12) in which measured J values are used. %he agreement measured and predicted values was very good, well within the deviation error bar on A z (Fig.6).

in As of 0.5 predicted by between the one standard

The HTC is an overall measure of class separability that is well suited, and presently used [37], to design diagnostic sys terns. If one wishes to study and compare performance over a broad range of decision thresholds, then ROC analysis is ideal. To illustrate this point, consider the data in figure 6. If resources are limited to measuring one feature, the obvious choice for detecting chronic hepatitis is the average scatterer spacing, d. IncludE a:, r and ao in the tissue signature may increase overall detectability by an amount that may be compared with the added cost of making the additional measurements. Intuitively, one might expect the performance to improve, or at least remain unchanged, by making additional measurements on the tissue. However, as Devijver [45] points out, the overall detectability may decrease by including a correlated feature with marginal or no discriminability. This behavior was not observed in our data to any significant degree since the correlation coefficients were all less than 0.3. Now examine the ROC curves in figure 5 and note that curve (a) is for curve (b) is for (a‘,o’,& and curve (c) is for (r,c1o). features (d,r,UL,co), Including the features (r,ao) in the signature has little effect at low false positive fractions but does add to the detectability at larger FPFs. This information would be missed with such summary measures as As or J. The ROC curve can diagnostic performance. dependent. When studying results have indicated discriminating features.

be very important in the complete evaluation of As with any analysis, the results are highly task other patient populations, the preliminary that the other parameters can be the dominant This is the topic of a future report.

The performance of a classifier is also strongly affected by the ratio of the feature space, n. of the training set size, m, to the dimensionality It has been shown for the two class problem that if the classifier is to yield meaningful generalizations beyond the data, the ratio m/n must be greater than two [38], and where possible on the order of twenty 1391. For m/n is greater than or equal to twenty. data considered in this paper, results are a function of m/n and Jurs [40] points out that classification In his example for m/n = 5, the should be cautiously interpreted. of the training set probability is one-half that 77 percent of the members A less bias as a result of chance alone. will be correctly classified Given the estimate of the performance may be found with increasing m/n. the importance of optimizing the economics of increasing the training set, dimensionali ty becomes evident.

176

PATTERN

RECOGNITION

METHODS

IN

ULTRASOUND

We searched for important characterization features by investigating the strong and weak points of human observers of textured images for detecting and classifying disease. Based on the work of Julesz [41], it seems that human Burgess et al. [42], and our own simulations 1131, observers are very efficient at discriminating differences in first-order image brightness, while their efficiency is much image properties, e.g., Lower for higher-order detection tasks. Therefore, image processing using second-order statistics, e.g., correlation properties of the image, might offer new information. This hypothesis is consistent with our data for normal-hepatitis discrimination. Hepatitis cannot be consistently diagnosed from B-scans, possibly because the viewer does not have full access to the second-order statistical information-in the image. Using our quantitative analysis, the second-order features (d, oi ) clearly outperform the first-order features (r,a, ). Psychophysical experiments are currently under way to better understand the role of second-order statistical properties in conventional diagnostic ultrasonography. Using exclusively first-order statistical measures to characterize in normal Liver tissues often gives ambiguous results. For example, we have found that the variance in the B-scan image from resolvable II scatterers (ordered structure) reduces the SNR.. from 1.9 to 1.7 falsely indicating a non-Gaussian medium, i.e., few scatterers resolution cell. Whenever class II scatterers are present in second-order statistical measures are needed to separate the variance (variance in the image that is due to regularly-spaced scatterers) from the classical Rician variance to obtain a unique The importance of using features that are associated with 1121. variables is emphasized in this case. V.

soft tissue class or Less, Per tissues, specular coherent signature physical

CONCLUSIONS

Pattern recognition techniques are an effective way to design and evaluate multivariate tissue signatures. For designing a tissue signature, the HTC was shown to be a fast and simple method of selecting the best based on maximizing measurements for detection and classification object-class separability. ROC analysis may be used to evaluate the diagnostic performance of the tissue signature for comparison with other diagnostic tools. We have applied this formalism to diagnostic ultrasound to automatically discriminate between normal liver and chronic active hepatitis, Tissue classification was based on a four-dimensional -~in vivo. Three of the four measurements are derived from the feature vector. firstand second-order statistical properties of the acoustic data and describe structural and scattering properties of the organ. The fourth is an estimate of the ultrasonic attenuation. When evaluating this feature vector as a tissue signature, we found that the second-order statistical properties of the image provides diagnostic information that is not otherwise accessible to observers. ACKNOWLEDGEMENTS The authors gratefully acknowledge the sustained efforts of Mary Ann Russell at the National Institutes of Health in acquiring and managing the growing number of patient data. We also wish to thank Charles E. Metz and colleagues at the University of Chicago for sharing with us programs for statistically analyzing ROC curves and Harry Barrett for many helpful discussions. The mention of either an actual or of Health and Human

commercial implied Services.

products endorsement

177

herein of such

is not products

to

be by

construed the Department

as

INSANA ET AL.

REFERENCES Gosink, B.B., Lemon, S-K., Scheible, W., and Leopold, G.R., Accuracy [ll of ultrasonography in diagnosis of hepatocellular disease, - AJR d133 19-23 (1979).

[21

Raeth, U., Schlaps, D., Limberg, B., Zuna, I., Lorens, A., van Kaick, G * , Lorenz, W.J., Diagnostic accuracy of and Kommerell, B., computerized B-scan and conventional ultrasonotexture analysis graphy in diffuse parenchymal and malignant liver disease, --J. Clin. Ultrasound 13, 87-99 (1985).

131

Insana, M.F., Wagner, R.F., Garra, B.S., Statistical Approach to an Expert Diagnostic (1986). --Proc. SPIE, Vol. 626, pp. 24-29,

[41

Finette, S., Bleier, A., and Swindell, W., Breast tissue classification and recognition using diagnostic ultrasound pattern techniques: I. Methods of pattern recognition, Ultrasonic Imaging 55-70 (1983).

and Shawker, T-H., Ultrasonic System,

A in

5,

[51

Finette, S., classification techniques: (1983).

Bleier, A.R., Swindell, W., and Haber, K., Breast tissue usinn diagnostic ultrasound and nattern recognition II. Experimental results, Ultrasoni; Imaging 5; 71-86

161

Nicholas, D., Nassiri, characterization from 12, 135-143 (1986).

171

Cloostermans, M.J.T.M., Mel, H., Verhoef, W.A., and Thijssen, In vitro estimation of acoustic parameters of the liver Med. correlations with histology, Ultrasound Biol. 12, (1986).

[81

Crawford, applications placenta,

191

Wagner, R.F., Smith, S.W., Sandrik, J.M., and Lopez, H., Statistics of speckle in ultrasound B-scans, ---IEEE Tran. Sonics Ultrason. SU-30, 156-163 (1983).

ilO

Wagner, R.F., Insana, M.F., and envelope frequency (submitted).

and Brown, D.G., detected signals,

[Ill

Wagner, R.F., Insana, M.F., detection and classification ultrasound, Optical Eng. 2,

and Brown, D.G., of speckle 738-742 (1986).

[121

Insana, M.F., Wagner, R.F., Garra, B.S., Brown, Analysis of ultrasound texture via T.H., statistics, Eng. 25, 743-748 (1986). --Optical

[131

Wagner, R.F., Insana, Texture Discrimination pp. 57-64 (1985).

1141

Goodman, 1985).

D.K., ultrasonic

Garbutt, B-scan

P., and Hill, data, Ultrasound

C.R., Tissue --Med. Biol.

D.C.,

Morris, D.T., Fenton, D.W., and Pryce, W.I., images of digital analysis of ultrasonic Ultrasound Med. Biol. Q, 79-84 (1985). --

J.W.,

Statistical

The statistics - J. Opt. Unified texture

Optics,

178

(John

Wiley

Possible of the

&

and

of radio *

approach to the in diagnostic

D.G., and generalized

M.F., and Brown, D.G., Progress in Medical Imaging, in Proc.

J.M., and 39-51

Shawker, Rician

in Signal SPIE, Vol Sons,

and 535,

New York,

PATTERN RECOGNITION METHODS IN ULTRASOUND

[16]

Insana, M.F., Wagner, R.F., Garra, B.S., and Smith, S.W., Identification Shawker, T.H., and diffusely scattering structures in phantoms via generalized Rician statistics, (1985) (Abstract only).

[17]

Shawker, T.H., Garra, B.S., Insana, M.F., Wagner, R.F., Stong, G.C., of tissue texture in human skeletal muscle: and Jones, B., Detection Preliminary results of an in vivo and in vitro study, Ultrasonic Imaging 8, 71-72 (1986) (Abstract only). ~-

118)

Kuc, R., Clinical application of an ultrasound for liver pathology coefficient estimation technique tion, ----IEEE Trans. Biomed. Eng. BME-27, 312-319 (1980).

[L9]

Insana, M.F., Zagzebski, J.A., and Madsen, spectral difference method for measuring Ultrasonic (1983). --Imaging 5, 331-345

[20]

paper are targeted for Although the methods described in this off-line processing, a high speed device for estimating the statistical properties of B-scan or intensity images has been proposed at real-time rates. U.S. Patent Application filed November 18, 1985.

1211

Bendat, J.S. and Piersol, A.G., Measurement Procedures, Chapt.9, 1971). Morrison, D.F., Hill, New York,

[23]

Fukunaga, K., Chapts. 3,4,9,

[24]

Morrison, (McGraw-Hill,

[25]

Wagner, imaging

[26]

Van Trees, 1, Section

127) [28]

1291

1301

Multivariate 1967).

D.F., Multivariate New York, 1967).

R.F., and systems, -.

Whalen, A.D., Detection Press, New York, 1971). Green, physics,

Methods,

Analysis

Signals

and York,

New

Chapt.

7, (McGrawRecognition,

Methods,

Chapt.

Unified SNR analysis 30 489-518 (1985).

of

medical

and Modulation York, 1968).

Theory,

New

in

Noise,

6,

Chapt.

4,

Vol.

(Academic

-

D.M. and Swets, J.A., Signal Chapt. 1, (Robert E. Krieger

Detection Theory and PsychoPub., Huntington, NyT974).

and Pickett, R.M., Swets, J.A. Evaluation of Methods from Signal Detection Theory, Chapel, -~-New York, 1982r Castleman, K.R., Digital Inc., Engelwood Cliffs,

attenuation characterisa-

E.L., Improvements in the ultrasonic attenuation,

Statistical Pattern York, 1972).

Estimation, and Sons, of

T.J., Stong, G.C., of periodic, specular, skeletal muscle and TM Ultrasonic Imaging 7, 87

Statistical

Brown, D.G., -Med. --Biol.

H.L., Detection, 2.2, (John Wiley

Hall,

-.Random - Data: (Wiley Interscience,

Statistical

Introduction to (Academic Press>ew

scans,

Trans.

Burckhardt, C.B., Speckle in ultrasonic Sonics Ultrason. SU-25, 1-6 (1978).

1221

B-mode

--IEEE

[15]

Image Processing, NJ, 1979).

179

p.

Diagnostic Systems: (Academic Press, 323,

(Prentice-Hall,

INSANA ET AL.

[31]

Kundel, H.L., Investigative

[32]

Wagner, R.F., Fundamentals Theory in Imaging, in --Proc.

[33]

Mets, C.E., Basic principles (1978). -Med. 8, 283-298

1341

A values were calculated using a FORTRAN program ROCFIT modified by C?E. Metz, P.L. Wang, and H.B. Kronman, at the University of Chicago from the program RSCORE II written by D.D. Dorfman. RSCORE II may be found in appendix D of [29].

[351

Gu, Z.H. criterion

[36]

Barrett, Theory,

1371

Smith, W.E. and Barrett, H.H., Hotelling of merit for the optimization of imaging e, 717-725 (1986).

[38]

Cover, linear Trans.

[39]

Tou, J.T. and Gonzalez, R.C., (Addison-Wesley, PP. 186-187,

1401

Jurs, P.C., in analytic

[41]

Julesz, B., interactions,

[42]

Burgess, Efficiency (1981).

[43]

Disease Radiology

prevalence and radiological 11, 107-109 (1982).

decision

and Applications SPIE, Vol. 626, pp. of

and Lee, S.H., Optical for image classification, H.H., Myers, K.J., in --Proc. SPIE, Vol.

of Signal Detection 765-761 (1986).

ROC analysis,

implementation --Optical

making,

Seminars

Nuclear

of the Hotelling trace Eng. 2, 727-731 (1984).

and Wagner, R.F., 626, pp. 231-239

Beyond Signal (1986).

trace criterion systems, ---J. Opt.

Detection as a figure Sot. Am.

T.M., Geometrical and statistical properties of systems of in equalities with applications to pattern recognition, IEEE Electronic Computers EC-14, 326-334 (1965).

Pattern recognition chemistry, -Science Textons, Nature ~-

Pattern Reading,

Recognition MA, 1974).

Principles,

used to investigate (1986). -.232 1219-1224

the elements of 290, 91-97 (1981).

texture

A.E., Wagner, R.F., Jennings, of human visual discrimination, Also see J. Appl. Photog. Eng. 5,

multivariate

perception,

R.J.,

data and

and Barlow, Science 2,

their H.B., 93-94

76-780.

Swets, J.A., Form of empirical ROCs in discrimination tasks: Implications for theory and measurement Psychological Bulletin 99, 181-198 (1986).

of

and diagnostic performance,

[44]

Private communications: C.E. Mets to R.F. Wagner, 1980; between H.H. Barrett and R.F. Wagner, 1985-86; C.E. Metz to H.H. Barrett, 1986; and Fiete R.D., Barrett H.H., Smith W-E., and Myers K.J., The Hotelling trace criterion and its correlation with human performance, J. --Opt. Sot. Am. (submitted).

1451

Devijver, Pattern

P.A., Statistical Recognition, K.S.

Fu,

pattern recognition, ed. (CRC Press Inc.,

180

in Applications Florida, 1982).-

of