The conceptual basis of ethnic group terminology and classifications

The conceptual basis of ethnic group terminology and classifications

Pergamon Soc. Sci. Med. Vol. 45. No. 5, pp. 689-698, 1997 Plh S0277-9536(96)00386-3 ~ 1997ElsevierScienceLtd All rights reserved. Printedin Great B...

970KB Sizes 0 Downloads 0 Views


Soc. Sci. Med. Vol. 45. No. 5, pp. 689-698, 1997

Plh S0277-9536(96)00386-3

~ 1997ElsevierScienceLtd All rights reserved. Printedin Great Britain 0277-9536/97$17.00+ 0,00

T H E C O N C E P T U A L BASIS OF E T H N I C G R O U P TERMINOLOGY AND CLASSIFICATIONS PETER J. ASPINALL South East Institute of Public Health, United Medical and Dental Schools (of Guy's and St Thomas's Hospitals), Broomhill House, Tunbridge Wells TN3 0XT. U.K. Abstractl-"Ethnic group" is a problematic variable in health-related research. While self-identification is now widely accepted as the appropriate mode of assignment, the impracticalities of a free response in the collection of ethnic group data mean that categorisation into a limited set of choices must take place. The substantial and increasing number of persons in minority ethnic groups who identify through non-standard responses emphasises the need to develop classificationsthat accommodate salient vernacular terminology. The use of informants in cognitive settings and the monitoring of open-ended responses appear to offer the best way of determining which group labels to employ. The recommended approach addresses the research priorities for accurate, consistent, and high quality data and is also responsive to the dynamic nature of ethnic group and the growing ethnic diversity of the population. ~; 1997 Elsevier Science Ltd Key words--ethnic group, terminology, classifications,self-identification,census

INTRODUCTION It is now five years since Bhopal et al. (Bhopal et al., 1992) invited researchers to debate the appropriateness of ethnic group terminology within the context of scientific writing. McKenzie and Crowcroft have also argued that "A thorough investigation of the validity of current classifications is urgently needed" (McKenzie and Crowcroft, 1994), more recently contributing with the British Medical Journal a set of guidelines covering usage of ethnicity, race, and culture (McKenzie and Crowcroft, 1996; B M J , 1996). Much of this pragmatic advice has been useful but there has been a lack of focus in the literature on linking the ethnic group categories used with the conceptual thinking implicit in the classifications. A pursuit of this issue is now timely for a number of reasons. During the last few years the research community has had unprecedented access to small-area statistics on the socioeconomic position of ethnic groups in Great Britain through the inclusion for the first time of an ethnic group question in the 1991 Census. In addition researchers have been able to utilise individual and household data released as Samples of Anonymised Records (SARs) (Marsh, 1993), ethnic group and a range of other variables from the 1971, 1981 and 1991 Censuses and vital statistics for the same individuals in the longitudinal study, and specially customised abstracts of output, all of which have produced a challenging body of analytical work on the construction of ethnic group in the 1991 Census (Holdsworth and Dale, 1996; Dale and Holdsworth, 1995; Berrington, 1996). Over the same period the Department of Health

has, amongst several initiatives, introduced the mandatory collection by providers of ethnic group data on admitted patients (NHS Executive, 1994) and commissioned research on the practicality of collecting this information through general practitioners. This activity has given rise to a burgeoning interest in ethnic health issues by public health practitioners and a growing sub-specialty of ethnic health researchers which has, arguably, outpaced the critical and considered examination of the terminology used to identify ethnic groups and its conceptual basis. The fact that the U.K. is now at the midpoint between censuses and actively planning for the 2001 enumeration--with new classifications under consideration (Aspinall, 1996)--presents a situation of both exigency and opportunity. In addition, the Government Statistical Service's Committee on Surveys of Persons and Households is now involved in work on harmonising questions for Government Social Surveys, classifications used in outputs from these surveys, and of definitions used in administrative resources (Government Statistical Service, 1995). A similar initiative is the Definitions Projects, currently being undertaken by the Scottish Social Work Inter-regional Statistics Group on behalf of the Confederation of Scottish Local Authorities and the Scottish Office, which is seeking to achieve nationally agreed classifications (Stead, 1996). Clarification of these classification issues, particularly with regard to the production of appropriate input and output categories and access to consistent and compatible denominator data, is vital for the community of public health researchers



Peter J. Aspinall

given the very substantial and growing volume of published research on ethnic health issues. While this paper primarily addresses the British experience, it is important to acknowledge that similar problems face the U.S.A. and other societies. There is, for example, an increasing U.S. literature on the topic, as American researchers and the federal government attempt to formulate a strategy for the approaching 2000 Census. Census agencies in Canada and Australia are also reviewing the categories they use to collect and present ethnicity data. SOCIAL G R O U P C A T E G O R | S A T I O N

At the centre of their debate Bhopal et al. located the need to establish a set of principles guiding definition and description of ethnic minority identities in any country so as to help achieve precision, familiarity, acceptability, and accuracy in the terminology used. The requirement to marry the need for scientific clarity with acceptability to those described constitutes the essential challenge in our efforts to derive appropriate terminology. It reflects the two prevailing strategies used in the social sciences for social group categorisation. Firstly, ethnic group identity can be established by eliciting respondents' categorisations of themselves or of their family members through their own selfdescriptions or terms that they regard as appropriate, whether or not observers find them to be ambiguous or contradictory. Secondly, such identities can be established upon the basis of phenomenal distinctions judged appropriate by a community of scientific observers, their adequacy being " . . . ultimately a matter of the extent to which they contribute to the construction of cross-culturally testable hypotheses and theories" [emphasis added] (Harris et al., 1993). Arguably, such categorisations are not falsified if they are rejected or deemed unsatisfactory by the respondents themselves. However, in practice, this standard of leading to "testable hypotheses and theories" may be difficult to attain, in the context of our societal prejudices about race and ethnicity (Kaufman and Cooper, 1995). The difficulties of harmonizing these two approaches and the failure to understand the distinction between them lie at the source of much of the utilisation of inappropriate or unsatisfactory terminology. THE PRINCIPLE OF SELF-IDENTIFICATION

The majority of ethnic group data collection systems, including national censuses, in the industrialised Western world now recognise the principle of self-identification. Some observers have even argued that provision for individuals to categorise themselves and their children according to their own sense of identity is a matter of civil rights (Davis,

1991). Observer-assignment was abandoned in the U.S. Census in 1971 (Lee, 1993) and in the United Kingdom General Household Survey (OPCS, 1989) and British Crime Survey (NOP/SCPR, 1989), the only official surveys that used this method, in 1986 and 1988, respectively. The position that the individual is, inalienably, the arbiter of their ethnic group gains credence from the nature of what constitutes ethnic group or racial identity. It is now almost universally accepted that such concepts are socially constructed and that individuals' choices with regard to terms are the outcome of appraisal of many factors, including ancestry or origins, physical characteristics, religious beliefs, culture, and language use, but also their relationship to the host culture. Such choices are intrinsically subjective and impenetrable by observation. Moreover, the term "ethnic group" invokes a sense of belonging or group identity, determined by social pressures and psychological needs. This perspective clearly argues against observer assignment. However, assignment by such means to broad categories may have some advantages. A large percentage of observed ethnic group differences in disease and mortality is likely to be mediated by discrimination, differential socio-economic status and other forms of racism encountered by ethnic minority groups. One might argue, therefore, that self-definition is not as consequential as perception by the wider society. Self-definition may become important as a determinant of psychological state and coping mechanisms (Neighbors et al., 1996), but the structured disadvantages experienced by minorities may be impervious to variations in self-description. Interviewer observation, therefore, may be a better predictor of group differences in health outcomes than self-definition, despite the fact that agencies would prefer to yield to individuals the right to define themselves, wanted by the majority of the population (Pringle and Rothera, 1996). Choices about ethnic group are known to be dynamic and, in some cases, proximate. "Ethnic group" is clearly different from the more restrictive "ethnic origin" which tbcuses the question back in time and conveys an historical and geographic context, although the terms are sometimes treated synonymously (Government Statistical Service, 1995). Again, the clarity of the concept may make it a better indicator of health differences than ethnic group for some people. While some agencies still favour concepts based on "'family origin" (Smith, 1991), ethnic' group has become the pervasive term. Given the centrality of self-identification to ethnic group categorisation, it can be argued that data collection should be based on or, at least, sensitive to, salient vernacular terms if it is to avoid producing artefactual data of questionable meaning. If this principle is accepted, it clearly depends on agencies knowing what constitutes appropriate terminology from the viewpoint of respondents.

Ethnic group terminology Bhopal et al., for example, declare that the use of the term "Asian" is not a self-description of the peoples of the Indian subcontinent, although some recent survey and focus group research indicates that persons of such origins in Britain do use this term (and also "British Asian") as the identifier of choice in an unprompted context (Pringle and Rothera, 1996; Mortimer and White, 1996). In North America, the terms "African American" (and, to a lesser extent, "Afro-American") and "lrish American" have emerged as descriptions widely used by, and acceptable to, these groups.* Terminology appears to develop through a complex process of experimentation, growing acceptability, and customary usage, so eventually gaining everyday familiarity. However, the precise and distinctive features of the cognitive system by which members of different groups express their ethnic identity is frequently poorly understood. In Britain the knowledge base in this area is more limited than in the U.S.A., being based primarily on the analysis of responses to survey questions devised by agencies. While such an approach can clarify acceptability of terminology, it is intrinsically reactive and inevitably embodies the value systems of such agencies, thereby locating the reference point for self-identification outside the individual's frame of reference. Moreover, the accumulated "investment" of resources in the process provides a justification for conservatism, tending to accord disproportionate value to historical knowledge. Only rarely has this approach been eschewed for one that seeks to explore with individuals their construction of ethnic/racial group and, by its nature, much of this research is based on small samples (Modood et al., 1994; Parker, 1995; Mortimer and White, 1996). SELF-DESCRIPTION Yet we know from a variety of sources that respondents attach importance to the principle of self-identification through self-description. In the 1991 Census 740,257 persons or one in four in ethnic group categories other than White in Great Britain eschewed the seven predesignated categories ("White", "Black-African", "Black-Caribbean", *In an interview survey of almost 60,000 households from different racial backgrounds, an opportunity for people to express their preferences for specific designations (identified in a series of cognitive interviews) revealed that the Black population is almost evenly split on "black" (44.2%) versus "African American" (28.1%) or "Afro-American" (12.1%). See Tucker et al. (1996). tUnpublished, from the Welsh Health Survey, undertaken by the South East Institute of Public Health for the Welsh Office. ~I am grateful to Nikki Bennett (Social Survey Division, OPCS) and Patricia Prescott-Clarke (Social and Community Planning Research) for providing information on the 1993 and 1994/1995 surveys, respectively.


"Indian", "Pakistani", "Bangladeshi" and "Chinese") and utilised the "Black-Other" and "Any other ethnic group" free-text fields to write in a description of their ethnic group. 70.4% of this number used the latter option unconstrained by a colour term (OPCS/GRO(S), 1993a). Survey evidence corroborates these findings. Of the 548 residents from ethnic minority groups in South East London Health Authority who responded to Healthquest South East (SE Thames RHA, 1993), 17% did not feel that one of the listed ethnic groups (the same as those in the Census with the exception that "Black-Other" did not offer a writein response) best described them and wrote in their own descriptions for the "Other" group. Of the 264 respondents from ethnic groups other than White in the Welsh Health Survey, 101 or 38% ticked the "Other" category with a free-text option (again in a 1991 Census classification but with no free-text field for the "Black-Other" category), 8l writing in a description, including 17 who included "Welsh" in the term used.t In the case of the 1993 Health Survey for England (which used the full Census classification), just over a fifth (21%) of respondents from ethnic groups other than White utilised the free-text fields (25% in 1994).$ Moreover, the proportions using free text fields appear to vary in time and space. While multiple cross-sectional data for Great Britain have not yet been accumulated, in the 1990 U.S. Census, for example, 9.8 million (3.9% of the population or 20.0% of the minority groups) reported in the "Other race" free text field, compared with 6.8 million (3.0% or 17.7% of the minority population) in 1980, an increase of 45.1% (McKenney and Bennett, 1994). However, there is strong evidence in British Census data of geographical variability. While 16.9% of persons in minority ethnic groups utilised the colour neutral "Any other ethnic group" free text field, the proportion across the 458 local authorities varied from 3.3% (Pendle) to 64.6% (Ross and Cromarty), over four-fifths of local authorities exceeding the Great Britain figure and two-fifths exceeding 32% (OPCS/GRO(S), 1993a). The extent of utilisation of this free-text field appears to be negatively associated with the number (Spearman Rank Correlation Coefficient rs = -0.54778) and proportion (rs = -0.53634) of persons in minority ethnic groups and the size of local authorities (rs = -0.39886), (p = 0.0001) (see Fig. 1). This very interesting result--that use of open response for ethnic designation is correlated with the density of minority ethnic groups in the community--shows that self-definition is contextual and that respondents change their self-definition in response to external conditions. This is an important point, both because it illuminates somewhat the psychological and social determination of self-definition and because it demonstrates a possible source


Peter J. Aspinall Scatterplot: 'free text responders' and ethnic minority population in Great Britain local authorities

80 00


=o •~


G,O o.

60.00 ~D M


5000 o L A s <80.000 pop

"~ g


{:]LAs 80.000 <120.000 pop

30 O0

& LAs 120,000+ pop


10 O0

0 O0







Ethnic minority population (log scale}

Fig. 1. Scatterplot: "free text responders" and ethnic minority population in Great Britain local authorities, 1991. of differential misclassification in health surveys with fixed response options only. In order to take proper cognisance of salient vernacular terminology, our limited stock of knowledge needs to be enhanced by a programme of research to elicit Ji'ee choice (unprompted) selfdescriptions through cognitive research and largerscale population surveys based on purposive sampling to establish representative opinion in the different ethnic groups and how such identifications map to Census categories. The self-descriptions provided in a number of health surveys show that respondents are readily able to articulate their ethnic group identity through the use of many different terms. The following dozen expressions are randomly drawn from the self-descriptions in the Welsh Health Survey: "European (I am not a colour)"; "Part African, Part Welsh": "Maltese": "Caribbean-Indian"; "Welsh": "'White-Muslim": "'Spanish descended Latin American"; "Kashmiri"; "Anglo-Asian": "White Father, Half Nigerian Mother": "'Sicilian Italian": and "Kenyan Asian". Responses to the "Black-Other'" free text field in the Health Survey for England (1995), totalling 48, indicate that about a third of respondents used the term "Black British"; the 308 responses to the "Other" free text field reveal a diversity of terms, including "'mixed race" (27), "mixed" (19), an identity comprising two named ethnic groups (46), and also frequent use of terms like "'Asian", "Japanese", "Oriental", "Eurasian", "Vietnamese", "British Indian", "Mediterranean" and "Sri Lankan". While the responses represent a "residual" group, unwilling or unable to express their ethnic group through the use of pre-designated categories, they indicate that the concept is a personal construction incorporating differing domains

of ethnicity. Pringle and Rothera's study (Pringle and Rothera, 1996) provides evidence that the way respondents articulate their ethnic group by selfdescription differs from the choices they make in a Census classification. When general practice staff asked patients to give their ethnic group without any prompting 851 (96.7%) offered a response. Significantly, when prompted with the Office of Population Censuses and Surveys" classification 855 (97.2%) of patients selected a category, but in only 236 (27.7%) of valid cases was this an exact or very close match to the self-reported description.

ETHNIC GROUPCLASSIFICATIONS While the principle of sensitivity to vernacular terminology should be paramount, ethnic group data collection systems must be responsive to the principle of self-identification. Clearly, a classification based on a limited set of designated categories is likely to amount to /orced selfidentification for many persons. Responsiveness may be construed as the provision of a set of categories which reflects the main ethnic group composition of the population and which does not ignore or override the distinctive features of the cognitive system by which individuals express their ethnic group identity. Clearly, since a classification cannot list all ethnic minority groups as separate categories, an integral part of any classification must be the provision of one or more free-text or self-description fields. Those who do not feel that their ethnic group has received official recognition through the provision of a listed category must have the right t~ provide a description of their identity. The denial of that right, effectively limiting the choice of respondents to pre-defined terms, is a violation of the

Ethnic group terminology principle of self-identification, marginalising those who do not belong to large or frequently recognised ethnic groups. The standard offered by the Department of Health for the purpose of the collection of ethnic group data on admitted patients was flawed in this respect (Aspinall, 1995; Stead, 1996), entirely omitting free-text fields in spite of offering as a working definition of ethnic group " . . . the individual's own perception of themselves in response to all the cultural and other factors making up ethnic group" (NHS Executive IMG, 1994). It is also a matter of concern that the "harmonised" ethnic group question for Government social surveys replaces the "Black-Other" and "Any other ethnic group" free text fields in the 1991 Census with the negatively defined predesignated categories of "Black-neither Caribbean nor African" and "None of these", respectively, eschewing write-in provision (Government Statistical Service, 1995). In addition to the 1991 Great Britain Census, the 1980 and 1990 U.S. Censuses properly permitted respondents to self-identify through the provision of free-text fields, as eurrentlv do other national surveys such as the Labour Force Survey, General Household Survey, and Health Survey for England. The value of free-text fields also lies in the scope they offer to users for monitoring the adequacy of pre-designated ethnic group categories in classifications. They give information on the numbers of persons who eschew the predesignated categories but also the terms people use when they write in a description. By looking at this information in a longitudinal perspective, we can identify the extent to which classifications are meeting the needs of the minority ethnic group population and the demand for new categories as expressed in the free-text fields. In the 1991 Census, for example, there is evidence that persons of mixed parentage wished to identify themselves as such and that many persons--over 88,000 in fact identifying in the two free-text fields used the term "British" in their descriptions (OPCS/GRO(S), 1993a). TECHNICALFEASIBILITY The collection of ethnic group data using terminology that is both acceptable and familiar is a necessary, but not sufficient, requirement. Technical feasibility is an issue that pervades the concerns of all those who collect or utilise ethnic group data, be they members of the scientific community or government agencies. Its constituency encompasses attributes like accuracy, coherence, precision, and exclusivity and is user-focussed. Nearly all ethnic group data collection has eschewed free choice (or unprompted) self-description--as exemplified by the type of question that was used for religion in the Northern Ireland 1991 Census--for a framework of predesignated and free-text options as a means of obtaining consistent information on distributions.


This is necessary from a scientific and policy point of view since a significant minority of the population do not have an understanding of "ethnic group" out of context and the views of individuals may vary widely on what characteristic(s) an ethnic group question is aimed at measuring. Moreover, an open-ended question would force agencies to aggregate similar responses into broader categories for analysis purposes. While this approach might offer users some flexibility in how they tabulate data, it is likely to have a negative effect on analysis when trying to compare results across administrative records and surveys. Further, if researchers allow a free response but then proceed to aggregate according to their own biases and assumptions, it is questionable whether the rights of self-definition of respondents have been maintained. Arguably. if someone is going to collapse these categories, it might be more accurately done by the respondents themselves by asking them to classify themselves into one of a list of possible responses that most closely represents their ethnic identification. However, people may behave differently when confronted with precoded options, since these categories become the reference point for how people self-identify, as demonstrated by Pringle and Rothera's study (Pringle and Rothera, 1996). In the mutually exclusive categorisation of the 1991 Census, for example, individuals wishing to declare their Irish ethnic group identity in the final "Any other ethnic group" free-text field would have had to forego the first precoded category ot" "White", a primary identifier for many of them. Given also the greater propensity for respondents in surveys to tick boxes rather than write in descriptions, the estimated 11,000 persons in Great Britain who did so undoubtedly substantially undercounts those who would have declared as Irish in an unforced choice, even if one takes into account the estimated further 20,000 people in Greater London who, ignoring the question instruction, ticked the "White" box on the Census form and also wrote in "'Irish" (these people being coded as White in the full classification) (OPCS/GRO(S), 1994a). For example, in a recent study by Hickman and Walter of Jirst-generation Irish immigrants, 59% of respondents said that they thought the Irish should be recognised as an ethnic group in Britain (Hickman and Walter, 1996). Moreover, in a survey by Ullah of second-generation Irish teenagers living in Britain, over half felt "half English, halt' Irish" and a fifth "'mainly Irish" (Ullah, 1985). The inadequacy of the Census count of persons born in Ireland is demonstrated by recent research findings that the health disadvantage extends to second-generation members of the group (Harding and Balarajan, 1996), the possible health selection biases among immigrants restricting any attempts to generalise other findings to the Irish ethnic population as a whole. The case for separately identifying the Irish


Peter J. Aspinall

is also strengthened by the Commission for Racial Equality's recent recommendation that it should be included as a predesignated category in ethnic monitoring systems. Clearly the value of classifications lies in the extent to which they accommodate the terminology of individuals' choice a n d yield scientifically meaningful groups. The tension is manifest in the Census categories provided for the Black groups: "BlackCaribbean", "Black-African" and "Black-Other". While terms of self-description include "West Indian", "Afro-Caribbean", "African" and "Black British", the first two Census categories capture groups of Caribbean and African origins who have distinctive socio-economic positions in British society, but at the expense of a residual free-text field of "Black-Other" comprising a diversity of selfdescriptions (British, Black/White, and other mixed) in a group of persons most of whom (85%) were born in Britain. While the Census categories provide greater precision than a term like "AfroCaribbean"--used variously to describe persons of African descent in or from the Caribbean, those who are black and of Caribbean ancestry, or people of African or Caribbean descent--they are not sensitive to emerging vernacular terminology and provide in "Black-Other" a group of questionable analytical value. Moreover, we do not know to what extent a "Black British" category might also have coherence in terms of socio-economic position and cultural practices. The Census categories also leave in an equivocal position those Caribbeans who are not of African origins. Trinidad, for example, is one of the most ethnically diverse populations in the Caribbean, a legacy of its colonial history, with 43% of African and 36% of East Indian ancestry, the remainder comprising many islanders of mixed ancestry and notable minorities of European, Chinese, Syrian and Lebanese people. In simple Census classifications, it becomes difficult to disentangle complex ethnicity, as encompassed by terms like "Indo-Caribbean", from mixed parentage. INCONSISTENT REPORTING

Whilst it has been argued that self-identification results in more consistent reporting of ethnic or racial group, particularly for persons of mixed parentage, than that recorded by observers (McKenney and Cresce, 1993), it is not free from such problems. Such inconsistency may result from changing ethnic identity, a phenomenon termed "ethnic flux" by Lieberson (Lieberson and Waters, 1986), or misreporting for such reasons as ambiguity in the respondent's perception or understanding of the concept in question. Moreover, there is recognition of the dynamic nature of the concept in the general population: a minority (49%) of 489 general practice patients thought that ethnicity

never changed (Pringle and Rothera, 1996). Inconsistency can be measured in several ways, such as comparing counts over time for a population group and, more rigorously, by comparing a response at reinterview with the original response from a census or survey for the same group of persons.

The Census Validation Survey (the follow-up survey carried out by OPCS to check the accuracy of the data collected by the Census) revealed a g r o s s e r r o r rate--that is, the proportion of times the response on the Census form was not the same as that given in the CVS interview--for the four main ethnic groups of "White", "Black", "Indian subcontinent" (including Pakistani and Bangladeshi), and "Other groups", of only 0.8% in a total sample size of 13,080 persons resident in households (OPCS/GRO(S), 1995). However, the gross error for ethnic group, excluding those who answered "White" in both the Census and the CVS--the vast majority of respondents in the CVS--was significantly higher at 13.2%. Inconsistent reporting is not uniformly manifested across the different groups. While 99.6% of those who identified as White in the 1991 Census so identified in the CVS, the proportion consistently identifying in the Black groups was 88.0%, in the Indian subcontinent groups 98.7%, and in other groups, 78.1%. Other national census reinterview surveys have found similar errors. An evaluation study comparing responses in the 1990 U.S. Census with those reported in a 1990 Census reinterview for identical persons revealed considerable inconsistent reporting in the American Indian category, most of the inconsistency among this group involving persons who identified as White in either the Census or the reinterview (McKenney et al., 1993). The context for such inconsistency is an increase in the frequency of the reporting of "American Indian" ethnicity, a response to a shift in the perceived popularity of this designation. While further empirical evidence is needed to resolve the question of whether free responses lead to greater inconsistency than fixed categories, research from the 1980 and 1990 U.S. censuses indicates high levels of inconsistent responses to the open-ended ancestry question and strong "example" effects.


So far the discussion of ethnic categorisation has concerned concepts which are inputs, that is, survey questions and answer categories. However, the issue also involves outputs, that is, analysis variables derived from the inputs. Users may derive different output classifications from a common set of questions by using different algorithms to reassign answers in free-text fields. Again, the issue of scientific validity is invoked. The algorithm used by


Ethnic group terminology OPCS assigns the write-in descriptions in the two free-text fields, identified by 28 codes, to output categories, comprising the seven predesignated categories, the "Black-Other" group, a manufactured "Other Asian" group, and a residual "Other" category. In the 1991 Census a total of 740,257 persons used the free-text field options, 110,329 or 14.9% of whom were reassigned from the category they ticked to produce the output categories (treating the "Other-other" group and manufactured "Other Asian" group as one) (OPCS/ GRO(S), 1993a). Only four groups--Indian, Pakistani, Bangladeshi and Chinese--contained counts in the output classification which were the same as those on the Census form. The legitimacy of reassignment of selected ethnic group coding categories, although explicit, can be questioned on the grounds that it compromises the principle of selfidentification. The treatment of persons of mixed ethnic group is instructive. In the Census persons descended from more than one ethnic group were invited to tick one group to which they felt they belonged o1" tick the "Any other ethnic group" box and write in a description of their ancestry. By the use of the algorithm the identity of the 230,000 or so persons who declared through self-descriptions as being of mixed origin is lost. In accord with the principle of self-identification, of those utilising the "Black Other" free-text field, almost all--nearly 25,000 "Black/White" and over 50,000 "Other M i x e d " - remain in that group. Similarly, nearly all the 150,000 or so persons of mixed origin utilising the ~kAny other ethnic group" free text field remain in a residual "other group" category, including almost 30,000 persons of ~Black/White" origin. Only persons of "mixed White" origin, less than 4000 in number, get assigned to the White group. Arguably, the fact that a large proportion of "mixed" persons in Great Britain, perhaps two-thirds, based on an average of 309,000 persons (1989-91) in the Labour Force Survey (LFS)'s "'mixed" group (OPCS, 1992) and allowing for some undercount, chose to declare their ethnic group in self-descriptions, should be recognised in a separate "mixed" output category, as they were in the LFS until its adoption of 1991 Census categories in 1992. Certainly, the group has a much more youthful age structure and is less concentrated in metropolitan counties than other minority ethnic groups. Moreover, cognitive studies that have explored self-identification amongst samples of young persons of mixed parentage suggest that they identify as "'mixed", rather than "Black" or some other group, under half of the 59 young people with one White and one AfroCaribbean or African parent in Tizard and Phoenix's sample stating that they thought of people of mixed-parentage, including themselves, as black (Tizard and Phoenix, 1993).

These concerns over data aggregation algorithms pose a dilemma. Statistics stratified by a myriad of small categories are of questionable utility, while those aggregated into large heterogeneous categories are of questionable validity. This set of trade-offs can be resolved in a number of ways. The use of categories based on cognitive research and smallscale testing and the provision of more such options in a single classification seem likely to reduce the number of persons who, unable to locate themselves in a specific category, resort to the use of free-text responses. Decision rules about aggregations of detailed categories based on these responses could be designed through open discussion to minimise cross-category reassignment, thereby preserving as far as possible the rights of self-definition of respondents.


Where Census ethnic group data are used as denominators, there is an obvious requirement for numerator data obtained from morbidity recording systems to be compatible. The Department of Health's national minimum standard for the mandatory collection of data on ethnic group of inpatients, by its exclusion of free-text fields, precludes comparison with Census statistics for those ethnic groups comprising free-text responses, notably, Black-Other and Other Asian (Aspinall, 1995). Moreover, the difficulties in aggregating more detailed local categories to the standard raises concerns about consistency of reporting and quality of data. Data on births and deaths obtained from registration systems in Britain are only available by country of birth (and there are currently no plans to change this), the lack of this essential information by ethnic group making the calculation of population estimates and projections for these groups impracticable. This means that Census denominator data will only be of value around the year of the enumeration. Similar issues with respect to Census and public health surveillance data have been addressed by researchers in the U.S. (Hahn and Stroup, 1994). These drawbacks clearly compound researchers' existing concerns about the quality of denominator data and compatibility with numerator data, notably, imputation in the 199t Great Britain Census of residents of wholly absent households (substantially higher in Inner London and for ethnic groups) (OPCS/GRO(S), 1993b) and underenumeration, especially for the male population in ethnic minority groups in the age bands 20-24 and 25 29, preliminary work by OPCS suggesting that undercoverage varies more among ethnic groups than is explained simply by their age, sex and geographical distributions (OPCS/GRO(S), 1994b).


Peter J. Aspinall


Finally, in reporting on ethnic groups, the aggregating of groups and the labelling of such aggregates with inappropriate idioms like "Asian", "Asian and Oriental", "black and Asian", and "'Asian and Chinese", to which might be added the term "Black", again compromise the principle of self-identification. King's Fund use the term "Black populations" to refer to people from racial or other minorities in Britain who may be disadvantaged because of their racial backgrounds, even though, as they readily acknowledge, there are people who do not identify themselves as Black but who share a common experience of racism (Gunarathnam, 1994). Equally, there are those who belong to minority ethnic groups who do not feel they are subject to racial discrimination and this idiosyncratic use of the word "Black" as an umbrella term has also been criticised by Modood (Modood, 1994) on the grounds that it falsely equates racial discrimination with colour discrimination. Indeed, much of the epidemiological literature of North American provenance employs rigid dichotomous (White/Nonwhite) or trichotomous (White/Black/Hispanic) categories, the former even finding its way into government reports (Leginski et al., 1991). The term "nonWhite" having widespread currency in scientific writing in the U.S. an EMBASE* search revealing 312 instances of its use in medical literature (frequently as a reporting category) by U.S. authors in the period 1981-1996--is now insinuating itself into publication in Britain. Not only does the term lump indiscriminately widely different ethnic groups but it also sets "White" as the standard, invoking an ethnocentric perspective. Other difficulties arise from inexactness of expression (e.g. "Asian") or the combination of terms that are not mutually exclusive (e.g. "Asian and Chinese"). Finally, there is much variation in the way in which ethnic groups are combined in British Government social and health surveys which makes comparison of findings problematic. For example, the General Household Survey reports on the "White", "Indian", "Pakistani and Bangladeshi", "Black Caribbean" and "Remaining groups" ("Black-African", "Black-Other", "Chinese" and "Any other group"), the Labour Force Survey "White", "Black" (comprising "Black-Caribbean", "Black-African" and "Black-Other"), "Indian", "Pakistani and Bangladeshi" and "Other" ("Chinese" and "Any other group"), and the Family Resources Survey "White", "BlackCaribbean", "Black-African", "Indian", "Pakistani and Bangladeshi" and "Other" (comprising "Black Other", "Chinese" and "Any other group"). Clearly, and the efforts by the Government *EMBASE database, Release 3.3.05 (Elsevier Science B.V.), accessed via the BIDS EMBASE Data Service.

Statistical Service to harmonise survey outputs on the ethnic group question are to be welcomed.

CONCLUSIONS "Ethnic group" is a problematic variable in health-related research. While an effective strategy for monitoring ethnic differences in health and disease is essential, the components of that strategy, particularly terminology and classifications, need to be clearly identified. The issue of how ethnicity is measured is crucial. Variables are either scaled or categorical and, given that ethnicity has no logical scale, it must be categorical. Furthermore, if ethnicity is to be used, it must be categorised either by the respondent or the researcher. A myriad small categories that would result from a completely open response is of no practical use. Moreover, collapsing a large number of freely coded responses into researcher-defined categories would compromise self-definition and jeopardise the comparison of results among data sets. Some aggregation is clearly required, even though this places limitations upon respondents who might want to classify themselves in some unique way. Given that categorisation into a limited set of choices must take place, it seems reasonable to suggest that respondents themselves are most qualified to pick the most suitable box to tick and that reliance on algorithms to reassign free text responses to broad categories should be minimised. The use of informants in cognitive research settings and of open-ended responses in order to determine which group labels to employ appears to provide the best way of developing classifications that accommodate the salient vernacular terminology of the different groups. In addition, there is no reason why agencies should be unduly parsimonious in the number of categories they provide. While there is no simple solution to the problem of how to classify persons by ethnic group, this strategy would appear to offer a number of advantages. It enables categories to be used that are sensitive to the way people identify themselves, thereby minimising non-response that results from lack of suitable options. Moreover, it addresses the problem of undercount for groups frequently forced to make choices that are inappropriate or represent only part of their ethnic identity. By such means the large and growing proportion of respondents from minority ethnic groups who currently choose non-standard responses to describe their ethnic group may be checked. Such responses do not provide an accurate count of the specific groups who utilise this method of identification and the use by agencies of different algorithms to assign free text responses reduces the analytical usefulness of the data. The approach is responsive to the dynamic nature of ethnic group identity and the growing ethnic diversity of the population.

Ethnic group terminology Clearly, there is n o single way o f classifying ethnic a n d racial groups. Given the m a n y settings in which m o n i t o r i n g a n d data collection take place, some flexibility in a p p r o a c h a n d in terminology is to be desired. F o r example, the utilisation o f categories used in decennial censuses m a y not meet the need for sociologically rich s u b g r o u p i n f o r m a t i o n required for the analysis of health risks. However, in m a n y health contexts, the analytical usefulness of the data must be m a i n t a i n e d across administrative records and surveys. Moreover, there must be compatibility with Census data, the source of denominators for m u c h of this work. These requirements focus the need u p o n the b r o a d e r issues of appropriateness of terms, consistency of responses, and the development of useful o u t p u t categories. They also point to an a p p r o a c h in which sensitMty to respondents" categorisations of themselves supplants standardisation in service to the d e m a n d s of historical continuity as the driving force.

Aeknowle~qements 1 am grateful to U.S. reviewers of this article for their very informative comments.

REFERENCES Aspinall, P. J. (1995) Department of Health's requirement for mandatory collection of data on ethnic group of inpatients. British Medical Journal 311, 1006-1009. Aspinall, P. J. (19963 The Development qf an Ethnic Group Question ./or the 2001 Census." The Findings of a Consuhation Exercise with Members of the OPCS 2001 Census Working Subgroup. UMDS (South East Institute of Public Health), London. Berrington, A. (1996) Marriage patterns and inter-ethnic unions. In EthniciO' h~ the 1991 Census. Vol. 1. Demographic characteris'tics oj' the ethnic mhloritv groups, eds D. Coleman and J. Salt, pp. 178 212. HMSO, London. Bhopal, R. S.. Phillimore, P. and Kohli, H. S. (19923 Inappropriate use of the term "'Asian": an obstacle to ethnicity and health research. Journal o/Public Health Medicine 123, 244 246. British Medical Journal (1996) Ethnicity, race, and culture: guidelines for research, audit, and publication. British Medical Journal 312, 1094. Dale, A. and Holdsworth, C. (1995) The construction of ethnicity in the 1991 British Census: evidence from the microdata, In Procee~fings qf the Annual Research Cotferenee ~/' the American Bureau of the Census. pp. 403-434. Department of Commerce, U.S. Davis F. J. (1991) Who is Black: One Nation's Definition. Pennsylvania State University Press, University Park, PA. Government Statistical Service (1995) Harnumised Questions lor Government Social Surveys, HMSO, London. Gunarathnam, Y. (1994) Health and Race. A Starting Point Jbr Managers on Improving Services ./or Bkwk Populations. King's Fund, London. Hahn, R. A. and Stroup, D. F. (1994) Race and ethnicity in public health surveillance: criteria for the scientific use of social categories. Public Health Reports 109(13, 7 15. Harding, S. and Balarajan, R. (19963 Patterns of mortality in second generation Irish living in England and Wales:


longitudinal study. British Medical Journal 312, 13891392. Harris, M., Consorte, J. G., Lang, J. and Byrne, B. (19933 Who are the Whites? Imposed Census categories and the racial demography of Brazil. Social Forces 72(2), 451-462. Hickman, M. and Walter, B. (1996) Discrimination and the Irish Community in Britain. A Report for the Commission for Racial Equalit3( Forthcoming. Holdsworth, C. and Dale, A. (1996) Ethnic Homogeneity and Family Formation: Evidence from the 1991 Household SAR. Occasional Paper 7. Census Microdata Unit, Manchester. Kaufman, J. S. and Cooper, R. S. (1995) Epidemiologic research on minority health: in search of the hypothesis. Public Health Reports 110, 662-666. Lee, S. M. (1993) Racial classifications in the US census: 1890 1990. Ethnic and Racial Studies 16(1), 75 94. Leginski, W. A., Manderscheid, R. W. and Henderson. P. R. (1991) Patients served in state mental hospitals: results from a longitudinal data base. In Mental Health. United States. 1990, eds R. W. Manderscheid and M. A. Sonnenschein. DHHS pub (ADM) 90-1708, National Institute of Mental Health, Rockville, MD. Lieberson, S. and Waters, M. (19861 Ethnic groups in flux: the changing ethnic responses of American whites. Annals qf the American Academy o/Political and Social Sciem.e 487, 79 91. Marsh, C. (19933 The sample of anonymised records. Ill The 1991 Census User's Guide, eds A. Dale and C. Marsh, pp. 295 311. HMSO, London. McKenney, N. R. and Bennett, C. E. (19943 Issues regarding data on race and ethnicity: the Census Bureau experience. Public Health Reports 109(13, 16 25. McKenney, N. R., Bennett, C. E., Harrison, R. J. and del Pinal, J. (19933 Evaluating Racial and Ethnic Reporting m the 1990 Census. Paper presented at the annual meeting of the American Statistical Association, San Francisco, CA, August. McKenney N. R. and Cresce, A. R. (19933. Measurement of ethnicity in the United States: experiences of the US Census Bureau. In Challenges o/ ~4easur#tg an Ethnic World." Science, Polities and Reality. Proceedings o/ the Joh~t Cana~k* ('nited States Cm~l~'renee on l/u' Meavurement qf Ethnicity, April 1992, pp. 173 222. U.S. Government Printing Office, Washington, DC. McKenzie, K. J. and Crowcroft, N. S. (19943 Race, ethnicity, culture, and science. British Medical Jm,'nal 309, 286 287. McKenzie, K. J. and Crowcroft, N. S. (1996) Describing race, ethnicity, and culture in medical research. British Medical Journal 312, 1054. Modood, T. (1994) Political Blackness and British Asians. Sociology 28(4), 859 876. Modood. T.. Beishon, S. and Virdee, S. (19941. Changin~ Ethnic Identities. Policy Studies Institute, London. Mortimer, L. and White, A. (1996). Ethnic group question: findings from focus group discussions (unpublished). Office for National Statistics (Social Survey Division), London. Neighbors, H. W. et al. (1996) Racism and the mental health of African Americans. The role of self and system blame. Ethnicily and Disease 6, 167 175. NHS Executive ll994) Colh'etion of Ethnic Group Data [or Admitted Patients (Letter EL( 94 ; 77 ~. NHSE, Leeds. NHS Executive, Information Management Group 11994) Collecting Ethnic Group Data for Admitted Patient Care." bnplementation GuMance arm Training Material. Department of Health, Leeds. NOP/SCPR (19893 1988 British Crime Survey (England and Wales) Technical Report. NOP/SCPR, London. OPCS (1989) General Household Survey 1986. HMSO, London.


Peter J. Aspinall

OPCS (1992) Labour Force Survey 1990 and 1991. Series LFS No. 9. HMSO, London. OPCS/GRO(S) (1993a) 1991 Census. Ethnic Group and Country of Birth, Great Britain, 11ol. 2 of 2 [Table A Ethnic group (full and summary classifications)]. HMSO, London. OPCS/GRO(S) (1993b) 1991 Census Report for Great Britain, Part 1, Vol. 1 of 3 (Table 18). HMSO, London. OPCS/GRO(S) (1994a) 1991 Census Supplement to Report on Ethnic Group and Country of Birth (Table A). OPCS, London. OPCS/GRO(S) (1994b) 1991 Census User Guide No. 58. Undercoverage in Great Britain. OPCS, London. OPCS/GRO(S) (1995) 1991 Census General Report, Great Britain. HMSO, London. Parker, D. (1995) Through Different Eyes: The Cultural Identities of Young Chinese People in Britain. Avebury, Aldershot. Pringle, M. and Rothera, I. (1996) Practicality of recording patient ethnicity in general practice: descriptive

intervention study and attitude survey. British Medical Journal 312, 1080-1082. Smith, P. (1991) Ethnic Minorities in Scotland. Scottish Office Central Research Unit, Edinburgh. South East Thames Regional Health Authority (1993) Healthquest South-East Regional Report. SETRHA, Bexhill. Stead, B. (1996) Towards Standard Definitions for Social Work/Social Services." Ethnic Monitoring and Recording.

Inter-Regional Statistics Group, Edinburgh. Tizard, B. and Phoenix, A. (1993) Black, White or Mixed Race? Race and Racism in the Lives of Young People of Mixed Parentage. Routledge, London. Tucker, C. et al. (1996) BLS Statistical Notes No. 40. Testing Methods of Collecting Racial and Ethnic Information: Results of the Current Population Survey Supplement on Race and Ethnicity. Bureau of Labor

Statistics, Washington, DC. Ullah, P. (1985) Second-generation Irish youth: identity and ethnicity. New Community 12(2), 310-320.