- Email: [email protected]

Contents lists available at ScienceDirect

Measurement journal homepage: www.elsevier.com/locate/measurement

Objectivity, realism, and psychometrics Trisha Nowland ⇑, Alissa Beath, Simon Boag Macquarie University, Sydney, Australia

a r t i c l e

i n f o

Article history: Received 21 September 2018 Received in revised form 19 January 2019 Accepted 11 May 2019 Available online 16 May 2019 Keywords: Psychometrics Generalised latent variable model Conceptual framework Set theory

a b s t r a c t The aim of this paper is raise and address questions regarding the status of objectivity for the generalized latent variable model (GLVM) in psychometric research, given the conceptual, logical and mathematical problems of circularity, conditional independence, and factor indeterminacy, respectively. The question of objectivity for the model is examined with respect to measurement and realist perspectives. Drawing on insights from measurement and systems dynamics literature, a proposal for a conceptual framework is presented, that integrates: i) inference from the best systematisation; and ii) axiomatic set theory. This conceptual framework, which addresses the whole of a research project, invites specification of the expected relations, conditions, and assumptions which are relevant to the implementation of the GLVM. While this does not eliminate the problems for the GLVM, it provides future researchers with maximal objective information in standardized form, supporting minimization of definitional and instrumental uncertainty, in psychological modelling practices. Ó 2019 Elsevier Ltd. All rights reserved.

1. Introduction Psychometrics today embraces a broad and perhaps eclectic church of techniques, but simply defined is the scientific practice of integrating instruments, practices, models, and theory, with the aim of measuring psychological phenomena [49]. The earliest use of the term psychometrics appears to occur with Galton [22], who defined psychometrics as ‘‘[t]he art of imposing measurement and number upon operations of the mind‘‘. Galton’s choice of the term ‘‘art” provides insight into the degree to which subjective creativity is necessary when using measurement techniques to assess unobservable psychological phenomena. Any reliance on subjective interpretation creates challenges in a field that emphasises the objectivity of its methods, under the scientist-practitioner model [2,62]. Where subjective interpretations are not systematically disclosed, controversy may persist regarding interpretation of reported outcomes from research projects. Such controversy is evident in psychometric literature, for example, in the question of whether unobservable psychological phenomena are quantitative in nature (cf. [68–69,52,14,13]). This controversy arises at least in part because psychological phenomena remain notoriously difficult to detect, let alone measure, given their purportedly unobservable status [26,13,14]. As such, there is little direct or tangible empirical evidence available to researchers regarding psychological processes [13,66]. ⇑ Corresponding author. E-mail address: [email protected] (T. Nowland). https://doi.org/10.1016/j.measurement.2019.05.038 0263-2241/Ó 2019 Elsevier Ltd. All rights reserved.

Extensive use has been made of the mathematical latent variable model in psychometric practices, in an effort to investigate such unobservable psychological phenomena. A latent variable model posits some common, indirect or underlying variable as related to two or more manifest variables, for which there are available data realisations [8,67]. An example is where a general or common level of intelligence is assumed to be present, no matter whether the participant is completing items for mathematical, vocabulary, or spatial tasks [82,19]. Latent variable modelling has changed substantially since its first introduction to psychometric practices, yet many of the original assumptions needed for the practice remain (see [18]). To date, however, the status of objectivity for practices involved in latent variable modelling has received relatively little attention. The aim of this paper is to examine the question of objectivity for the psychometric latent variable model in the context of the conduct of measurement of psychological phenomena, and to demonstrate the role for a conceptual framework in addressing some of the problems associated with objectivity for latent variable modelling. A cognitive-historical perspective [59] is adopted throughout, which makes sense of science both in terms of the cognitive processes and skills relevant for any single research practitioner, as well as the historical origins of the practices and the shared meaning that these practices have as discernible in both theoretical and substantive psychometric literature. In this way the proposal for the conceptual framework is both informed by present practice, and is optimised for what can be understood is needed for robust, and sustainable, future research outcomes.

293

T. Nowland et al. / Measurement 145 (2019) 292–299

2. Latent variable modelling in psychometrics The latent variable model has been recently described as a ‘‘gold standard” in psychometric practice [3]; cf. [50]. This model has roots in Spearman’s [68] investigation into definitions of intelligence in the paper ‘‘General Intelligence”, Objectively Determined and Measured. Following in-depth conceptual analysis and review of earlier literature, Spearman noted that there was no concordance about what intelligence actually is. Spearman’s factor theory attempted to provide a correlational solution to this problem of defining intelligence, in virtue of what for Spearman at least constituted an act of measuring it. Spearman’s observation of the positive manifold (see [18]), a pattern of positive correlations among different kinds of cognitive ability test scores for children, led him to propose that a common underlying factor must exist which accounts for this pattern. Attempts to provide a mathematical structure for this underlying factor resulted in the development of the tetrad equations method, first presented in Hart and Spearman [28]. This method used rankings in linear matrix algebra, and while Spearman worked on the problem for the remainder of his career, proof for the split of test scores into latent and manifest variable components was never found [18]. Mathematical proof would have meant it was possible to split test scores into two parts, one to represent the common factor of g, or cognitive ability, the other to represent the unique test administered, plus error, without need for independent confirmation that the underlying phenomena for the latent and manifest variables existed as two separate kinds of phenomena. Yet no psychometrician beyond Spearman seems to have pursued a proof for the model, even while the influence of factor analysis on the field is apparent from the time of Spearman, and shows no sign of abating (see for example [21]). The assumption of conditional independence is perhaps Spearman’s key innovation in factor analysis, and remains as a core feature of latent variable modelling through its history. It is an assumption that states, given the latent variable, the manifest variables in a model remain independent of each other [56,57]. Spearman’s factor theory became the target of controversy soon after its publication, in connection to the assumption of conditional independence on just one latent variable (cf. [78,79,79]). Thurstone [80] for example noted that Spearman had chosen to limit his interpretation of rankings based on the first ranking aligning with the idea of general ability, or a common factor for intelligence. There was no mathematical proof that entailed a limit to this first factor or latent variable [18]. Thurstone’s subsequent use of rankings to represent other variables became the ground for his proposal for multiple factor analysis, or a multiple latent variable model. Further critiques included Thomson’s (1916) bonds model and Thorndike’s [79] hierarchical model of intelligence. Together, these approaches constituted the very beginnings of a common framework of latent variable modelling, a method which still sits core to psychometric practice today. The important point to take from these earliest critiques, however, is that from the beginning, subjective decision-making entered into the choice of notionally objective syntactical mathematical model, to answer simple questions about how many latent variables were given, to exist. The computing revolution brought with it new researcher problems to solve, even as model types that were understood as more relevant to different types of psychological phenomena were developing, such as those that addressed categorical variable structures (see [36]). These different types were ultimately gathered under the generalised latent variable model (GLVM: [67]). The models that are combined under the generalised account are diverse, and include item response theory (IRT: [38,63]), latent structure analysis [36], exploratory and confirmatory factor analysis (EFA, CFA:

[31,32]) and structural equation modelling (SEM: [86]), among others. Researchers were now presented with a greater number of subjective decisions to make and assumptions to address regarding both the syntactic mathematical model, and semantic interpretation of the model (see [11]). These included for example deciding upon which kind of model to use, and how to account for the assumptions checking and/or violation, which is a typically presenting feature with use of the model (see [77]). At present, however, there is no logically-informed or systematised way for researchers to account for the assumptions that they make, and the decisions made in connection to their use of the GLVM. In what follows we define the GLVM, and look at three core aspects of the GLVM, in order to know something about what we would need to account for in any logical system. The three aspects are always present in any deployment of the model, and have remained unchanged, since the original work of Spearman. Before that, we briefly define here objectivity, for the purpose of reviewing the GLVM in reference to it. 3. Objectivity Objectivity remains as a valued property of scientific processes and findings [27]. Scientific metrology literature has recently distinguished between two important properties of measurement results – objectivity, as object-relatedness, and intersubjectivity, as subject-independence [48]. Objectivity, defined here, is ‘‘the extent to which the conveyed information is about the property object of measurement and nothing else” [48]. Two important components are listed – definitional specificity, with the property in question characterised as clearly as is possible, and instrumental veracity, with a requirement to identify and where possible eliminate as many undue influence properties and forms of spurious information from the reported output. In this way, definitional uncertainty, and instrumental uncertainty about the output of a measurement process can be addressed and potentially, assessed [44]. To the extent that subjective choices may influence definitional uncertainty or instrumental uncertainty, then, and to the degree these remain undisclosed in the reporting of research project outcomes, we can understand the objectivity of the proffered measurement to be at risk. 4. The GLVM A familiar form of the GLVM is the response model [67] which integrates both random coefficient and factor models for a particular item j as:

yj ¼ Xj b þ Kj gj þ

ej ;

where yj is the response function, Xj b stands for the intercept terms, Kj gj represents the matrix of both parameters and variables for the latent variable in question, and єj is the error term for the jth response. No matter whether the specific application of the GLVM takes form as IRT, SEM, EFA, CFA, or some other latent variable analysis, there are three common conceptual, logical, and mathematical elements present in all GLVM forms. These are conceptual circularity, the conditional independence assumption, and the mathematical problem of factor indeterminacy. These elements can be understood as influencing the objectivity of outputs from application of the GLVM. 5. Circularity Boring [10] noted evident conceptual circularity in intelligence testing as reflected in the model above – without intelligence tests,

294

T. Nowland et al. / Measurement 145 (2019) 292–299

or responses from individuals to items we have already set as intelligence items, we have no way of saying what intelligence actually is (see also [7,61]). Other difficult questions followed, concerning the connection between the latent variable in the mathematical model, and the empirical psychological phenomenon, as defined under the research construct (see [45]). The mathematical model itself can say nothing about this relationship, and in psychology research we rely on extra-statistical criteria to support inferences about these relationships, such as reliability and validity criteria (see [9]; cf. [47]). Conflation of the real-world phenomenon, the research construct, and the mathematical model is noted as a persistent problem in psychometric literature [43,45]. Without systematised delineation of distinctions between phenomenon, construct, and model, definitional uncertainty may be at stake for the outputs of the GLVM, threatening scientific objectivity of findings. 6. Conditional independence It is not only the case that we cannot say more about our psychological constructs than exactly what is given in our construction of the manifest variables which are independently related to the latent variable in the model. It is also the case that logically there is a question regarding use of the latent variable as evidence of existence of psychological phenomena. This is because of the role of the assumption of conditional independence in the use of the model. The conditional independence assumption states that there must exist a common latent variable that accounts for the pattern observed in positive manifold, with all other manifest variables rendered independent of each other. To rely on the assumption of conditional independence then is to assume that the latent variable exists. Logically, in this regard, existence is assumed for the latent variable, prior to the modelling practice notionally employed to provide evidence for the phenomenon represented by the latent variable. [91] have demonstrated that it is always possible to find scalar-valued latent variables for joint distributions, such that conditional independence holds. What this means is that independent evidence beyond what is included in the model is needed about the existence of the phenomenon at the level of definitional certainty, before we can begin to consider instrumental questions in connection, to it. 7. Factor indeterminacy There is another key difficulty for the GLVM which reappears in research literature roughly once a generation (see [83,23,58,42]). This is the problem of factor indeterminacy. Factor indeterminacy is a mathematico-grammatical problem [42] that arises because any act of obtaining a solution for a latent variable logically applies for an infinite number of any other possible latent variables [56,57,42]. Factor indeterminacy is not a function of error of measurement, for a latent variable. It is founded in the inability to secure representational foundations for a link between the phenomena at stake in the research project, and the latent variable in the syntactic mathematical model. In the words of Mulaik and McDonald [58]: ‘‘Factor indeterminacy is the inability to determine uniquely the common and unique factor variables of the common factor model from the uniquely defined ‘‘observed variables” because the number of observed variables is smaller than the number of common and unique factors. Factor indeterminacy occurs when the multiple correlation for predicting a factor, using the correlations of the observed variables with the factor (factor

structure coefficients) and the correlations among the observed variables, is less than unity.” In practice, the parameter in question is always less than unity. This means it is possible that ‘‘more than one random variable can have the appropriate correlations with the observed variables as given by the factor structure coefficients, yet these multiple solutions for the factor need not be perfectly correlated.” [58]. Recognition of the problem of factor indeterminacy has a long history. When reviewing [90] ‘‘The Abilities of Man”, Wilson [83,84] noted the impossibility of deriving a unique solution for g, or cognitive ability [57]. Wilson [83] further demonstrated using vector space analysis that latent variables lie partly outside the space described by the linear combinations of the manifest variables, and thus, cannot be uniquely determined. What this means is that no matter how exact or precise the estimated solution for the latent variable, there remains no certain way to link this to the psychological phenomena, or to exclude the solution from being any other potentially contradictory latent variable [57]. Consequently, when we use the latent variable model, we have no certain way to connect the outcomes from the analysis back to the construct or concept we were interested in at the beginning of the research project. For Spearman, this meant that even though he proposed his factor theory as a way to clarify the concept of intelligence, there was no direct way to connect the concept of intelligence to the outcomes of his factor approach. Spearman [69] responded to Wilson [83] suggesting that indeterminacy could be solved with the addition of a variable that was exactly correlated with g to the already included manifest variables, but what this offered was an observable substitute to the latent variable, not a solution to factor indeterminacy [57]. McDonald and [88] in fact set out a proof demonstrating that increasing the number of variables included to infinity does not eliminate the problem of factor indeterminacy for latent variable modelling. Thus, the addition of any number of variables does not overcome the innumerability of solutions for g [58]. Despite the conceptual challenge that factor indeterminacy presents, the problem has been largely overlooked, in the field [70]. Mulaik and McDonald [58] offer a possible solution to the problem of factor indeterminacy, the infinite behaviour domain position (see also [46]). In this proposal, a conceptual domain is declared by the researcher, which is infinite in structure, from which the items that are considered to represent the psychological phenomena are selected. Such a practice may go some way towards addressing definitional uncertainty for the latent variable, although the distinction between the structure of non-empirical infinity, and empirical questionnaire items that are typically evaluated on point-scales does not lend itself to easy reconciliation. Without the researcher explicitly accounting for the relationship they intend between an infinite domain and finite test items in a systematic conceptual framework, definitional uncertainty to a large extent, remains unresolved, even with the adoption of an infinite behaviour domain. Despite a substantial review by seasoned psychometricians of the time in a Special Edition of Multivariate Behavioral Research in 1996 initiated by Michael Maraun, factor indeterminacy remains as an aspect of the GLVM that is unaddressed in the reporting of research project outcomes. Instead, exploration of latent variable models as approximations has typically focused more on the process of estimation of parameters, rather than resolution of conceptual difficulties in applied contexts. This has resulted in the researcher being presented with a sizeable array of decisions, about the kind of model they want to use, and how or whether to account for an assessment of their use, of it.

T. Nowland et al. / Measurement 145 (2019) 292–299

8. Modelling applications The GLVM umbrella includes techniques that allow for the specification of modelled relations between latent and manifest variables as well as error-structures, for the model. Typical practice in for example CFA then includes testing, to see whether data fit with the specified model. Under this approach, goodness-of-fit heuristics are employed to help evaluate whether the data fit the model. Heuristics provide guidance on use of models and interpretation of model outcomes, but in and of themselves cannot prescribe disclosure of assumption violations, for example. There are various heuristics to choose from for example in solving for parameter values, for latent variables. The researcher may need to consider: treatment of model identification and cut-off criteria for goodness-of-fit [1,30]; solutions for measurement invariance concerns [51]; connecting analyses from between-subjects data to within-subjects processes [50,11]; and decisions regarding which and how many variables to include as well as techniques to calculate model parameters such as maximum likelihood or weighted least squares methods (see [17,6]), among other decisions. It is noted in the literature that both the methods [5] and the heuristics [33] remain open to subjective use and interpretation, although there are standards that have been set in the profession regarding application of these heuristics (see [16]). Any researcher utilising the GLVM consequently had an increasingly large number of subjective decisions to make about how their activities contribute towards psychological measurement, and while guidance is provided about what to use and where, there is no standardised format for accounting for these decisions as they are made throughout the completion of a research project. The conceptual framework aims to provide such a standardised format. We have not however addressed the question of what psychological measurement is, and so turn now to examine aspects of measurement highlighted in the psychometric literature. This discussion is followed by exploration of what is needed for an objective measurement process, and what is needed to capture the subjective art form of latent variable modelling, in objective terms.

9. Measurement and objectivity We have said above that recent metrology literature emphasises object-relatedness, or objectivity and subject-independence, or intersubjectivity as essential features of measurement [48,44]. Evaluation of object-relatedness and subject-independence for the GLVM is facilitated by the completion of a standardised and systematised conceptual framework for the whole of the research project, by the researcher. An account of the basic structure for this framework is outlined in what follows, below. Going back in history, we see for Spearman inclusion of the term ‘‘objectively determined” in the title of his 1904 paper introducing factor analysis. A key claim throughout the history of psychology as a science is that the outcomes of psychological research must be consistent with scientific objectivity [55–73]. Mulaik directly contrasts objectivity and subjectivity in his characterisation of latent variables, as ‘‘objective variables that serve to synthesise diverse manifest indicator variables (appearances) according to specified rules of how they are related functionally to the latent variables” (1991, p. 194). He further notes that the status of objectivity for the latent variable depends on shared intersubjective understanding of the outcomes of goodness-of-fit heuristics for the model amongst psychometricians. Objectivity is revealed, he says, when the phenomena appears in the same way through multiple points in time or from different perspectives. In this way for Mulaik [55], we can understand that objectivity and intersubjectivity are intertwined – we rely on intersubjective sharing of knowledge, in support of scien-

295

tific objectivity (see [27]). Systematised declaration of subjective interpretations of the relation between the phenomenon, construct, and GLVM, for example, provides a basis on which objective agreement may be secured, regarding the application of processes, and veracity of analysis outcomes. One potential solution here then is to develop a logical and systematic conceptual framework, which provides a foundation for recording such information. We will describe such a framework shortly but before that will consider some philosophical stances towards the latent variable, beyond that of Mulaik’s objective variable, described above. 10. Enter realism One concern consonant with that of objectivity for latent variable modelling is the issue regarding the ontological status of the latent variable. Ontology orients to questions of ‘what is’ regarding the nature of reality – something with ontological import can be understood as having objective, mind-independent existence [29]. Borsboom [11] suggests that we should look to how psychometricians use the latent variable model in practice, to understand what sort of philosophical stance best underpins the use of the latent variable model – looking specifically at how the research output is treated. Here Borsboom proposes entity realism for the latent variable model. Entity realism in Borsboom’s account is described as a variant of the combination of entity and theory realism in the scientific realism of Hacking [24] and Devitt [20]. Entity realism ‘‘ascribes an ontological status to the latent variable in the sense that it is assumed to exist independent of measurement” [11]. Entity realism is presented by Borsboom [11] in contrast to an operationalist interpretation of the latent variable model, which would take the latent variable as completely determined by the steps used in any one particular study to construct it. Entity realism is also presented in contrast to constructivism, which for Borsboom would render the latent variable as nothing more than a hypothesis in the mind of the researcher. In this sense, says Borsboom, entity realism is the best fit for the latent variable model since psychometricians treat the latent variable as something real and logically independent of their models. There are good reasons to question whether Borsboom’s [11] account of the relevance of entity realism for latent variable modelling holds, if only because any ontology built only from the practices of researchers leaves us with no scope for making judgements about the ontological viability of the practices or the assumptions behind the practices, themselves. There is a logical priority for ontology, in that to evaluate any epistemological claim regarding how we can know about ‘what is’, we must be able to say something about what exists [29]. This is an important point because, as we have seen above, problems for the GLVM mean that subjective decision-making about practices is inextricably wound up in any analysis conducted using the GLVM. This means our inferences are more rather than less likely to be subject to definitional uncertainty and instrumental uncertainty. Where either of these properties are increasing, we are less likely rather than more likely to achieve the scientific objectivity that we seek. We need methods that allow future researchers to clearly discern the inferential processes involved in adopting assumptions, checking assumptions, and dealing with violations, when they occur. A systematic and universal conceptual framework, grounded in realism of the research situation can provide such foundations for disclosure and maximise the capacity to ensure the object relatedness of any outcomes that are reported as measurements.1 1 Note there are good reasons to carefully consider whether outputs from the GLVM can be properly considered to be measurements. See for example Markus and Borsboom [59], who re-frame the outcomes of such processes as assessment, rather than measurement.

296

T. Nowland et al. / Measurement 145 (2019) 292–299

To be consistent with objectivity as it is described in the measurement literature (see [44]) what is needed for the latent variable is systematic evidence, that could provide a basis on which a logical test for the existence of the phenomena may be founded. With systematised historical evidence in standardised conceptual frameworks, we shift the playing field for claims for existence beyond the circular ground of Boring [10], the logical difficulty of the conditional independence assumption, and the problem of factor indeterminacy, as discussed above. In his work on the theory of measurement and scientific modelling practices, Suppes [71,74,76] offers insight regarding the use of axioms in connection to representations made in scientific practices that aim to identify invariant patterns in nature, such as would be represented in systematised historical evidence for psychological phenomena. We turn now to consider how we might bring this forward, for latent variable modelling. 11. Need for a systematic conceptual framework The issues highlighted above regarding the use of the GLVM in psychometric practices indicate the need for a logical conceptual framework, suitable for any research project making use of this model. Defined, a conceptual framework is understood as ‘‘the system of concepts, assumptions, expectations, beliefs, and theories that supports and informs your research” ([87], p. 9). To the extent that systematic cohesion is demonstrable for a conceptual framework, we can expect that it will describe the ideology, theory, models, relations, variables, data, and phenomena, with attention to the specific context of the research and the novel content presented (see [85]). Taken as a whole, the conceptual framework for a research project should define the local research situation, and each module in the framework should lend itself to scrutiny by the scientific community, individually. The conceptual framework sets a domain, over which inferences from the best systematisation (see [64], and below), can be adduced. One benefit of such a framework is that it helps to exemplify evidence of the constraints set, or constructed, in a project. With this evidence, the consistency of the entire system for the project can be evaluated. Once consistency is evaluated, it is possible to provide coherent overall evidence for the object-relatedness of the GLVM outputs, as well as evidence regarding the address of definitional and instrumental uncertainty, specifically. Such a framework should thus facilitate clarification of any relations between the elements of a research project. What should follow from the implementation of such a conceptual framework is local confirmation of: (i) the context of the evidence garnered in the project; (ii) the capacity for outputs of the project to be counted as measurement; and (iii) the logical conditions relevant to identity constraints for the psychological phenomena in question in order of logical necessity [62]. Such constraints provide guiding principles in support of scientific reasoning following research outcomes, in a way consistent with the operation of constraints in constructive mathematics [40] and axiomatic conditions in set theory [40,76]. We turn now to look at how the work of Suppes is drawn into this framework which places objectivity in the cradle of detected and evidenced, invariances, articulated in a series of cascading constraints. 12. Suppes – axioms for models Suppes’s contribution to science is often framed in terms of axiomatic set theory [74], or the representational theory of measurement (see [27,16,487). In this paper we follow Boumans [15], in suggestion that there is further benefit in adopting principles from Suppes’s body of work that characterised theory as a function

of models. In this way, set theory is utilised not in strict axiomatic form, as in the representational theory of measurement. Instead, we make use of the logic of set theory to set constraints for recognition of what is invariant for some psychological phenomena, in order that some activity like assessment may become possible. This makes it possible to utilise set theory without committing to the existence of sets in a realist fashion, by for example ensuring that research elements are defined within structures that facilitate set theoretical representations (see [45,65]). As Markus [45] notes, ‘‘[w]ithin a given framework defined by a set of individuals and a set of properties, one can refer to the actual individuals with their actual properties as the actual state of affairs”. We take the research situation, or the research project, conceptualised, as the foundation for what determines which individuals, which properties, and what states of affairs, need to be described, in model form, in the conceptual framework. Once these constraints are systematically specified, hypothesis testing for any individual element of the project or the project as a whole can be supported, whether the hypotheses are investigated using qualitative or quantitative techniques (see [4]), on the need for qualitative techniques, for models). A systematic account of a conceptual framework at a minimum for a project that adopts the GLVM in its analysis will need evidence to substantiate the theory, model(s), variable(s), relations, data, and instantiations of phenomena, as well as clarification of researcher ideology and metatheoretical commitments, located for a specific geohistorical situation. The aim with systematising this information is to provide some analogue of the correspondence rules used to define an axiomatic theory of measurement (see [35]), but in a way that facilitates the inclusion of imprecise descriptors in ordered subsets, or models (see [72]) that are relevant to each project element. These descriptors are placeholders for the researcher’s commitments relevant to the particular project element. The model for each element is stated in standardised form of a structure or ordered set, S = (U, O, R), where U = the universe or domain over which the model applies for this element, O = any operations on the domain, and R = relations both within the domain for this element and to other elements (see [54]). Operations will include listing of any assumptions associated with the particular element in question, whether theory, model, variable, relation, phenomena, data, or other. In this way, even though models of distinct elements may be of different logical types, an axiomatised approach facilitates specification of the fundamental ideas relevant to an element domain, such that we can locate areas of problem or concern and raise up their profile for redress (see [75]). Empirical situations will include some elements that are not amenable to axiomatisation, because for example they have unique properties which do not place them in a set with any other elements except to the degree that they are an aspect (or subelement) of the particular research project element, that is being modelled. This will include for example ‘‘ceteris paribus conditions” [15,73], which may influence the performance of any of the sub-elements in the set for the present element in a way or at a level that would be scientifically of interest. With an accumulation of standardised records in conceptual frameworks for projects gathered over time, these sub-elements may be eventually be revealed as patterns that do have some role of import for the psychological phenomena that is of interest. At present, there is no requirement for a researcher to report such potential external influences, and indeed, there may be reward for a researcher in keeping them hidden, where commercial ties for example are evident to a particular model formulation of some kind of psychological phenomena which does not include the influence in question. Acknowledging that there are elements of a psychology research project that do not lend themselves to full axiomatization for all values within a domain, a modular approach to the concep-

T. Nowland et al. / Measurement 145 (2019) 292–299

tual framework is supported, following the systems dynamics approach of Barlas [4]. Barlas notes that white-box modelling exists where we are able to achieve something like deductive closure for a model of both the representations used in the measurement and the real-world empirical correlates, or a complete check of the axiom set which yields a true model (see [92]). This is rarely achievable in practices of empirical research (see [44]). More realistic is the approach of grey-box modelling as described by Barlas [4]. For Barlas [4], there are an infinite number of ways of asserting what he calls model validity, both qualitative and quantitative. Some of the methods are well known, some are barely known. What is less important than acceptance in the field is the researcher’s account of how the method supports the interpretation of both research process, and research outcome. In the formulation of a grey box model, it is vital that the inputs are specified in a way where they can be checked against the outputs, and the representational outputs can be checked against real-world phenomena [4]. The axiomatic formulation of these in the standardised model format instituted by Suppes [72] and as described above, creates the conditions, for this confirmation. In psychology research, we have a number of well-recognised practices that assist us with evaluations of our quantitative inferences from psychometric models, such as reliability and validity techniques [46]; cf. [47]. In this proposal for a conceptual framework with particular attention to the question of scientific objectivity, we seek to add one property to add to the evaluation of each of the project elements beyond checks of reliability and validity – that of credibility. Credibility is a property addressed in qualitative methodology literature [81], as a value that addresses consistency as a foundation for dependability [37], as well as the characteristic of trustworthiness [34]. Where we are challenged to secure correspondence between the psychological phenomena with the mathematical model because of the non-observable status of the phenomenon itself, we must rely more heavily on the consistency of both practices and the statements that we make following the execution of practices that adopt the GLVM. With three criteria for each element of the conceptual framework assessed as reliability, validity, and credibility, research project outcomes can be evaluated in what is described in systems literature as a research efficacy triptych [81]. These properties lend themselves to evaluation of the ontological appropriateness of research claims [81], cohered over a research project as a whole. Such an evaluation allows then for discernment of the objectivity of findings, given the researcher’s subjective self-disclosures, in a standardised format. All this takes place in a way facilitative, of scientific inference made from the best systematisation that the researcher can produce, for their project.

13. Why inference from the best systematisation? There has been recent endorsement of inference to the best explanation as relevant to the inferences made from psychological research (IBE: [25]). Such an approach relies on explanatory criteria in the selection of inferences from an array of possible conjectures relevant to psychological phenomena or events (see Harman, 1965; [25]). However, an initial question arises with respect to the explanatory condition of IBE, for the GLVM. Use of the GLVM may be involved in any of the activities of description, explanation, or prediction, each vital aspects of scientific inquiry and none reducible by necessity, to the other (see [7]). To adequately address the full extent of adoption of the GLVM in psychological research, which simultaneously maximises the efficiencies wrought in systematisation of reporting of practices and outcomes, inference from the best systematization (IBS: [64]) recommends use of logical criteria, which facilitates evaluation of the system-

297

aticity of a research project. IBS wraps around any use of the GLVM, whether it is made in descriptive, predictive, or explanatory capacity. A systematic approach to inferences from a well-evidenced foundation invites scrutiny of logical links between research elements, articulation of known theoretical and methodological constraints, and clarification of possible gaps in our research program. A further strength of IBS is that it asks us to account for our research projects in a way akin to the pre-registration processes becoming predominant in psychology research practices (see [60]). Although pre-registration databases aimed originally to secure some ground to assess reproducibility of research outcomes in the psychological sciences, a welcome by-product of preregistration is increased transparency with respect to psychometric assumptions and practices. Nevertheless, current versions of these pre-registration databases do not address logical connections between elements of research projects in a way facilitative of overall judgements about the veracity of the research project reported outcomes. The conceptual framework proposed here for psychometric practices does exactly these things, and nurtures a ground for sustained assurance about the scientific objectivity of outcomes, in light of transparency regarding the subjective judgements necessary, in any research project, for each research element domain. One final consideration for us is the reference to the conditions that most benefit the clarification of constraints as set out in what is described here as a realist constructivist stance. We now follow with clarification of this position in respect of a conceptual framework for the use of the latent variable model in psychology research.

14. Why do we need a realist constructivism? In Markus and Borsboom [46] we see a shift from Borsboom’s [11] earlier stance on scientific or entity realism, to a constructivist realism. Whereas Borsboom [11] characterised a constructivist stance as one where latent variables existed only in the mind of the researcher using the model, Markus and Borsboom [46] propose that constructs are ‘‘approximations of existing entities constructed through scientific practice” [46]. Here we see an embrace of the reliance on intersubjective understandings of the meaning of constructs in psychological research, particularly given the non-observability of psychological phenomena. In characterising our stance here as realist constructivist, we seek to emphasise a concern with the discovery of the true nature of psychological phenomena, over and above placing ontological import with or in the constructs we may use in an effort to say something about them. Realism is maintained in acknowledgement of the situation of the research project, and the way that this provides geo-historical co-ordinates for what must be included in the conceptual framework for a project. We remain realist in adopting the position of Maddy [40], in accepting the best of what other sciences have to tell us about what reality is, and also adopt realist attitudes towards the phenomena in focus, for the field of psychology (i.e., psychological events), in infinitely complex situations ([90]). We also remain constructivist, following the mathematical constructivism exemplified in Ferreriós (2016). Ferreriós argues that the certainty we usually attribute to mathematics comes about as a function of the setting of constraints in axioms that function as working hypotheses, which ‘‘establish links that restrict the admissible” (p. 248). It is these links that ensure objectivity or invariance of mathematical findings, in exactly what Ferreriós (2016) describes as ‘‘a peculiarly strong form of intersubjectivity—very likely, the strongest there is for humans” (p. 160). By grounding the conceptual framework in a realist constructivist stance, we acknowledge both the mind-independent nature of empirical occasions, as well as the fact that human knowledge

298

T. Nowland et al. / Measurement 145 (2019) 292–299

comes about in a network of relations founded at best in logically coherent practices. This approach makes possible the adoption of any preferred philosophical stance on behalf of a researcher for the research project, which can be fully accounted for in the conceptual framework. Such an attitude is consistent with the notion of local realism presented by Mäki [41], or thin realism as presented by Maddy [40], each of which privilege a contextualised philosophy that is considered by the researcher to be most relevant to the project at hand. A conceptual framework founded with a realist constructivist perspective makes use of the best of what science has to tell us about the nature of reality, but offers clear insight into the unique sets of assumptions utilised in the construction of such knowledge for the present research project. In such a way the framework offers the possibility of sustained coherent practice in connection to the tracking of invariances in systems of knowledge, occurring in but not limited by time. 15. Conclusion The approach to a conceptual framework as presented here describes a theory of expected relations across project element domains as relevant to any research project that makes use of the GLVM for psychological research. Latent variable modelling still today dominates the field of psychometrics as a leading practice in the garnering of evidence as a function of peer-reviewed publication. Reliance on the assumption of conditional independence foundational to the latent variable model must be supported by demonstration of non-statistical evidence and evidentiary methods that support identification of the phenomena by elaborating the logical constraints most relevant to the existence of the phenomena. Only by undertaking and accounting for such practices in a conceptual framework can latent variable modelling be in any way though to contribute to objective measurement outcomes as a practice in the stable of psychometric techniques. By integrating instruments, practices, models, and theory in a single overarching conceptual framework custom-built for every research project, researchers using the latent variable model are placed to offer for perhaps the first time an insight into the depth and breadth of the critical inquiry that makes up their research endeavour. The degree to which latent variable modelling is not just a statistical endeavour (see [13,55]) becomes apparent in sustainable accounting of and for the full spectrum of researcher decisions needed just to use the models. We envisage that making all assumptions involved in the process of using the GLVM open for scrutiny will revolutionise our present day practices in ways we cannot fully yet, forsee. References [1] H. Akaike, A new look at the statistical model identification, IEEE Trans. Autom. Control 19 (1974) 716–723. [2] D.B. Baker, L.T. Benjamin Jr, The affirmation of the scientist-practitioner: a look back at Boulder, Am. Psychol. 55 (2) (2000) 241–247. [3] D.L. Bandalos, Measurement Theory and Applications for the Social Sciences, Guilford, New York, 2018. [4] Y. Barlas, Formal aspects of model validity and validation in system dynamics, Syst. Dyn. Rev. 12 (3) (1996) 183–210. [5] D.J. Bartholomew, Factor analysis for categorical data, J. Roy. Stat. Soc. B (1980) 293–321. [6] P.M. Bentler, C.P. Chou, Practical issues in structural modeling, Sociol. Methods Res. 16 (1) (1987) 78–117. [7] S. Boag, Explanation in personality psychology: ‘‘Verbal magic” and the fivefactor model, Philos. Psychol. 24 (2) (2011) 223–243. [8] K.A. Bollen, Latent variables in psychology and the social sciences, Annu. Rev. Psychol. 53 (2002) 605–634. [9] K. Bollen, R. Lennox, Conventional wisdom on measurement: a structural equation perspective, Psychol. Bull. 110 (2) (1991) 305–314. [10] E.G. Boring, Intelligence as the tests test it, New Republic 36 (1923) 35–37. [11] D. Borsboom, Measuring the Mind: Conceptual Issues in Contemporary Psychometrics, Cambridge University Press, Cambridge, 2005.

[12] D. Borsboom, The attack of the psychometricians, Psychometrika 71 (2006) 425–440. [13] D. Borsboom, Latent variable theory, Measurement 6 (2008) 25–53. [14] D. Borsboom, G.J. Mellenbergh, Why psychometrics is not pathological: a comment on Michell, Theory Psychol. 14 (1) (2004) 105–120. [15] M. Boumans, Suppes’s outlines of an empirical measurement theory, J. Econ. Methodol. 23 (3) (2016) 305–315. [16] L. Cai, Factor analysis of tests and items, in: K. Geisinger, B. Bracken, J. Carlson, J. Hansen, N. Kuncel, S. Reise, M. Rodriguez (Eds.), APA Handbook of Testing and Assessment in Psychology, Vol. 1: Test Theory and Testing and Assessment in Industrial and Organizational Psychology, American Psychological Association, Washington DC, 2013. [17] N. Cliff, Article commentary: abstract measurement theory and the revolution that never happened, Psychol. Sci. 3 (3) (1992) 186–190. [18] M. Cowles, Statistics in Psychology: An Historical Perspective, Psychology Press, London, 2001. [19] I.J. Deary, Looking Down on Human Intelligence: From Psychometrics to the Brain, Oxford University Press, Oxford, 2000. [20] M. Devitt, Realism and Truth, second ed., Blackwell, Cambridge, UK, 1991. [21] S. Epskamp, M. Rhemtulla, D. Borsboom, Generalized network pschometrics: combining network and latent variable models, Psychometrika 82 (4) (2017) 904–927. [22] F. Galton, Psychometric experiments, Brain 11 (1879) 149–162. [23] L. Guttman, The determinacy of factor score matrices with implications for five other basic problems of common-factor theory 1, Br. J. Stat. Psychol. 8 (2) (1955) 65–81. [24] I. Hacking, Representing and Intervening, vol. 279, Cambridge University Press, Cambridge, 1983. [25] B.D. Haig, Inference to the best explanation: a neglected approach to theory appraisal in psychology, Am. J. Psychol. (2009) 219–234. [26] B.D. Haig, Investigating the Psychological World: Scientific Method in the Behavioral Sciences, MIT Press, Boston, 2014. [27] J.F. Hanna, The scope and limits of scientific objectivity, Philos. Sci. 71 (3) (2004) 339–361. [28] B. Hart, C. Spearman, General ability, its existence and nature, Br. J. Psychol. 1904-1920 5 (1) (1912) 51–84. [29] F.J. Hibberd, The metaphysical basis of a process psychology, J. Theor. Philos. Psychol. 34 (3) (2014) 161–186. [30] L.T. Hu, P.M. Bentler, Cutoff criteria for fit indexes in covariance structure analysis: conventional criteria versus new alternatives, Struct. Eq. Model. 6 (1) (1999) 1–55. [31] K.G. Jöreskog, Some contributions to maximum likelihood factor analysis, Psychometrika 32 (4) (1967) 443–482. [32] K.G. Jöreskog, A general method for estimating a linear structural equation system, in: A.S. Goldberger, O.D. Duncan (Eds.), Structural Equation Models in the Social Sciences, Seminar Press, New York, 1973, pp. 85–112. [33] J.O. Kim, C.W. Mueller, Factor Analysis: Statistical Methods and Practical Issues (No. 14), Sage, London, 1978. [34] L. Krefting, Rigor in qualitative research: the assessment of trustworthiness, Am. J. Occup. Ther. 45 (3) (1991) 214–222. [35] D.H. Krantz, R.D. Luce, P. Suppes, A. Tversky, Foundations of Measurement (Additive and Polynomial Representations), vol. 1, Academy Press, New York, 1971. [36] P. Lazarsfeld, N. Henry, Latent Structure Analysis, Houghton Mifflin, New York, 1968. [37] Y.S. Lincoln, E.G. Guba, Naturalistic Inquiry, vol. 75, Sage, London, 1985. [38] F. Lord, A Theory of Test Scores (Psychometric Monograph No. 7), Psychometric Corporation, Richmond, VA, 1952. [39] R.D. Luce, Quantification and symmetry: commentary on Michell, quantitative science and the definition of measurement in psychology, Br. J. Psychol. 88 (3) (1997) 395–398. [40] P. Maddy, Defending the Axioms: On the Philosophical Foundations of Set Theory, Oxford University Press, Oxford, 2011. [41] U. Mäki, Reglobalizing realism by going local, or (how) should our formulations of scientific realism be informed about the sciences?, Erkenntnis 63 (2) (2005) 231–251 [42] M.D. Maraun, Metaphor taken as math: indeterminancy in the factor analysis model, Multivar. Behav. Res. 31 (4) (1996) 517–538. [43] M.D. Maraun, S.M. Gabriel, Illegitimate concept equating in the partial fusion of construct validation theory and latent variable modeling, New Ideas Psychol. 31 (1) (2013) 32–42. [44] L. Mari, P. Carbone, A. Giordani, D. Petri, A structural interpretation of measurement and some related epistemological issues, Stud. History Philos. Sci. A 65 (2017) 46–56. [45] K.A. Markus, Constructs, concepts and the worlds of possibility: Connecting the measurement, manipulation, and meaning of variables, Measurement 6 (1–2) (2008) 54–77. [46] K.A. Markus, D. Borsboom, Frontiers of Test Validity Theory: Measurement, Causation, and Meaning, Routledge, New York, 2013. [47] A. Maul, Rethinking traditional methods of survey validation, Measurement 15 (2) (2017) 51–69. [48] A. Maul, L. Mari, D.T. Irribarra, M. Wilson, The quality of measurement results in terms of the structural features of the measurement process, Measurement 116 (2018) 611–620. [49] R.P. McDonald, Test Theory, Erlbaum Associates, Mahwah: NJ: Lawrence, 1999.

T. Nowland et al. / Measurement 145 (2019) 292–299 [50] R.J. McNally, D.J. Robinaugh, G.W. Wu, L. Wang, M.K. Deserno, D. Borsboom, Mental disorders as causal systems: a network approach to posttraumatic stress disorder, Clin. Psychol. Sci. 3 (6) (2015) 836–849. [51] W. Meredith, Measurement invariance, factor analysis and factorial invariance, Psychometrika 58 (4) (1993) 525–543. [52] J. Michell, The quantitative imperative: Positivism, naïve realism and the place of qualitative methods in psychology, Theory Psychol. 13 (1) (2003) 5–31. [53] J. Michell, Conjoint measurement and the Rasch paradox: a response to Kyngdon, Theory Psychol. 18 (1) (2008) 119–124. [54] C.U. Moulines, Ontology, reduction, emergence: a general frame, Synthese 151 (3) (2006) 313–323. [55] S.A. Mulaik, Factor analysis, information-transforming instruments, and objectivity: a reply and discussion, Br. J. Philos. Sci. 42 (1) (1991) 87–100. [56] S.A. Mulaik, Objectivity and multivariate statistics, Multivar. Behav. Res. 28 (2) (1993) 171–203. [57] S.A. Mulaik, Foundations of Factor Analysis, second ed., Chapman & Hall/CRC, Boca Raton, Florida, 2010. [58] S.A. Mulaik, R.P. McDonald, The effect of additional variables on factor indeterminacy in models with a single common factor, Psychometrika 43 (2) (1978) 177–192. [59] N.J. Nersessian, The cognitive basis of model-based reasoning in science, Cogn. Basis Sci. (2002) 133–153. [60] Open Science Collaboration, Estimating the reproducibility of psychological science, Science 349 (6251) (2015) aac4716. [61] J.A. Passmore, The nature of intelligence, Aust. J. Psychol. Philos. 13 (4) (1935) 279–289. [62] A. Petocz, G. Newbery, On conceptual analysis as the primary qualitative approach to statistics education research in psychology, Stat. Educ. Res. J. 9 (2010) 123–145. [63] G. Rasch, Studies in Mathematical Psychology: I. Probabilistic Models for Some Intelligence and Attainment Tests, Nielsen & Lydiche, Oxford, England, 1960. [64] N. Rescher, Inference from the best systematization, Mind Soc. 15 (2) (2016) 147–154. [65] D. Sherry, Thermoscopes, thermometers, and the foundations of measurement, Stud. History Philos. Sci. A 42 (4) (2011) 509–524. [66] K. Sijtsma, Psychometrics in psychological research: Role model or partner in science?, Psychometrika 71 (3) (2006) 451–455 [67] A. Skrondal, S. Rabe-Hesketh, Generalized Latent Variable Modelling: Multilevel, Longitudinal, and Structural Equation Modelling, Chapman & Hall/CRC, Boca Raton, FL, 2004. [68] C. Spearman, ‘‘General Intelligence,” objectively determined and measured, Am. J. Psychol. 15 (2) (1904) 201–292. [69] C. Spearman, The uniqueness of ‘‘g”, J. Educ. Psychol. 20 (3) (1929) 212–216. [70] J.H. Steiger, Factor indeterminacy in the 1930’s and the 1970’s some interesting parallels, Psychometrika 44 (2) (1979) 157–167. [71] P. Suppes, A set of independent axioms for extensive quantities, Portugaliae Math. 10 (4) (1951) 163–172.

299

[72] P. Suppes, A comparison of the meaning and uses of models in mathematics and the empirical sciences, Synthese 12 (1960) 287–301. [73] P. Suppes, Models of data, in: E. Nagel, P. Suppes, A. Tarski (Eds.), Logic, Methodology, and Philosophy of Science: Proceedings of the 1960 International Congress, Stanford University Press, Amsterdam, 1962, pp. 252–261. [74] P. Suppes, Axiomatic Set Theory, Courier Corporation, New York, 1972. [75] P. Suppes, Introduction to Logic, Dover Publications, New York, 1999. [76] P. Suppes, Representation and Invariance of Scientific Structures, CSLI Publications, Stanford, 2002. [77] B. Thompson, Exploratory and Confirmatory Factor Analysis: Understanding Concepts and Applications, American Psychological Association, Washington DC, 2004. [78] G.H. Thomson, A hierarchy without a general factor, Br. J. Psychol. 8 (3) (1916) 271–281. [79] R.L. Thorndike, Factor analysis of social and abstract intelligence, J. Educ. Psychol. 27 (3) (1936) 231–233. [80] L.L. Thurstone, The vectors of mind, Psychol. Rev. 41 (1) (1934) 1–32. [81] W. Varey, Apithology systems inquiry: evaluation from a generativist ontology, Systems 5 (1) (2017) 22. [82] R.T. Warne, C. Burningham, Spearmanâ€TMs g found in 31 non-western nations: strong evidence that g is a universal phenomenon, Psychol. Bull. 145 (3) (2019) 237–272. [83] E.B. Wilson, Review of ’The Abilities of Man, Their Nature and Measurement’ by C. Spearman, Science 67 (1928) 244–248. [84] E.B. Wilson, Comment on professor Spearman’s note, J. Educ. Psychol. 20 (1929) 217–223. [85] J. Creswell, Research Design: Qualitative, Quantitative, and Mixed methods Approaches, Sage Publications, Thousand Oaks, 2012. [86] K.G. Jöreskog, Analysis of covariance structures, in: P.R. Krishanaiah (Ed.), Multivariate Analysis–III, Academic Press, Cambridge, 1973, pp. 263–285. [87] J.A. Maxwell, Qualitative research design: an interactive approach, Sage Publications, 2012. [88] S.A. Mulaik, R.P. McDonald, The effect of additional variables on factor indeterminacy in models with a single common factor, Psychometrika 43 (2) (1978) 177–192. [89] A. Petocz, N. Mackay, Unifying psychology through situational realism, Rev. Gen. Psychol. 17 (2) (2013) 216–223. [90] C. Spearman, The Abilities of Man: Their Nature and Measurement, Macmillan, London, 1927. [91] P. Suppes, M. Zanotti, When are probabilistic explanations possible?, Synthese 48 (2) (1981) 191–199 [92] A. Tarski, What is elementary geometry?, in: L. Henkin (Ed.), The Axiomatic Method : With Special Reference To Geometry And Physics : Proceedings Of An International Symposium Held At The University Of California, North Holland Publishing Company, Amsterdam, 1959.