Studies in Educational Evaluation 32 (2006) 401–409 www.elsevier.com/stueduc
Gardner, J. (Ed.) (2006). Assessment and learning (2006). London: Sage.
Fadia Nasser-Abu Alhija School of Education, Tel Aviv University, Israel
During the last two decades there has been a flourishing of texts on different aspects of assessment and learning (i.e., Black, Harrison, Lee, Marshall, & Wiliam 2003; Clarke, 2005; Segers, Dochy, & Cascallar, 2003; Torrance & Pryor, 1998; Weeden, Winter, & Broadfoot, 2002; Wiggins, 1998). Assessment and Learning is yet another such publication, with a broader scope and providing a non-technical and comprehensive overview of key concepts, issues, and topics in the field of student assessment, in particular assessment for learning. This edited volume has contributions from ten scholars with long standing interest and experience in assessment and learning, amongst other things through their activity within the Assessment Reform Group (ARG) in the UK. The book is organized in four parts that include 11 chapters, the content of which I shall discuss briefly. I shall also offer a more general appraisal. Part I. Practice This part of the book includes two chapters. In Chapter 1 "Assessment for learning in the classroom", Black and Wiliam present the essence of what research has had to say about formative assessment. Although the findings of their review are all well known and documented in other texts on classroom assessment, they highlight the difficulty of changing teachers' classroom practices, particularly implementing new methods of classroom assessment. The literature review is followed by an excellent presentation of how King'sMedway-Oxfordshire Formative Assessment Project (KMOFAP) sought to translate research findings into practice. Not less important and illuminating is the authors' reflection on their work with teachers, participating teachers' reflection, and the implications of the findings from the project. Their discussion of practical issues is particularly helpful to
0191-491X/04/$ – see front matter # 2006 Published by Elsevier Ltd. doi:10.1016/j.stueduc.2006.10.006
teachers, researchers, and policy makers as the authors do not shy away from concrete concerns and challenges relating to assessment for learning practices. Chapter 2 focuses on "professional learning as a condition for assessment for learning". In this chapter, James and Pedder examine the issue of teachers' learning in support of assessment for learning. Evidence from responses of teachers and school principals regarding the association between classroom assessment practices and values, and professional learning practices and values is clearly presented. The chapter concludes by arguing for an interpretation that draws parallels between processes of assessment for student learning and inquiry–based learning by teachers. James and Pedder report the findings regarding the structures of the assessment and learning constructs, together with the associations among them. However, information regarding the method of factor analysis used to extract factors, the range of the magnitude of factor loadings, and whether the items are ordered in the tables according to the magnitude of their factor loadings, is not reported. This information is crucial for evaluating the findings and for enabling replication studies by other researchers. Implications of assessment for learning for classroom roles, and the importance of teacher learning for promoting this type of assessment are clearly presented. The importance of teachers' professional development for acquiring assessment for learning skills which was evident in the authors' presentation is addressed with alacrity and wit. Part II: Theory This part includes three chapters (3-5) that focus on the theoretical underpinnings of learning and assessment. In Chapter 3, Mary James presents a clear and effective discussion of the relationship between assessment, teaching, and theories of learning. The chapter begins with a section about alignment between assessment and learning, followed by a description of three different examples of assessment practices illustrating the alienation between learning theories and assessment practices. In the section about the theoretical foundations of learning and assessment practices the author presents three clusters of learning theories and provides a good discussion of their implications for assessment practices. The chapter concludes with a discussion of whether eclectic or synthetic models of assessments aligned to learning are feasible. This chapter is a highly instructive resource for teachers, school principals, policy makers, and researchers seeking a lucid synopsis of theories of learning and how they can be used to explain the relationship between assessment practices, teaching, and learning outcomes. It is particularly valuable for teachers who have only taken a course on theories of learning during their training. Teacher training, as the author justifiably indicates, does not provide teachers with the knowledge and skills that are needed to align their teaching and assessment practices with their understanding of learners, learning, and subject knowledge. One of the appealing features of this chapter is its organization. The brief descriptions are conceptually organized to match the main theme of the chapter and the book in general. I also want to praise the way the gap between theory and practice is discussed and agree with the author's conclusion that "psychological, social-psychological,
sociological, and epistemological dimensions all need to be taken into consideration at some level in the framing of assessment practice" (p. 48). Chapter 4 presents a clear and inspiring text on the role of assessment in developing motivation for learning. Wynne Harlen provides a lucid and useful discussion of some key components of motivation for learning and some related theories. This discussion is followed by reference to research evidence relating to the negative impact that some summative assessment practices, particularly high-stake tests, have on motivation for learning. The chapter concludes by drawing together implications for assessment policy at the school, local and national levels. This chapter is useful as well as important. It is useful because it looks into the association between assessment and motivation, draws attention to the risk or negative impact of some assessment practices, and suggests some ways of using assessment to enhance motivation for learning. These three topics should be of great interest to anyone involved in educational assessment. The importance of this chapter is rooted in the fact that affective correlates and outcomes of learning and assessment have been given little attention in previous assessment texts. The contribution of this chapter therefore is great, illuminating the association between motivation and assessment and bringing this topic to the forefront of assessment research. The clear description and discussion of the main themes are beautifully bolstered by attention to theoretical associations between assessment and motivation and by evidence from studies of the impact of assessment practices on components of motivation in different contexts. One shortcoming, however, is that some of the important literature on motivation variables is ignored. For example Bandura's (1997) masterful exposition on selfefficacy is not even mentioned. This seems to be an oversight and one that will hopefully be addressed in a future edition. In Chapter 5, Black and Wiliam report on their attempt to develop a theory of formative assessment. They discuss changes in the practice of teachers who were involved in a formative assessment project in terms of relationship between the teacher's role and the nature of the subject discipline, teachers' belief about their roles in the regulation of the learning process, feedback and student-teacher interaction, and students' roles in learning. The authors acknowledge the limited exploratory power of a theoretical framework based on only four elements; however, they rightly believe these elements should be taken into account in any attempt to understand the association between assessment and learning. The application of activity theory to explain the relationship between tools, subjects and objects, and outcomes in the classroom, to which they refer as an activity system, seems very plausible. Changes through two assessment interventions – the KMOFAP and BEAR projects – are used as complementary bases for theory development and three related research and development studies are presented to illustrate possibilities of classroom pedagogy. The limited applicability of the proposed theoretical framework, using an induction strategy, from experiences and lessons learned from implementing a few projects in specific contexts, is acknowledged by the authors. Hence it is important to develop a theoretical framework that will be applicable to a wide array of contexts and under various circumstances. The proposed theoretical framework should therefore be viewed as an inspiring lead to be followed by further work, providing further empirical support for the
proposed relationship and exploring other potential elements that can be incorporated within a more comprehensive theory of formative assessment. Part III. Formative and Summative Assessment This part includes three chapters (6-8) focusing on the purposes of assessment, reliability and validity. In Chapter 6, Wynne Harlen addresses the relationship between assessment for formative and assessment for summative purposes. The author makes a successful attempt to highlight the question whether it is useful to consider summative assessment (assessment of learning) and formative assessment (assessment for learning) as conceptually or pragmatically distinct activities. A useful description of the characteristics of each type of assessment, how it can be used for both purposes, and the limitation of using evidence gathered for one purpose for the other purpose are skillfully presented. The chapter ends by concluding that at present the distinction between formative and summative purposes of assessment should be maintained whilst assessment systems should include the provisions that make it possible for information gathered by teachers to be used for both purposes. I especially appreciate the charts offered, which summarize the relationship between elements in the assessment for learning (p. 105) and assessment of learning processes (p. 106). By using these organizing charts the author makes explicit the distinction between assessment for learning and assessment of learning in terms of the development process on the one hand, and the different roles that the student plays in each of the two modes of assessment on the other. A final comment is in order. Despite the aforementioned strengths of this chapter its contribution to the book's central argument and to the potential reader is limited. This is because its basic message, to the effect that the same information cannot be used for summative and formative purposes due to the asymmetric relationship between the two, is not new. Furthermore, the author's justified concern that formative assessment might turn into a set of summative assessment practices is repeated in other chapters in the book. Chapter 7 focuses on "the reliability of assessments". In the first section, Black and Wiliam discuss the meaning of the reliability of scores obtained from summative tests. In the second section they present published evidence about test reliabilities. In the third section they focus on decision consistency, that is, the effects of low reliability on the errors that result in assigning candidates to specific grades or levels on the basis of test scores. In the next three sections they broaden the scope of their discussion, looking at the overlap between reliability and validity, the reliability of formative assessments, and the broader issue of dependability. The non-technical presentation of this chapter is most beneficial for presenting an overview of the topics covered. I also liked the notion of describing studies that demonstrate the negative impact of using unreliable tests. I found this very effective, providing lively evidence for the abstract notion of reliability. Another captivating feature of the chapter is that it is clearly organized and uses a light, down-to-earth manner that makes most of the material easy to grasp. Some topics can benefit from further clarification. One of these is the explanation of the discriminatory power of an item. Furthermore, it seems to me, that the argument "all
reliability threats are removable if the cost and time needed to produce the results are secured" and the claim that "the public gets what it is prepared to pay for," are not completely true. Although time and cost are critical for reducing threats to reliability, other measurement considerations are also imperative even when time and cost are secured. Also, other psychometric concepts deserve some consideration such as generalizability, through which all sources of measurement errors are assessed, and reliability estimates in IRT, that do not assume equal measurement error for all performance levels and can be taken into consideration in assessing selection errors. Some ways for enhancing reliability and the problematic issues attached to such practices are discussed and clearly presented; however, a more comprehensive exploration of what can be done to enhance reliability and how to avoid potential measurement pitfalls and reliability threats is needed. Chapter 8 addresses "the validity of formative assessment". The main topics of this chapter are: current understandings of validity in relation to summative and formative assessment and factors inside and outside the classroom that either promote assessment for learning or hinder it. Gordon Stobart suggests the context in which learning takes place (socio-cultural and policy environment as well as what goes on in the classroom) and the quality of the feedback as two key factors that can promote or hinder assessment for learning. In discussing the current understanding of validity the author skillfully provides an overview of the relevant updated literature about validity, such as Messick (1989) and the American Educational Research Association's Standards for Educational and Psychological Testing. Stobart accurately discusses the meaning of validity emphasizing that validity refers to the score interpretation rather than the test itself, a common error made by many graduate students and some researchers. The author stresses that consequential validity is important in the context of both summative and formative assessments. He also makes clear that validity in formative assessment is about consequences, i.e., whether further learning takes place as a result of assessment. Although consequential validity is crucial, the importance of the content aspect deserves mention. A final note relates to Black and Wiliam's promise that a comprehensive consideration of the issue of dependability, which is an overall integrating concept in which both validity and reliability are included, is comprehensively considered in this chapter on validity. This promise is fulfilled only in part. Part IV. Policy This part includes three chapters (9-11) focusing on policy in relation to assessment for learning in different context. In Chapter 9 on "Constructing assessment for learning in the UK policy environment", Daugherty and Ecclestone explore the rise of assessment for learning as a characteristic of educational policy in the four countries of the UK (England, Scotland, Wales, and Northern Ireland). They also demonstrate how assessment policies are a key element in these countries' different policy environments. Furthermore, the authors address the rising prominence of assessment for learning within the broader educational policy
scene as one of the major ways in which governments aim to alter professional and public expectations of assessment systems. The authors cleverly take Broadfoot's (1996) reminder that assessment practices and discourses are embedded in and emanate from cultural, social and political traditions and assumptions as their starting point. These affect policies and teachers' practices in subtle, complex and often contradictory ways. The authors skillfully present the complexity and irony of policies and indicate how recommendations are distorted through interpretations and implementation. The arguments regarding policy are supported with examples and explanations reflecting the knowledge and expertise of the authors concerning policy issues in relation to assessment practices. The fact that it took about a decade to establish assessment for learning as an element in the official policy discourse in each of the four countries of the UK indicates that such a change is difficult but not impossible. The observation that assessment for learning is reflected in the official educational policy in the UK countries in ways that mirror their distinctive cultures, policy environments, and policy processes implies that translating research findings into official policy and from there to practice cannot be done from one context to another. In Chapter 10, "Assessment for Learning: Why no profile in US policy?" Wiliam describes the current position with regard to assessment for learning in the USA in the light of the more general history of assessment. The chapter begins with a brief description of the establishment of the Entrance Examination Board and its impact on university admission examinations. Following sections describe the history of the Scholastic Aptitude Test and its current prominence in university admissions in the USA. The final sections discuss assessment practices in schools in the latter part of the twentieth century including some alternative methods. The proliferation of written aptitude testing is understandable on grounds of cost, objectivity, and accountability concerns in the decision to use assessment for learning or assessment of learning. Despite the dominance of aptitude and multiple choice tests, several attempts have been made in the USA to implement other modes of assessment. The use of alternative assessment methods in the early to mid 1990s deserves more consideration within this chapter. The negative effects of grading practices on motivation and learning are discussed in previous chapters. However, the unique features of the US testing practices as described in this chapter provide a clear illustration of how constraints, as well as the shakers and movers of policy considerations, impact assessment practices at all levels of the education system. Chapter 11 draws on illustrative examples of practice from selected countries that participated in the OECD study of formative assessment. The author competently outlines the major tensions that challenge the effort to gain interpretations and inferences from comparisons of assessment for learning across countries. In addition, readers who are interested in a more extensive discussion of this issue in the literature of comparative education can benefit from the references provided throughout the chapter. Jody Sebba makes a successful attempt at drawing parallels between the experiences of assessment for learning across the different countries involved in the study. Particularly
useful and illuminating is the evidence of the influence of culture on the educational organization and processes in different countries. Despite limitations outlined by the author, the OECD study provides rich descriptions of the variety of practices in assessment for learning in a range of different contexts. This offers interesting insights and suggests that some classroom practices and challenges may have more similarities than differences. While the Assessment Reform Group's (ARG, 2002a) definition puts considerably greater emphasis on the use to be made by learners of the assessment information, the OECD definition of formative assessment stresses adjustment of teaching in light of assessment. These are complementary rather than clashing aspects of formative assessment and emphasize the link between assessment and learning. As with previous chapters, the description of the case studies that compare formative assessment practices across countries is beautifully bolstered by excellent attention to practical and problematic aspects of strategies of formative assessment in the participating countries. General Appraisal The idea of aligning instruction, learning, and assessment is not new and is acknowledged as an important issue in nearly all programs for academic development (Biggs, 1999). This book provides the readers with research-based evidence and practical implications for the introduction of new modes of assessment in order to reach greater consistency between the learning and assessment environments and practices. A total of ten scholars in the field of assessment contribute 11 chapters in this book. In addition to these chapters, John Gardner has done a great job introducing the concept of assessment for learning, its definition, and how it is distinct from assessment of learning. He also provides an excellent summary and comments on the main issues addressed throughout the book. The relatively large number of contributors has its impact on the scope of the book. The key aspects of educational assessment are covered. The wide scope of the book provides the reader with a good overview of contemporary developments and issues in the field of assessment. This book is a very important contribution to the field of assessment for learning, throwing light on its various potentials for improving learning. The argument in favor of assessment for learning is coherent and well established both theoretically and empirically. Also, the authors' strong beliefs in their way and their keen attempts to make it work deserve praise. The contribution of the volume is manifold. Assessment and Learning provides a crucial link between theory and practice and between innovation and theorizing. It is a must for policy makers in relation to assessment, definitely for anyone in education working on projects related to assessment and learning. It can be used easily as a textbook at the undergraduate and graduate levels, as most of the chapters stand alone as discussions of assessment-for-learning issues. The volume is also of value as a reference book on assessment and learning, providing an interesting source for readings. It provides an excellent overview of key issues, concepts and topics in the field of student assessment, in particular assessment for learning. This major contribution essentially delivers what the
title promises in that it carries the link between assessment practice and learning forward by providing a state-of-the-art account and by advancing our understanding of the complex array of assessment issues and factors that are at play in assessment practices. It will be intriguing to see how the projects described in this volume become daily practice at school. A closing comment is in order. Theoretically there is a consensus that assessment for learning can improve attainment; however, the issue is practice. That is, how can assessment for learning be effectively implemented in order to meet standards of teaching quality? The experience in the USA and elsewhere indicates that assessment carried out by teachers does not meet acceptable criteria. Therefore, it would be a good thing to explore the reasons that underlie the failure of this approach more deeply rather than apportioning blame, on the one hand, or attempting to convince that this approach is important, on the other. A more comprehensive exploration of what can be done to enhance the practice of assessment for learning and how to avoid potential assessment pitfalls and misuse is needed. To conclude, Assessment and Learning is a useful and inspiring book for everyone concerned with the field of assessment and provides a comprehensible overview of the contemporary developments in the field of educational assessment, particularly assessment for learning. The long reference list at the end of the book (364 entries) is of great value for readers who are interested in expanding their knowledge of the topics covered in the 11 chapters of the book, as well as in the introduction and the closing notes by the author. This book has a potential appeal to researchers, policy makers, students, teachers, and professionals alike. It should be clear, however, that the book provides neither solutions to the limited implementation of assessment for learning nor remedies for its misuse. References Black, P., Harrison, C., Lee, C., Marshall, B., & Wiliam, D. (2003). Assessment for learning: Putting it into practice. Buckingham: Open University Press. Bandura, A. (1997). Self-efficacy: The exercise of control. New York: Freeman. Biggs, J. (1999). What the student does: Teaching for enhanced learning. Higher Education Research and Development, 18 (1), 57-75. Broadfoot, P. (1996). Education, assessment and society. Buckingham: Open University Press. Clarke, S. (2005). Formative assessment in the secondary classroom. London: Hodder Murray. Messick, S. (1989). "Validity". In R.L. Linn (Ed.), Educational measurement (3rd ed.). New York: American Council on Education/Macmillan. OECD (2005). Formative assessment: Improving learning in secondary classrooms. Paris: OECD. Segers, S., Dochy, F., & Cascallar, E. (2003) (Eds.). Optimizing new modes of assessment: In search of qualities and standards. Dordrecht: Kluwer.
Torrance, H., & Pryor, J. (1998). Investigating formative assessment: Teaching, learning and empirical questions. Buckingham: Open University Press Weeden, P., Winter, J., & Broadfoot, P. (2002). Assessment: What's in it for schools? London: Routledge Falmer. Wiggins, G. (1998). Educative assessment: Designing assessments to inform and improve student performance. San Francisco: Jossey-Bass, Wiley.
The Author FADIA NASSER-ABU ALHIJA is a senior lecturer and head of the Program on Research, Measurement, and Evaluation Methods at the Tel Aviv University School of Education. Main research topics: measurement and evaluation of gender- and culture-related achievements, evaluation of teachers and teaching, and structure validity (methodological studies). Correspondence: