A Formative Evaluation of a Task-based EFL Programme for Korean University Students


1.1 Introduction

The following study is an investigation and an evaluation of an EFL English Conversation Programme designed and implemented by the author, for students of Andong National University, in the Republic of Korea. The programme, which entered its pilot year in March 1997, and which reached its first year of full implementation in 1999, was commissioned by the then president of the university, Dr. Lee Jin-seol, with the intention of promoting communicative competence in English (in line with government policy, cf. Li 1998:681) for all university students in their first three years of study[1]. Given this purpose, plus the particular characteristics of the learners (section 5.3, page 181) and their socio-cultural and learning background (sections 2.3.2, 2.3.3, 2.4, pages 39, 40, 42), the author decided to offer a task-based infrastructure, founded on a "process" view of learning and on humanistic goals and methods (Chapter 4, page 173). This approach aimed to promote communicative competence in a holistic setting, focusing on development of learning skills and of learning attributes (confidence, motivation and independence) which would facilitate students' future learning endeavours.

The task-based "process" infrastructure was seen as an appropriate "vehicle" for a "language-learning as education" package, because it allowed a focus on affective, psycho-social and socio-cultural aspects of learning (cf. literature review, section 3.3, page 99), it promoted a problem-solving approach to language learning and to learner training (section 3.4.3, page 144), and it encouraged learner autonomy (cf. literature review, section 3.2, page 55). Such an expanded view of the programme's original purpose implied curricular attention to learner training (section, page 231), self-direction/ autonomy (section 3.2, page 55), self-assessment (section 3.2.3, page 92) and reflection (section, page 94), along with attention to affective variables such as lack of confidence (section, page 108), language-learning anxiety (section 3.3.4, page 118), and unrealistic expectations (section, page 117) - all considered to be barriers to learning in the Korean classroom. Since there were no commercially available textbooks with similar goals and methods (cf. Lee 1996:167) at this time, three books of task-based activities and projects (sections 2.6.3, 7.3, pages 52, 223) were co-authored by the author and the then Director of the Language Centre at ANU, Dr. Hyun Tae-duck, and were further adapted in consultation with teaching staff during the first three years of the programme (1997, 1998, 1999), finally being published in colour in 2000.

Assessment of the students (and therefore of the programme) recognised research findings that learners do not "learn" what teachers "teach" (Allwright 1984; Williams & Burden 1997) and that learner beliefs and perceptions determine the content and efficacy of learning, representing reality for the students (Rogers 1951:484). It was therefore decided to incorporate self-evaluation and reflection in the curriculum, and to encourage students to monitor their progress through the programme. In this way, a record of changes in perceptions would be constructed, providing information on how learners perceived their own learning progress and the programme's effectiveness.

The evaluation method chosen (section 1.3, page 29) was "formative" (Breen & Candlin 1980), and to some extent "illuminative" (Parlett 1981), since the programme designer was also the evaluator, involved in the day-to-day running and development of the programme:

A genuinely communicative use of evaluation will lead towards an emphasis on or ongoing evaluation rather than summative or end-of-course evaluation. (Breen & Candlin 1980:105-6)[2]

Qualitative and quantitative results obtained through questionnaires, interviews, learner journals and self-assessment instruments, provided data on whether students became (i.e. perceived themselves as) more confident, motivated and independent, as a measure of programme success.

1.2 Research questions

Although "the very process of evaluation helps to shape the nature of the project itself" (Williams & Burden 1994:22), it was considered appropriate to formulate research questions, even though these would be subject to formative change (cf. Parlett 1981:421, cited in Williams & Burden 1994:23). At the beginning of this research Long & Crookes' observation that "No complete programme that we know of has been implemented and evaluated [3] which has fully adopted even the basic characteristics of TBLT ..., much less the detailed principles for making materials design and methodological decisions" (Long & Crookes 1993:43) made evaluation of the design and effectiveness of a task-based programme the main priority. However, since Mohamed's findings that "the TBL syllabus and method is more effective than the PPP method and syllabus" (Mohamed 1998:251), the focus of attention shifted to evaluation of a task-based programme as a vehicle for promotion of learner autonomy, positive affect, and (consequently) positive attitude change in students and teachers[4]. Since the programme was to be evaluated in terms of participant (students/teachers) perceptions, these research questions emphasised qualitative measurement of attitude change:

  1. Research question 1: Did learner/teacher beliefs change during the research period?
    • If "Yes", how did they change?
    • If "Yes", to what extent was this due to the programme?
  2. Research question 2: Did learners/teachers become more confident, motivated and independent during the research period?
    • If "Yes", how did this manifest itself?
    • If "Yes", to what extent was this due to the programme?
  3. Research question 3: Did learners perceive an improvement in their oral skills during the research period?
    • If "Yes", to what extent was this due to the programme?
  4. Research question 4: Did the research affect the programme?
    • If "Yes", how did this manifest itself?

1.3 Programme evaluation

1.3.1 Definitions

Rea-Dickins (1994:72) points out that there have been many different ways of interpreting educational evaluation and offers a selection of definitions:  

  • Educational evaluation is the process of delineating, obtaining and providing useful information for judging decision alternatives. (Stufflebeam 1971:43)
  • Evaluation is the process of marshalling information and arguments which enable interested individuals and groups to participate in the critical debate about a specific programme. (Kemmis 1986)
  • Educational evaluation is a systematic description of educational objectives and/or an assessment of their merit or worth. (Hopkins 1989:14)
  • Evaluation is the principled and systematic collection of information for purposes of decision-making. (Rea-Dickins & Germaine 1992)
  • Evaluation is the process of determining the merit, worth and value of things, and evaluations are the products of that process ... evaluation is here treated as a key analytical process in all disciplined intellectual and practical endeavours. ... It is said to be one of the most powerful and versatile of the 'transdisciplines' - tool disciplines such as logic, design, and statistics - that apply across broad ranges of the human investigative and creative effort while maintaining the autonomy of a discipline in their own right. (Scriven 1991:1)

These definitions can be interpreted according to either the "psychometric" or>  "ethnographic" research paradigm[5] (cf. section 1.3.2, page 30), though terms such as "process" and "description" suggest a naturalistic tendency, which Rea-Dickins & Germaine (1998) take further, stating that evaluation is "about innovation" (Rea-Dickins & Germaine 1998:15), "about the worth of something" (Rea-Dickins & Germaine 1998:17), about accountability, curriculum development, awareness-raising (cf. Stenhouse 1975) and managing (Rea-Dickins & Germaine, 1998:17), about forming a judgement and providing evidence, and "about stimulating learning and understanding" (Rea-Dickins & Germaine, 1998:11), in addition to the more traditional view of evaluation as a tool for assessing programme impact and value for money.

1.3.2 Quantitative vs. qualitative

Core issues of language programme evaluationhave traditionally included: i) what is evaluated; ii) who does the evaluating; iii) when> the evaluation occurs; and iv) how the evaluation is carried out (Rea-Dickins 1994:77), though (especially in light of recent attention to :process in evaluation), questions relating to: v) e:who the evaluation is intended for (e.g. financial sponsors, executive administrators, students, teachers, programme designer, curriculum designer); and vi) how it will be used (e.g. to help in improving the programme for participants, or to assist decision-making regarding its continued feasibility) are also pertinent. Approaches to these issues have developed from a positivistic perspective of "objective" judgement in "experimental" conditions, with reference to target objectives, to one which includes educational process, is "participative, concerned with communication and critical debate and ... is principled, systematic and an integral part of curriculum planning and implementation" (Rea-Dickins 1994:72). This development has been identified as a paradigm dialogue (Guba 1990) between a positivistic (quantitative) and a naturalistic (qualitative) view of evaluation (cf. Lynch 1996), showing similarities to the propositional/process paradigm debate in syllabus design (cf. Breen 1987a; 1987b; Section 3.4, page 139).

From the conventional positivistic point of view, reality is objective, and facts must be separated from values by researchers who remain distant from the observed data, the assumption being that a "laboratory" approach to learning conditions, and "scientific" measurement of observable and verifiable "facts" can isolate and measure relevant variables. Such "classical" evaluation usually depends on objectives having been clearly stated at the outset, and consists of a pre-programme test, the teaching programme, and then a post-programme test, involving initial, formative and summative procedures (Sharp 1990:132/3). This approach can be very appealing to the researcher, except that statistical models can be shown to have social biases, and the "most objective of procedures ... can be shown to require subjective interpretations" (Lynch 1996:17). There is also the danger (as with the "scientific" approach to testing – section 6.4, page 203) that such evaluations "can only measure that which is measurable" (Van Lier 1996:120), and can focus on "fixed characteristics" (e.g. proficiency levels) that do not necessarily exist, or that are in fact variable (cf. Weir 1988:7), to the exclusion of other indicators of learning. Other problems inherent in the classical "experimental" approach include: history (unpredictable events affecting the study); maturation (changes in the students); familiarisation (sensitisation of students, learning of testing materials; instrumentation (testing instruments can determine the outcome); selection (important differences between students in comparison groups); and mortality (students may drop out of a programme before the evaluation is complete) (cf. Long 1984:411).

In the naturalistic paradigm the emphasis is on observing, describing, interpreting and understanding the continuously changing process of the programme being evaluated, using techniques such as in-depth interviews, participant observation and journals. This approach (originating in phenomenology and the interpretative approach to social inquiry of the late 19th century) generally requires emergent, variable design, in contrast to the pre-ordinate[6] designs of positivistic research (Guba 1978:14), and stems from a non-objective definition of reality as dependent on mind and interpretation, in which there is no meaningful separation of facts from values, and "phenomena can be understood only within the context in which they are studied" (Guba & Lincoln 1989:45). Thus Guba states that the more an investigator does not impose constraints on antecedent conditions and outputs, "the more he is naturalistically inclined" (Guba 1978:80). Acknowledging that such methodology has problems of subjectivity, and (as with all methodologies) relies on the faith of its practitioners and users ("all truth depends, in part, on persuasion" Guba 1978:81), Guba offers a number of arguments in favour of naturalistic inquiry:

... to enlarge the arsenal of investigative strategies available for dealing with emergent questions of interest; to provide an acceptable basis for studying process; to provide an alternative where it is impossible to meet the technical assumptions of the experimental approach in the real world; to better assess the implications of treatment-situation interaction; to address the balance between reconstructed logic and logic-in-use; to avoid the implicit shaping of possible outcomes; to optimise generalizability; and to meet certain practical criteria defined as fitting, working and communicating. (Guba 1978:80-81)

This is not to claim that one approach is intrinsically superior to another, or that one paradigm will naturally replace the other, though there has been a move from the evaluator as external specialist, to evaluation (and the evaluator) being integral parts of the project, feeding back into decision-making at all stages (White 1988:148), "observing effects in context" (Cronbach 1975) rather than making predictive generalisations, and improving the course while it is still fluid (Cronbach 1963:403). However, as Guba points out, the choice of evaluation method reflects differing goals, situations and educational perspectives, and depends on 'dimensions' such as philosophical base (logical positivism vs. phenomenology), paradigm (experimental physics vs. ethnography or investigative journalism), purpose (verification vs. discovery), stance (reductionist vs. expansionist), framework/design (preordinate vs. emergent), style (intervention vs. selection), reality manifold (singular vs. multiple), value structure (singular vs. pluralistic), setting (laboratory vs. nature), context (unrelated vs. relevant), conditions (controlled vs. invited interference), (stable vs. variable), (molecular vs. molar), and (intersubjective agreement vs. objective factual/confirmable) (Guba 1978:80). In addition to these pedagogical and theoretical considerations, there are also ethical issues which are relevant to all evaluations. These are discussed by Simons (1979), who lists five factors which the evaluator must take into account: i) impartiality; ii) confidentiality and control over the data participants; iii) negotiation among all parties involved; iv) collaboration by all concerned; and v) accountability by all levels in the organisational hierarchy (cf. White 1988:149). Lynch (1996) suggests a combination or a mix of evaluation strategies (e.g. qualitative analysis of quantitative data) or a mixed design (positivistic and naturalistic):

... the systematic attempt to gather information in order to make judgements or decisions ... can be both qualitative and quantitative in form, and can be gathered through different methods such as observation or the administration of pencil-and-paper tests. (Lynch 1996:2)

For Rea-Dickins however, there is no preferred approach, and "the issue is ... to choose the method for the specific purpose" (1994:77).

1.3.3 Forms

Long (1984:419) identifies four types of evaluation (summative, product, formative and process) differing in focus, timing, purpose and theoretical motivation, and reflecting different perspectives and goals (Appendix A-1, below):

Appendix A-1

Traditional, summative approaches to evaluation are used to decide on whether a project has been 'successful' (however this is defined), but are unable to comment on :why or <>how, or on what should happen next (cf. Cronbach 1976; Parlett 1976; 1981; Williams & Burden 1994:22), while formative evaluation is often used by programme evaluators in order to increase the likelihood of the project's successful implementation (Breen & Candlin 1980:106; Williams & Burden 1994:22). Parlett (1981) also proposes a further form of "illuminative" evaluation, which can be used summatively or formatively. According to this 'social-anthropological' model, the evaluator is involved in the day-to-day working of the project, and data (collected through interviews, questionnaires, observation, diaries, student records, etc.) assists decision-making and guides implementation. The role of the evaluator in this model is to produce an "interpretation of a highly complex system" (Parlett 1981:421), addressing questions raised by participants, and investigating background, culture, politics, aims, hidden curricula, and varying opinions (cf. Williams & Burden 1994:23). Systematic observation of classroom behaviour, which is the essence of process evaluation (Long 1984:415), can also contribute to this approach. Other recent models of evaluation include the "professional" teacher-as-researcher approach (deriving from the work of Stenhouse, and related to action research as proposed by Elliott 1981; 1985; and Cohen & Manion 1980), and the "case study" (cf. Adelman, Jenkins & Kemmis 1976; Yin 1984), which allows readers to make judgements for themselves (cf. Lawton 1980).

Parlett & Hamilton (1975) stress that any evaluation, whatever its parameters, must take into account the "social- psychological and material environment in which students and teachers work together", which they call the "learning milieu", and which includes legal, administrative, occupational, architectural and financial constraints, operating assumptions, individual teacher's characteristics, and student perspectives and preoccupations. These cultural, social, institutional and psychological variables interact in complex ways in the classroom (cf. complexity theory, section 6.6, page 212) to produce:

... a unique pattern of circumstances, pressure, customs, and work styles which suffuse the teaching and learning. ... The introduction of an innovation sets off a chain of repercussions throughout the learning milieu. In turn these unintended consequences are likely to affect the innovation itself, changing its form and moderating its impact. (Parlett & Hamilton 1975:145)

[1] Undergraduates in Korea are required to study for four academic years before graduating, though most male students do their compulsory military service (six months to two years) during this time, and can therefore take up to six years to graduate.
[2] Quoted materials in this study are reproduced in the original form, irrespective of spelling differences (US/UK) or of grammatical ambiguities.
[3] Original italics.
[4] These issues were present in the original research design, but received more prominence following Mohamed's work.
[5] Breen defines 'paradigm' as "a consensus within a professional community concerning which ideas are considered important" (1987a:157; cf. Kuhn 1970)
[6] i.e constructed before the fact