[RUME] How Can We Measure Student Learning?

Sat May 13 17:41:02 EDT 2006

If you reply to this long (19kB) post please don't hit the reply 
button unless you prune the copy of this post that may appear in your 
reply down to a few relevant lines, otherwise the entire already 
archived post may be needlessly resent to subscribers.

***************************************
ABSTRACT: It is argued that direct measure of students' higher-level 
*domain-specific* learning through pre/post testing using (a) valid 
and consistently reliable tests *devised by disciplinary experts*, 
and (b) traditional courses as controls, can provide a crucial 
complement to the top-down assessment of broad-ability areas 
advocated by Hersh (2005) and Klein et al. (2005). .
***************************************

Michael Sylvester, in his TIPS [Teaching In the Psychological 
Sciences with archives at 
<http://www.mail-archive.com/tips%40acsun.frostburg.edu/>] post of 4 
May 2006 14:03:13-0000 titled "Learning Evaluation" wrote [bracketed 
by lines "SSSSSSSS. . . "; slightly edited]:

SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS
Can anyone recommend a learning evaluation instrument - one that would
assess the extent of students learning in the classroom?

Relevant items could be: (a) the teacher stimulates my thinking, (b) 
I am really learning a lot from this course, (c) I would recommend 
this course to others, (d) I learn more in this course than my grade 
would indicate, etc.

The problem that I find with the current teacher evaluation is that 
it does not address issues as to how the course contributes to 
students learning.
SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS

To which David Wasieleski, in his TIPS response of 04 May 2006 
07:13:21-0700 responded: 

"Aren't exams and assignments learning evaluations?"

In this response to Wasieleski & Sylvester (no, I didn't pay them to 
serve as straight men), I'll draw upon "The Physics Education Reform 
Effort: A Possible Model for Higher Education" [Hake (2005a)]. My 
apologies to the few outliers who have read that article.

Regarding David Wasieleski's apparent belief that course exams 
constitute learning evaluations, Wilbert McKeachie (1987) has pointed 
out that the time-honored gauge of student learning - course exams 
and final grades - typically measures lower-level educational 
objectives such as memory of facts and definitions rather than 
higher-level outcomes such as critical thinking and problem solving.

Regarding Michael Sylvester's criticism of Student Evaluations of 
Teaching (SET's), the same criticism as to assessing only lower-level 
learning applies to SET's - even those that contain questions such as 
those suggested by Sylvester - since their primary justification as 
measures of student learning appears to lie in the modest correlation 
with overall ratings of course (+ 0.47) and instructor (+ 0.43) with 
"achievement" *as measured by course exams or final grades* (Cohen 
1981).

HOW THEN CAN WE MEASURE STUDENTS' HIGHER-LEVEL LEARNING IN COLLEGE COURSES?

Several *indirect* (and therefore in my view problematic) gauges have 
been developed; e.g., Reformed Teaching Observation Protocol (RTOP), 
National Survey Of Student Engagement (NSSE), Student Assessment of 
Learning Gains (SALG), and Knowledge Surveys (KS's) (Nuhfer & Knipp 
2003). For a discussion and references for all but the last see Hake 
(2005b). RTOP and NSSE contain questions of the type desired by 
Sylvester.

On the other hand, *direct measures of student learning have been 
developed by Hersh (2005) and Klein et al. (2005). Hersh codirects 
the "Learning Assessment Project" 
<http://www.cae.org/content/pro_collegiate.htm> that "evaluates 
students' ability to articulate complex ideas, examine claims and 
evidence, support ideas with relevant reasons and examples, sustain a 
coherent discussion, and use standard written English." But Shavelson 
& Huang (2003) warn that:

". . . learning and knowledge are highly domain-specific - as, 
indeed, is most reasoning. Consequently, **the direct impact of 
college is most likely to be seen at the lower levels of Chart 1 - 
domain-specific knowledge and reasoning** . . . [of the Shavelson & 
Huang 2003 "Framework of Cognitive Objectives" (SHFCO)]."

Klein et al. have devised tests that compare student learning across 
institutions in both domain-specific and broad-ability areas of the 
SHFCO.

In sharp contrast to the above mentioned invalid (course exams, final 
grades, SET's); indirect (RTOP, NSSE, SALG, KS's); or general-ability 
[Hersh (2005), Klein et al. (2005)] measures discussed above, is the 
DIRECT MEASURE OF STUDENTS' HIGHER-LEVEL *DOMAIN-SPECIFIC* LEARNING 
THROUGH PRE/POST TESTING using (a) valid and consistently reliable 
tests *devised by disciplinary experts*, and (b) traditional courses 
as controls. It should be realized that domain specific learning is 
probably coupled to the broad-ability areas of the SHFCO, as 
suggested for physics by the recent research of Coletta & Phillips 
(2005).

Yes, I know, as discussed in Hake (2002), content learning should not 
be the sole measure of the value of a course. But I think most would 
agree that a gauge of content learning is *necessary*, if not 
sufficient.

In my opinion, the physics-education reform model - measurement and 
improvement of cognitive gains by faculty disciplinary experts *in 
their own courses* - can provide a crucial complement to the top-down 
approaches of Hersh (2005) and Klein et al. (2005). Such pre/post 
testing, pioneered by economists [Paden & Moyer (1969)] and 
physicists [Halloun & Hestenes (1985a,b)], is rarely employed in 
higher education, in part because of the tired old canonical 
objections recently lodged by Suskie (2004) and countered by Hake 
(2004a) and Scriven (2004).

Despite the nay-sayers, pre/post testing is gradually gaining a 
foothold in introductory astronomy, economics, biology, chemistry, 
computer science, economics, engineering, and physics courses [see 
Hake (2004b) for references].

Unfortunately, psychologists, as a group, have shown zero or even 
negative interest in assessing the effectiveness of their own 
introductory courses by means of definitive pre/post testing [see 
e.g. Hake (2005c,d,e)].

IMHO, this is especially discouraging because psychologists and 
psychometricians seem to be in control of (a) the U.S. Dept. of 
Education's "What Works Clearinghouse" (WWC) <http://www.w-w-c.org/> 
and (b) NCLB testing of "science achievement" to commence in 2007. 
The latter threatens to promote California's direct instruction of 
science thoughout the U.S [Hake (2005f)]. Why should psychologists be 
the arbiters of "What Works" and NCLB testing when, as far as I know, 
they haven't even bothered to meaningfully research "What Works" in 
their own courses?

For recent scathing criticism of the WWC see Schoenfeld (2006a,b).

Richard Hake, Emeritus Professor of Physics, Indiana University
24245 Hatteras Street, Woodland Hills, CA 91367
<rrhake at earthlink.net>
<http://www.physics.indiana.edu/~hake>
<http://www.physics.indiana.edu/~sdi>

REFERENCES [Tiny URL's courtesy <http://tinyurl.com/create.php>]
Cohen, P.A. 1981. "Student ratings of Instruction and Student 
Achievement: A Meta-analysis of Multisection Validity Studies," 
Review of Educational Research 51: 281. For references to Cohen's 
1986 and 1987 updates see Feldman (1989).

Coletta, V.P. and J.A. Phillips. 2005. "Interpreting FCI Scores: 
Normalized Gain, Preinstruction Scores, & Scientific Reasoning 
Ability," Am. J. Phys. 73(12): 1172-1182; online at 
<http://scitation.aip.org/dbt/dbt.jsp?KEY=AJPIAS&Volume=73&Issue=12>.

Feldman, K.A. 1989. "The Association Between Student Ratings of 
Specific Instructional Dimensions and Student Achievement: Refining 
and Extending the Synthesis of Data from Multisection Validity 
Studies," Research on Higher Education 30: 583.

Hake, R.R. 2002. "Assessment of Physics Teaching Methods, Proceedings 
of the UNESCO-ASPEN Workshop on Active Learning in Physics, Univ. of 
Peradeniya, Sri Lanka, 2-4 Dec. 2002; also online as ref. 29 at
<http://www.physics.indiana.edu/~hake/>, or download directly by clicking on
<http://www.physics.indiana.edu/~hake/Hake-SriLanka-Assessb.pdf> (84 kB)

Hake, R.R. 2004a. "Re: pre-post testing in assessment," online at
<http://listserv.nd.edu/cgi-bin/wa?A2=ind0408&L=pod&P=R9135&I=-3>. 
Post of 19 Aug 2004 13:56:07-0700 to POD.

Hake, R.R. 2004b. "Re: Measuring Content Knowledge," POD posts of 14 
&15 Mar 2004, online at
<http://listserv.nd.edu/cgi-bin/wa?A2=ind0403&L=pod&P=R13279&I=-3> and
<http://listserv.nd.edu/cgi-bin/wa?A2=ind0403&L=pod&P=R13963&I=-3>.

Hake, R. R. 2005a. "The Physics Education Reform Effort: A Possible 
Model for Higher Education," online at 
<http://www.physics.indiana.edu/~hake/NTLF42.pdf> (100 kB). This is a 
slightly updated  version of an article that was
(a) published in the National Teaching and Learning Forum 15(1), 
December 2005, online to subscribers at 
<http://www.ntlf.com/FTPSite/issues/v15n1/physics.htm>, and (b) 
disseminated by the Tomorrow's Professor list 
<http://ctl.stanford.edu/Tomprof/postings.html> as Msg. 698 on 14 Feb 
2006.

Hake, R.R. 2005b. "Re: Measuring Teaching Performance," POD post of 
13 May 2005; online at 
<http://listserv.nd.edu/cgi-bin/wa?A2=ind0505&L=pod&P=R9303&I=-3>.

Hake, R.R. 2005c. "Re: Why Don't Psychologists Research the Effectiveness
of Their Own Introductory Courses?" online at 
<http://tinyurl.com/muvy6>. Post of 20 Jan 2005 16:29:56-0800 to 
PsychTeacher (rejected) & PhysLrnR.

Hake, R.R. 2005d. "Do Psychologists Research the Effectiveness of
Their Own Introductory Courses?" TIPS post of 19 Feb 2005 
07:58:43-0800; online at 
<http://www.mail-archive.com/tips@acsun.frostburg.edu/msg13133.html>.

Hake, R.R. 2005e. "Do Psychologists Research the Effectiveness of 
Their Courses? Hake Responds to Sternberg," online at
<http://tinyurl.com/n9dp6>. Post of 21 Jul 2005 22:55:31-0700 to 
AERA-C, AERA-D, AERA-J, AERA-L, ASSESS, EvalTalk, PhysLrnR, POD, & 
STLHE-L, TeachingEdPsych.

Hake, R.R. 2005f. "Will the No Child Left Behind Act Promote Direct 
Instruction of Science?" Am. Phys. Soc. 50: 851 (2005); APS March 
Meeting, Los Angles, CA. 21-25 March; online as ref. 36 at 
<http://www.physics.indiana.edu/~hake>, or download directly by 
clicking on 
<http://www.physics.indiana.edu/~hake/WillNCLBPromoteDSI-3.pdf> (256 
kB).

Halloun, I. & D. Hestenes. 1985a. "The initial knowledge state of 
college physics students," Am. J. Phys. 53: 1043-1055; online at
<http://modeling.asu.edu/R&E/Research.html>. Contains the "Mechanics 
Diagnostic" test (omitted from the online version), precursor to the 
widely used "Force Concept Inventory" [Hestenes et al. (1992)].

Halloun, I. & D. Hestenes. 1985b. "Common sense concepts about 
motion," Am. J. Phys. 53: 1056-1065; online at 
<http://modeling.asu.edu/R&E/Research.html>.

Hersh, R.H. 2005. "What Does College Teach? It's time to put an end 
to 'faith-based' acceptance of higher education's quality," Atlantic 
Monthly 296(4): 140-143, November; freely online at (a) the Atlantic 
Monthly <http://tinyurl.com/dwss8>, and (b) (with hot-linked academic 
references) at <http://tinyurl.com/9nqon> (scroll to the APPENDIX).

Hestenes, D., M. Wells, & G. Swackhamer, 1992. "Force Concept 
Inventory," Phys. Teach. 30: 141-158; online (except for the test 
itself) at
<http://modeling.asu.edu/R&E/Research.html>. The 1995 revision by 
Halloun, Hake, Mosca, & Hestenes is online (password protected) at 
the same URL, and is available in English, Spanish, German, 
Malaysian, Chinese, Finnish, French, Turkish, Swedish, and Russian.

Klein, S.P., G.D. Kuh, M.Chun, L. Hamilton, & R. Shavelson. 2005. "An 
Approach to Measuring Cognitive Outcomes Across Higher Education 
Institutions." Research in Higher Education 46(3): 251-276; online at
<http://www.stanford.edu/dept/SUSE/SEAL/> // "Reports/Papers" scroll 
to "Higher Education," where "//" means "click on."

McKeachie, W.J. 1987. "Instructional evaluation: Current issues and 
possible improvements," Journal of Higher Education 58(3): 344-350.

Nuhfer, E. & D. Knipp. 2003. "The Knowledge Survey: A Tool for All 
Reasons," in To Improve the Academy 21: 59-78; online at
<http://www.isu.edu/ctl/facultydev/KnowS_files/KnowS.htm>.

Paden, D.W. & M.E. Moyer. 1969. "The Relative Effectiveness of 
Teaching Principles of Economics," Journal of Economic Education 1: 
33-45.

Scriven, M. 2004. "Re: pre- post testing in assessment," AERA-D post 
of 15 Sept 2004 19:27:14-0400; online at <http://tinyurl.com/942u8>.

Shavelson, R.J. & L. Huang. 2003. "Responding Responsibly To the 
Frenzy to Assess Learning in Higher Education," Change Magazine, 
January/February; online at <http://www.stanford.edu/dept/SUSE/SEAL/> 
// "Reports/Papers" scroll to "Higher Education," where "//" means 
"click on."

Suskie, L. 2004. "Re: pre- post testing in assessment," ASSESS post 
19 Aug 2004 08:19:53-0400; online at <http://tinyurl.com/akz23>.