Posted: September 19th, 2022

Crmj

need help with question

good qualitygood work

Distribution of Chi Square
Probability

df .99 .98 .95 .90 .80 .70 .50

11 .03157 .03628 .00393 .0158 .0642 .148 .455
12 .0201 .0404 .103 .211 .446 .713 1.386
13 .115 .185 .352 .584 1.005 1.424 2.366
14 .297 .429 .711 1.064 1.649 2.195 3.357
15 .554 .752 1.145 1.610 2.343 3.000 4.351

16 .872 1.134 1.635 2.204 3.070 3.828 5.348
17 1.239 1.564 2.167 2.833 3.822 4.671 6.346
18 1.646 2.032 2.733 3.490 4.594 5.528 7.344
19 2.088 2.532 3.325 4.168 5.380 6.393 8.343
10 2.558 3.059 3.940 4.865 6.179 7.267 9.342

11 3.053 3.609 4.575 5.578 6.989 8.148 10.341
12 3.571 4.178 5.226 6.304 7.807 9.034 11.340
13 4.107 4.765 5.892 7.042 8.634 9.926 12.340
14 4.660 5.368 6.571 7.790 9.467 10.821 13.339
15 5.229 5.985 7.261 8.547 10.307 11.721 14.339

16 5.812 6.614 7.962 9.312 11.152 12.624 15.338
17 6.408 7.255 8.672 10.085 12.002 13.531 16.338
18 7.015 7.906 9.390 10.865 12.857 14.440 17.338
19 7.633 8.567 10.117 11.651 13.716 15.352 18.338
20 8.260 9.237 10.851 12.443 14.578 16.266 19.337

21 8.897 9.915 11.591 13.240 15.445 17.182 20.337
22 9.542 10.600 12.338 14.041 16.314 18.101 21.337
23 10.196 11.293 13.091 14.848 17.187 19.021 22.337
24 10.856 11.992 13.848 15.659 18.062 19.943 23.337
25 11.524 12.697 14.611 16.473 18.940 20.867 24.337

26 12.198 13.409 15.379 17.292 19.820 21.792 25.336
27 12.879 14.125 16.151 18.114 20.703 22.719 26.336
28 13.565 14.847 16.928 18.939 21.588 23.647 27.336
29 14.256 15.574 17.708 19.768 22.475 24.577 28.336
30 14.953 16.306 18.493 20.599 23.364 25.508 29.336

For larger values of df, the expression �2�2– �2df–1 may be used as a normal deviate with unit variance,
remembering that the probability of �2 corresponds with that of a single tail of the normal curve.

continued on the inside back cover

Basics of Research
Methods for

CRIMINAL JUSTICE
and CRIMINOLOGY

Second Edition

Michael G. Maxfi eld
Rutgers University

Earl Babbie
Chapman University

Australia • Brazil • Japan • Korea • Mexico • Singapore
Spain • United Kingdom • United States

© 2009, 2006 Wadsworth, Cengage Learning

ALL RIGHTS RESERVED. No part of this work covered by the copyright
herein may be reproduced, transmitted, stored, or used in any form or by
any means graphic, electronic, or mechanical, including but not limited
to photocopying, recording, scanning, digitizing, taping, Web distribu-
tion, information networks, or information storage and retrieval systems,
except as permitted under Section 107 or 108 of the 1976 United States
Copyright Act, without the prior written permission of the publisher.

For product information and technology assistance, contact us at
Cengage Learning Customer & Sales Support, 1-800-354-9706.

For permission to use material from this text or product,
submit all requests online at cengage.com/permissions.

Further permissions questions can be emailed to
permissionrequest@cengage.com.

Library of Congress Control Number:

ISBN-13: 978-0-495-50385-9
ISBN-10: 0-495-50385-1

Wadsworth
10 Davis Drive
Belmont, CA 94002-3098
USA

Cengage Learning is a leading provider of customized learning solutions
with offi ce locations around the globe, including Singapore, the
United Kingdom, Australia, Mexico, Brazil, and Japan. Locate your local
offi ce at international.cengage.com/region.

Cengage Learning products are represented in Canada by
Nelson Education, Ltd.

For your course and learning solutions, visit academic.cengage.com.

Purchase any of our products at your local college store or at our
preferred online store www.ichapters.com.

Basics of Research Methods for Criminal
Justice and Criminology, Second Edition

Michael G. Maxfi eld and Earl Babbie

Senior Editor, Criminal Justice: Carolyn
Henderson Meier

Assistant Editor: Meaghan Banks

Editorial Assistant: John Chell

Technology Project Manager: Bessie Weiss

Marketing Manager: Michelle Williams

Marketing Assistant: Jillian Myers

Marketing Communications Manager:
Tami Strang

Project Manager, Editorial Production:
Jennie Redwitz

Creative Director: Rob Hugel

Art Director: Maria Epes

Print Buyer: Paula Vang

Permissions Editor: Bob Kauser

Production Service: Linda Jupiter
Productions

Copy Editor: Lunaea Weatherstone

Proofreader: Henrietta Bensussen

Indexer: Katherine Simpson

Illustrator: Newgen

Cover Designer: Yvo Riezebos, Riezebos
Holzbaur Design Group

Cover Image: (c) George Hammerstein/
Solus-Veer/Corbis

Compositor: Newgen

Printed in Canada
1 2 3 4 5 6 7 12 11 10 09 08

www.ichapters.com

To Max Jacob Fauth

iv

Earl Babbie grew up in small-town Vermont
and New Hampshire, venturing into the outer
world by way of Harvard, the U.S. Marine
Corps, the University of California, Berkeley,
and 12 years of teaching at the University of
Hawai’i. Along the way, he married Sheila (two
months after their fi rst date), and created Aaron
three years after that. He resigned from teach-
ing in 1980 and wrote full-time for seven years,
until the call of the classroom became too loud
to ignore. To him, teaching is like playing jazz:
even if you perform the same number over and
over, it never comes out the same twice, and you
don’t know exactly what it’ll sound like until
you hear it. Teaching is like writing with your
voice. Recently he has rediscovered his roots in
summer trips to Vermont. Rather than a return
to the past, it feels more like the next turn in a
widening spiral, and he can’t wait to see what’s
around the next bend.

About the Authors
Michael G. Maxfi eld is Professor of Criminal
Justice at Rutgers University, Newark. He is the
author of numerous articles and books on a
variety of topics, including victimization, po-
licing, homicide, community corrections, and
long-term consequences of child abuse and ne-
glect. He is the coauthor (with Earl Babbie) of
the textbook Research Methods for Criminal Justice
and Criminology, now in its fi fth edition, and co-
editor (with Mike Hough) of Surveying Crime in
the 21st Century, in the Crime Prevention Studies
series. Other recent work includes a POP Cen-
ter guide on the problem of abandoned vehicles
(forthcoming) and a special issue of Criminal
Justice Policy Review on environmental criminol-
ogy. Formerly a Visiting Fellow at the National
Institute of Justice, Maxfi eld works with a vari-
ety of public agencies and other organizations,
acting as a consultant and advocate of frugal
evaluation for justice policy. Recent projects
initiated collaboration with police departments
and other justice agencies in the areas of repeat
domestic violence, performance measurement
systems, and auto theft. Maxfi eld received his
Ph.D. in political science from Northwestern
University.

v

Brief Contents
PART ONE: An Introduction to Criminal Justice Inquiry 1

Chapter 1: Criminal Justice and Scientifi c Inquiry 2

Chapter 2: Ethics and Criminal Justice Research 26

PART TWO: Structuring Criminal Justice Inquiry 49

Chapter 3: General Issues in Research Design 50

Chapter 4: Concepts, Operationalization, and Measurement 80

Chapter 5: Experimental and Quasi-Experimental Designs 112

PART THREE: Modes of Observation 139

Chapter 6: Sampling 140

Chapter 7: Survey Research and Other Ways of Asking Questions 169

Chapter 8: Field Research 200

Chapter 9: Agency Records, Content Analysis, and Secondary Data 229

PART FOUR: Application and Analysis 253

Chapter 10: Evaluation Research and Problem Analysis 254

Chapter 11: Interpreting Data 287

This page intentionally left blank

vii

Quantitative and Qualitative Data 23
Knowing through Experience: Summing Up and

Looking Ahead 24
Main Points 24

Chapter 2: Ethics and Criminal Justice
Research 26

Introduction 27
Ethical Issues in Criminal Justice Research 27

No Harm to Participants 27

ETHICS AND EXTREME FIELD RESEARCH 28

Voluntary Participation 31
Anonymity and Confi dentiality 32
Deceiving Subjects 33
Analysis and Reporting 33
Legal Liability 34
Special Problems 35

Promoting Compliance with Ethical
Principles 37
Codes of Professional Ethics 37
Institutional Review Boards 38
Institutional Review Board Requirements and

Researcher Rights 41

ETHICS AND JUVENILE GANG MEMBERS 42

Ethical Controversies 42
The Stanford Prison Experiment 42
Discussion Examples 45

Main Points 46

PART TWO: Structuring Criminal
Justice Inquiry 49

Chapter 3: General Issues in Research
Design 50

Introduction 51
Causation in the Social Sciences 51

Criteria for Causality 52
Necessary and Suffi cient Causes 53

Validity and Causal Inference 53
Statistical Conclusion Validity 53

Contents
Preface xiii

PART ONE: An Introduction to
Criminal Justice Inquiry 1

Chapter 1: Criminal Justice and
Scientifi c Inquiry 2

Introduction 3

HOME DETENTION 4

What Is This Book About? 4
Two Realities 4
The Role of Science 6

Personal Human Inquiry 6
Tradition 7
Authority 7

ARREST AND DOMESTIC VIOLENCE 8

Errors in Personal Human Inquiry 8
Inaccurate Observation 8
Overgeneralization 8
Selective Observation 9
Illogical Reasoning 10
Ideology and Politics 10
To Err Is Human 10

Foundations of Social Science 11
Theory, Not Philosophy or Belief 11
Regularities 13
What about Exceptions? 13
Aggregates, Not Individuals 13
A Variable Language 14
Variables and Attributes 15
Variables and Relationships 18

Purposes of Research 18
Exploration 18
Description 19
Explanation 19
Application 20

Differing Avenues for Inquiry 20
Idiographic and Nomothetic Explanations 21
Inductive and Deductive Reasoning 22

viii Contents

Conceptualization 83
Indicators and Dimensions 83

WHAT IS RECIDIVISM? 84

Creating Conceptual Order 84
Operationalization Choices 86

Measurement as Scoring 87

JAIL STAY 88

Exhaustive and Exclusive Measurement 88
Levels of Measurement 89
Implications of Levels of Measurement 91

Criteria for Measurement Quality 92
Reliability 93
Validity 94

Measuring Crime 97
General Issues in Measuring Crime 97

UNITS OF ANALYSIS AND MEASURING

CRIME 98

Measures Based on Crimes Known
to Police 98

Victim Surveys 102
Surveys of Offending 103
Measuring Crime Summary 104

Composite Measures 105
Typologies 106
An Index of Disorder 107

Measurement Summary 109
Main Points 109

Chapter 5: Experimental and
Quasi-Experimental Designs 112

Introduction 113
The Classical Experiment 113

Independent and Dependent Variables 114
Pretesting and Posttesting 114
Experimental and Control Groups 115
Double-Blind Experiments 116
Selecting Subjects 116
Randomization 117

Experiments and Causal Inference 117
Experiments and Threats to Validity 118
Threats to Internal Validity 118

Internal Validity 55
External Validity 55
Construct Validity 55
Validity and Causal Inference Summarized 57
Does Drug Use Cause Crime? 57

CAUSATION AND DECLINING CRIME IN

NEW YORK CITY 58

Introducing Scientifi c Realism 60
Units of Analysis 61

Individuals 61
Groups 61
Organizations 62
Social Artifacts 62
The Ecological Fallacy 63
Units of Analysis in Review 63

UNITS OF ANALYSIS IN THE NATIONAL

YOUTH GANG SURVEY 64

The Time Dimension 65
Cross-Sectional Studies 66
Longitudinal Studies 66
Approximating Longitudinal Studies 67
The Time Dimension Summarized 70

How to Design a Research Project 70
The Research Process 71
Getting Started 73
Conceptualization 73
Choice of Research Method 74
Operationalization 74
Population and Sampling 74
Observations 75
Analysis 75
Application 75
Research Design in Review 75

The Research Proposal 76
Elements of a Research Proposal 76

Answers to the Units-of-Analysis Exercise 78
Main Points 78

Chapter 4: Concepts, Operationalization,
and Measurement 80

Introduction 81
Conceptions and Concepts 81

Contents ix

The British Crime Survey 161
Probability Sampling in Review 162

Nonprobability Sampling 162
Purposive Sampling 162
Quota Sampling 163
Reliance on Available Subjects 164
Snowball Sampling 165
Nonprobability Sampling in Review 166

Main Points 166

Chapter 7: Survey Research and Other Ways
of Asking Questions 169

Introduction 170
Topics Appropriate to Survey Research 171

Counting Crime 171
Self-Reports 171
Perception and Attitudes 172
Targeted Victim Surveys 172
Other Evaluation Uses 172

Guidelines for Asking Questions 173
Open-Ended and Closed-Ended

Questions 173
Questions and Statements 174
Make Items Clear 174
Short Items Are Best 174
Avoid Negative Items 174
Biased Items and Terms 175
Designing Self-Report Items 175

Questionnaire Construction 177
General Questionnaire Format 177
Contingency Questions 177
Matrix Questions 178
Ordering Items in a Questionnaire 180

DON’T START FROM SCRATCH! 181

Self-Administered Questionnaires 181
Mail Distribution and Return 182
Warning Mailings and Cover Letters 182
Follow-Up Mailings 183
Acceptable Response Rates 183
Computer-Based Self-Administration 184

In-Person Interview Surveys 185
The Role of the Interviewer 185

Ruling Out Threats to Internal Validity 120
Generalizability and Threats to Validity 121

Variations in the Classical Experimental
Design 123

Quasi-Experimental Designs 124
Nonequivalent-Groups Designs 125
Cohort Designs 128
Time-Series Designs 128
Variations in Time-Series Designs 132
Variable-Oriented Research and Scientifi c

Realism 133
Experimental and Quasi-Experimental Designs

Summarized 135
Main Points 136

PART THREE: Modes of Observation 139

Chapter 6: Sampling 140

Introduction 141
The Logic of Probability Sampling 141

Conscious and Unconscious Sampling
Bias 143

Representativeness and Probability of
Selection 144

Probability Theory and Sampling
Distribution 145
The Sampling Distribution of 10 Cases 145
From Sampling Distribution to Parameter

Estimate 149
Estimating Sampling Error 150
Confi dence Levels and Confi dence

Intervals 151
Probability Theory and Sampling Distribution

Summed Up 152
Populations and Sampling Frames 153
Types of Sampling Designs 154

Simple Random Sampling 154
Systematic Sampling 154
Stratifi ed Sampling 155
Disproportionate Stratifi ed Sampling 156
Multistage Cluster Sampling 157
Multistage Cluster Sampling with

Stratifi cation 158
Illustration: Two National Crime Surveys 160

The National Crime Victimization Survey 160

x Contents

Topics Appropriate for Agency Records and
Content Analysis 230

Types of Agency Records 232
Published Statistics 232
Nonpublic Agency Records 234
New Data Collected by Agency Staff 236

IMPROVING POLICE RECORDS OF DOMESTIC

VIOLENCE 238

Reliability and Validity 239
Sources of Reliability and Validity

Problems 240

HOW MANY PAROLE VIOLATORS WERE THERE

LAST MONTH? 242

Content Analysis 244
Coding in Content Analysis 244
Illustrations of Content Analysis 246

Secondary Analysis 247
Sources of Secondary Data 248
Advantages and Disadvantages of Secondary

Data 249
Main Points 250

PART FOUR: Application and Analysis 253

Chapter 10: Evaluation Research and
Problem Analysis 254

Introduction 255
Topics Appropriate for Evaluation Research and

Problem Analysis 255
The Policy Process 256
Linking the Process to Evaluation 257

Getting Started 260
Evaluability Assessment 260
Problem Formulation 261
Measurement 263

Designs for Program Evaluation 266
Randomized Evaluation Designs 266
Home Detention: Two Randomized

Studies 269
Quasi-Experimental Designs 271
Other Types of Evaluation Studies 273

Problem Analysis and Scientifi c Realism 273
Problem-Oriented Policing 274
Auto Theft in Chula Vista 275

Coordination and Control 186
Computer-Assisted In-Person Interviews 187

Telephone Surveys 189
Computer-Assisted Telephone

Interviewing 190
Comparison of the Three Methods 191
Strengths and Weaknesses of Survey

Research 192
Other Ways of Asking Questions 194

Specialized Interviewing 194
Focus Groups 195

Should You Do It Yourself ? 196
Main Points 198

Chapter 8: Field Research 200

Introduction 201
Topics Appropriate to Field Research 202
The Various Roles of the Observer 203
Asking Questions 205
Gaining Access to Subjects 207

Gaining Access to Formal Organizations 207
Gaining Access to Subcultures 210
Selecting Cases for Observation 210
Purposive Sampling in Field Research 212

Recording Observations 214
Cameras and Voice Recorders 214
Field Notes 215
Structured Observations 216
Linking Field Observations and Other

Data 217
Illustrations of Field Research 219

Field Research on Speeding and Traffi c
Enforcement 219

CONDUCTING A SAFETY AUDIT 220

Bars and Violence 222
Strengths and Weaknesses of Field Research 224

Validity 224
Reliability 225
Generalizability 226

Main Points 227

Chapter 9: Agency Records, Content
Analysis, and Secondary Data 229

Introduction 230

Contents xi

Describing Two or More Variables 296
Bivariate Analysis 296

MURDER ON THE JOB 298

Multivariate Analysis 301
Inferential Statistics 303

Univariate Inferences 304
Tests of Statistical Signifi cance 305
Visualizing Statistical Signifi cance 306
Chi Square 307
Cautions in Interpreting Statistical

Signifi cance 309
Main Points 311

Glossary 313
References 321
Name Index 332
Subject Index 334

Other Applications of Problem Analysis 276
Space- and Time-Based Analysis 276
Scientifi c Realism and Applied Research 280

The Political Context of Applied Research 282
Evaluation and Stakeholders 282

WHEN POLITICS ACCOMMODATES

FACTS 283

Politics and Objectivity 284
Main Points 285

Chapter 11: Interpreting Data 287

Introduction 288
Univariate Description 288

Distributions 288
Measures of Central Tendency 289
Measures of Dispersion 291
Comparing Measures of Dispersion and

Central Tendency 293
Computing Rates 295

This page intentionally left blank

xiii

introductory graduate courses, prefer the more
extensive coverage offered in RMCJC.

Organization
The overall organization of Basics follows
RMCJC. Part One introduces research meth-
ods. Chapter 1 begins with a brief treatment of
the epistemology of social science. We then de-
scribe the role of theory and different general
approaches to empirical research in criminal
justice. Chapter 2 considers the ethics of con-
ducting research in such a sensitive area of so-
cial life. We trace the foundations of efforts to
protect human subjects, then describe different
ways ethical principles are operationalized by
researchers.

Part Two examines the main elements of
planning empirical research. Chapter 3 describes
three important topics: causation, units of anal-
ysis, and the time dimension. This chapter con-
cludes with a step-by-step consideration of how
to plan research and prepare a research pro-
posal. In Chapter 4 we describe measurement in
general, including a brief version of material on
measuring crime from RMCJC. Even with the
abbreviated presentation here, we believe the
coverage of measurement concerns is the most
rigorous available in any undergraduate text on
criminal justice research methods. Chapter 5
examines research design, with extensive treat-
ment of experimental and quasi-experimental
approaches. As in RMCJC, we also consider sci-
entifi c realism as an approach to research that
complements traditional treatments of design.

Part Three covers data collection in some de-
tail, albeit more concisely than in the larger text.
Chapter 6 describes sampling, with its founda-
tions in probability theory. We also discuss dif-
ferent techniques of nonprobability sampling.
Finally, we consider combined approaches such
as adaptive sampling. Each of the next three
chapters centers on a general category of data

Preface
Since the fi rst edition of Research Methods for
Criminal Justice and Criminology (RMCJC) was
published in 1995, we have been delighted to
hear comments from instructors who have used
the text (and from a few who do not use it!).
Though it is always gratifying to learn of positive
reactions, we have also listened to suggestions
for revising the book through its fi ve editions.
Some colleagues suggested trimming the text
substantially to focus on the most important
principles of research methods in criminal jus-
tice. Students and instructors are also increas-
ingly sensitive to the cost of college texts.

As a result, we introduced Basics of Research
Methods for Criminal Justice and Criminology about
three years ago. Our objective in producing that
text was fi vefold: (1) retain the key elements
of the parent text; (2) concentrate on funda-
mental principles of research design; (3) ap-
peal to a broad variety of teaching and learning
styles; (4) retain salient examples that illustrate
various methods; (5) reduce less-central points
of elaboration and the examples used to illus-
trate them. That proved to be more challenging
than we initially thought. At one point we were
tempted to do something simple like drop two
chapters, wrap the result in a soft cover, and de-
clare what was left to be the basics. Fortunately
that sentiment was reined in and we pursued a
more deliberate approach that involved plan-
ning from the ground up.

Basics is shorter, more concise, and focused
on what we believe is the most central material
for introductory courses in research methods.
Rather than simply offering a truncated version
of the full text, Basics has been crafted to appeal
to those seeking a more economical alternative
while retaining the big book’s highly success-
ful formula. Many instructors teaching shorter
courses, or courses where students are better
served by concentrating on basic principles of
criminal justice research, have used the Basics
edition. Others, especially instructors teaching

xiv Preface

these topics in Chapter 4, “Concepts, Opera-
tionalization, and Measurement,” and Chap-
ter 5, “Experimental and Quasi-Experimental
Designs,” which will help students grasp
these important concepts.

• The introductory material on data collec-
tion modes has been cut from Chapter 6,
“Sampling,” which now focuses entirely on
sampling. We have added new material on
probability sampling from RMCJC. Like-
wise, Chapter 7, “Survey Research and Other
Ways of Asking Questions,” includes revised
guidance on computer-assisted interviewing
and the scope for greater use of web-based
survey techniques. Updated material draws
largely on a book edited by Mike Hough and
Mike Maxfi eld: Surveying Crime in the 21st
Century (Monsey, NY: Criminal Justice Press;
London: Willan; 2007). Together, these revi-
sions highlight the important role of case
selection, while presenting updated mate-
rial on different approaches to sampling.

• Chapter 10, “Evaluation Research and Prob-
lem Analysis,” follows the RMCJC shift in
focus from policy analysis to problem analy-
sis. This refl ects the growing use of evidence-
based planning by justice agencies. Among
other things, this produces broader cover-
age of applied research methods.

Popular features from the fi rst edition have
been retained, resulting in an up-to-date, con-
cise presentation of evolving methods in crimi-
nal justice research. We are happy to present
this revised edition and look forward to hear-
ing from instructors and students.

Learning Tools
As has always been the case in RMCJC, our ap-
proach to this text is student-centered. We
combine a solid discussion of principles with a
number of examples. Over the fi ve editions of
RMCJC we have struck a good balance, and that
is carried over into this edition of Basics. The
end of each chapter presents additional tools

collection: survey research and other ways of
asking questions (Chapter 7); fi eld observa-
tion, including systematic and ethnographic
approaches (Chapter 8); existing data collected
by justice agencies, and secondary data analysis
(Chapter 9).

In Part Four we present chapters on applied
research and an introduction to data analysis.
This follows the organization of RMCJC, though
these chapters are somewhat briefer than in the
larger book. Our treatment of applied research
(Chapter 10) has always been well received by
instructors and students using RMCJC.

Features of the New Edition
We are gratifi ed that both texts have been so
well received. At the same, we are grateful to
have been given a number of ideas from col-
leagues about how to improve Basics. Some of
the changes in this new edition stem from sug-
gestions by reviewers or colleagues who have
used the text. Other revisions are drawn from
the fi fth edition of RMCJC.

• Our discussion of theory and criminal justice
research has been streamlined and moved
into Chapter 1, “Criminal Justice and Scien-
tifi c Inquiry.” This responds to suggestions
that a more concise presentation of theory
would aid student understanding.

• Chapter 2, “Ethics and Criminal Justice Re-
search,” is now devoted exclusively to ethics
in criminal justice research, drawing on the
more complete discussion in RMCJC. Stu-
dents will benefi t from the more complete
consideration of ethics.

• Chapter 3, “General Issues in Research De-
sign,” includes guidelines on developing a
research proposal. This chapter also includes
updated examples of scientifi c realism. Each
of these features, adapted from the larger
text, will help students better understand
the research process.

• Sorting out validity in measurement and
causal inference can be diffi cult for students.
We present a more concise discussion of

Preface xv

to criminal justice, criminology, corrections,
criminal law, policing, and juvenile justice.

Student Resources
• Crime Scenes 2.0 Bring criminal justice to

life with this interactive simulation CD-ROM
featuring six scenarios of various crimes ( ju-
venile murder, prostitution, assault, arrest-
ing force/DUI, search and seizure, and em-
bezzlement/white-collar crime) to illustrate
all the stages of the criminal justice system.
Students make choices about the outcomes
at various decision points in each scenario,
illustrating the consequences of each choice.
Use the scenarios to introduce or review con-
cepts, spark class discussion, or as a basis for
group research projects. Written by Bruce
Berg (California State University, Long
Beach), this CD-ROM was awarded gold and
silver medals by New Media magazine.

• Current Perspectives Designed to give stu-
dents a deeper understanding of special top-
ics in criminal justice, the timely articles in
the Current Perspectives readers are selected
by experts in each topic from within Info-
Trac® College Edition. Each reader includes
access to InfoTrac® College Edition. Topics
available include:

• Juvenile Justice

• Cybercrime

• Terrorism and Homeland Security

• Public Policy and Criminal Justice

• New Technologies and Criminal Justice

• Racial Profi ling

• White Collar Crime

• Victimology (publishing 2008)

• Forensics and Criminal Investigation
(publishing 2008)

• Ethics and Criminal Justice (publishing
2008)

• Guide to Careers in Criminal Justice,
Third Edition This handy guide, compiled
by Caridad Sanchez-Leguelinel of John Jay
College of Criminal Justice, gives students
information on a wide variety of career paths,

to aid student learning. Main Points summarizes
topics with a brief statement that should trig-
ger student retention. This is followed by Key
Terms, each of which is introduced in the chap-
ter. Each key term is also presented in a glos-
sary at the end of the book. We offer a few Re-
view Questions and Exercises that are designed for
class discussion. In our own teaching, we ask
students to review these items before class.

Ancillaries
A number of supplements are provided by Wad-
sworth to help instructors use Basics of Research
Methods in Criminal Justice and Criminology, Sec-
ond Edition, in their courses and to aid stu-
dents in preparing for exams. Supplements are
available to qualifi ed adopters. Please consult
your local sales representative for details.

Instructor Resources
• Instructor’s Resource Manual with Test

Bank Fully updated and revised, the Instruc-
tor’s Resource Manual with Test Bank for this
edition includes learning objectives, de-
tailed chapter outlines, chapter summaries,
key terms, class discussion exercises, lec-
ture suggestions, and a complete test bank.
Each chapter’s test bank contains multiple-
choice, true-false, fi ll-in-the-blank, and essay
questions (approximately 75 questions in
all), along with a complete answer key.

• eBank Microsoft® PowerPoint® slides
Microsoft PowerPoint slides are provided
to assist you in preparing for your lectures.
Available online, the slides are fully custom-
izable to your course.

• Classroom Activities for Criminal Jus-
tice Stimulate student engagement with a
compilation of the best of the best in crimi-
nal justice classroom activities. Novice and
seasoned instructors will fi nd this booklet a
powerful course customization tool contain-
ing tried-and-true favorites and exciting new
projects drawn from the spectrum of crimi-
nal justice subjects, including introduction

xvi Preface

including requirements, salaries, training,
contact information for key agencies, and
employment outlooks.

• Handbook of Selected Supreme Court
Cases for Criminal Justice This supple-
mentary handbook covers almost 40 land-
mark cases, each of which includes a full case
citation, an introduction, a summary from
Westlaw, excerpts from the case, and the de-
cision. The updated edition includes Hamdi
v. Rumsfeld, Roper v. Simmons, Ring v. Arizona,
Atkins v. Virginia, Illinois v. Caballes, and much
more.

• Internet Activities for Criminal Justice
In addition to providing a wide range of
activities for any criminal justice class, this
booklet familiarizes students with Internet
resources useful both to students of and
professionals in criminal justice. Internet
Activities for Criminal Justice integrates Inter-
net resources and addresses with important
topics such as criminal and police law, polic-
ing organizations, policing challenges, cor-
rections systems, juvenile justice, criminal
trials, and current issues in criminal justice.

• Internet Guide for Criminal Justice, Sec-
ond Edition Intended for the novice user,
this guide provides students with back-
ground and vocabulary necessary to navi-
gate and understand the Web, then provides
them with a wealth of criminal justice web-
sites and Internet project ideas.

• Writing and Communicating for Crimi-
nal Justice This booklet provides students
with a basic introduction to academic, pro-
fessional, and research writing in criminal
justice. It contains articles on writing skills,
a basic grammar review, and a survey of ver-
bal communication on the job that will ben-
efi t students in their professional careers.

• Companion Website The book-specifi c
website at academic.cengage.com/criminal
justice/maxfi eld offers students a variety of
study tools and useful resources such as a
tutorial quiz, glossary, fl ash cards, and addi-
tional study aids.

Acknowledgments
Several reviewers made perceptive and use-
ful comments on the fi rst edition of Basics. We
thank them for their insights and suggestions:

Brian Forst, American University
Shaun Gabbidon, Pennsylvania State Uni-

versity, Harrisburg
David Jenks, California State University, Los

Angeles
Elizabeth McConnell, University of Hous-

ton, Downtown
J. Mitchell Miller, University of South

Carolina
Wayne Pitts, University of Memphis
Sudipto Roy, Indiana State University
Michael Sabath, San Diego State University
Theodore Skotnicki, Niagara County Com-

munity College
Clete Snell, University of Houston,

Downtown
Dennis Stevens, University of Southern

Mississippi

This edition continues to benefi t from con-
tributions by students at the Rutgers Univer-
sity School of Criminal Justice. We thank Dr.
Carsten Andresen (now at the Travis County De-
partment of Community Corrections and Su-
pervision), Dr. Gisela Bichler (now at California
State University, San Bernardino), Dr. Sharon
Chamard (now at University of Alaska), Shuryo
Fujita, Galma Jahic (now at Istanbul Bilgi Uni-
versity, Turkey), Dr. Jarret Lovell (now at Cali-
fornia State University, Fullerton), Dr. Marie
Mele (now at Monmouth University), Dr. Dina
Perrone (now at Bridgewater State College), and
Dr. Christopher Sullivan (now at the University
of South Florida).

We are especially grateful for the excellent
support and assistance we can always count
on from people at Cengage Learning: Carolyn
Henderson Meier, Jennie Redwitz, Michelle
Williams, and Meaghan Banks. Special thanks
to copy editor Lunaea Weatherstone and pro-
duction coordinator Linda Jupiter.

1

Part One

An Introduction to
Criminal Justice Inquiry

characteristics and issues that make sci-
ence different from other ways of knowing
things. Chapter 1 begins with a look at na-
tive human inquiry, the sort of thing all of us
have been doing all our lives. Because people
sometimes go astray in trying to understand
the world around them, we’ll consider the
primary characteristics of scientifi c inquiry
that guard against those errors.

Chapter 2 deals with the ethics of social
science research. The study of crime and
criminal justice often presents special chal-
lenges with regard to ethics. We’ll see that
most ethical questions are rooted in two
fundamental principles: (1) research sub-
jects should not be harmed, and (2) their
participation must be voluntary.

The overall purpose of Part One, therefore,
is to construct a backdrop against which to
view more specifi c aspects of research design
and execution. By the time you complete the
chapters in Part One, you’ll be ready to look
at some of the more concrete aspects of crim-
inal justice research.

What comes to mind when you encoun-
ter the word science? What do you think of
when we describe criminal justice as a social
science? For some people, science is math-
ematics; for others, it is white coats and lab-
oratories. Some confuse it with technology
or equate it with diffi cult high school or
college courses.

Science is, of course, none of these things
per se, but it is diffi cult to specify what ex-
actly science is. Scientists, in fact, disagree
on the proper defi nition. Some object to the
whole idea of social science; others question
more specifi cally whether criminal justice
can be a social science.

For the purposes of this book, we view
science as a method of inquiry—a way of
learning and knowing things about the
world around us. Like other ways of learning
and knowing about the world, science has
some special characteristics. We’ll examine
these traits in this opening set of chapters.
We’ll also see how the scientifi c method of
inquiry can be applied to the study of crime
and criminal justice.

Part One lays the groundwork for the rest
of the book by examining the fundamental

2

Chapter 1

Criminal Justice and
Scientifi c Inquiry
People learn about their world through a variety of methods, and they often
make mistakes along the way. Science is different from other ways of learning
and knowing. We’ll consider the foundations of social science, different pur-
poses of research, and different general approaches to social science.

Introduction 3

HOME DETENTION 4

What Is This Book About? 4

Two Realities 4

The Role of Science 6

Personal Human Inquiry 6

Tradition 7

Authority 7

ARREST AND DOMESTIC

VIOLENCE 8

Errors in Personal
Human Inquiry 8

Inaccurate Observation 8

Overgeneralization 8

Selective Observation 9

Illogical Reasoning 10

Ideology and Politics 10

To Err Is Human 10

Foundations of Social Science 11

Theory, Not Philosophy or Belief 11

Regularities 13

What about Exceptions? 13

Aggregates, Not Individuals 13

A Variable Language 14

Chapter 1 Criminal Justice and Scientifi c Inquiry 3

Introduction
Criminal justice professionals are both consumers
and producers of research.

Spending a semester studying criminal justice
research methodology may not be high on your
list of “Fun Things to Do.” Perhaps you are or
plan to be a criminal justice professional and
are thinking, “Why do I have to study research
methods? When I graduate, I’ll be working in
probation (or law enforcement, or corrections,
or court services), not conducting research! I
would benefi t more from learning about proba-
tion counseling (or police management, or cor-
rections policy, or court administration).” Fair
enough. But as a criminal justice professional,
you will need to be a consumer of research. One
objective of this book is to help you become an
informed consumer of research.

For example, fi ndings from an experimen-
tal study of policing, the Kansas City Preven-
tive Patrol Experiment, appeared to contradict
a fundamental belief that a visible police patrol
force prevents crime. Acting as a consumer of
research fi ndings, a police offi cer, supervisor, or
executive should be able to understand how that
research was conducted and how the study’s
fi ndings might apply in his or her department.

Most criminal justice professionals, espe-
cially those in supervisory roles, routinely re-
view various performance reports and statisti-
cal tabulations. A continually growing number
of research reports may now be found on the
Internet. For example, the National Criminal
Justice Reference Service (NCJRS) was estab-
lished to archive and distribute research reports
to criminal justice professionals and research-
ers around the world. Many such reports are
prepared specifi cally to keep the criminal jus-
tice community informed about new research
developments and may be downloaded from the
NCJRS website (www.ncjrs.gov, accessed May 6,
2008). An understanding of research methods
can help decision makers critically evaluate
such reports and recognize when methods are
properly and improperly applied. The box ti-
tled “Home Detention” describes an example of
how knowledge of research methods can help
policy makers avoid mistakes.

Another objective of this book is to help you
produce research. In other courses you take
or in your job, you may become a producer of
research. Probation offi cers sometimes test
new approaches to supervising or counseling
clients, and police offi cers try new methods of
dealing with recurring problems. Many cities

Variables and Attributes 15

Variables and Relationships 18

Purposes of Research 18

Exploration 18

Description 19

Explanation 19

Application 20

Differing Avenues for Inquiry 20

Idiographic and Nomothetic
Explanations 21

Inductive and Deductive
Reasoning 22

Quantitative and
Qualitative Data 23

Knowing through Experience:
Summing Up and Looking
Ahead 24

www.ncjrs.gov

4 Part One An Introduction to Criminal Justice Inquiry

Two Realities
Ultimately, we live in a world of two realities.
Part of what we know could be called our “ex-
periential reality”—the things we know from di-
rect experience. If you dive into a glacial stream
fl owing down through the Canadian Rockies,
you don’t need anyone to tell you the water
is cold; you notice that all by yourself. And if
you step on a piece of broken glass, you know
it hurts without anyone telling you. These are
things you experience.

The other part of what we know could be
called our “agreement reality”—the things
we consider real because we’ve been told they’re
real, and everyone else seems to agree they are
real. A big part of growing up in any society,
in fact, is learning to accept what everybody
around us “knows” to be true. If we don’t know
those same things, we can’t really be a part of
society. If you were to seriously question a ge-
ography professor as to whether the sun really

and states have a compelling need to evaluate
services provided to offenders released from
prison or jail. Determining whether changes or
existing programs are effective is an example of
applied research. A problem-solving approach,
rooted in systematic research, is being used
in more and more police departments and in
many other criminal justice agencies as well.
Therefore criminal justice professionals need
to know not only how to interpret research
accurately but also how to produce accurate
research.

What Is This Book About?
This book focuses on how we know what we know.

This book focuses on how we learn. Although
you will come away from the book knowing
many things you don’t know right now, our
primary purpose is to help you look at how you
know things, not what you know.

HOME DETENTION

Home detention with electronic moni-
toring (ELMO) was widely adopted as

an alternative punishment in the United States
in the 1980s. The technology for this new sanc-
tion was made possible by advances in telecom-
munications and computer systems. Prompted by
growing prison and jail populations, not to men-
tion sales pitches by equipment manufacturers,
criminal justice offi cials embraced ELMO. Ques-
tions about the effectiveness of these programs
quickly emerged, however, and led to research to
determine whether the technology worked. Com-
prehensive evaluations were conducted in Marion
County (Indianapolis), Indiana. Selected fi ndings
from these studies illustrate the importance of
understanding research methods in general and
the meaning of various ways to measure program
success in particular.

ELMO programs directed at three groups of
people were studied: (1) convicted adult offend-

ers, (2) adults charged with a crime and await-
ing trial, and (3) juveniles convicted of burglary
or theft. People in each of the three groups were
assigned to home detention for a specifi ed time.
They could complete the program in one of three
ways: (1) successful release after serving their
term, (2) removal due to rule violations, such as
being arrested again or violating program rules,
or (3) running away, or absconding. The agencies
that administered each program were required
to submit regular reports to county offi cials on
how many individuals in each category completed
their home-detention terms. The accompanying
table summarizes the program-completion types
during the evaluation study.

Convicted Pretrial
Adults (%) Adults (%) Juveniles

Success 81 73 99

Rule
violation 14 13 1

Abscond 5 14 0

Chapter 1 Criminal Justice and Scientifi c Inquiry 5

hended while he is committing a crime or im-
mediately thereafter.”

Seven years later, the Police Foundation, a
private research organization, published results
from an experimental study that presented a
dramatic challenge to the conventional wisdom
on police patrol. Known as the Kansas City Pre-
ventive Patrol Experiment, this study compared
police beats with three levels of preventive
patrol: (1) control beats, with one car per beat;
(2) proactive beats, with two or three cars per
beat; and (3) reactive beats, with no routine pre-
ventive patrol. After almost one year, research-
ers examined data from the three types of beats
and found no differences in crime rates, citizen
satisfaction with police, fear of crime, or other
measures of police performance (Kelling, Pate,
Dieckman, and Brown 1974).

Additional studies conducted in the 1970s
cast doubt on other fundamental assump-
tions about police practices. A quick response
to crime reports made no difference in arrests,

sets in the west, you’d quickly fi nd yourself set
apart from other people. The fi rst reality is a
product of our own experience; the second is a
product of what people have told us.

To illustrate the difference between agree-
ment and experiential realities, consider preven-
tive police patrol. The term “preventive” implies
that when police patrol their assigned beats they
prevent crime. Police do not prevent all crime,
of course, but it is a commonsense belief that
a visible, mobile police force will prevent some
crimes. In fact, the value of patrol in preventing
crime was a fundamental principle of police
operations for many years. A 1967 report on
policing for President Lyndon Johnson by the
President’s Commission on Law Enforcement
and Administration of Justice (p. 1) stated that
“the heart of the police effort against crime is
patrol. . . . The object of patrol is to disperse
policemen in a way that will eliminate or reduce
the opportunity for misconduct and to increase
the probability that a criminal will be appre-

These percentages, reported by agencies to county
offi cials, indicate that the juvenile program was a
big success; virtually all juveniles were successfully
released.

Now consider some additional information
on each program collected by the evaluation
team. Data were gathered on new arrests of pro-
gram participants and on the number of success-
ful computerized telephone calls to participants’
homes.

Convicted Pretrial Juveniles
Adults (%) Adults (%) (%)

New arrest 5 1 11

Successful
calls 53 52 17

As the table shows, many more juveniles were
arrested, and juveniles successfully answered a
much lower percentage of telephone calls to their
homes. What happened?

The simple answer is that the staff responsible
for administering the juvenile program were not
keeping track of offenders. The ELMO equipment
was not maintained properly, and police were
not visiting the homes of juveniles as planned.
Because staff were not keeping track of program
participants, they were not aware that many juve-
niles were violating the conditions of home deten-
tion. And because they did not detect violations,
they naturally reported that the vast majority of
young burglars and thieves completed their home
detention successfully.

A county offi cial who relied on only agency
reports of program success would have made a
big mistake in judging the juvenile program to be
99 percent successful. In contrast, an informed
consumer of such reports would have been skep-
tical of a 99 percent success rate and searched
for more information.

Source: Adapted from Maxfield and Baumer (1991)
and Baumer, Maxfield, and Mendelsohn (1993).

6 Part One An Introduction to Criminal Justice Inquiry

The Role of Science
Science offers an approach to both agreement
reality and experiential reality. Scientists have
certain criteria that must be met before they will
agree on the reality of something they haven’t
personally experienced. In general, an assertion
must have both logical and empirical support: it
must make sense, and it must agree with actual
observations. For example, why do earthbound
scientists accept the assertion that it’s cold on
the dark side of the moon? First, it makes sense
because the surface heat of the moon comes
from the sun’s rays. Second, scientifi c measure-
ments made on the moon’s dark side confi rm
the assertion. Therefore scientists accept the
reality of things they don’t personally experi-
ence—they accept an agreement reality—but
they have special standards for doing so.

More to the point of this book, however, sci-
ence offers a special approach to the discovery
of reality through personal experience. Episte-
mology is the science of knowing; methodol-
ogy (a subfi eld of epistemology) might be called
the science of fi nding out. This book focuses on
criminal justice methodology—how social sci-
entifi c methods can be used to better under-
stand crime and criminal justice policy. To un-
derstand scientifi c inquiry, let’s fi rst look at the
kinds of inquiry we all do each day.

Personal Human Inquiry
Everyday human inquiry draws on personal experi-
ence and secondhand authority.

Most of us would like to be able to predict how
things are going to be for us in the future. We
seem quite willing, moreover, to undertake this
task using causal and probabilistic reasoning.
First, we generally recognize that future circum-
stances are somehow caused or conditioned by
present ones. For example, we learn that get-
ting an education will affect what kind of job
we have later in life and that running stoplights
may result in an unhappy encounter with an
alert traffi c offi cer. As students, we learn that

according to a research study in Kansas City
(Van Kirk 1977). And criminal investigation
by police detectives rarely resulted in an arrest
(Greenwood 1975).

We mention these examples not to attack
routine law enforcement practices but to show
that systematic research on policing has illus-
trated how traditional beliefs—as examples of
agreement reality— can be misleading. Simply
increasing the number of police offi cers on pa-
trol does not reduce crime because police patrol
often lacks direction. Faster response time to
calls for police assistance does not increase ar-
rests because there is often a long delay between
the time when a crime occurs and when it is re-
ported to police. Clever detective work seldom
solves crimes because investigators get most of
their information from reports prepared by pa-
trol offi cers, who in turn get their information
from victims and witnesses.

Traditional beliefs about patrol effective-
ness, response time, and detective work are ex-
amples of agreement reality. In contrast, the re-
search projects that produced alternative views
about each law enforcement practice represent
experiential reality. These studies are exam-
ples of empirical research, the production of
knowledge based on experience or observation.
In each case, researchers conducted studies of
police practices and based their conclusions
on observations and experience. Empirical re-
search is a way of learning about crime and
criminal justice, and explaining how to con-
duct empirical research is the purpose of this
book.

In focusing on empirical research, we do
not intend to downplay the importance of
other ways of knowing things. Law students are
trained in how to interpret statutes and judicial
opinions. Historians take courses on methods
of historical interpretation, mathematics ma-
jors learn numerical analysis, and students of
philosophy study logic. If you are a criminal
justice major, many of the other courses you
take—say, a course on theories of crime and de-
viance—will add to your agreement reality.

Chapter 1 Criminal Justice and Scientifi c Inquiry 7

jumping-off point for the development of more
knowledge.

Authority
Despite the power of tradition, new knowledge
appears every day. Throughout life we learn
about new discoveries and understandings
from others. However, our acceptance of this
new knowledge often depends on the status of
the discoverer. For example, you are more likely
to believe a judge who declares that your next
traffi c violation will result in a suspension of
your driver’s license than your parents when
they say the same thing.

Like tradition, authority can both help and
hinder human inquiry. We do well to trust the
judgment of individuals who have special train-
ing, expertise, and credentials in a matter, es-
pecially in the face of contradictory arguments
on a given question. At the same time, inquiry
can be greatly hindered by the legitimate au-
thorities who err within their own special prov-
ince. Biologists, after all, do make mistakes in
the fi eld of biology, and biological knowledge
changes over time. Criminal justice research
sometimes yields mistaken results, and we are
wise to not uncritically accept research fi nd-
ings only because they come from experts. The
box titled “Arrest and Domestic Violence” illus-
trates the problems that can result when crimi-
nal justice policy makers accept too quickly the
results from criminal justice research.

Inquiry is also hindered when we depend
on the authority of experts speaking outside
their realm of expertise. Consider a political or
religious leader, lacking any biochemical exper-
tise, who declares marijuana to be a dangerous
drug. The advertising industry plays heavily on
this misleading use of authority by having pop-
ular athletes discuss the value of various sports
drinks and having movie stars evaluate the per-
formance of automobiles.

Both tradition and authority, then, are
double-edged swords in the search for knowl-
edge about the world. Simply put, they provide
us with a starting point for our own inquiry,

studying hard will result in better examination
grades.

Second, we recognize that such patterns of
cause and effect are probabilistic in nature: the
effects occur more often when the causes occur
than when the causes are absent—but not al-
ways. Thus, as students, we learn that studying
hard produces good grades in most instances,
but not every time. We recognize the danger of
ignoring stoplights without believing that ev-
ery such violation will produce a traffi c ticket.

The concepts of causality and probability play
a prominent role in this book. Science makes
causality and probability more explicit and pro-
vides techniques for dealing with them more
rigorously than does casual human inquiry.

However, our attempts to learn about the
world are only partly linked to personal inquiry
and direct experience. Another, much larger,
part comes from the agreed-on knowledge that
others give us. This agreement reality both as-
sists and hinders our attempts to fi nd out
things for ourselves. Two important sources of
agreement reality—tradition and authority—
deserve brief consideration here.

Tradition
Each of us is born into and inherits a culture
made up, in part, of fi rmly accepted knowledge
about the workings of the world. We may learn
from others that planting corn in the spring
will result in the greatest assistance from the
gods, that the circumference of a circle is ap-
proximately 3.14 times its diameter, or that
driving on the left side of the road (in the
United States) is dangerous. We may test a few
of these “truths” on our own, but we simply ac-
cept the great majority of them. These are the
things that “everybody knows.”

Tradition, in this sense, has some clear ad-
vantages for human inquiry. By accepting
what everybody knows, we are spared the over-
whelming task of starting from scratch in our
search for regularities and understanding.
Knowledge is cumulative, and an inherited
body of information and understanding is the

8 Part One An Introduction to Criminal Justice Inquiry

In contrast to casual human inquiry, scien-
tifi c observation is a conscious activity. Sim-
ply making observations in a more deliberate
way helps to reduce error. If you had gone to
the fi rst class meeting with a conscious plan to
observe and record what your instructor was
wearing, you’d have increased your chances of
accuracy.

In many cases, using both simple and com-
plex measurement devices helps to guard
against inaccurate observations. Suppose that
you had taken color photographs of your in-
structor on the fi rst day. The photos would
have added a degree of precision well beyond
that provided by unassisted human memory.

Overgeneralization
When we look for patterns among the specifi c
things we observe around us, we often assume
that a few similar events are evidence of a gen-
eral pattern. The tendency to overgeneralize
is probably greatest when there is pressure to
reach a general understanding, yet overgeneral-
ization also occurs in the absence of pressure.

but they may lead us to start at the wrong point
or push us in the wrong direction.

Errors in Personal
Human Inquiry
Everyday personal human inquiry reveals a number
of potential biases.

Aside from the potential dangers of relying on
tradition and authority, we often stumble when
we set out to learn for ourselves. Let’s consider
some of the common errors we make in our own
casual inquiries and then look at the ways sci-
ence provides safeguards against those errors.

Inaccurate Observation
The keystone of inquiry is observation. But
quite frequently we fail to observe things right
in front of us or mistakenly observe things
that aren’t so. Do you recall what your instruc-
tor was wearing on the fi rst day of this class? If
you had to guess now, what are the chances you
would be right?

ARREST AND
DOMESTIC VIOLENCE

In 1983, preliminary results were re-
leased from a study on the deterrent effects of
arrest in cases of domestic violence. The study re-
ported that male abusers who were arrested were
less likely to commit future assaults than offenders
who were not arrested. Conducted by researchers
from the Police Foundation, the study used rigor-
ous experimental methods adapted from the nat-
ural sciences. Criminal justice scholars generally
agreed that the research was well designed and ex-
ecuted. Public offi cials were quick to embrace the
study’s fi ndings that arresting domestic violence
offenders deterred them from future violence.

Here, at last, was empirical evidence to sup-
port an effective policy in combating domestic
assaults. Results of the Minneapolis Domestic
Violence Experiment were widely disseminated, in

part because of aggressive efforts by the research-
ers to publicize their fi ndings (Sherman and Cohn
1989). The attorney general of the United States
recommended that police departments make ar-
rests in all cases of misdemeanor domestic vio-
lence. Within fi ve years, more than 80 percent of
law enforcement agencies in U.S. cities adopted
arrest as the preferred way of responding to do-
mestic assaults (Sherman 1992, 2).

Several things contributed to the rapid adop-
tion of arrest policies to deter domestic violence.
First, the experimental study was conducted care-
fully by highly respected researchers. Second,
results were widely publicized in newspapers, in
professional journals, and on television programs.
Third, offi cials could understand the study, and
most believed that its fi ndings made sense. Finally,
mandating arrest in less serious cases of domestic
violence was a straightforward and politically at-
tractive approach to a growing problem.

Chapter 1 Criminal Justice and Scientifi c Inquiry 9

tion means repeating a study, checking to see
whether similar results are obtained each time.
The study may also be repeated under slightly
different conditions or in different locations.
The box titled “Arrest and Domestic Violence”
describes an example of why replication can be
especially important in applied research.

Selective Observation
Another danger of overgeneralization is that it
may lead to selective observation. Once we have
concluded that a particular pattern exists and
have developed a general understanding of why,
we will be tempted to pay attention to future
events and situations that correspond with the
pattern and to ignore those that don’t. Racial,
ethnic, and other prejudices are reinforced by
selective observation.

Research plans often specify in advance the
number and kind of observations to be made
as a basis for reaching a conclusion. For exam-
ple, if we wanted to learn whether women were
more likely than men to support long prison
sentences for sex offenders, we would have to

Whenever overgeneralization does occur, it can
misdirect or impede inquiry.

Imagine you are a rookie police offi cer newly
assigned to foot patrol in an urban neighbor-
hood. Your sergeant wants to meet with you at
the end of your shift to discuss what you think
are the major law enforcement problems on the
beat. Eager to earn favor with your supervisor,
you interview the manager of a popular store in
a small shopping area. If the manager mentions
vandalism as the biggest concern, you might
report that vandalism is the main problem on
your beat, even though other business owners
and area residents believe that drug dealing
contributes to the neighborhood problems of
burglary, street robbery, and vandalism. Over-
generalization leads to misrepresentation and
simplifi cation of the problems on your beat.

Criminal justice researchers guard against
overgeneralization by committing themselves in
advance to a suffi ciently large sample of obser-
vations and by being attentive to how represen-
tative those observations are. The replication
of inquiry provides another safeguard. Replica-

Sherman and Berk (1984), however, urged
caution in uncritically embracing the results of
their study. Others urged that similar research be
conducted in other cities to check on the Min-
neapolis fi ndings (Lempert 1984). Recognizing
the need for more research, the U.S. National
Institute of Justice sponsored more experiments—
known as replications—in six other cities. Not
everyone was happy about the new studies. For
example, a feminist group in Milwaukee opposed
the replication in that city because it believed
that the effectiveness of arrest had already been
proved (Sherman and Cohn 1989, 138).

Results from the replication studies brought
into question the effectiveness of arrest policies.
In three cities, no deterrent effect was found in
police records of domestic violence. In other cit-
ies, there was no evidence of deterrence for lon-
ger periods (6 to 12 months), and in three cities
researchers found that violence actually escalated

when offenders were arrested (Sherman 1992,
30). For example, Sherman and associates (1992,
167) report that in Milwaukee “the initial deter-
rent effects observed for up to thirty days quickly
disappear. By one year later [arrests] produce an
escalation effect.” Arrest works in some cases but
not in others. In responding to domestic assaults,
as in many other cases, it’s important to carefully
consider the characteristics of offenders and the
nature of the relationship between offender and
victim.

After police departments throughout the
country embraced arrest policies following the
Minneapolis study, researchers were faced with
the diffi cult task of explaining why initial results
must be qualifi ed. Arrest seemed to make sense;
offi cials and the general public believed what they
read in the papers and saw on television. Chang-
ing their minds by reporting complex fi ndings was
more diffi cult.

10 Part One An Introduction to Criminal Justice Inquiry

bias in police practices and sentencing policies.
Ideological or political views on such issues can
undermine objectivity in the research process.
Criminal justice professionals may have par-
ticular diffi culty separating ideology and poli-
tics from a more detached, scientifi c study of
crime.

Criminologist Samuel Walker (1994, 16)
compares ideological bias in criminal justice
research to theology: “The basic problem . . . is
that faith triumphs over facts. For both liber-
als and conservatives, certain ideas are unchal-
lenged articles of faith, almost like religious be-
liefs that remain unshaken by empirical facts.”

Most of us have our own beliefs about
public policy, including policies for dealing
with crime. The danger lies in allowing such
beliefs to distort how research problems are de-
fi ned and how research results are interpreted.
The scientifi c approach to the study of crime
and criminal justice policy guards against, but
does not prevent, ideology and theology color-
ing the research process. In empirical research,
so-called articles of faith are compared with
experience.

To Err Is Human
We have seen some of the ways that we can go
astray in our attempts to know and understand
the world and some of the ways that science
protects its inquiries from these pitfalls. Social
science differs from our casual, day-to-day in-
quiry in two important respects. First, social sci-
entifi c inquiry is a conscious activity. Although
we engage in continual observation in daily life,
much of it is unconscious or semiconscious. In
social scientifi c inquiry, we make a conscious
decision to observe, and we stay alert while we
do it. Second, social scientifi c inquiry is a more
careful process than our casual efforts; we are
more wary of making mistakes and take special
precautions to avoid doing so.

Do social scientifi c research methods offer
total protection against the errors that people
commit in personal inquiry? No. Not only
do individuals make every kind of error we’ve
looked at, but social scientists as a group also

make a specifi ed number of observations on
that question. We might select a thousand peo-
ple to be interviewed. Even if the fi rst 10 women
supported long sentences and the fi rst 10 men
opposed them, we would continue to interview
everyone selected for the study and record each
observation. We would base our conclusion
on an analysis of all the observations, not just
those fi rst 20.

Illogical Reasoning
People have various ways of handling observa-
tions that contradict their judgments about the
way things are. Surely one of the most remark-
able creations of the human mind is the maxim
about the exception that proves the rule, an idea
that makes no sense at all. An exception can
draw attention to a rule or to a supposed rule,
but in no system of logic can it prove the rule it
contradicts. Yet we often use this pithy saying to
brush away contradictions with a simple stroke
of illogic.

What statisticians call the gambler’s fallacy is
another illustration of illogic in day-to-day rea-
soning. According to this fallacy, a consistent
run of good or bad luck is presumed to fore-
shadow its opposite. An evening of bad luck at
poker may kindle the belief that a winning hand
is just around the corner; many a poker player
has stayed in a game too long because of that
mistaken belief. Conversely, an extended period
of good weather may lead us to worry that it is
certain to rain on our weekend picnic.

Although we all sometimes use embarrass-
ingly illogical reasoning, scientists avoid this
pitfall by using systems of logic consciously and
explicitly. Chapters 2 and 4 examine the logic of
science in more depth.

Ideology and Politics
Crime is, of course, an important social prob-
lem, and a great deal of controversy surrounds
policies for dealing with crime. Many people
feel strongly one way or another about the
death penalty, gun control, and long prison
terms for drug users as approaches to reducing
crime. There is ongoing concern about racial

Chapter 1 Criminal Justice and Scientifi c Inquiry 11

more or less universal principles. Barney Gla-
ser and Anselm Strauss (1967) coined the term
grounded theory to describe this method of
theory construction. Field research—the direct
observation of events in progress—is frequently
used to develop theories, or survey research
may reveal patterns of attitudes that suggest
particular theoretical explanations.

Once developed, theories provide general
statements about social life that are used to
guide research. For example, routine activ-
ity theory states that crimes are more likely
to occur when a motivated offender encoun-
ters a suitable victim in the absence of a capa-
ble guardian (Cohen and Felson 1979). Mike
Townsley and associates used routine activity
theory to guide their research on “contagious”
burglaries (Townsley, Homel, and Chaseling
2003). They argued that once burglars struck
one house in a neighborhood, they were more
likely to break into nearby houses because they
had become more familiar with an area and
its potential targets. The research results were
generally consistent with these expectations—
burglary tended to cluster around houses of
similar type in a neighborhood.

Townsley and associates used routine activ-
ity theory to generate a hypothesis about pat-
terns of burglary. A hypothesis is a specifi ed
expectation about empirical reality. Taking a
different example, a theory might contain the
hypothesis “Working-class youths have higher
delinquency rates than upper-class youths.”
Such a hypothesis could then be tested through
research.

Drawing on theories to generate hypotheses
that are tested through research is the tradi-
tional image of science, illustrated in Figure 1.2
on page 14. Here we see the researcher beginning
with an interest in something or an idea about
it. Next comes the development of a theoretical
understanding of how a number of concepts,
represented by the letters A, B, C, and so on,
may be related to each other. The theoretical
considerations result in a hypothesis, or an
expectation about the way things would be in
the world if the theoretical expectations were

succumb to the pitfalls and stay trapped for
long periods.

Foundations of Social Science
Social scientifi c inquiry generates knowledge through
logic and observation.

The two pillars of science are (1) logic, or ratio-
nality, and (2) observation. A scientifi c under-
standing of the world must make sense and
must agree with what we observe. Both of these
elements are essential to social science and re-
late to three key aspects of the overall scientifi c
enterprise: theory, data collection, and data
analysis.

As a broad generalization, scientifi c theory
deals with the logical aspect of science, data
collection deals with the observational aspect,
and data analysis looks for patterns in what is
observed. This book focuses mainly on issues re-
lated to data collection— demonstrating how to
conduct empirical research—but social science
involves all three elements. With this in mind,
the theoretical context of designing and execut-
ing research is an important part of the overall
process. Chapter 11 presents a conceptual intro-
duction to the statistical analysis of data. Figure
1.1 offers a schematic view of how the book ad-
dresses these three aspects of social science.

Let’s turn now to some of the fundamen-
tal things that distinguish social science from
other ways of looking at social phenomena.

Theory, Not Philosophy or Belief
Social scientifi c theory has to do with what is,
not what should be. A theory is a systematic ex-
planation for the observed facts and laws that
relate to a particular aspect of life—juvenile de-
linquency, for example, or perhaps social strati-
fi cation or political revolution. Joseph Maxwell
(2005, 42) defi nes theory as “a set of concepts
and the proposed relationships among these, a
structure that is intended to represent or model
something about the world.”

Often, social scientists begin constructing
a theory by observing aspects of social life,
seeking to discover patterns that may point to

12 Part One An Introduction to Criminal Justice Inquiry

Observation

Chapters 7–10

Planning to do
research

Chapters 3–5

Sampling

Chapter 6

DATA COLLECTION

DATA ANALYSIS

Chapter 11

THEORY

Media
exposure

Victimization
experience

Personal
communication

networks

Personal
and household

vulnerability

Fear of
crime

Knowledge
of events

Knowledge
of victims

Neighborhood
conditions

34% 78%

66% 22%

x

y

y

xY = a + x1 + x2 + x3 + x4 + e

a

c
d g

b

Figure 1.1 Social Science � Theory � Data Collection � Data Analysis

Chapter 1 Criminal Justice and Scientifi c Inquiry 13

What about Exceptions?
The objection that there are always exceptions
to any social regularity misses the point. The
existence of exceptions does not invalidate the
existence of regularities. Thus it is not impor-
tant that a particular police offi cer earns more
money than a particular judge if, overall, judges
earn more than police offi cers. The pattern still
exists. Social regularities represent probabilistic
patterns, and a general pattern does not have
to be refl ected in 100 percent of the observable
cases to be a pattern.

This rule applies in the physical as well as
the social sciences. In genetics, the mating of
a blue-eyed person with a brown-eyed person
will probably result in brown-eyed offspring.
The birth of a blue-eyed child does not chal-
lenge the observed regularity, however. Rather,
the geneticist states only that brown-eyed off-
spring are more likely and, furthermore, that
a brown-eyed offspring will be born in only a
certain percentage of cases. The social scientist
makes a similar, probabilistic prediction, that
women overall are less likely to murder any-
body, but when they do, their victims are most
often males.

Aggregates, Not Individuals
Social scientists primarily study social patterns
rather than individual ones. All regular patterns
refl ect the aggregate, or combined, actions and
situations of many individuals. Although social
scientists study motivations that affect individ-
uals, aggregates are more often the subject of
social scientifi c research.

A focus on aggregate patterns rather than
on individuals distinguishes the activities of
criminal justice researchers from the daily rou-
tines of many criminal justice practitioners.
Consider the task of processing and classifying
individuals newly admitted to a correctional
facility. Prison staff administer psychologi-
cal tests and review the prior record of each
new inmate to determine security risks, pro-
gram needs, and job options. A researcher who
is studying whether white inmates tend to be

correct. The notation Y � f (X ) is a conventional
way of saying that Y (for example, auto theft
rate) is a function of or is in some way caused
by X (for example, availability of off-street park-
ing). At that level, however, X and Y have gen-
eral rather than specifi c meanings.

In the operationalization process, general
concepts are translated into specifi c indicators.
Thus the lowercase x is a concrete indicator of
capital X. As an example, census data on the
number of housing units that have garages (x)
are a concrete indicator of off-street parking (X ).
This operationalization process results in the
formation of a testable hypothesis: is the rate of
auto theft higher in areas where fewer housing
units have garages? Observations aimed at fi nd-
ing out are part of what is typically called hy-
pothesis testing. We consider the logic of hy-
pothesis testing more fully in Chapters 3 and 5.

Regularities
Ultimately, social scientifi c theory aims to fi nd
patterns of regularity in social life. This as-
sumes, of course, that life is regular, not chaotic
or random. That assumption applies to all sci-
ence, but it is sometimes a barrier for people
when they fi rst approach social science.

A vast number of norms and rules in soci-
ety create regularity. Only persons who have
reached a certain age may obtain a driver’s li-
cense. In the National Hockey League, only
men participate on the ice. Such informal and
formal prescriptions regulate, or regularize, so-
cial behavior.

In addition to regularities produced by
norms and rules, social science is able to identify
other types of regularities. For example, teen-
agers commit more crimes than middle-aged
people. When males commit murder, they usu-
ally kill another male, but female murderers
more often kill a male. On average, white urban
residents view police more favorably than non-
whites do. Judges receive higher salaries than
police offi cers. Probation offi cers have more
empathy for the people they supervise than
prison guards do.

14 Part One An Introduction to Criminal Justice Inquiry

That’s just the way we think. Suppose someone
says to you, “Women are too soft-hearted and
weak to be police offi cers.” You are likely to hear
that comment in terms of what you know about
the speaker. If it’s your old Uncle Albert, who,
you recall, is also strongly opposed to daylight
saving time, zip codes, and computers, you are
likely to think his latest pronouncement simply
fi ts into his dated views about things in general.
If the statement comes from a candidate for
sheriff who is trailing a female challenger and
who has begun making other statements about
women being unfi t for public offi ce, you may
hear his latest comment in the context of this
political challenge.

In both of these examples, you are trying to
understand the thoughts of a particular, con-
crete individual. In social science, however, we
go beyond that level of understanding to seek
insights into classes or types of individuals. In
the two preceding examples, we might use terms

assigned to more desirable jobs than nonwhite
inmates would be more interested in pat-
terns of job assignment. The focus would be
on aggregates of white and nonwhite persons
rather than the assignment for any particular
individual.

Social scientifi c theories, then, typically deal
with aggregate, not individual, behavior. Their
purpose is to explain why aggregate patterns
of behavior are so regular even when the indi-
viduals who perform them change over time. In
another important sense, social science doesn’t
seek to explain people. Rather, it seeks to un-
derstand the systems within which people op-
erate, the systems that explain why people do
what they do. The elements in such systems are
not people but variables.

A Variable Language
Our natural attempts at understanding usually
take place at the concrete, idiosyncratic level.

Figure 1.2 The Traditional Image of Science

Idea/Interest

THEORETICAL UNDERSTANDING

HYPOTHESIS

Y = f(X )

y = f(x)

[Operationalization]

[Hypothesis testing]

?

A
B

G

D

C
F

ZH
J

I L

X Y

E

K

Chapter 1 Criminal Justice and Scientifi c Inquiry 15

employed, and intoxicated. Any quality we might
use to describe ourselves or someone else is an
attribute.

Variables are logical groupings of attri-
butes. Male and female are attributes, and gender
is the variable composed of the logical grouping
of those two attributes. The variable occupation
is composed of attributes such as dentist, pro-
fessor, and truck driver. Prior record is a variable
composed of a set of attributes such as prior
convictions, prior arrests without convictions, and no
prior arrests. It’s helpful to think of attributes
as the categories that make up a variable. See
Figure 1.3 for a schematic view of what social
scientists mean by variables and attributes.
Panel A lists both variables and attributes, mix-
ing them together. Panel B separates these con-
cepts to distinguish variables from attributes.
Panel C presents variables together with the at-
tributes they carry.

The relationship between attributes and
variables lies at the heart of both description
and explanation in science. We might describe a
prosecutor’s offi ce in terms of the variable gen-
der by reporting the observed frequencies of the
attributes male and female: “The offi ce staff is
60 percent men and 40 percent women.” An
incarceration rate can be thought of as a de-
scription of the variable incarceration status of
a state’s population in terms of the attributes
incarcerated and not incarcerated. Even the report
of family income for a city is a summary of at-
tributes composing the income variable: $27,124,
$44,980, $76,000, and so forth.

The relationship between attributes and
variables becomes more complicated as we try
to explain how concepts are related to each
other. Here’s a simple example involving two
variables: type of defense attorney and sentence.
For the sake of simplicity, let’s assume that the
variable defense attorney has only two attributes:
private attorney and public defender. Similarly,
let’s give the variable sentence two attributes:
probation and prison.

Now let’s suppose that 90 percent of people
represented by public defenders are sentenced
to prison and the other 10 percent are sentenced

like old-fashioned or bigoted to describe the per-
son who made the comment. In other words, we
try to identify the actual individual with some
set of similar individuals, and that identifi ca-
tion operates on the basis of abstract concepts.

One implication of this approach is that it
enables us to make sense out of more than one
person. In understanding what makes the big-
oted candidate think the way he does, we can also
learn about other people who are like him. This
is possible because we have not been studying
bigots as much as we have been studying bigotry.

Bigotry is considered a variable in this case
because the level of bigotry varies; that is, some
people in an observed group are more bigoted
than others. Social scientists may be interested
in understanding the system of variables that
causes bigotry to be high in one instance and low
in another. However, bigotry is not the only vari-
able here. Gender, age, and economic status also
vary among members of the observed group.

Here’s another example. Consider the prob-
lem of whether police should make arrests in
cases of domestic violence. The object of a po-
lice offi cer’s attention in handling a domestic
assault is the individual case. Of course, each
case includes a victim and an offender, and po-
lice are concerned with preventing further harm
to the victim. The offi cer must decide whether
to arrest an assailant or to take some other ac-
tion. The criminal justice researcher’s subject
matter is different: does arrest as a general pol-
icy prevent future assaults? The researcher may
study an individual case (victim and offender),
but that case is relevant only as a situation in
which an arrest policy might be invoked, which
is what the researcher is really studying.

Variables and Attributes
Social scientists study variables and the at-
tributes that compose them. Social scientifi c
theories are written in a variable language, and
people get involved mostly as the carriers of
those variables.

Attributes are characteristics or qualities
that describe some object, such as a person.
Examples are bigoted, old-fashioned, married, un-

16 Part One An Introduction to Criminal Justice Inquiry

we bet on your ability to guess whether a per-
son is sentenced to prison or probation. We’ll
pick the people one at a time (not telling you
which ones we’ve picked), and you have to
guess which sentence each person receives. We’ll
do it for all 20 people in Figure 1.4A. Your best
strategy in this case is to always guess prison
because 12 out of the 20 people are categorized
that way. You’ll get 12 right and 8 wrong, for a
net success score of 4.

Now suppose that we pick a person from
Figure 1.4A and we have to tell you whether the
person has a private attorney or a public de-
fender. Your best strategy now is to guess prison
for each person with a public defender and pro-

to probation. And let’s suppose that 30 percent
of people with private attorneys go to prison
and the other 70 percent receive probation. This
is shown visually in Figure 1.4A.

Figure 1.4A illustrates a relationship be-
tween the variables defense attorney and sentence.
This relationship can be seen by the pairings of
attributes on the two variables. There are two
predominant pairings: (1) persons represented
by private attorneys who are sentenced to pro-
bation and (2) persons represented by public
defenders who are sentenced to prison. But
there are two other useful ways of viewing that
relationship.

First, imagine that we play a game in which

A. B.

C.

Figure 1.3 Variables and Attributes

Two Different Kinds of Concepts

Variables Attributes

Gender Female
Sentence Probation
Property crime Auto theft
Age Middle-aged
Occupation Thief

Some Common Criminal Justice Concepts

Female
Probation

Thief
Gender

Sentence
Property crime
Middle-aged

Age
Auto theft

Occupation

The Relationship between Variables and Attributes

Variables Attributes

Gender Female, male

Age Young, middle-aged, old

Sentence Fine, prison, probation

Property crime Auto theft, burglary, larceny

Occupation Judge, lawyer, thief

Chapter 1 Criminal Justice and Scientifi c Inquiry 17

in Figure 1.4B. Notice that half the people have
private attorneys and half have public defend-
ers. Also notice that 12 of the 20 (60 percent)
are sentenced to prison— 6 who have private at-
torneys and 6 who have public defenders. The
equal distribution of those sentenced to proba-
tion and those sentenced to prison, no matter
what type of defense attorney each person had,
allows us to conclude that the two variables are
unrelated. Here, knowing what type of attorney
a person had would not be of any value to you
in guessing whether that person was sentenced
to prison or probation.

bation for each person represented by a private
attorney. If you follow that strategy, you will
get 16 right and 4 wrong. Your improvement in
guessing the sentence on the basis of knowing
the type of defense attorney illustrates what it
means to say that the variables are related. You
would have made a probabilistic statement on
the basis of some empirical observations about
the relationship between type of lawyer and
type of sentence.

Second, let’s consider how the 20 people
would be distributed if type of defense attorney
and sentence were unrelated. This is illustrated

A. Defendants represented by public defenders are sentenced to prison more often than those
represented by private attorneys.

Prison

Probation

Public DefenderPrivate Attorney

B. There is no relationship between type of attorney and sentence.

Prison

Probation

Public DefenderPrivate Attorney

SENTENCE

SENTENCE

DEFENSE ATTORNEY

DEFENSE ATTORNEY

Figure 1.4 Illustration of Relationships between Two Variables

18 Part One An Introduction to Criminal Justice Inquiry

on what we know about each. For example, we
know that private attorneys tend to be more
experienced than public defenders. Many law
school graduates gain a few years of experience
as public defenders before they enter private
practice. Logically, then, we would expect the
more experienced private attorneys to be bet-
ter able to get more lenient sentences for their
clients. We might explore this question directly
by examining the relationship between attorney
experience and sentence, perhaps comparing
inexperienced public defenders with public de-
fenders who have been working for a few years.
Pursuing this line of reasoning, we could also
compare experienced private attorneys with pri-
vate attorneys fresh out of law school.

Notice that the theory has to do with the
variables defense attorney, sentence, and years of
experience, not with individual people per se.
People are the carriers of those variables. We
study the relationship between the variables by
observing people. Ultimately, however, the the-
ory is constructed in terms of variables. It de-
scribes the associations that might logically be
expected to exist between particular attributes
of different variables.

Purposes of Research
We conduct criminal justice research to serve vari-
ous purposes.

Criminal justice research, of course, serves many
purposes. Explaining associations between two
or more variables is one of those purposes; oth-
ers include exploration, description, and appli-
cation. Although a given study can have several
purposes, it is useful to examine them sepa-
rately because each has different implications
for other aspects of research design.

Exploration
Much research in criminal justice is conducted
to explore a specifi c problem. A researcher or
offi cial may be interested in a crime or criminal
justice policy issue about which little is known.
Or perhaps an innovative approach to policing,

Variables and Relationships
We will look more closely at the nature of the re-
lationships between variables later in this book.
For now, let’s consider some basic observations
about variables and relationships that illustrate
the logic of social scientifi c theories and their
use in criminal justice research.

Theories describe relationships that might
logically be expected among variables. This ex-
pectation often involves the notion of causa-
tion: a person’s attributes on one variable are
expected to cause or encourage a particular at-
tribute on another variable. In the example just
given, having a private attorney or a public de-
fender seemed to cause a person to be sentenced
to probation or prison, respectively. Apparently
there is something about having a public de-
fender that leads people to be sentenced to
prison more often than if they are represented
by a private attorney.

Type of defense attorney and sentence are ex-
amples of independent and dependent variables,
respectively. In this example, we assume that
criminal sentences are determined or caused
by something; the type of sentence depends on
something and so is called the dependent vari-
able. The dependent variable depends on an
independent variable; in this case, sentence de-
pends on type of defense attorney.

Notice, at the same time, that type of defense
attorney might be found to depend on some-
thing else— our subjects’ employment status,
for example. People who have full-time jobs
are more likely to be represented by private at-
torneys than those who are unemployed. In this
latter relationship, the type of attorney is the de-
pendent variable, and the subject’s employment
status is the independent variable. In cause-and-
effect terms, the independent variable is the
cause and the dependent variable is the effect.

How does this relate to theory? Our discus-
sion of Figure 1.4 involved the interpretation
of data. We looked at the distribution of the
20 people in terms of the two variables. In con-
structing a theory, we form an expectation about
the relationship between the two variables based

Chapter 1 Criminal Justice and Scientifi c Inquiry 19

or public offi cial observes and then describes
what was observed. Criminal justice observa-
tion and description, methods grounded in the
social sciences, tend to be more accurate than
the casual observations people may make about
how much crime there is or how violent teen-
agers are today. Descriptive studies are often
concerned with counting or documenting ob-
servations; exploratory studies focus more on
developing a preliminary understanding about
a new or unusual problem.

Descriptive studies are frequently conducted
in criminal justice. The FBI has compiled Uni-
form Crime Reports (UCR) since 1930. UCR
data are routinely reported in newspapers and
widely interpreted as accurately describing
crime in the United States. For example, 2006
UCR fi gures (Federal Bureau of Investigation
2007) showed that Nevada had the highest rate
of auto theft (1080.4 per 100,000 residents) in
the nation and South Dakota had the lowest
(91.8 per 100,000 residents).

Descriptive studies in criminal justice have
other uses. A researcher may attend meetings
of neighborhood anticrime groups and observe
their efforts to organize block watch commit-
tees. These observations form the basis for a
case study that describes the activities of neigh-
borhood anticrime groups. Such a descriptive
study might present information that offi cials
and residents of other cities can use to promote
such organizations themselves. Or consider
research by Richard Wright and Scott Decker
(1994), in which they describe in detail how
burglars search for and select targets, how they
gain entry into residences, and how they dis-
pose of the goods they steal.

Explanation
A third general purpose of criminal justice re-
search is to explain things. Recall our earlier ex-
ample, in which we sought to explain the rela-
tionship between type of attorney and sentence
length. Reporting that urban residents have
generally favorable attitudes toward police is
a descriptive activity, but reporting why some

court management, or corrections has been
tried in some jurisdiction, and the researcher
wishes to determine how common such prac-
tices are in other cities or states. An exploratory
project might collect data on a measure to es-
tablish a baseline with which future changes
will be compared.

For example, heightened concern over drug
use might prompt efforts to estimate the level
of drug abuse in the United States. How many
people are arrested for drug sales or possession
each year? How many high school seniors report
using marijuana in the past week or the past
month? How many hours per day do drug deal-
ers work, and how much money do they make?
These are examples of research questions in-
tended to explore different aspects of the prob-
lem of drug abuse. Exploratory questions may
also be formulated in connection with criminal
justice responses to drug problems. How many
cities have created special police or prosecutor
task forces to crack down on drug sales? What
sentences are imposed on major dealers or on
casual users? How much money is spent on
treatment for drug users? What options exist
for treating different types of addiction?

Exploratory studies are also appropriate
when a policy change is being considered. One
of the fi rst questions public offi cials typically
ask when they consider some new policy is how
other cities (or states) have handled this problem.

Exploratory research in criminal justice can
be simple or complex, using a variety of meth-
ods. A mayor anxious to learn about drug ar-
rests in his or her city might simply phone the
police chief and request a report. Estimating
how many high school seniors have used mari-
juana requires more sophisticated survey meth-
ods. Since the early 1970s, the National Insti-
tute on Drug Abuse has supported nationwide
surveys of students regarding drug use.

Description
A key purpose of many criminal justice studies
is to describe the scope of the crime problem or
policy responses to the problem. A researcher

20 Part One An Introduction to Criminal Justice Inquiry

Rather than observing and analyzing current or
past behavior, policy analysis tries to anticipate
the future consequences of alternative actions.

Similarly, justice organizations are increas-
ingly using techniques of problem analysis to
study patterns of cases and devise appropriate
responses. Problem-oriented policing is perhaps
the best-known example, in which crime ana-
lysts work with police and other organizations
to examine recurring problems. Ron Clarke and
John Eck (2005) have prepared a comprehensive
guide for this type of applied research.

Our brief discussion of distinct research
purposes is not intended to imply that research
purposes are mutually exclusive. Many criminal
justice studies have elements of more than one
purpose. Suppose you want to evaluate a new
program to reduce bicycle theft at your univer-
sity. First, you need some information that de-
scribes the problem of bicycle theft on campus.
Let’s assume your evaluation fi nds that thefts
from some campus locations have declined but
that there was an increase in bikes stolen from
racks outside dormitories. You might explain
these fi ndings by noting that bicycles parked
outside dorms tend to be unused for longer pe-
riods and that there is more coming and going
among bikes parked near classrooms. One op-
tion to further reduce thefts would be to pur-
chase more secure bicycle racks. A policy anal-
ysis might compare the costs of installing the
racks with the predicted savings resulting from
a reduction in bike theft.

Differing Avenues for Inquiry
Social scientifi c research is conducted in a variety
of ways.

There is no one way of doing criminal justice
research. If there were, this would be a much
shorter book. In fact, much of the power and
potential of social scientifi c research lies in the
many valid approaches it comprises.

Three broad and interrelated distinctions
underlie many of the variations of social scien-
tifi c research: (1) idiographic and nomothetic

people believe that police are doing a good job
while other people do not is an explanatory ac-
tivity. Similarly, reporting why Nevada has the
highest auto-theft rate in the nation is expla-
nation; simply reporting auto-theft rates for
different states is description. A researcher has
an explanatory purpose if he or she wishes to
know why the number of 14-year-olds involved
in gangs has increased, as opposed to simply
describing changes in gang membership.

Application
Researchers also conduct criminal justice stud-
ies of an applied nature. Applied research stems
from a need for specifi c facts and fi ndings with
policy implications. Another purpose of crimi-
nal justice research, therefore, is its application
to public policy. We can distinguish two types
of applied research: evaluation and policy/
problem analysis.

Applied research is often used to evaluate
the effects of specifi c criminal justice programs.
Determining whether a program designed to
reduce burglary actually had the intended effect
is an example of evaluation. In its most basic
form, evaluation involves comparing the goals
of a program with the results. If one goal of in-
creased police foot patrol is to reduce fear of
crime, then an evaluation of foot patrol might
compare levels of fear before and after increas-
ing the number of police offi cers on the beat
on foot. In most cases, evaluation research uses
social scientifi c methods to test the results of a
program or policy change.

The second type of applied research is pol-
icy and problem analysis. What would hap-
pen to court backlogs if we designated a judge
and prosecutor who would handle only drug-
dealing cases? How many new police offi cers
would have to be hired if a department shifted to
community policing? These are examples of
what if questions addressed by policy analysis.
Answering such questions is sort of a counter-
part to program evaluation. Policy analysis is
different from other forms of criminal justice
research primarily in its focus on future events.

Chapter 1 Criminal Justice and Scientifi c Inquiry 21

effi ciently, using only one or just a few explana-
tory factors. Finally, it settles for a partial rather
than a full explanation of a type of situation.

In each of the preceding nomothetic ex-
amples, you might qualify your causal state-
ments with phrases such as “on the whole” or
“usually.” You usually do better on exams when
you’ve studied in a group, but there have been
exceptions. Your team has won some games
on the road and lost some at home. And last
week you got a speeding ticket on the way to
Tuesday’s chemistry class, but you did not get
one over the weekend. Such exceptions are an
acceptable price to pay for a broader range of
overall explanation.

Both idiographic and nomothetic ap-
proaches to understanding can be useful in
daily life. They are also powerful tools for crim-
inal justice research. The researcher who seeks
an exhaustive understanding of the inner work-
ings of a particular juvenile gang or the rulings
of a specifi c judge is engaging in idiographic re-
search. The aim is to understand that particu-
lar group or individual as fully as possible.

Rick Brown and Ron Clarke (2004) sought
to understand thefts of a particular model of
Nissan trucks in the south of England. Most
stolen trucks were never recovered. Their re-
search led Brown and Clarke to a shipping yard
where trucks were taken apart and shipped
to ports in France and Nigeria as scrap metal.
They later learned that trucks were reassembled
and sold to individuals and small companies.
In the course of their research, they linked most
thieves in England and most resellers abroad to
legitimate shipping and scrap metal businesses.
Even though Brown and Clarke sought answers
to the idiosyncratic problem of stolen trucks in
one region of England, they came to some ten-
tative conclusions about loosely organized in-
ternational theft rings.

Sometimes, however, the aim is a more gen-
eralized understanding across a class of events,
even though the level of understanding is inevi-
tably more superfi cial. For example, researchers
who seek to uncover the chief factors that lead to

explanations, (2) inductive and deductive rea-
soning, and (3) quantitative and qualitative
data. Although it is possible to see them as
competing choices, a good researcher masters
each of these orientations.

Idiographic and Nomothetic
Explanations
All of us go through life explaining things; we
do it every day. You explain why you did poorly
or well on an exam, why your favorite team is
winning or losing, and why you keep getting
speeding tickets. In our everyday explanations,
we engage in two distinct forms of causal rea-
soning—idiographic and nomothetic expla-
nation—although we do not ordinarily distin-
guish them.

Sometimes we attempt to explain a single
situation exhaustively. You might have done
poorly on an exam because (1) you had forgot-
ten there was an exam that day, (2) it was in
your worst subject, (3) a traffi c jam caused you
to be late to class, and (4) your roommate kept
you up the night before with loud music. Given
all these circumstances, it is no wonder that
you did poorly on the exam.

This type of causal reasoning is idiographic
explanation. Idio in this context means “unique,
separate, peculiar, or distinct,” as in the word
idiosyncrasy. When we complete an idiographic
explanation, we feel that we fully understand
the many causes of what happened in a par-
ticular instance. At the same time, the scope of
our explanation is limited to the case at hand.
Although parts of the idiographic explanation
might apply to other situations, our intention
is to explain one case fully.

Now consider a different kind of explana-
tion. For example, every time you study with
a group, you do better on an exam than if you
study alone. Your favorite team does better at
home than on the road. You get more speeding
tickets on weekends than during the week. This
type of explanation— called nomothetic—seeks
to explain a class of situations or events rather
than a single one. Moreover, it seeks to explain

22 Part One An Introduction to Criminal Justice Inquiry

or in a group? It suddenly occurs to you that
you almost always do better on exams when
you studied with others than when you studied
alone. This is known as the inductive mode of
inquiry.

Inductive reasoning (induction) moves
from the specifi c to the general, from a set of
particular observations to the discovery of a
pattern that represents some degree of order
among the varied events under examination.
Notice, incidentally, that your discovery doesn’t
necessarily tell you why the pattern exists—
merely that it does.

There is the second, and very different, way
you might reach the same conclusion about
studying for exams. As you approach your fi rst
set of exams in college, you might wonder about
the best ways to study. You might consider how
much you should review the readings and how
much you should focus on your class notes.
Should you study at a measured pace over time
or pull an all-nighter just before the exam?
Among these musings, you might ask whether
you should get together with other students
in the class or study on your own. You decide
to evaluate the pros and cons of both options.
On the one hand, studying with others might
not be as effi cient because a lot of time might
be spent on material you already know. Or the
group might get distracted from studying. On
the other hand, you can understand something
even better when you’ve explained it to someone
else. And other students might understand ma-
terial that you’ve been having trouble with and
reveal perspectives that might have escaped you.

So you add up the pros and the cons and
conclude, logically, that you’d benefi t from
studying with others. This seems reasonable to
you in theory. To see whether it is true in prac-
tice, you test your idea by studying alone for
half your exams and studying with others for
half. This second approach is known as the de-
ductive mode of inquiry.

Deductive reasoning (deduction) moves
from the general to the specifi c. It moves from
a pattern that might be logically or theoreti-

juvenile delinquency are pursuing a nomothetic
inquiry. They might discover that children who
frequently skip school are more likely to have
records of delinquency than those who attend
school regularly. This explanation would extend
well beyond any single juvenile, but it would
do so at the expense of a complete explanation.

In contrast to the idiographic study of Nis-
san truck theft, Pierre Tremblay and associ-
ates (2001) explored how a theory of offending
helped explain different types of offender net-
works. Examining auto thefts over 25 years in
Quebec, the authors concluded that different
types of relationships were involved in different
types of professional car theft. However, Trem-
blay and associates found that persons involved
in legitimate car sales and repair businesses
were key members in all networks. The research-
ers showed how complex relationships among
people involved in legitimate and illegitimate
activities helped explain patterns of car theft
over a quarter-century. This is an illustration of
the nomothetic approach to understanding.

Thus social scientists have access to two dis-
tinct logics of explanation. We can alternate be-
tween searching for broad, albeit less detailed,
universals (nomothetic) and probing more
deeply into more specifi c cases (idiographic).

Inductive and Deductive Reasoning
The distinction between inductive and deduc-
tive reasoning exists in daily life, as well as in
criminal justice research. You might take two
different routes to reach the conclusion that
you do better on exams if you study with oth-
ers. Suppose you fi nd yourself puzzling, half-
way through your college career, over why you
do so well on exams sometimes and poorly at
other times. You list all the exams you’ve taken,
noting how well you did on each. Then you try
to recall any circumstances shared by all the
good exams and by all the poor ones. Do you
do better on multiple-choice exams or essay
exams? Morning exams or afternoon exams?
Exams in the natural sciences, the humanities,
or the social sciences? After you studied alone

Chapter 1 Criminal Justice and Scientifi c Inquiry 23

may tend to be a little younger than you in age
but to act more mature. Or we might have been
thinking of how young or old your friends look
or of the variation in their life experiences, their
worldliness. All these other meanings are lost in
the numerical calculation of average age.

In addition to greater detail, nonnumerical
observations seem to convey a greater richness
of meaning than do quantifi ed data. Think
of the cliché “he is older than his years.” The
meaning of that expression is lost in attempts
to specify how much older. In this sense, the
richness of meaning is partly a function of am-
biguity. If the expression meant something to
you when you read it, that meaning came from
your own experiences, from people you have
known who might fi t the description of being
older than their years.

This concept can be quantifi ed to a certain
extent, however. For example, we could make a
list of life experiences that contribute to what
we mean by worldliness:

Getting married
Getting divorced
Having a parent die
Seeing a murder committed
Being arrested
Being fi red from a job
Running away with a rock band

We could quantify people’s worldliness by
counting how many of these experiences they
have had: the more such experiences, the more
worldly we say they are. If we think that some
experiences are more powerful than others, we
can give those experiences more points than
others. Once we decide on the specifi c experi-
ences to be considered and the number of points
each warrants, scoring people and comparing
their worldliness is fairly straightforward.

To quantify a concept like worldliness, we
must be explicit about what we mean. By focus-
ing specifi cally on what we will include in our
measurement of the concept, as we did here,
we also exclude the other possible meanings.
Inevitably, then, quantitative measures will be

cally expected to observations that test whether
the expected pattern actually occurs in the
real world. Notice that deduction begins with
why and moves to whether, whereas induction
moves in the opposite direction.

Both inductive and deductive reasoning are
valid avenues for criminal justice and other
social scientifi c research. Moreover, they work
together to provide ever-more powerful and
complete understandings.

Quantitative and Qualitative Data
Simply put, the distinction between quantita-
tive and qualitative data is the distinction be-
tween numerical and nonnumerical data. When
we say that someone is witty, we are making a
qualitative assertion. When we say that that
person has appeared three times in a local com-
edy club, we are attempting to quantify our
assessment.

Most observations are qualitative at the out-
set, whether it is our experience of someone’s
sense of humor, the location of a pointer on a
measuring scale, or a check mark entered in a
questionnaire. None of these things is inher-
ently numerical. But it is often useful to convert
observations to a numerical form. Quantifi ca-
tion often makes our observations more explicit,
makes it easier to aggregate and summarize
data, and opens up the possibility of statistical
analyses, ranging from simple descriptions to
more complex testing of relationships between
variables.

Quantifi cation requires focusing our atten-
tion and specifying meaning. Suppose someone
asks whether your friends tend to be older or
younger than you. A quantitative answer seems
easy. You think about how old each of your
friends is, calculate an average, and see whether
it is higher or lower than your own age. Case
closed.

Or is it? Although we focused our attention
on “older or younger” in terms of the number
of years people have been alive, we might mean
something different with that idea—for exam-
ple, “maturity” or “worldliness.” Your friends

24 Part One An Introduction to Criminal Justice Inquiry

entists sometimes use tensiometers, spectro-
graphs, and other such equipment for measure-
ment, criminal justice researchers use a variety
of techniques, examined in Part Three.

The other key to criminal justice research
is interpretation. Much of interpretation is
based on data analysis, which is introduced in
Part Four. More generally, however, interpre-
tation very much depends on how observa-
tions are structured, a point we will encounter
repeatedly.

As we put the pieces together—measure-
ment and interpretation—we are in a position
to describe, explain, or predict something. And
that is what social science is all about.

✪ Main Points
• Knowledge of research methods is valuable to

criminal justice professionals as consumers and
producers of research.

• The study of research methods is the study of
how we know what we know.

• Inquiry is a natural human activity for gaining
an understanding of the world around us.

• Much of our knowledge is based on agreement
rather than direct experience.

• Tradition and authority are important sources
of knowledge.

• Empirical research is based on experience
and produces knowledge through systematic
observation.

• In day-to-day inquiry, we often make mistakes.
Science offers protection against such mistakes.

• Whereas people often observe inaccurately, sci-
ence avoids such errors by making observation
a careful and deliberate activity.

• Sometimes we jump to general conclusions on
the basis of only a few observations. Scientists
avoid overgeneralization through replication.

• Scientists avoid illogical reasoning by being as
careful and deliberate in their thinking as in
their observations.

• The scientifi c study of crime guards against,
but does not prevent, ideological and political
beliefs infl uencing research fi ndings.

• Social science involves three fundamental as-
pects: theory, data collection, and data analysis.

• Social scientifi c theory addresses what is, not
what should be.

more superfi cial than qualitative descriptions.
This is the trade-off.

What a dilemma! Which approach should
we choose? Which is better? Which is more ap-
propriate to criminal justice research?

The good news is that we don’t have to
choose. In fact, by choosing to undertake a
qualitative or quantitative study, researchers
run the risk of artifi cially limiting the scope of
their inquiry. Both qualitative and quantitative
methods are useful and legitimate. And some
research situations and topics require elements
of both approaches.

Knowing through
Experience: Summing
Up and Looking Ahead
Empirical research involves measurement and
interpretation.

This chapter introduced the foundation of
criminal justice research: empirical research, or
learning through experience. Each avenue for
inquiry—nomothetic or idiographic descrip-
tion, inductive or deductive reasoning, quali-
tative or quantitative data—is fundamentally
empirical. It’s worth keeping that in mind as
we examine the various forms criminal justice
research can take.

It is also helpful to think of criminal jus-
tice research as organized around two basic
activities: measurement and interpretation. Re-
searchers measure aspects of reality and then
draw conclusions about the meaning of what
they have measured. All of us are observing
all the time, but measurement refers to some-
thing more deliberate and rigorous. Part Two
of this book describes ways of structuring ob-
servations to produce more deliberate, rigorous
measures.

Our ability to interpret observations in crim-
inal justice research depends crucially on how
those observations are structured. After decid-
ing how to structure observations, we have to
actually measure them. Whereas physical sci-

Chapter 1 Criminal Justice and Scientifi c Inquiry 25

related violence, it is easy to assume that these
are real problems identifi ed by systematic study.
Choose a criminal justice topic or claim that’s
currently prominent in news stories or enter-
tainment. Consult a recent edition of the Sour-
cebook of Criminal Justice Statistics (citation below)
for evidence to refute the claim.

✪ Additional Readings
Babbie, Earl, The Sociological Spirit (Belmont, CA:

Wadsworth, 1994). The primer in some socio-
logical points of view introduces some of the
concepts commonly used in the social sciences.

Hoover, Kenneth R., and Todd Donovan, The Ele-
ments of Social Scientifi c Thinking, 9th edition
(Belmont, CA: Wadsworth, 2007). This book
provides an excellent overview of the key ele-
ments in social scientifi c analysis.

Levine, Robert, A Geography of Time: The Temporal
Misadventures of a Social Psychologist (New York:
Basic Books, 1997). Most of us think of time
as absolute. Levine’s book is fun and fascinat-
ing as he explores how agreement reality plays
a major role in how people from different cul-
tures think about time.

Pastore, Ann L., and Kathleen Maguire (eds.), Sour-
cebook of Criminal Justice Statistics (Washington,
DC: U.S. Department of Justice, Offi ce of Jus-
tice Programs, Bureau of Justice Statistics, an-
nual; www.albany.edu/sourcebook; accessed
May 6, 2008). For 30 years this annual publica-
tion has been a source of basic data on criminal
justice. If you’re not yet familiar with this com-
pendium, Chapter 1 is a good place to start.

Tilley, Nick, and Gloria Laycock, Working Out What
to Do: Evidence-Based Crime Reduction (London:
Home Offi ce Policing and Reducing Crime
Unit, Crime Reduction Series, no. 11, 2002;
www.homeoffice.gov.uk /rds/crimreducpubs1
.html; accessed May 6, 2008). One of many ex-
cellent publications from the British Home
Offi ce, this guide helps justice professionals
develop policies based on empirical experi-
ence. The guide is clearly written and useful as
an illustration of how practitioners use social
science.

• Theory guides research. In grounded theory, ob-
servations contribute to theory development.

• Social scientists are interested in explaining ag-
gregates, not individuals.

• Although social scientists observe people, they
are primarily interested in discovering relation-
ships that connect variables.

• Explanations may be idiographic or nomothetic.

• Data may be quantitative or qualitative.

• Theories may be inductive or deductive.

✪ Key Terms
These terms are defi ned in the chapter where they
are set in boldface and can also be found in the
glossary at the end of the book.

aggregate, p. 13
attribute, p. 15
deductive

reasoning, p. 22
dependent

variable, p. 18
empirical, p. 6
grounded

theory, p. 11
hypothesis, p. 11
hypothesis

testing, p. 13

idiographic, p. 21
independent

variable, p. 18
inductive

reasoning, p. 22
nomothetic, p. 21
replication, p. 9
theory, p. 11
variable, p. 15

✪ Review Questions and Exercises
1. Review the common errors of personal inquiry

discussed in this chapter. Find a newspaper or
magazine article about crime that illustrates
one or more of those errors. Discuss how a sci-
entist would avoid making that error.

2. Briefl y discuss examples of descriptive research
and explanatory research about changes in
crime rates in some major city.

3. Often things we think are true and supported
by considerable experience and evidence turn
out not to be true, or at least not true with the
certainty we expected. Criminal justice seems
especially vulnerable to this phenomenon, per-
haps because crime and criminal justice policy
are so often the subjects of mass and popular
media attention. If news stories, movies, and
TV shows all point to growing gang- or drug-

www.albany.edu/sourcebook

www.homeoffice.gov.uk /rds/crimreducpubs1.html

www.homeoffice.gov.uk /rds/crimreducpubs1.html

26

Chapter 2

Ethics and Criminal
Justice Research
We’ll examine some of the ethical considerations that must be taken into
account along with the scientifi c ones in the design and execution of research.
We’ll consider different types of ethical issues and ways of handling them.

Introduction 27

Ethical Issues in Criminal
Justice Research 27

No Harm to Participants 27

ETHICS AND EXTREME FIELD

RESEARCH 28

Voluntary Participation 31

Anonymity and Confi dentiality 32

Deceiving Subjects 33

Analysis and Reporting 33

Legal Liability 34

Special Problems 35

Promoting Compliance with
Ethical Principles 37

Codes of Professional Ethics 37

Institutional Review Boards 38

Institutional Review Board
Requirements and Researcher
Rights 41

ETHICS AND JUVENILE GANG

MEMBERS 42

Ethical Controversies 42

The Stanford Prison Experiment 42

Discussion Examples 45

Chapter 2 Ethics and Criminal Justice Research 27

Introduction
Despite our best intentions, we don’t always recog-
nize ethical issues in research.

Most of this book focuses on scientifi c proce-
dures and constraints. We’ll see that the logic
of science suggests certain research procedures,
but we’ll also see that some scientifi cally “per-
fect” study designs are not feasible, because
they would be too expensive or take too long to
execute. Throughout the book, we’ll deal with
workable compromises.

Before we get to scientifi c and practical con-
straints on research, it’s important to explore
another essential consideration in doing crimi-
nal justice research in the real world— ethics.
Just as certain designs or measurement proce-
dures are impractical, others are constrained by
ethical problems.

All of us consider ourselves ethical—not
perfect perhaps, but more ethical than most
of humanity. The problem in criminal justice
research—and probably in life—is that ethical
considerations are not always apparent to us.
As a result, we often plunge into things with-
out seeing ethical issues that may be obvious
to others and even to ourselves when they are
pointed out. Our excitement at the prospect of
a new research project may blind us to obstacles
that ethical considerations present.

Any of us can immediately see that a study
that requires juvenile gang members to dem-
onstrate how they steal cars is unethical. You’d
speak out immediately if we suggested inter-
viewing people about drug use and then pub-
lishing what they said in the local newspaper.
But, as ethical as we think we are, we are likely to
miss the ethical issues in other situations—not
because we’re bad, but because we’re human.

Ethical Issues in Criminal
Justice Research
A few basic principles encompass the variety of ethi-
cal issues in criminal justice research.

In most dictionaries and in common usage,
ethics is typically associated with morality, and

both deal with matters of right and wrong. But
what is right and what is wrong? What is the
source of the distinction? Depending on the
individual, sources vary from religion to politi-
cal ideology to pragmatic observations of what
seems to work and what doesn’t.

Webster’s New World Dictionary (4th ed.) is
typical among dictionaries in defi ning ethical
as “conforming to the standards of conduct
of a given profession or group.” Although the
relativity embedded in this defi nition may frus-
trate those in search of moral absolutes, what
we regard as moral and ethical in day-to-day life
is no more than a matter of agreement among
members of a group. And, not surprisingly, dif-
ferent groups have agreed on different ethical
codes of conduct. If someone is going to live
in a particular society, it is extremely useful to
know what that society considers ethical and
unethical. The same holds true for the criminal
justice research “community.”

Anyone preparing to do criminal justice
research should be aware of the general agree-
ments shared by researchers about what’s
proper and improper in the conduct of scien-
tifi c inquiry. Ethical issues in criminal justice
can be especially challenging because our re-
search questions frequently address illegal be-
havior that people are anxious to conceal. This
is true of offenders and, sometimes, people who
work in criminal justice agencies.

The sections that follow explore some of the
more important ethical issues and agreements
in criminal justice research. Our discussion is
restricted to ethical issues in research, not in
policy or practice. Thus, we will not consider
such issues as the morality of the death pen-
alty, acceptable police practices, the ethics of
punishment, or codes of conduct for attorneys
and judges. If you are interested in substantive
ethical issues in criminal justice policy, consult
Jocelyn Pollock (2003) or Richard Hall and as-
sociates (1999) for an introduction.

No Harm to Participants
Weighing the potential benefi ts from doing
research against the possibility of harm to the

28 Part One An Introduction to Criminal Justice Inquiry

as well as embarrassment. Although the likeli-
hood of physical harm may seem remote, it is
worthwhile to consider possible ways it might
occur.

Harm to subjects, researchers, or third par-
ties is possible in fi eld studies that collect in-
formation from or about persons engaged in
criminal activity; this is especially true for fi eld
research. Studies of drug crimes may involve lo-

people being studied— or harm to other people—
is a fundamental ethical dilemma in all re-
search. For example, biomedical research can
involve potential physical harm to people or
animals. Social research may cause psychologi-
cal harm or embarrassment in people who are
asked to reveal information about themselves.
Criminal justice research has the potential to
produce both physical and psychological harm,

ETHICS AND EXTREME
FIELD RESEARCH

Dina Perrone
Bridgeport State College

As a female ethnographer studying active drug
use in a New York dance club, I have encoun-
tered awkward and diffi cult situations. The main
purpose of my research was to study the use of
ecstasy and other drugs in rave club settings. I be-
came a participant observer in an all-night dance
club (The Plant) where the use of club drugs was
common. I covertly observed activities in the club,
partly masking my role as a researcher by assum-
ing the role of club-goer.

Though I was required to comply with uni-
versity institutional review board guidelines, pub-
lished codes and regulations offered limited guid-
ance for many of the situations I experienced. As
a result, I had to use my best judgment, learning
from past experiences to make immediate deci-
sions regarding ethical issues. I was forced to
make decisions about how to handle drug epi-
sodes, so as not to place my research or my infor-
mants in any danger. Because my research was
conducted in a dance club that is also a place
for men to pick up women, I faced problems in
getting information from subjects while watching
out for my physical safety.

Drug Episodes and Subject Safety
I witnessed many drug episodes—adverse reac-
tions to various club drugs—in my visits to The
Plant. I watched groups trying to get their friends
out of K-holes resulting from ketamine, or Spe-

cial K. I even aided a subject throwing up. Being
a covert observer made it diffi cult to handle these
episodes. There were times in the club when I felt
as though I was the only person not under the
infl uence of a mind-altering substance. This led
me to believe that I had better judgment than the
other patrons. Getting involved in these episodes,
however, risked jeopardizing my research.

During my fi rst observation, I tried to inter-
vene in what appeared to be a serious drug epi-
sode but was warned off by an informant. I was
new to the club and unsure what would happen
if I got involved. If I sought help from club staff
or outsiders in dealing with acute drug reactions,
patrons as well as the bouncers would begin to
question why I kept coming there. I needed to
gain the trust of the patrons to enlist participants
in my research. Furthermore, the bouncers could
throw me out of the club, fearing I was a trouble-
maker who would summon authorities.

As a researcher, I have an ethical responsibil-
ity to my participants, and as a human being, I
have an ethical responsibility to my conscience. I
decided to be extra cautious during my research
and to pay close attention to how drug episodes
are handled. I would fi rst consult my informants
and follow their suggestions. But if I ever thought
a person suffering a drug episode was at risk while
other patrons were neither able nor inclined to
help, I would intervene to the best of my ability.

Sexual Advances in the Dance Club
The Plant is also partly a “meat market.” Unlike
most bars and dance clubs, the patrons’ attire and
the dance club entertainment are highly erotic.
Most of the males inside the club are shirtless,
and the majority of females wear extremely reveal-

Chapter 2 Ethics and Criminal Justice Research 29

Potential danger to fi eld researchers should
also be considered. For instance, Peter Reu-
ter and associates (1990) selected their drug
dealer subjects by consulting probation depart-
ment records. The researchers recognized that
sampling persons from different Washington,
D.C., neighborhoods would have produced a
more generalizable group of subjects, but they
rejected that approach because mass media

cating and interviewing active users and dealers.
Bruce Johnson and associates (1985) studied
heroin users in New York, recruiting subjects
by spreading the word through various means.
Other researchers have studied dealers in De-
troit (Mieczkowski 1990) and St. Louis ( Jacobs
1999). Collecting information from active crim-
inals presents at least the possibility of violence
against research subjects by other drug dealers.

ing clothes. In staged performances, males and
females perform dances with sexual overtones,
and clothing is partly shed. This atmosphere
promotes sexual encounters; men frequently ap-
proach single women in search of a mate. Men
had a tendency to approach me—I appeared to
be unattached, and because of my research role, I
made it a point to talk to as many people as pos-
sible. It’s not diffi cult to imagine how this behav-
ior could be misinterpreted.

There were times when men became sexu-
ally aggressive and persistent. In most instances,
I walked away, and the men usually got the hint.
However, some men are more persistent than oth-
ers, especially when they are on ecstasy. In situa-
tions in which men make sexual advances, Terry
Williams and colleagues (1992) suggest devel-
oping a trusting relationship with key individuals
who can play a protective role. Throughout my
research, I established a good rapport with my
informants, who assumed that protective role.
Unfortunately, acting in this role has had the po-
tential to place my informants in physically dan-
gerous circumstances.

During one observation, “Tom” grabbed me
after I declined his invitation to dance. Tom per-
sisted, grabbed me again, then began to argue with
“Jerry,” one of my regular informants, who came
to my aid. This escalated to a fi stfi ght broken up
only after two bouncers ejected Tom from the club.

I had placed my informant and myself in a
dangerous situation. Although I tried to convince
myself that I really had no control over Jerry’s ac-
tions, I felt responsible for the fi ght. A basic prin-
ciple of fi eld research is to not invite harm to par-
ticipants. In most criminal justice research, harm
is associated mainly with the possibility of arrest

or psychological harm from discussing private is-
sues. Afterward, I tried to think about how the in-
cident escalated and how I could prevent similar
problems in the future.

Ethical Decision Rules Evolving from
Experience
Academic associations have formulated codes
of ethics and professional conduct, but limited
guidance is available for handling issues that arise
in some types of ethnographic research. Instead,
like criminal justice practitioners, those research-
ers have to make immediate decisions based on
experience and training, without knowing how a
situation will unfold. Throughout my research,
I found myself in situations that I would nor-
mally avoid and would probably never confront.
Should I help the woman over there get through a
drug episode? If I don’t, will she be okay? If I walk
away from this aggressive guy, will he follow me?
Does he understand that I wanted to talk to him
just for research?

The approach I developed to tackle these is-
sues was mostly gained by consulting with col-
leagues and reading other studies. An overarch-
ing theme regarding all codes of ethics is that
ethnographers must put the safety and interests
of their participants fi rst, and they must recognize
that their informants are more knowledgeable
about many situations than they are. Through-
out the research, I used my judgment to make the
best decisions possible when handling these situ-
ations. To decide when to intervene during drug
episodes, I followed the lead of my informants.
Telling men that my informant was my boyfriend
and walking away were successful tactics in turn-
ing away sexual advances.

30 Part One An Introduction to Criminal Justice Inquiry

computer questionnaires in the British Crime
Survey (Mirrlees-Black 1999). Rather than ver-
bally respond to questions from interviewers,
respondents read and answer questions on
a laptop computer. This procedure affords a
greater degree of privacy for research subjects.

Although the fact often goes unrecognized,
subjects can also be harmed by the analysis
and reporting of data. Every now and then, re-
search subjects read the books published about
the studies they participated in. Reasonably
sophisticated subjects can locate themselves
in the various indexes and tables of published
studies. Having done so, they may fi nd them-
selves characterized—though not identifi ed by
name—as criminals, deviants, probation viola-
tors, and so forth.

Largely for this reason, information on the
city of residence of victims identifi ed in the
National Crime Victimization Survey is not
available to researchers or the public. The rela-
tive rarity of some types of crime means that if
crime victimization is reported by city of resi-
dence individual victims might recognize the
portrayal of their experience or might be identi-
fi ed by third parties.

Recent developments in the use of crime-
mapping software have raised similar concerns.
Many police departments now use some type
of computer-driven crime map, and some have
made maps of small areas available to the pub-
lic on the Web. As Tom Casady (1999) points
out, this raises new questions of privacy as in-
dividuals might be able to identify crimes di-
rected against their neighbors. Researchers and
police alike must recognize the potential for
such problems before publishing or otherwise
displaying detailed crime maps. See crime maps
for cities in the San Diego metropolitan area
for examples (www.arjis.org; accessed January
16, 2008).

By now, it should be apparent that virtually
all research runs some risk of harming other
people somehow. A researcher can never com-
pletely guard against all possible injuries, yet
some study designs make harm more likely

reports of widespread drug-related violence gen-
erated concern about the safety of research staff
(Reuter, MacCoun, and Murphy 1990, 119).
Whether such fears were warranted is unclear,
but this example does illustrate how safety is-
sues can affect criminal justice research.

Other researchers acknowledge the potential
for harm in the context of respect for ethical
principles. The box titled “Ethics and Extreme
Field Research” gives examples of subtle and
not-so-subtle ethical dilemmas encountered by
a Rutgers University graduate in her study of
drug use in rave clubs.

More generally, John Monahan and associ-
ates (1993) distinguish three different groups
at potential risk of physical harm in their re-
search on violence. First are research subjects
themselves. Women at risk of domestic violence
may be exposed to greater danger if assailants
learn they have disclosed past victimizations to
researchers. Second, researchers might trigger
attacks on themselves when they interview sub-
jects who have a history of violent offending.
Third, and most problematic, is the possibility
that collecting information from unstable indi-
viduals might increase the risk of harm to third
parties. The last category presents a new di-
lemma if researchers learn that subjects intend
to attack some third party. Should researchers
honor a promise of confi dentiality to subjects
or intervene to prevent the harm?

The potential for psychological harm to
subjects exists when interviews are used to col-
lect information. Crime surveys that ask re-
spondents about their experiences as victims
of crime may remind them of a traumatic, or
at least an unpleasant, experience. Surveys may
also ask respondents about illegal behaviors
such as drug use or crimes they have commit-
ted. Talking about such actions with interview-
ers can be embarrassing.

Some researchers have taken special steps to
reduce the potential for emotional trauma in
interviews of domestic violence victims (Tjaden
and Thoennes 2000). One of the most interest-
ing examples involves the use of self-completed

www.arjis.org

Chapter 2 Ethics and Criminal Justice Research 31

periment; they are told that participation is
completely voluntary; and they are further in-
structed that they can expect no special rewards
(such as early parole) for participation. Even
under these conditions, volunteers often are
motivated by the belief that they will personally
benefi t from their cooperation. In other cases,
prisoners— or other subjects—may be offered
small cash payments in exchange for participa-
tion. To people with very low incomes, small
payments may be an incentive to participate in
a study they would not otherwise endure.

When an instructor in an introductory
criminal justice class asks students to fi ll out
a questionnaire that she or he plans to ana-
lyze and publish, students should always be
told that their participation in the survey is
completely voluntary. Even so, students might
fear that nonparticipation will somehow affect
their grade. The instructor should therefore
be especially sensitive to the implied sanctions
and make provisions to obviate them, such as
allowing students to drop the questionnaires in
a box near the door prior to the next class.

Notice how this norm of voluntary partici-
pation works against a number of scientifi c
concerns or goals. In the most general terms,
the goal of generalizability is threatened if ex-
perimental subjects or survey respondents are
only the people who willingly participate. The
same is true when subjects’ participation can be
bought with small payments. Research results
may not be generalizable to all kinds of people.
Most clearly, in the case of a descriptive study, a
researcher cannot generalize the study fi ndings
to an entire population unless a substantial
majority of a scientifi cally selected sample actu-
ally participates—both the willing respondents
and the somewhat unwilling.

Field research (the subject of Chapter 8) has
its own ethical dilemmas in this regard. Often,
a researcher who conducts observations in the
fi eld cannot even reveal that a study is being
done, for fear that this revelation might signifi –
cantly affect what is being studied. Imagine that
you are interested in whether the way stereo

than others. If a particular research procedure
seems likely to produce unpleasant effects for
subjects—such as asking survey respondents to
report deviant behavior—the researcher should
have fi rm scientifi c grounds for doing so. If re-
searchers pursue a design that is essential and
also likely to be unpleasant for subjects, they
will fi nd themselves in an ethical netherworld,
forced to do some personal agonizing.

As a general principle, possible harm to sub-
jects may be justifi ed if the potential benefi ts of
the study outweigh the harm. Of course, this
raises a further question of how to determine
whether possible benefi ts offset possible harms.
There is no simple answer, but as we will see,
the research community has adopted certain
safeguards that help subjects to make such de-
terminations themselves.

Not harming people is an easy norm to ac-
cept in theory, but it is often diffi cult to ensure
in practice. Sensitivity to the issue and experi-
ence in research methodology, however, should
improve researchers’ efforts in delicate areas of
inquiry. Review Dina Perrone’s observations in
the box “Ethics and Extreme Field Research”
for examples.

Voluntary Participation
Criminal justice research often intrudes into
people’s lives. The interviewer’s telephone call or
the arrival of a questionnaire via e-mail signals
the beginning of an activity that respondents
have not requested and that may require a sig-
nifi cant portion of their time and energy. Being
selected to participate in any sort of research
study disrupts subjects’ regular activities.

A major tenet of medical research ethics is
that experimental participation must be vol-
untary. The same norm applies to research in
criminal justice. No one should be forced to
participate. But this norm is far easier to accept
in theory than to apply in practice.

For example, prisoners are sometimes used
as subjects in experimental studies. In the most
rigorously ethical cases, prisoners are told the
nature—and the possible dangers— of the ex-

32 Part One An Introduction to Criminal Justice Inquiry

been interviewed, because researchers did not
record their names. Nevertheless, in some situ-
ations, the price of anonymity is worth pay-
ing. In a survey of drug use, for example, we
may decide that the likelihood and accuracy
of responses will be enhanced by guaranteeing
anonymity.

Respondents in many surveys cannot be con-
sidered anonymous because an interviewer col-
lects the information from individuals whose
names and addresses are known. Other means
of data collection may similarly make it impos-
sible to guarantee anonymity for subjects. If we
wished to examine juvenile arrest records for a
sample of ninth-grade students, we would need
to know their names even though we might not
be interviewing them or having them fi ll out a
questionnaire.

Confi dentiality Confi dentiality means that
a researcher is able to link information with a
given person’s identity but essentially promises
not to do so publicly. In a survey of self-reported
drug use, the researcher is in a position to make
public the use of illegal drugs by a given respon-
dent, but the respondent is assured that this
will not be done. Similarly, if fi eld interviews
are conducted with juvenile gang members, re-
searchers can certify that information will not
be disclosed to police or other offi cials. Studies
using court or police records that include indi-
viduals’ names may protect confi dentiality by
not including any identifying information.

Some techniques ensure better performance
on this guarantee. To begin, fi eld or survey in-
terviewers who have access to respondent iden-
tifi cations should be trained in their ethical
responsibilities. As soon as possible, all names
and addresses should be removed from data
collection forms and replaced by identifi cation
numbers. A master identifi cation fi le should be
created linking numbers to names to permit
the later correction of missing or contradictory
information. This fi le should be kept under
lock and key and be made available only for le-
gitimate purposes.

Whenever a survey is confi dential rather

headphones are displayed in a discount store af-
fects rates of shoplifting. Therefore, you plan a
fi eld study in which you will make observations
of store displays and shoplifting. You cannot
very well ask all shoppers whether they agree to
participate in your study.

The norm of voluntary participation is an
important one, but it is sometimes impossible
to follow. In cases in which researchers ulti-
mately feel justifi ed in violating it, it is all the
more important to observe the other ethical
norms of scientifi c research.

Anonymity and Confi dentiality
The clearest concern in the protection of the
subjects’ interests and well-being is the protec-
tion of their identity. If revealing their behav-
ior or responses would injure them in any way,
adherence to this norm becomes crucial. Two
techniques—anonymity and confi dentiality—
assist researchers in this regard, although the
two are often confused.

Anonymity A research subject is considered
anonymous when the researcher cannot as-
sociate a given piece of information with the
person. Anonymity addresses many potential
ethical diffi culties. Studies that use fi eld ob-
servation techniques are often able to ensure
that research subjects cannot be identifi ed.
Researchers may also gain access to nonpublic
records from courts, corrections departments,
or other criminal justice agencies in which the
names of persons have been removed.

One example of anonymity is a web-based
survey where no login or other identifying in-
formation is required. Respondents anony-
mously complete online questionnaires that
are then tabulated. Likewise, a telephone survey
is anonymous if residential phone numbers are
selected at random and respondents are not
asked for identifying information. Interviews
with subjects in the fi eld are anonymous if the
researchers neither ask for nor record the names
of subjects.

Assuring anonymity makes it diffi cult to
keep track of which sampled respondents have

Chapter 2 Ethics and Criminal Justice Research 33

pate in a study of human development. She also
prepared a brochure describing her research on
human development that was distributed to
respondents.

Although we might initially think that
concealing our research purpose by deception
would be particularly useful in studying active
offenders, James Inciardi (1993), in describing
methods for studying “crack houses,” makes a
convincing case that this is inadvisable. First,
concealing our research role when investigat-
ing drug dealers and users implies that we are
associating with them for the purpose of ob-
taining illegal drugs. Faced with this situation,
a researcher would have the choice of engaging
in illegal behavior or offering a convincing ex-
planation for declining to do so. Second, mas-
querading as a crack-house patron would have
exposed the researcher to the considerable dan-
ger of violence that was found to be common in
such places. Because the choice of committing
illegal acts or becoming a victim of violence is
really no choice at all, Inciardi (1993, 152) ad-
vises researchers who study active offenders in
fi eld settings: “Don’t go undercover.”

Analysis and Reporting
As criminal justice researchers, we have ethi-
cal obligations to our subjects of study. At the
same time, we have ethical obligations to our
colleagues in the scientifi c community; a few
comments on those obligations are in order.
In any rigorous study, the researcher should be
more familiar than anyone else with the tech-
nical shortcomings and failures of the study.
Researchers have an obligation to make those
shortcomings known to readers. Even though
it’s natural to feel foolish admitting mistakes,
researchers are ethically obligated to do so.

Any negative fi ndings should be reported.
There is an unfortunate myth in social scien-
tifi c reporting that only positive discoveries are
worth reporting (and journal editors are some-
times guilty of believing that as well). And this
is not restricted to social science. Helle Krogh
Johansen and Peter Gotzsche (1999) describe
how published research on new drugs tends

than anonymous, it is the researcher’s respon-
sibility to make that fact clear to respondents.
He or she must never use the term anonymous to
mean confi dential. Note, however, that research
subjects and others may not understand the dif-
ference. For example, a former assistant attor-
ney general in New Jersey once demanded that
Maxfi eld disclose the identities of police offi –
cers who participated in an anonymous study.
It required repeated explanations of the differ-
ence between anonymous and confi dential before
the lawyer fi nally understood that it was not
possible to identify participants. In any event,
subjects should be assured that the information
they provide will be used for research purposes
only and not be disclosed to third parties.

Deceiving Subjects
We’ve seen that the handling of subjects’ iden-
tities is an important ethical consideration.
Handling our own identity as researchers can be
tricky, too. Sometimes it’s useful and even neces-
sary to identify ourselves as researchers to those
we want to study. It would take a master con art-
ist to get people to participate in a laboratory
experiment or complete a lengthy questionnaire
without letting on that research was being con-
ducted. We should also keep in mind that de-
ceiving people is unethical; in criminal justice
research, deception needs to be justifi ed by com-
pelling scientifi c or administrative concerns.

Sometimes, researchers admit that they are
doing research but fudge about why they are
doing it or for whom. Cathy Spatz Widom and
associates interviewed victims of child abuse
some 15 years after their cases had been heard
in criminal or juvenile courts (Widom, Weiler,
and Cottler 1999). Widom was interested in
whether child abuse victims were more likely
than a comparison group of nonvictims to have
used illegal drugs. Interviewers could not ex-
plain the purpose of the study without poten-
tially biasing responses. Still, it was necessary
to provide a plausible explanation for asking
detailed questions about personal and family
experiences. Widom’s solution was to inform
subjects that they had been selected to partici-

34 Part One An Introduction to Criminal Justice Inquiry

data may be subject to subpoena by a criminal
court. Because disclosure of research data that
could be traced to individual subjects violates
the ethical principle of confi dentiality, a new
dilemma emerges.

Fortunately, federal law protects researchers
from legal action in most circumstances, pro-
vided that appropriate safeguards are used to
protect research data. Research plans for 2002
published by organizations in the Offi ce of
Justice Programs summarized this protection:
“[Research] information and copies thereof
shall be immune from legal process, and shall
not, without the consent of the person furnish-
ing such information, be admitted as evidence
or used for any purpose in any action, suit, or
other judicial, legislative, or administrative pro-
ceedings” (42 U.S. Code §22.28a). This not only
protects researchers from legal action but also
can be valuable in assuring subjects that they
cannot be prosecuted for crimes they describe
to an interviewer or fi eld worker. Bruce John-
son and associates (1985, 219) prominently dis-
played a Federal Certifi cate of Confi dentiality
at their research offi ce to assure heroin dealers
that they could not be prosecuted for crimes
disclosed to interviewers. More savvy than
many people about such matters, the heroin us-
ers were duly impressed.

Note that such immunity requires confi –
dential information to be protected. We have
already discussed the principle of confi denti-
ality, so this bargain should be an easy one to
keep.

Somewhere between legal liability and phys-
ical danger lies the potential risk to fi eld re-
searchers from law enforcement. Despite being
up-front with crack users about his role as a re-
searcher, Inciardi (1993) points out that police
could not be expected to distinguish him from
his subjects. Visibly associating with offenders
in natural settings brings some risk of being
arrested or inadvertently being an accessory
to crime. On one occasion, Inciardi fl ed the
scene of a robbery and on another was caught
up in a crack-house raid. Another example

to focus on successful experiments. Unsuc-
cessful research on new formulations is less
often published, which leads pharmaceutical
researchers to repeat studies of drugs already
shown to be ineffective. Largely because of this
bias, researchers at the Johns Hopkins Univer-
sity Medical School have established the Jour-
nal of Negative Observations in Genetic Oncology
(NOGO), dedicated to publishing negative fi nd-
ings from cancer research (www.path.jhu.edu/
nogo/; accessed May 6, 2008). In social science,
as in medical research, it is often as important
to know that two things are not related as to
know that they are.

In general, science progresses through hon-
esty and openness, and is retarded by ego de-
fenses and deception. We can serve our fellow
researchers—and the scientifi c community as
a whole—by telling the truth about all the pit-
falls and problems experienced in a particular
line of inquiry. With luck, this will save others
from the same problems.

Legal Liability
Two types of ethical problems expose research-
ers to potential legal liability. To illustrate the
fi rst, assume you are making fi eld observations
of criminal activity, such as street prostitution,
that is not reported to police. Under criminal
law in many states, you might be arrested for
obstructing justice or being an accessory to a
crime. Potentially more troublesome is the situ-
ation in which participant observation of crime
or deviance draws researchers into criminal or
deviant roles themselves—such as smuggling
cigarettes into a lockup in order to obtain the
cooperation of detainees.

The second and more common potential
source of legal problems involves knowledge
that research subjects have committed illegal
acts. Self-report surveys or fi eld interviews may
ask subjects about crimes they have committed.
If respondents report committing offenses they
have never been arrested for or charged with,
the researcher’s knowledge of them might be
construed as obstruction of justice. Or research

www.path.jhu.edu/nogo/

www.path.jhu.edu/nogo/

Chapter 2 Ethics and Criminal Justice Research 35

We will tell you what the researchers decided
at the end of this chapter. You should recog-
nize, however, how applied research in criminal
justice agencies can involve a variety of ethical
issues.

Research Causes Crime Because criminal
acts and their circumstances are complex and
imperfectly understood, some research projects
have the potential to produce crime or infl u-
ence its location or target. Certainly, this is a
potentially serious ethical issue for researchers.

Most people agree that it is unethical to en-
courage someone to commit an offense solely
for the purpose of a research project. What’s
more problematic is recognizing situations in
which research might indirectly promote of-
fending. Scott Decker and Barrik Van Winkle
(1996) discuss such a possibility in their re-
search on gang members. Some gang mem-
bers offered to illustrate their willingness to
use violence by inviting researchers to witness
a drive-by shooting. Researchers declined all
such invitations (1996, 46). Another ethical is-
sue was the question of how subjects used the
$20 cash payments they received in exchange
for being interviewed (1996, 51):

We set the fee low enough that we were
confi dent that it would not have a crimino-
genic effect. While twenty dollars is not a
small amount of money, it is not suffi cient
to purchase a gun or bankroll a large drug
buy. We are sure that some of our subjects
used the money for illegal purposes. But,
after all, these were individuals who were
regularly engaged in delinquent and crimi-
nal acts.

You may or may not agree with the authors’
reasoning in the last sentence. But their consid-
eration of how cash payments would be used by
active offenders represents an unusually care-
ful recognition of the ethical dilemmas that
emerge in studying active offenders.

A different type of ethical problem is the
possibility of crime displacement in studies of

is the account Bruce Jacobs (1996) gives of
his contacts with police while he was study-
ing street drug dealers. Exercises presented at
the end of the chapter ask you to think more
carefully about the ethical issues involved in
Jacobs’s contact with police.

Special Problems
Certain types of criminal justice studies present
special ethical problems in addition to those we
have mentioned. Applied research, for example,
may evaluate some existing or new program.
Evaluations frequently have the potential to
disrupt the routine operations of agencies be-
ing studied. Obviously, it is best to minimize
such interferences whenever possible.

Staff Misbehavior While conducting ap-
plied research, researchers may become aware
of irregular or illegal practices by staff in public
agencies. They are then faced with the ethical
question of whether to report such informa-
tion. For example, investigators conducting an
evaluation of an innovative probation program
learned that police visits to the residences of
probationers were not taking place as planned.
Instead, police assigned to the program had
been submitting falsifi ed log sheets and had
not actually checked on probationers.

What is the ethical dilemma in this case? On
the one hand, researchers were evaluating the
probation program and so were obliged to re-
port reasons it did or did not operate as planned.
Failure to deliver program treatments (home
visits) is an example of a program not operat-
ing as planned. Investigators had guaranteed
confi dentiality to program clients—the offend-
ers assigned to probation—but no such agree-
ment had been struck with program staff. On
the other hand, researchers had assured agency
personnel that their purpose was to evaluate the
probation program, not individuals’ job perfor-
mance. If researchers disclosed their knowledge
that police were falsifying reports, they would
violate this implied trust.

What would you have done in this situation?

36 Part One An Introduction to Criminal Justice Inquiry

Suppose researchers believe that diverting do-
mestic violence offenders from prosecution to
counseling reduces the possibility of repeat vio-
lence. Is it ethical to conduct an experiment in
which some offenders are prosecuted but oth-
ers are not?

You may recognize the similarity between
this question and those faced by medical re-
searchers who test the effectiveness of experi-
mental drugs. Physicians typically respond
to such questions by pointing out that the ef-
fectiveness of a drug cannot be demonstrated
without such experiments. Failure to conduct
research, even at the potential expense of sub-
jects not receiving the trial drugs, would there-
fore make it impossible to develop new drugs
or distinguish benefi cial treatments from those
that are ineffective and even harmful.

One solution to this dilemma is to interrupt
an experiment if preliminary results indicate
that a new policy, or drug, does in fact produce
improvements in a treatment group. Michael
Dennis (1990) describes how such plans were
incorporated into a long-term evaluation of en-
hanced drug treatment counseling. If prelimi-
nary results had indicated that the new coun-
seling program reduced drug use, researchers
and program staff were prepared to provide
enhanced counseling to subjects in the control
group. Dennis recognized this potential ethi-
cal issue and planned his elaborate research de-
sign to accommodate such midstream changes.
Similarly, Martin Killias and associates (2000)
planned to interrupt their experimental study
of heroin prescription in Switzerland if com-
pelling evidence pointed to benefi ts from that
approach to treating drug dependency.

Mandatory Reporting The situation is some-
what murkier for researchers studying certain
kinds of family violence. Following the Federal
Child Abuse Prevention and Treatment Act of
1974, all states developed child protection agen-
cies and adopted mandatory reporting laws.
Specifi c provisions vary, but in general, people
who learn about possible cases of child abuse
must report them to designated state agencies.

crime prevention programs. Consider an ex-
perimental program to reduce street prostitu-
tion in one area of a city. Researchers studying
such a program might designate experimental
target areas for enhanced enforcement, as well
as nearby comparison areas that will not receive
an intervention. If prostitution is displaced
from target areas to adjacent neighborhoods,
the evaluation study contributes to an increase
in prostitution in the comparison areas.

In a review of more than 50 evaluations of
crime prevention projects, René Hesseling (1994)
concludes that displacement tended to be asso-
ciated with programs targeting street prostitu-
tion, bank robbery, and certain combinations
of offenses. The type of crime prevention action
also made a difference, with displacement more
common for target-hardening programs. For
example, installing security screens on ground-
fl oor windows in some buildings seemed to
displace burglary to less protected structures.
Similarly, adding steering column locks to new
cars tended to increase thefts of older cars (Fel-
son and Clarke 1998).

At the same time, Hesseling demonstrates
that displacement is by no means inevitable.
Ronald Clarke and John Eck (2005) further
argue that researchers and public offi cials un-
realistically assume a deterministic model of
offending behavior. Instead, offenders are eas-
ily dissuaded by a variety of crime prevention
measures.

In any event, when it does occur, displace-
ment tends to follow major policy changes
that are not connected with criminal justice
research. Researchers cannot be expected to
control actions by criminal justice offi cials that
may benefi t some people at the expense of oth-
ers. However, it is reasonable to expect research-
ers involved in planning an evaluation study to
anticipate the possibility of such things as dis-
placement and bring them to the attention of
program staff.

Withholding of Desirable Treatments Cer-
tain kinds of research designs in criminal justice
can lead to different kinds of ethical questions.

Chapter 2 Ethics and Criminal Justice Research 37

safeguards are used. In 1974, the National
Research Act was signed into law after a few
highly publicized examples of unethical prac-
tices in medical and social science research.
A few years later, what has become known as
The Belmont Report prescribed a brief but com-
prehensive set of ethical principles for protect-
ing human subjects (National Commission for
the Protection of Human Subjects of Biomedi-
cal and Behavioral Research 1979). In only six
pages, three principles were presented:

1. Respect for persons: Individuals must be al-
lowed to make their own decisions about
participation in research, and those with
limited capacity to make such decisions
should have special protection.

2. Benefi cence: Research should do no harm to
participants, and seek to produce benefi ts.

3. Justice: The benefi ts and burdens of par-
ticipating in research should be distributed
fairly.

Copious federal regulations have stemmed
from these three principles. But in most cases,
the research community has adopted two gen-
eral mechanisms for promoting ethical research
practices: codes of professional ethics and insti-
tutional review boards.

Codes of Professional Ethics
If the professionals who design and conduct
research projects can fail to recognize ethical
problems, how can such problems be avoided?
One approach is for researchers to consult
one of the codes of ethics produced by pro-
fessional associations. Formal codes of con-
duct describe what is considered acceptable
and unacceptable professional behavior. The
American Psychological Association (2002)
code of ethics is quite detailed, refl ecting the
different professional roles of psychologists in
research, clinical treatment, and educational
contexts.

Many of the ethical questions criminal
justice researchers are likely to encounter are
addressed in the ethics code of the Ameri-
can Sociological Association (1997). Paul

This certainly seems to be a worthwhile goal,
but what about researchers who learn about
possible child maltreatment in the course of a
survey? In most states, such requirements ap-
ply only to health professionals and teachers.
But in eight states, anyone who suspects a case of
child maltreatment must report it to designated
authorities.

Notice how this is consistent with one ethi-
cal principle—protection of human subjects by
reporting possible victims—but at odds with
another principle— confi dentiality. A Bureau of
Justice Statistics report on human subjects pro-
tection suggests that researchers warn subjects
at the beginning of an interview that any infor-
mation disclosed about child abuse must be
reported to authorities (Sieber 2001). But that
threatens researchers’ ability to learn about
child abuse. Another approach, adopted by Li-
anne Woodward and David Fergusson (2000),
is to interview subjects age 18 and older, asking
about experiences of abuse victimization when
they were children. This is an imperfect solu-
tion, but it illustrates the trade-offs between
our interest in protecting research subjects and
our interest in studying the phenomenon of
child abuse. For more examples and guidance
in this vexing research/ethical area, see Seth Ka-
lichman’s (2000) book, published by the Ameri-
can Psychological Association.

Research in criminal justice, especially ap-
plied research, can pose a variety of ethical di-
lemmas, only some of which we have mentioned
here. See the “Additional Readings” at the end
of this chapter for more information.

Promoting Compliance with
Ethical Principles
Codes of ethics and institutional review boards are
two main ways of promoting compliance with ethi-
cal principles.

No matter how sensitive they might be to the
rights of individuals and possible ways subjects
might be harmed, researchers are not always
the best judges of whether or not adequate

38 Part One An Introduction to Criminal Justice Inquiry

Links to a number of ethics codes for so-
cial science associations are listed, including
those listed above and the British Society of
Criminology.

What can we make of the inability of the
largest professional association of criminolo-
gists to agree on a code of ethics? The wide vari-
ety of approaches to doing research in this area
probably has something to do with it. Crimi-
nologists also encounter a range of ethical is-
sues and have diverging views on how those is-
sues should be addressed. Finally, we have seen
examples of the special problems criminolo-
gists face in balancing ethics and research. Not
all of these problems have easy solutions that
can be embodied in a code.

Even when they exist, professional codes of
ethics for social scientists cannot be expected
to prevent unethical practices in criminal jus-
tice research any more than the American Bar
Association’s Code of Professional Responsi-
bility eliminates breaches of ethics by lawyers.
For this reason, and in reaction to some contro-
versial medical and social science research, the
U.S. Department of Health and Human Ser-
vices (HHS) has established regulations protect-
ing human research subjects. These regulations
do not apply to all social science or criminal
justice research. It is, however, worthwhile to
understand some of their general provisions.
Material in the following section is based on
the Code of Federal Regulations, Title 45,
Chapter 4.6. Those regulations are themselves
rooted in The Belmont Report.

Institutional Review Boards
Government agencies and nongovernment or-
ganizations (including universities) that con-
duct research involving human subjects must
establish review committees, known as institu-
tional review boards (IRB). These IRBs have two
general purposes. First, board members make
judgments about the overall risks to human
subjects and whether these risks are acceptable,
given the expected benefi ts from actually doing
the research. Second, they determine whether

Reynolds (1979, 442– 449) has created a com-
posite code for the use of human subjects in re-
search, drawing on 24 codes of ethics published
by national associations of social scientists.
The National Academy of Sciences publishes
a very useful booklet on a variety of ethical is-
sues, including the problem of fraud and other
forms of scientifi c misconduct (Committee on
Science, Engineering, and Public Policy 1995).

The two national associations represent-
ing criminology and criminal justice research-
ers have one code of ethics between them. The
Academy of Criminal Justice Sciences (ACJS)
based its code of ethics on that developed by
the American Sociological Association. ACJS
members are bound by a very general code that
refl ects the diversity of its membership: “Most
of the ethical standards are written broadly, to
provide applications in varied roles and var-
ied contexts. The Ethical Standards are not
exhaustive— conduct that is not included in
the Ethical Standards is not necessarily ethi-
cal or unethical” (Academy of Criminal Justice
Sciences 2000, 1).

After years of inaction, a committee of the
American Society of Criminology (ASC) pro-
posed a draft code of ethics in 1998, which drew
extensively on the code for sociology. But no
ethics code had been adopted as of July 2007,
and the ASC withdrew its draft code from cir-
culation in 1999. In personal correspondence
with Maxfi eld in 2003, a prominent ASC offi –
cer expressed doubt that any sort of code would
be approved soon; eventually, this person felt, a
very brief statement of general principles might
be approved. In the meantime, the ASC website
includes the following statement on a page ti-
tled “Code of Ethics”:

The American Society of Criminology has
not formally adopted a code of ethics. We
would suggest that persons interested in
this general topic examine the various codes
of ethics adopted by other professional as-
sociations. (www.asc41.com/ethicspg.html;
accessed January 11, 2008)

www.asc41.com/ethicspg.html

Chapter 2 Ethics and Criminal Justice Research 39

about the purpose of the research—human
development— one component of which is be-
ing a victim of child abuse, which the subjects
were not told.

Another potential problem with obtain-
ing informed consent is ensuring that sub-
jects have the capacity to understand the de-
scriptions of risks, benefi ts, procedures, and
so forth. Researchers may have to provide oral
descriptions to participants who are unable to
read. For subjects who do not speak English, re-
searchers should be prepared to describe proce-
dures in their native language. And if research-
ers use specialized terms or language common
in criminal justice research, participants may
not understand the meaning and thus be un-
able to grant informed consent. Consider this
statement: “The purpose of this study is to
determine whether less restrictive sanctions
such as restitution produce heightened sensi-
tivity to social responsibility among persistent
juvenile offenders and a decline in long-term
recidivism.” Can you think of a better way to
describe this study to delinquent 14-year-olds?
Figure 2.1 presents a good example of an in-
formed consent statement that was used in a
study of juvenile burglars. Notice how the state-
ment describes research procedures clearly and
unambiguously tells subjects that participation
is voluntary.

Other guidelines for obtaining informed
consent include explicitly telling people that
their participation is voluntary and assuring
them of confi dentiality. However, it is more
important to understand how informed con-
sent addresses key ethical issues in conducting
criminal justice research. First, it ensures that
participation is voluntary. Second, by inform-
ing subjects of procedures, risks, and benefi ts,
researchers are empowering them to resolve the
fundamental ethical dilemma of whether the
possible benefi ts of the research offset the pos-
sible risks of participation.

Special Populations Federal regulations
on human subjects include special provisions

the procedures to be used include adequate
safeguards regarding the safety, confi dentiality,
and general welfare of human subjects.

Under HHS regulations, virtually all research
that uses human subjects in any way, includ-
ing simply asking people questions, is subject
to IRB review. The few exceptions potentially
include research conducted for educational
purposes and studies that collect anonymous
information only. However, even those studies
may be subject to review if they use certain spe-
cial populations (discussed later) or procedures
that might conceivably harm participants. In
other words, it’s safe to assume that most re-
search is subject to IRB review if original data
will be collected from individuals whose identi-
ties will be known. If you think about the vari-
ous ways subjects might be harmed and the dif-
fi culty of conducting anonymous studies, you
can understand why this is the case.

Federal regulations and IRB guidelines ad-
dress other potential ethical issues in social
research. Foremost among these is the typical
IRB requirement for dealing with the ethical
principle of voluntary participation.

Informed Consent The norm of voluntary
participation is usually satisfi ed through in-
formed consent—informing subjects about
research procedures and then obtaining their
consent to participate. Although this may seem
like a simple requirement, obtaining informed
consent can present several practical diffi cul-
ties. It requires that subjects understand the
purpose of the research, possible risks and side
effects, possible benefi ts to subjects, and the
procedures that will be used.

If you accept that deception may sometimes
be necessary, you will realize how the require-
ment to inform subjects about research pro-
cedures can present something of a dilemma.
Researchers usually address this problem by
telling subjects at least part of the truth or of-
fering a slightly revised version of why the re-
search is being conducted. In Widom’s study of
child abuse, subjects were partially informed

40 Part One An Introduction to Criminal Justice Inquiry

ethical principles and satisfi ed the concerns of
their university’s IRB.

Prisoners are treated as a special popula-
tion for somewhat different reasons. Because
of their ready accessibility for experiments and
interviews, prisoners have frequently been used
in biomedical experiments that produced seri-
ous harm (Mitford 1973). Recognizing this,
HHS regulations specify that prisoner sub-
jects may not be exposed to risks that would
be considered excessive for nonprison subjects.
Furthermore, undue infl uence or coercion can-
not be used in recruiting prisoner subjects. In-
formed consent statements presented to pro-
spective subjects must indicate that a decision
not to participate in a study will have no infl u-
ence on work assignments, privileges, or parole
decisions. To help ensure that these ethical

for certain types of subjects, and two of these
are particularly important in criminal justice
research—juveniles and prisoners. Juveniles, of
course, are treated differently from adults in
most aspects of the law. Their status as a special
population of human subjects refl ects the legal
status of juveniles, as well as their capacity to
grant informed consent. In most studies that
involve juveniles, consent must be obtained
both from parents or guardians and from the
juvenile subjects themselves.

In some studies, however, such as those that
focus on abused children, it is obviously not de-
sirable to obtain parental consent. Decker and
Van Winkle faced this problem in their study
of St. Louis gang members. See the box “Ethics
and Juvenile Gang Members” for a discussion
of how they reconciled the confl ict between two

You and your parents or guardian are invited to participate in a research study of the monitoring program
that you were assigned to by the Juvenile Court. The purpose of this research is to study the program and
your reactions to it. In order to do this a member of the research team will need to interview you and your
parents/guardians when you complete the monitoring program. These interviews will take about 15 minutes
and will focus on your experiences with the court and monitoring program, the things you do, things that have
happened to you, and what you think. In addition, we will record from the court records information about the
case for which you were placed in the monitoring program, prior cases, and other information that is put in the
records after you are released from the monitoring program.

Anything you or your parents or guardian tell us will be strictly confidential. This means that only the re-
searchers will have your answers. They will not under any conditions (except at your request) be given to the
court, the police, probation officers, your parents, or your child!

Your participation in this research is voluntary. If you don’t want to take part, you don’t have to! If you de-
cide to participate, you can change your mind at any time. Whether you participate or not will have no effect
on the program, probation, or your relationship with the court.

The research is being directed by Dr. Terry Baumer and Dr. Robert Mendelsohn from the Indiana Univer-
sity School of Public and Environmental Affairs here in Indianapolis. If you ever have any questions about the
research or comments about the monitoring program that you think we should know about, please call one of
us at 274-0531.

Consent Statement

We agree to participate in this study of the Marion County Juvenile Monitoring Program. We have read the
above statement and understand what will be required and that all information will be confidential. We also
understand that we can withdraw from the study at any time without penalty.

Juvenile Date: ________________

Parent /Guardian

Parent /Guardian

Researcher

Figure 2.1 Informed Consent Statement for Evaluation of Marion County Juvenile Monitoring Program

Chapter 2 Ethics and Criminal Justice Research 41

regulations actually create problems by setting
constraints on their freedom and professional
judgments in conducting research. Recall that
potential confl ict between the rights of research-
ers to discover new knowledge and the rights of
subjects to be free from unnecessary harm is a
fundamental ethical dilemma. It is at least in-
convenient to have outsiders review a research
proposal. Or a researcher may feel insulted by
the implication that the potential benefi ts of
research will not outweigh the potential harm
or inconvenience to human subjects.

Many university IRBs have become ex-
tremely cautious in reviewing research propos-
als. See Christopher Shea’s (2000) discussion
for examples of problems resulting from this.
Professional associations and research-oriented
federal agencies have tried to offer guidance
on what is and is not subject to various levels
of IRB approval. Joan Sieber (2001) prepared
an analysis of human subjects issues associated
with large surveys for the Bureau of Justice Sta-
tistics. Always alert for possible restrictions on
academic freedom, the American Association of
University Professors (2001) published a useful
summary of how IRBs have come to regulate
social science research.

issues are recognized, if an IRB reviews a proj-
ect in which prisoners will be subjects, at least
one member of that IRB must be either a pris-
oner or someone specifi cally designated to
represent the interests of prisoners. Figure 2.2
presents a checklist required by the Rutgers
University IRB for proposed research involving
prisoners.

Regarding the last item in Figure 2.2, ran-
dom selection is generally recognized as an eth-
ical procedure for selecting subjects or decid-
ing which subjects will receive an experimental
treatment. HHS regulations emphasize this in
describing special provisions for using prison
subjects: “Unless the principal investigator pro-
vides to the [IRB] justifi cation in writing for
following some other procedures, control sub-
jects must be selected randomly from the group
of available prisoners who meet the characteris-
tics needed for that particular research project”
(45 CFR 46.305[4]).

Institutional Review Board
Requirements and Researcher Rights
Federal regulations contain many more provi-
sions for IRBs and other protections for human
subjects. Some researchers believe that such

Figure 2.2 Excerpts from Checklist for Research Involving Prisoners

• Does the research entail any possible advantages accrued to the prisoner through their participation in the
research that impairs their ability to weigh the risk /benefi ts of the participation in the limited choice environment
that exists in a prison? This comparison is to be made with respect to the general living conditions, medical
care, amenities and earning opportunities which exist in a prison.

• Are the risks of the research commensurate with those that would be accepted by non-prisoner participants?
Provide Rationale below.

• Is the information presented to the prisoners in the Consent Form or Oral consent script done so in a language
understandable by the participants?

• Does the Consent Form explicitly state to the subject that “Do not tell us any information about past or
future crimes that are unknown to the authorities as we cannot guarantee confi dentiality of that information
Additionally, I [the researcher] must report to the authorities information you tell me about harming yourself or
other people, or any plans you have to escape.”

• Does adequate assurance exist that parole boards will not take into account participation in the research when
determining parole and that the prisoners were clearly informed of this prior to participation in the research?

• Describe what specifi c steps were taken to ensure that the Informed Consent Form includes information specifi c
to the prisoner subject population. This is not necessary if the research is limited to data analysis

• Is the selection of prisoner research participants fair and equitable and immune from arbitrary intervention
by prisoner authorities? If not, has the PI provided suffi cient justifi cation for the implementation of alternative
procedures?

Source: Adapted from Rutgers University Offi ce of Research and Sponsored Programs document, available at: http://orsprutgers.edu/Humans/
default.php#general

http://orsprutgers.edu/Humans/default.php#general

http://orsprutgers.edu/Humans/default.php#general

42 Part One An Introduction to Criminal Justice Inquiry

Virtually all colleges and universities have
IRBs. Consult the Rutgers University website
(http://orsp.rutgers.edu/; accessed May 6, 2008)
for an example, or visit the IRB website at your
institution.

Ethical Controversies
Examples illustrate how ethics is a problem in jus-
tice research.

By way of illustrating the importance of ethics
principles, together with problems that may be
encountered in applying those principles, we
now describe a research project that provoked
widespread ethical controversy and discussion.
This is followed by further examples of ethical
questions for discussion.

The Stanford Prison Experiment
Few people would disagree that prisons are
dehumanizing. Inmates forfeit freedom, of
course, but their incarceration also results in a
loss of privacy and individual identity. Violence

There is some merit in such concerns; how-
ever, we should not lose sight of the reasons
IRB requirements and other regulations were
created. Researchers are not always the best
judges of the potential for their work to harm
individuals. In designing and conducting crim-
inal justice research, they may become excited
about learning how to better prevent crime or
more effectively treat cocaine addiction. That
excitement and commitment to scientifi c ad-
vancement may lead researchers to overlook
possible harms to individual rights or well-be-
ing. You may recognize this as another way of
asking whether the ends justify the means. Be-
cause researchers are not always disinterested
parties in answering such questions, IRBs are
established to provide outside judgments. Also
recognize that IRBs can be sources of expert ad-
vice on how to resolve ethical dilemmas. Decker
and Van Winkle shared their university’s con-
cern about balancing confi dentiality against
the need to obtain informed consent from juve-
nile subjects; together, they were able to fashion
a workable compromise.

ETHICS AND JUVENILE
GANG MEMBERS

Scott Decker and Barrik Van Winkle
faced a range of ethical issues in their study of gang
members. Many of these should be obvious given
what has been said so far in this chapter. Violence
was common among subjects and presented a real
risk to researchers. Decker and Van Winkle (1996,
252) reported that 11 of the 99 members of the
original sample had been killed since the project
began in 1990. There was also the obvious need
to assure confi dentiality to subjects.

Their project was supported by a federal
agency and administered through a university, so
Decker and Van Winkle had to comply with fed-
eral human subjects guidelines as administered
by the university institutional review board (IRB).
And because many of the subjects were juveniles,
they had to address federal regulations concern-

ing that special population. Foremost among
these was the normal requirement that informed
consent for juveniles include parental notifi cation
and approval.

You may immediately recognize the confl icting
ethical principles at work here, together with the
potential for confl ict. The promise of confi dential-
ity to gang members is one such principle that was
essential for the researchers to obtain candid re-
ports of violence and other law-breaking behavior.
But the need for confi dentiality confl icted with ini-
tial IRB requirements to obtain parental consent
for their children to participate in the research:

This would have violated our commit-
ment to maintain the confi dentiality of
each subject, not to mention the ethical
and practical diffi culties of fi nding and in-
forming each parent. We told the Human
Subjects Committee that we would not,

http://orsp.rutgers.edu/

Chapter 2 Ethics and Criminal Justice Research 43

was constructed in the basement of a psychol-
ogy department building: three 6-foot by 9-foot
“cells” furnished with only a cot, a prison “yard”
in a corridor, and a 2-foot by 7-foot “solitary
confi nement cell.” Twenty-one subjects were
selected from 75 volunteers after screening to
eliminate those with physical or psychological
problems. Offered $15 per day for their partici-
pation, the 21 subjects were randomly assigned
to be either guards or prisoners.

All subjects signed contracts that included
instructions about prisoner and guard roles
for the planned two-week experiment. “Prison-
ers” were told that they would be confi ned and
under surveillance throughout the experiment,
and their civil rights would be suspended; they
were, however, guaranteed that they would not
be physically abused.

“Guards” were given minimal instructions,
most notably that physical aggression or physi-
cal punishment of “prisoners” was prohibited.
Together with a “warden,” however, they were
generally free to develop prison rules and pro-
cedures. The researchers planned to study how

is among the realities of prison life that people
point to as evidence of the failure of prisons to
rehabilitate inmates.

Although the problems of prisons have
many sources, psychologists Curtis Haney,
Craig Banks, and Philip Zimbardo (1973) were
interested in two general explanations. The fi rst
was the dispositional hypothesis—prisons are
brutal and dehumanizing because of the types
of people who run them and are incarcerated in
them. Inmates have demonstrated their disre-
spect for legal order and their willingness to use
deceit and violence; persons who work as prison
guards may be disproportionately authoritar-
ian and sadistic. The second was the situational
hypothesis—the prison environment itself cre-
ates brutal, dehumanizing conditions indepen-
dent of the kinds of people who live and work
in the institutions.

Haney and associates set out to test the situ-
ational hypothesis by creating a functional
prison simulation in which healthy, psychologi-
cally normal male college students were assigned
to roles as prisoners and guards. The “prison”

in effect, tell parents that their child was
being interviewed because they were an
active gang member, knowledge that the
parents may not have had. (Decker and
Van Winkle 1996, 52)

You might think deception would be a
possibility—informing parents that their child was
selected for a youth development study, for exam-
ple. This would not, however, solve the logistical
diffi culty of locating parents or guardians, some
of whom had lost contact with their children.
Furthermore, it was likely that even if parents or
guardians could be located, suspicions about the
research project and the reasons their children
were selected would prevent many parents from
granting consent. Loss of juvenile subjects in this
way would compromise the norm of generality as
we have described it in this chapter and elsewhere.

Finally, waiving the requirement for parental

consent would have undermined the legal princi-
ple that the interests of juveniles must be protected
by a supervising adult. Remember that researchers
are not always the best judges of whether suffi –
cient precautions have been taken to protect sub-
jects. Here is how Decker and Van Winkle (1996,
52) resolved the issue with their IRB:

We reached a compromise in which we
found an advocate for each juvenile mem-
ber of our sample; this person—a univer-
sity employee—was responsible for making
sure that the subject understood (1) their
rights to refuse or quit the interview at any
time without penalty and (2) the confi –
dential nature of the project. All subjects
signed the consent form.

Source: Adapted from Decker and Van Winkle
(1996).

44 Part One An Introduction to Criminal Justice Inquiry

Subjects in each group accepted their roles
all too readily. Prisoners and guards could in-
teract with each other in friendly ways because
guards had the power to make prison rules. But
interactions turned out to be overwhelmingly
hostile and negative. Guards became aggressive,
and prisoners became passive. When the experi-
ment ended prematurely, prisoners were happy
about their early “parole,” but guards were dis-
appointed that the study would not continue.

Haney and colleagues justify the prison sim-
ulation study in part by claiming that the dis-
positional/situational hypotheses could not be
evaluated using other research designs. Clearly,
the researchers were sensitive to ethical issues.
They obtained subjects’ consent to the experi-
ment through signed contracts. Prisoners who
showed signs of acute distress were released
early. The entire study was terminated after less
than half of the planned two weeks had elapsed
when its unexpectedly harsh impact on subjects
became evident. Finally, researchers conducted
group therapy debriefi ng sessions with prison-
ers and guards and maintained follow-up con-
tacts for a year to ensure that subjects’ negative
experiences were temporary.

Two related features of this experiment raise
ethical questions, however. First, subjects were
not fully informed of the procedures. Although
we have seen that deception, including some-
thing less than full disclosure, can often be jus-
tifi ed, in this case deception was partially due
to the researchers’ uncertainty about how the
prison simulation would unfold. This relates to
the second and more important ethical prob-
lem: guards were granted the power to make up
and modify rules as the study progressed, and
their behavior became increasingly authori-
tarian. Comments by guards illustrate their
reactions as the experiment unfolded (Haney,
Banks, and Zimbardo 1973, 88):

“They [the prisoners] didn’t see it as an ex-
periment. It was real and they were fi ghting
to keep their identity. But we were always
there to show them just who was boss.”

both guards and prisoners reacted to their roles,
but guards were led to believe that the purpose
of the experiment was to study prisoners.

If you had been a prisoner in this experi-
ment, you would have experienced something
like the following after signing your contract:
First, you would have been arrested without
notice at your home by a real police offi cer,
perhaps with neighbors looking on. After be-
ing searched and taken to the police station in
handcuffs, you would have been booked, fi nger-
printed, and placed in a police detention facil-
ity. Next, you would have been blindfolded and
driven to “prison,” where you would have been
stripped, sprayed with a delousing solution,
and left to stand naked for a period of time in
the “prison yard.” Eventually, you would have
been issued a prison uniform (a loose overshirt
stamped with your ID number), fi tted with an
ankle chain, led to your cell, and ordered to re-
main silent. Your prison term would then have
been totally controlled by the guards.

Wearing mirrored sunglasses, khaki uni-
forms, and badges and carrying nightsticks,
guards supervised prisoner work assignments
and held lineups three times per day. Although
lineups initially lasted only a few minutes, guards
later extended them to several hours. Prison-
ers were fed bland meals and accompanied by
guards on three authorized toilet visits per day.

The behavior of all subjects in the prison
yard and other open areas was videotaped; au-
diotapes were made continuously while prison-
ers were in their cells. Researchers administered
brief questionnaires throughout the experi-
ment to assess emotional changes in prisoners
and guards. About four weeks after the experi-
ment concluded, researchers conducted inter-
views with all subjects to assess their reactions.

Haney and associates (1973, 88) had planned
to run the prison experiment for two weeks,
but they halted the study after six days because
subjects displayed “unexpectedly intense reac-
tions.” Five prisoners had to be released even
before that time because they showed signs of
acute depression or anxiety.

Chapter 2 Ethics and Criminal Justice Research 45

chapters in this book and whenever you plan a
research project.

To further sensitize you to the ethical com-
ponent of criminal justice and other social re-
search, we’ve prepared brief descriptions of
real and hypothetical research situations. Can
you see the ethical issue in each? How do you
feel about it? Are the procedures described ul-
timately acceptable or unacceptable? It would
be very useful to discuss these examples with
other students in your class.

1. In a federally funded study of a probation
program, a researcher discovers that one
participant was involved in a murder while
on probation. Public disclosure of this inci-
dent might threaten the program that the
researcher believes, from all evidence, is ben-
efi cial. Judging the murder to be an anomaly,
the researcher does not disclose it to federal
sponsors or describe it in published reports.

2. As part of a course on domestic violence, a
professor requires students to telephone
a domestic violence hotline, pretend to be
a victim, and request help. Students then
write up a description of the assistance of-
fered by hotline staff and turn it in to the
professor.

3. Studying aggression in bars and nightclubs,
a researcher records observations of a sav-
age fi ght in which three people are seri-
ously injured. Ignoring pleas for help from
one of the victims, the researcher retreats
to a restroom to write up notes from these
observations.

4. In a study of state police, researchers learn
that offi cers have been instructed by superi-
ors to “not sign anything.” Fearing that ask-
ing offi cers to sign informed consent state-
ments will sharply reduce participation,
researchers seek some other way to satisfy
their university IRB. What should they do?

5. In the example mentioned in the section
“Staff Misbehavior” (page xx), the research-
ers disclosed to public offi cials that police
were not making visits to probationers as

“During the inspection, I went to cell 2
to mess up a bed which the prisoner had
made and he grabbed me, screaming that
he had just made it. . . . He grabbed my
throat, and although he was laughing, I
was pretty scared. I lashed out with my
stick and hit him in the chin (although
not very hard), and when I freed myself I
became angry.”
“Acting authoritatively can be fun.
Power can be a great pleasure.”

How do you feel about this experiment? On
the one hand, it provided valuable insights into
how otherwise normal people react in a simu-
lated prison environment. Subjects appeared
to suffer no long-term harm, in part because
of precautions taken by researchers. Paul Reyn-
olds (1979, 139) found a certain irony in the
short-term discomforts endured by the college
student subjects: “There is evidence that the
major burdens were borne by individuals from
advantaged social categories and that the major
benefactors would be individuals from less ad-
vantaged social categories [actual prisoners], an
uneven distribution of costs and benefi ts that
many nevertheless consider equitable.” On the
other hand, researchers did not anticipate how
much and how quickly subjects would accept
their roles. In discussing their fi ndings, Haney
and associates (1973, 90) note: “Our results
are . . . congruent with those of Milgram who
most convincingly demonstrated the proposi-
tion that evil acts are not necessarily the deeds
of evil men, but may be attributable to the op-
eration of powerful social forces.” This quote il-
lustrates the fundamental dilemma—balancing
the right to conduct research against the rights
of subjects. Is it ethical for researchers to create
powerful social forces that lead to evil acts?

Discussion Examples
Research ethics is an important and ambiguous
topic. The diffi culty of resolving ethical issues
cannot be an excuse for ignoring them. You
need to keep ethics in mind as you read other

46 Part One An Introduction to Criminal Justice Inquiry

the document carefully. How would the code
apply to the Stanford prison simulation?

2. Discuss the general trade-offs between the re-
quirements of sound scientifi c research meth-
ods and the need to protect human subjects.
Where do tensions exist? Cite illustrations of
tensions from two or more examples of ethical
issues presented in this chapter.

3. Review the box “Ethics and Juvenile Gang Mem-
bers” on page 42. Although it is not shown in
the box, Decker and Van Winkle developed an
informed consent form for their subjects. Keep-
ing in mind the various ethical principles dis-
cussed in this chapter, try your hand at prepar-
ing an informed consent statement that Decker
and Van Winkle might have used.

✪ Additional Readings
Als-Nielsen, Bodil, Wendong Chen, Christian

Gluud, and Lise L. Kjaergard, “Association of
Funding and Conclusions in Randomized Drug
Trials: A Refl ection of Treatment Effect or Ad-
verse Events?” Journal of the American Medical As-
sociation 290(7, August 2003): 921–927. A brief,
interesting, and nontechnical analysis of re-
search on the effects of new drugs. The authors
fi nd that research sponsored by drug compa-
nies is much more likely to fi nd that drugs are
effective, compared to research sponsored by
nonprofi t organizations. What do you make of
that?

American Association of University Professors, Re-
search on Human Subjects: Academic Freedom and
the Institutional Review Board (2006), report from
the Committee on Academic Freedom and Ten-
ure (Washington, DC: American Association
of University Professors, 2006; www.aaup.org/
AAUP/comm/rep/A/humansubs.htm; accessed
May 5, 2008). Largely in response to concern
about overly restrictive institutional review
boards, the AAUP convened a series of meetings
with representatives from major social science
professional groups; this report summarizes
the discussion. The most interesting sections
address the expansion of human subjects’ pro-
tections in social science and even historical
research.

Committee on Science, Engineering, and Public
Policy, On Being a Scientist: Responsible Conduct
in Research, 2nd ed. (Washington, DC: National
Academy Press, 1995). This monograph covers

called for in the program intervention. Pub-
lished reports describe the problem as “ir-
regularities in program delivery.”

✪ Main Points
• In addition to technical and scientifi c consid-

erations, criminal justice research projects are
shaped by ethical considerations.

• What’s ethically “right” and “wrong” in research
is ultimately a matter of what people agree is
right and wrong.

• Researchers tend not to be the best judges of
whether their own work adequately addresses
ethical issues.

• Most ethical questions involve weighing the
possible benefi ts of research against the poten-
tial for harm to research subjects.

• Criminal justice research may generate special
ethical questions, including the potential for le-
gal liability and physical harm.

• Scientists agree that participation in research
should, in general, be voluntary. This norm,
however, can confl ict with the scientifi c need
for generalizability.

• Most scientists agree that research should not
harm subjects unless they willingly and know-
ingly accept the risks of harm.

• Anonymity and confi dentiality are two ways to
protect the privacy of research subjects.

• Compliance with ethical principles is promoted
by professional associations and by regulations
issued by the Department of Health and Hu-
man Services (HHS).

• HHS regulations include special provisions
for two types of subjects of particular interest
to many criminal justice researchers: prisoners
and juveniles.

• Institutional review boards (IRBs) play an im-
portant role in ensuring that the rights and
interests of human subjects are protected. But
some social science researchers believe that
IRBs are becoming too restrictive.

✪ Key Terms
anonymity, p. 32 confi dentiality, p. 32

✪ Review Questions and Exercises
1. Obtain a copy of the Academy of Criminal Jus-

tice Sciences’ (2000) code of ethics at this web-
site: www.acjs.org (accessed May 6, 2008). Read

www.aaup.org/AAUP/comm/rep/A/humansubs.htm

www.aaup.org/AAUP/comm/rep/A/humansubs.htm

www.acjs.org

Chapter 2 Ethics and Criminal Justice Research 47

describes the dangers and depressing realities
of fi eld research in a crack house. Should a fi eld
researcher intervene when witnessing a gang
rape in a crack house? Read this selection for
Inciardi’s answer.

Kalichman, Seth C., Mandated Reporting of Sus-
pected Child Abuse: Ethics, Law, and Policy, 2nd
ed. (Washington, DC: American Psychological
Association, 2000). This report presents an ex-
tensive discussion of legal and ethical issues in
conducting research on child abuse.

a range of issues, including research fraud and
other misconduct by researchers. Although
many of the issues are specifi c to the natural
sciences, criminologists will fi nd much valuable
material.

Inciardi, James A., “Some Considerations on the
Methods, Dangers, and Ethics of Crack-House
Research,” Appendix A in James A. Inciardi, Dor-
othy Lockwood, and Anne E. Pettieger, Women
and Crack Cocaine (New York: Macmillan, 1993),
pp. 147–157. In this thoughtful essay, Inciardi

This page intentionally left blank

49

Part Two

Structuring Criminal
Justice Inquiry

Posing questions properly is often more dif-
fi cult than answering them. Indeed, a prop-
erly phrased question often seems to answer
itself. We sometimes discover the answer to
a question in the very process of clarifying
the question for someone else.

At its base, scientifi c research is a process
for achieving generalized understanding
through observation. Part Three will de-
scribe some of the specifi c methods of obser-
vation for criminal justice research. But fi rst,
Part Two deals with the posing of proper
questions, the structuring of inquiry.

Chapter 3 addresses some of the funda-
mental issues that must be considered in
planning a research project. It examines
questions of causation, the units of analysis
in a research project, the important role of
time, and the kinds of things we must con-
sider in proposing to do research projects.

Chapter 4 deals with the specifi cation
of what it is we want to study—a process
known as conceptualization—and the mea-

surement of the concepts we specify. We’ll
look at some of the terms that we use casu-
ally in everyday life, and we’ll see how es-
sential it is to be clear about what we really
mean by such terms when we do research.
Once we are clear on what we mean when
we use certain terms, we are in a position to
create measurements of what those terms
refer to. The process of devising steps or
operations for measuring what we want to
study is known as operationalization. By
way of illustrating this process, Chapter 4
includes an extended discussion of different
approaches to measuring crime.

Chapter 5 concentrates on the general
design of a criminal justice research project.
A criminal justice research design specifi es
a strategy for fi nding out something, for
structuring a research project. Chapter 5 de-
scribes commonly used strategies for experi-
mental and quasi-experimental research.
Each is adapted in some way from the clas-
sical scientifi c experiment.

50

Chapter 3

General Issues in Research Design
Here we’ll examine some fundamental principles about conducting empiri-
cal research: (1) causation and (2) variations on who or what is to be studied
and when and how to do the studies. We’ll also take a broad overview of the
research process.

Introduction 51

Causation in the Social
Sciences 51

Criteria for Causality 52

Necessary and Suffi cient
Causes 53

Validity and Causal
Inference 53

Statistical Conclusion Validity 53

Internal Validity 55

External Validity 55

Construct Validity 55

Validity and Causal Inference
Summarized 57

Does Drug Use Cause Crime? 57

CAUSATION AND DECLINING

CRIME IN NEW YORK CITY 58

Introducing Scientifi c Realism 60

Units of Analysis 61

Individuals 61

Groups 61

Organizations 62

Social Artifacts 62

The Ecological Fallacy 63

Units of Analysis in Review 63

UNITS OF ANALYSIS IN THE

NATIONAL YOUTH GANG

SURVEY 64

The Time Dimension 65

Cross-Sectional Studies 66

Chapter 3 General Issues in Research Design 51

Introduction
Causation, units, and time are key elements in
planning a research study.

Science is an enterprise dedicated to “fi nd-
ing out.” No matter what we want to fi nd out,
though, there are likely to be a great many ways
of going about it. Topics examined in this chap-
ter address how to plan scientifi c inquiry—how
to design a strategy for fi nding out something.
Often criminal justice researchers want to fi nd
out something that involves questions of cause
and effect. They may want to fi nd out more
about things that make crime more likely to oc-
cur or about policies that they hope will reduce
crime in some way.

In practice, all aspects of research design are
interrelated. They are separated here and in sub-
sequent chapters so that we can explore partic-
ular topics in detail. We start with a discussion
of causation in social science, the foundation of
explanatory research. We then examine units of
analysis—the what or whom to study. Deciding
on units of analysis is an important part of all
research, partly because people sometimes inap-
propriately use data measuring one type of unit
to say something about a different type of unit.

Next, we consider alternative ways of han-
dling time in criminal justice research. It is

sometimes appropriate to examine a static
cross section of social life, but other studies
follow social processes over time. In this re-
gard, researchers must consider the time order
of events and processes in making statements
about cause.

We then provide a brief overview of the
whole research process. This serves two pur-
poses: (1) it provides a map to the remainder of
this book, and (2) it conveys a sense of how re-
searchers go about designing a study.

The chapter concludes with guidelines for
preparing a research proposal. Often the actual
conduct of research needs to be preceded by a
detailed plan of our intentions—perhaps to ob-
tain funding for a major project or to get an in-
structor’s approval for a class assignment. We’ll
see that preparing a research proposal offers an
excellent way to ensure that we have considered
all aspects of our research in advance.

Causation in the
Social Sciences
Causation is the focus of explanatory research.

Cause and effect are implicit in much of what
we have examined so far. One of the chief goals
of social scientifi c researchers is to explain why

Longitudinal Studies 66

Approximating Longitudinal Studies 67

The Time Dimension Summarized 70

How to Design a Research Project 70

The Research Process 71

Getting Started 73

Conceptualization 73

Choice of Research Method 74

Operationalization 74

Population and Sampling 74

Observations 75

Analysis 75

Application 75

Research Design in Review 75

The Research Proposal 76

Elements of a Research Proposal 76

52 Part Two Structuring Criminal Justice Inquiry

scribed by William Shadish, Thomas Cook, and
Donald Campbell (2002). The fi rst requirement
in a causal relationship between two variables is
that the cause precede the effect in time. It makes
no sense to imagine something being caused by
something else that happened later on. A bul-
let leaving the muzzle of a gun does not cause
the gunpowder to explode; it works the other
way around. As simple and obvious as this crite-
rion may seem, criminal justice research suffers
many problems in this regard. Often the time or-
der connecting two variables is simply unclear.
Which comes fi rst: drug use or crime? And even
when the time order seems clear, exceptions may
be found. For example, we normally assume
that obtaining a master’s degree in management
is a cause of more rapid advancement in a state
department of corrections. Yet corrections ex-
ecutives might pursue graduate education after
they have been promoted and recognize that ad-
vanced training in management skills will help
them do their job better.

The second requirement in a causal relation-
ship is that the two variables be empirically cor-
related with each other—they must occur to-
gether. It makes no sense to say that exploding
gunpowder causes a bullet to leave the muzzle
of a gun if, in observed reality, a bullet does not
come out after the gunpowder explodes.

Again, criminal justice research has diffi cul-
ties with this requirement. In the probabilistic
world of nomothetic models of explanation,
at least, we encounter few perfect correlations.
Most judges sentence repeat drug dealers to
prison, but some don’t. We are forced to ask,
therefore, how strong the empirical relation-
ship must be for that relationship to be consid-
ered causal.

The third requirement for a causal relation-
ship is that the observed empirical correlation
between two variables cannot be explained away
as being due to the infl uence of some third vari-
able that causes both of them. For example,
we may observe that drug markets are often
found near bus stops, but this does not mean
that bus stops encourage drug markets. A third
variable is at work here: groups of people natu-

things are the way they are. Typically we do that
by specifying the causes for the way things are:
some things are caused by other things.

Much of our discussion in this section de-
scribes issues of causation and validity for so-
cial science in general. Recall from Chapter 1
that criminal justice research and theory are
most strongly rooted in the social sciences. Fur-
thermore, social scientifi c research methods are
adapted from those used in the physical sci-
ences. Many important and diffi cult questions
about causality and validity occupy researchers
in criminal justice, but our basic approach re-
quires stepping back a bit to consider the larger
picture of how we can or cannot assert that
some cause produces some effect.

At the outset, it’s important to keep in mind
that cause in social science is inherently prob-
abilistic, a point we introduced in Chapter 1.
We say, for example, that certain factors make
delinquency more or less likely. Thus, victims
of childhood abuse or neglect are more likely
to report alcohol abuse as adults (Schuck and
Widom 2001). Recidivism is less likely among
offenders who receive more careful assessment
and classifi cation at institutional intake (Cul-
len and Gendreau 2000).

Criteria for Causality
We begin our consideration of cause by exam-
ining what criteria must be satisfi ed before we
can infer that something causes something else.
Joseph Maxwell (2005, 106–107) writes that
criteria for assessing an idiographic explana-
tion are (1) how credible and believable it is and
(2) whether alternative explanations (“rival hy-
potheses”) were seriously considered and found
wanting. The fi rst criterion relates to logic as
one of the foundations of science. We demand
that our explanations make sense, even if the
logic is sometimes complex. The second crite-
rion reminds us of Sherlock Holmes’s dictum
that when all other possibilities have been elim-
inated the remaining explanation, however im-
probable, must be the truth.

Regarding nomothetic explanation, we ex-
amine three specifi c criteria for causality, as de-

Chapter 3 General Issues in Research Design 53

with are probabilistic and partial—we are able
to partly explain cause and effect in some per-
centage of cases we observe.

Validity and Causal Inference
Scientists assess the truth of statements about cause
by considering threats to validity.

Paying careful attention to cause-and-effect
relationships is crucial in criminal justice re-
search. Cause and effect are also key elements of
applied studies, in which a researcher may be in-
terested in whether, for example, a new manda-
tory sentencing law actually causes an increase
in the prison population.

When we are concerned with whether we
are correct in inferring that a cause produced
an effect, we are concerned with the validity
of causal inference. In the words of Shadish,
Cook, and Campbell (2002, 34), validity is “the
approximate truth of an inference. . . . When
we say something is valid, we make a judgment
about the extent to which relevant evidence sup-
ports that inference as being true or correct.”
They emphasize that approximate is an impor-
tant word because one can never be absolutely
certain about cause.

Our next concern is a number of different
validity threats in causal inference—reasons
we might be incorrect in stating that some
cause produced some effect. As Maxwell (2005,
106) puts it, “A key concept for validity is thus
the validity threat: a way you might be wrong”
(emphasis in original). Here we will summarize
the threats to four general categories of valid-
ity: statistical conclusion validity, internal va-
lidity, construct validity, and external validity.
Chapter 5 discusses each type in more detail,
linking the issue of validity to different ways of
designing research.

Statistical Conclusion Validity
Statistical conclusion validity refers to our
ability to determine whether a change in the
suspected cause is statistically associated with a
change in the suspected effect. This corresponds

rally congregate near bus stops, and street drug
markets are often found where people naturally
congregate.

To sum up, most social researchers consider
two variables to be causally related— one causes
the other—if (1) the cause precedes the effect
in time, (2) there is an empirical correlation
between them, and (3) the relationship is not
found to result from the effects of some third
variable on each of the two initially observed.
Any relationship that satisfi es all these criteria
is causal, and these are the only criteria that
need to be satisfi ed.

Necessary and Suffi cient Causes
Recognizing that virtually all causal relation-
ships in criminal justice are probabilistic is
central to understanding other points about
cause. Within the probabilistic model, it is use-
ful to distinguish two types of causes: neces-
sary and suffi cient causes. A necessary cause is
a condition that, by and large, must be present
for the effect to follow. For example, it is neces-
sary for someone to be charged with a criminal
offense to be convicted, but being charged is
not enough; you must plead guilty or be found
guilty by the court. Figure 3.1 illustrates that
relationship.

A suffi cient cause, in contrast, is a condition
that more or less guarantees the effect in ques-
tion. Pleading guilty to a criminal charge is a
suffi cient cause for being convicted, although
you could be convicted through a trial as well.
Figure 3.2 illustrates this state of affairs.

The discovery of one cause that is both
necessary and suffi cient is the most satisfy-
ing outcome in explanatory research. If we are
studying juvenile delinquency, we want to dis-
cover a single condition that (1) has to be pres-
ent for delinquency to develop and (2) always
results in delinquency. Then we will surely feel
that we know precisely what causes juvenile de-
linquency. Unfortunately, we seldom discover
causes that are both necessary and suffi cient,
nor, in practice, are causes 100 percent neces-
sary or 100 percent suffi cient. Most causal rela-
tionships that criminal justice researchers work

54 Part Two Structuring Criminal Justice Inquiry

Basing conclusions on a small number of
cases is a common threat to statistical conclu-
sion validity. Suppose a researcher studies 10
drug users and 10 nonusers and compares the
numbers of times these subjects are arrested
for other crimes over a six-month period. The
researcher might fi nd that the 10 users were ar-
rested an average of three times, whereas non-
users averaged two arrests in six months. There
is a difference in arrest rates, but is it a signifi –
cant difference? Statistically, the answer is no
because so few drug users were included in the
study. Researchers cannot have much confi –

with one of the fi rst questions asked by research-
ers: are two variables related to each other? If we
suspect that using illegal drugs causes people to
commit crimes, one of the fi rst things we will be
interested in is the common variation between
drug use and crime. If drug users and nonus-
ers commit equal rates of crime and if about
the same proportions of criminals and non-
criminals use drugs, there will be no statistical
relationship between measures of drug use and
criminal offending. That seems to be the end of
our investigation of the causal relationship be-
tween drugs and crime.

Convicted

Not
Convicted

Plead GuiltyPlead Innocent

Figure 3.2 Suffi cient Cause

Convicted

Not
Convicted

ChargedNot Charged

Figure 3.1 Necessary Cause

Chapter 3 General Issues in Research Design 55

case, a third variable—prior convictions—may
explain some or all of the observed tendency
of prison sentences to be associated with re-
cidivism. Prior convictions are associated with
both sentence—prison or probation—and sub-
sequent convictions.

External Validity
Are fi ndings about the impact of mandatory ar-
rest for family violence in Minneapolis similar to
fi ndings in Milwaukee? Can community crime
prevention organizations successfully combat
drug use throughout a city, or do they work best
in areas with only minor drug problems? Elec-
tronic monitoring may be suitable as an alter-
native sentence for convicted offenders, but can
it work as an alternative to jail for defendants
awaiting trial? Such questions are examples of
issues in external validity: do research fi ndings
about cause and effect apply equally to different
cities, neighborhoods, and populations?

In a general sense, external validity is con-
cerned with whether research fi ndings from one
study can be reproduced in another study, often
under different conditions. Because crime prob-
lems and criminal justice responses can vary so
much from city to city or from state to state, re-
searchers and public offi cials are often especially
interested in external validity. For example, a
Kansas City evaluation found sharp reductions
in gun-related crimes in hot spots that had been
targeted for focused police patrols (Sherman and
Rogan 1995). Because these results were promis-
ing, similar projects were launched in two other
cities, Indianapolis (McGarrell et al. 2001) and
Pittsburgh (Cohen and Ludwig 2003). In both
cases, researchers found that police actions tar-
geting hot spots for gun violence reduced gun-
related crime and increased seizures of illegal
fi rearms. Having similar fi ndings in Indianapo-
lis and Pittsburgh enhanced the external valid-
ity of original results from Kansas City.

Construct Validity
This type of validity is concerned with how
well an observed relationship between variables

dence in statements about cause if their fi nd-
ings are based on a small number of cases.

Threats to statistical conclusion validity
might also have the opposite effect of suggest-
ing that covariation is present when, in fact,
there is no cause-and-effect relationship. The
reasons for this are again somewhat technical
and require a basic understanding of statisti-
cal inference. But the basic principle is based on
chance variation—sometimes what appears to
be a relationship simply occurs by chance.

Internal Validity
Internal validity threats challenge causal
statements that are based on an observed rela-
tionship. An observed association between two
variables has internal validity if the relation-
ship is, in fact, causal and not due to the effects
of one or more other variables. Whereas statis-
tical conclusion threats are most often due to
random error, internal validity problems result
from nonrandom or systematic error. Threats
to the internal validity of a proposed causal re-
lationship between two indicators usually arise
from the effects of one or more other variables.
Notice how this validity threat relates to the
third requirement for establishing a causal re-
lationship: eliminating other possible explana-
tions for the observed relationship.

If we observe that convicted drug users sen-
tenced to probation are rearrested less often
than drug users sentenced to prison, we might
be tempted to infer that prison sentences cause
recidivism. Although being in prison might
have some impact on whether someone com-
mits more crimes in the future, in this case it
is important to look for other causes of recidi-
vism. One likely candidate is prior criminal re-
cord. Convicted drug users without prior crimi-
nal records are more likely to be sentenced to
probation, whereas persons with previous con-
victions more often receive prison terms. Re-
search on criminal careers has found that the
probability of reoffending increases with the
number of prior arrests and convictions (Far-
rington, Jolliffe, Hawkins, et al. 2003). In this

56 Part Two Structuring Criminal Justice Inquiry

further that withdrawing preventive patrol, as
was done in the reactive beats, reduces the vis-
ibility of police. But by how much? Larson ex-
plored this question in detail and suggested
that two other features of police operations
during the Kansas City experiment partially
compensated for the absence of preventive pa-
trol and produced a visible police presence.

First, the different types of experimental
beats were adjacent to one another; one reac-
tive beat shared borders with three control and
three proactive beats. This enhanced the vis-
ibility of police in reactive beats in two ways:
(1) police in adjoining proactive and control
beats sometimes drove around the perimeter
of reactive beats, and (2) police often drove
through reactive beats on their way to some
other part of the city.

Second, many Kansas City police offi cers
were skeptical about the experiment and feared
that withdrawing preventive patrol in reactive
beats would create problems. Partly as a result,
police who responded to calls for service in the
reactive areas more frequently used lights and
sirens when driving to the location of com-
plaints. A related action was that police units
not assigned to the calls for service nevertheless
drove into the reactive beats to provide backup
service.

Each of these actions produced a visible po-
lice presence in the reactive beats. People who
lived in these areas were unaware of the experi-
ment and, as you might expect, did not know
whether a police car happened to be present be-
cause it was on routine patrol, was on its way
to some other part of the city, or was respond-
ing to a call for assistance. And, of course, the
use of lights and sirens makes police cars much
more visible.

Larson’s point was that the construct of po-
lice visibility is only partly represented by rou-
tine preventive patrol. A visible police presence
was produced in Kansas City through other
means. Therefore the researchers’ conclusion
that routine preventive patrol does not cause
a reduction in crime or an increase in arrests
suffers from threats to construct validity. Con-

represents the underlying causal process of in-
terest. Construct validity refers to generaliz-
ing from what we observe and measure to the
real-world things in which we are interested.
The concept of construct validity is thus closely
related to issues in measurement, as we will see
in Chapter 4.

To illustrate construct validity, let’s consider
the supervision of police offi cers—specifi cally,
whether close supervision causes police offi cers
to write more traffi c tickets. We might defi ne
close supervision in this way: a police sergeant
drives his own marked police car in such a way
as to always keep a patrol car in view.

This certainly qualifi es as close supervision,
but you may recognize a couple of problems.
First, two marked patrol cars present a highly
visible presence to motorists, who might drive
more prudently and thus reduce the opportu-
nities for patrol offi cers to write traffi c tickets.
Second, and more central to the issue of con-
struct validity, this represents a narrow defi ni-
tion of the construct close supervision. Patrol of-
fi cers may be closely supervised in other ways
that cause them to write more traffi c tickets.
For example, sergeants might closely supervise
their offi cers by reviewing their ticket produc-
tion at the end of each shift. Supervising subor-
dinates by keeping them in view is only one way
of exercising control over their behavior. It may
be appropriate for factory workers, but it is not
practical for police, representing a very limited
version of the construct supervision.

The well-known Kansas City Preventive Pa-
trol Experiment, discussed in Chapter 1, pro-
vides another example of construct validity
problems. Recall that the experiment sought
to determine whether routine preventive patrol
caused reductions in crime and fear of crime and
increases in arrests. Richard Larson (1975) dis-
cussed several diffi culties with the experiment’s
design. One important problem relates to the
visibility of police presence, a central concept in
preventive patrol. It is safe to assume that the
ability of routine patrol to prevent crime and
enhance feelings of safety depends crucially on
the visibility of police. It makes sense to assume

Chapter 3 General Issues in Research Design 57

Does Drug Use Cause Crime?
As a way of illustrating issues of validity and
causal inference, we will consider the relation-
ship between drug use and crime. Drug addic-
tion is thought to cause people who are des-
perate for a fi x and unable to secure legitimate
income to commit crimes to support their habit.

Discussing the validity of causal statements
about drug use and crime requires carefully
specifying two key concepts— drug use and
crime—and considering the different ways these
concepts might be related. Jan Chaiken and
Marcia Chaiken (1990) provide unusually care-
ful and well-reasoned insights that will guide
our consideration of links between drugs and
crime.

First is the question of temporal order: which
comes fi rst, drug use or crime? Research sum-
marized by Chaiken and Chaiken provides no
conclusive answer. In an earlier study of prison
inmates, Chaiken and Chaiken (1982) found
that 12 percent of their adult subjects com-
mitted crimes after using drugs for at least two
years, whereas 15 percent committed predatory
crimes two or more years before using drugs.
Studies of juveniles revealed similar fi ndings:
“About 50 percent of delinquent youngsters are
delinquent before they start using drugs; about
50 percent start concurrently or after” (Chai-
ken and Chaiken 1990, 235).

Many studies have found that some drug
users commit crimes and that some criminals
use drugs, but Chaiken and Chaiken (1990,
234) conclude that “drug use and crime par-
ticipation are weakly related as contemporane-
ous products of factors generally antithetical
to traditional United States lifestyles.” Stated
somewhat differently, drug use and crime (as
well as delinquency) are each deviant activities
produced by other underlying causes. A statis-
tical association between drug use and crime
clearly exists. But the presence of other factors
indicates that the relationship is not directly
causal, thus bringing into question the inter-
nal validity of causal statements about drug use
and crime.

struct validity is a frequent problem in applied
studies, in which researchers may oversimplify
complex policies and policy goals.

Validity and Causal Inference
Summarized
The four types of validity threats can be
grouped into two categories: bias and gener-
alizability. Internal and statistical conclusion
validity threats are related to systematic and
nonsystematic bias, respectively. Problems with
statistical procedures produce nonsystematic
bias, whereas an alternative explanation for an
observed relationship is an example of system-
atic bias. In either case, bias calls into question
the inference that some cause produced some
effect. Failing to consider the more general
cause-and-effect constructs that operate in an
observed cause-and-effect relationship results
in research fi ndings that cannot be generalized
to real-world behaviors and conditions. And a
cause-and-effect relationship observed in one
setting or at one time may not operate in the
same way in a different setting or at a different
time.

Shadish, Cook, and Campbell (2002, 39)
summarized their discussion of these four va-
lidity threats by linking them to the types of
questions that researchers ask in trying to es-
tablish cause and effect. Test your understand-
ing by writing the name of each validity threat
after the appropriate question.

1. How large and reliable is the covariation be-
tween the presumed cause and effect?

_______________
2. Is the covariation causal, or would the same

covariation have been obtained without the
treatment? _______________

3. What general constructs are involved in the
persons, settings, treatments, and observa-
tions used in the experiment?

_______________
4. How generalizable is the locally embedded

causal relationship over varied persons,
treatments, observations, and settings?

__________________

58 Part Two Structuring Criminal Justice Inquiry

and Bennett and Holloway (2005) report on
varying patterns of use in England and Wales.
Because there is no simple way to describe ei-
ther construct, searching for a single cause-and-
effect relationship misrepresents a complex
causal process.

Problems with the external validity of re-
search on drugs and crime are similar to those
revolving around construct validity. The rela-
tionship between occasional marijuana use and
delinquency among teenagers is different from
that between occasional cocaine use and adult

To assess the construct validity of research
on drugs and crime, let’s think for a moment
about different patterns of each behavior,
rather than assume that drug use and crime are
uniform behaviors. Many adolescents in the
United States experiment with drugs; just as
many— especially males— commit delinquent
acts or petty crimes. A large number of adults
may be occasional users of illegal drugs as well.
Many different patterns of drug use, delin-
quency, and adult criminality have been found
in research in other countries. Pudney (2002)

CAUSATION AND
DECLINING CRIME
IN NEW YORK CITY

Did changes in police strategy and tactics in New
York City cause a decline in crime? That question
is central to what became quite a spirited debate,
a debate that centers our attention on causa-
tion. In fact, former Police Commissioner William
Bratton (1999, 17) used language strikingly simi-
lar to what you will fi nd in this chapter to argue
that crime dropped in New York because of po-
lice action:

As a basic tenet of epistemology . . . we
cannot conclude that a causal relationship
exists between two variables unless . . .
three necessary conditions occur: one vari-
able must precede the other in time, an
empirically measured relationship must be
demonstrated between the variables, and
the relationship must not be better ex-
plained by any third intervening variable.
Although contemporary criminology’s ex-
planations for the decline in New York City
meet the fi rst two conditions, they don’t
explain it better than a third intervening
variable.

With these words, Bratton challenged researchers
to propose some empirical measure of a variable
that better accounted for the crime reduction.
A number of researchers have advanced alterna-

tive explanations. Let’s now consider what some
of those variables might be.

Changing Drug Markets
After falling from 1980 through 1984, homicide
rates in larger U.S. cities rose sharply through the
early 1990s. This corresponded with the emer-
gence of crack cocaine, a low-cost drug sold by
loosely organized gangs that settled business dis-
putes with guns instead of lawyers. The decline
in crack use in the mid-1990s corresponded with
the beginning of decreasing homicide rates. Al-
fred Blumstein and Richard Rosenfeld (1998)
point to changes in crack markets as a plausible
explanation—gun homicides and crack markets
both increased and decreased together. Other
researchers claim changes in crack markets had
some effect on violence ( Johnson, Golub, and
Dunlap 2000; Karmen 2000).

Regression
One threat to internal validity is regression to
the mean. This refers to a phenomenon whereby
social indicators move up and down over time,
and abnormally high or low values are eventually
followed by a return (regression) to more normal
levels. Jeffrey Fagan and associates (Fagan, Zim-
ring, and Kim 1998) present evidence to suggest
that rates of gun homicide in New York were rela-
tively stable from 1969 through the mid-1980s,
when they began to increase sharply. Around
1991, rates began to decline, returning to ap-
proximate previous levels.

Chapter 3 General Issues in Research Design 59

trates threats to the validity of causal infer-
ence. It is often diffi cult to fi nd a relationship
because there is so much variation in drug use
and crime participation (statistical conclusion
validity threat). A large number of studies have
demonstrated that, when statistical relation-
ships are found, both drug use and crime can
be attributed to other, often multiple, causes
(internal validity threat). Different patterns
among different population groups mean there
are no readily identifi able cause-and-effect con-
structs (construct validity). Because of these

crime; each relationship, in turn, varies from
that between heroin addiction and persistent
criminal behavior among adults.

The issue of external validity comes into fo-
cus when we shift from basic research that seeks
to uncover fundamental causal relationships to
criminal justice policy. Chaiken and Chaiken
argue that any uniform policy to reduce the use
of all drugs among all population groups will
have little effect on serious crime.

Basic and applied research on the relation-
ships among drug use and crime readily illus-

Homicides Declined Everywhere
Fagan and associates (1998) point out that al-
though New York’s decline was considerable it was
not unprecedented. Many large cities saw sharp re-
ductions in homicide during the same period, and
a few cities had even greater declines. Blumstein
and Rosenfeld (1998) cite similar data, pointing
out that sharp reductions in homicide rates oc-
curred in cities where no major changes in policing
were evident. This suggests that declining crime
in New York was simply part of a national trend.

Incapacitation
The decline of homicide rates in the 1990s fol-
lowed more than a decade of growth in incar-
ceration rates in state and federal correctional
facilities. According to the incapacitation argu-
ment, rates of homicide and other violent crimes
declined because growing numbers of violent
criminals were locked up. Some researchers
have presented evidence to challenge that claim
(Rosenfeld 2000; Spelman 2000).

Economic Opportunity
The early 1990s marked the beginning of a sus-
tained period of economic growth in the United
States. With greater job opportunities, crime nat-
urally declined. A number of authors have cited
this example of how change in one of the “root
causes” of crime might have caused changes in
rates of homicide and other serious crimes (for
example, Karmen 2000; Silverman 1999), but
none have offered any evidence to support it.

Demographic Change
Because violent and other offenses tend to be
committed more by younger people, especially
young males, a decline in the number of members
in those demographic groups may be responsible
for New York’s reduced crime rate. Fagan and as-
sociates (1998) present data that show relatively
stable numbers of 15- to 19-year-old males in
New York, so it seems unlikely that demographic
factors account for changes in crime rates. Addi-
tional analysis of demographic change has been
conducted by Rosenfeld (2000) and James Alan
Fox (2000).

Continuation of a Trend
George Kelling and William Bratton (1998) claim
that crime declined in New York immediately fol-
lowing the implementation of major changes in
policing. But other analysts argue that the be-
ginning of the shift preceded Bratton’s changes.
Notice that this brings into question another
of the three criteria for inferring cause by claiming
that the effect—declining crime—occurred before
the cause—changes in policing. Even more pos-
sible explanations have been offered, but those
presented here are the most plausible and are
most often cited by researchers. Andrew Karmen
(2000, 263) offers a good summary of different
explanations for New York’s crime drop.

60 Part Two Structuring Criminal Justice Inquiry

of considering nomothetic causation. A scien-
tifi c realist approach would consider the causal
mechanism underlying electronic monitoring
to be effective in some contexts but not in oth-
ers. As another example, we reviewed at some
length the cause-and-effect conundrum sur-
rounding drug use and crime. That review was
framed by traditional nomothetic research to
establish cause and effect. A scientifi c realism
approach to the question would recognize that
drug use and crime co-occur in some contexts
but not in others.

We say that scientifi c realism bridges idio-
graphic and nomothetic modes of explanation
because it exhibits elements of both. Because it
focuses our attention on very specifi c questions,
scientifi c realism seems idiographic: “Will rede-
signing the Interstate 78 exit in Newark, New
Jersey, cause a reduction in the number of sub-
urban residents seeking to buy heroin in this
neighborhood?” But this approach is compat-
ible with more general questions of causation:
“Can the design of streets and intersections be
modifi ed to make it more diffi cult for street
drug markets to operate?” Changing an express-
way exit ramp to reduce drug sales in Newark
is a specifi c example of cause and effect that is
rooted in the more general causal relationship
between traffi c patterns and drug markets. Re-
search by Nicholas Zanin and colleagues (2004)
addresses both the idiographic explanation in
Newark and the potential for broader applica-
tions elsewhere.

These illustrations of the scientifi c realist
approach to cause and effect are examples of
research for the purpose of application, a topic
treated at length by British researchers Ray
Pawson and Nick Tilley (1997). Application is a
type of explanatory research, as we indicated in
Chapter 1. In later chapters, we call on scientifi c
realism as a strategy for designing explanatory
research (Chapter 5) and conducting evalua-
tions (Chapter 10).

Sorting out causes and effects is one of the
most diffi cult challenges of explanatory re-
search. Our attention now turns to two other
important considerations that emerge in re-

differences, policies developed to counter drug
use among the population as a whole cannot be
expected to have much of an impact on serious
crime (external validity).

None of the above is to say that there is no
cause-and-effect relationship between drug
use and crime. However, Chaiken and Chai-
ken have clearly shown that there is no simple
causal connection. For more on how questions
of cause and effect emerge, see the box “Causa-
tion and Declining Crime in New York City.”

Introducing Scientifi c Realism
In our fi nal consideration of cause and effect
in this chapter, we revisit the distinction be-
tween idiographic and nomothetic ways of ex-
planation. Doing research to fi nd what causes
what refl ects nomothetic concerns more often
than not. We wish to fi nd causal explanations
that apply generally to situations beyond those
we actually study in our research. At the same
time, researchers and public offi cials are of-
ten interested in understanding specifi c causal
mechanisms in more narrowly defi ned situ-
ations—what we have described as the idio-
graphic mode of explanation.

Scientifi c realism bridges idiographic and
nomothetic approaches to explanation by seek-
ing to understand how causal mechanisms oper-
ate in specifi c contexts. Traditional approaches
to fi nding cause and effect usually try to isolate
causal mechanisms from other possible infl u-
ences, something you should now recognize
as trying to control threats to internal valid-
ity. The scientifi c realist approach views these
other possible infl uences as contexts in which
causal mechanisms operate. Rather than try
to exclude or otherwise control possible out-
side infl uences, scientifi c realism studies how
such infl uences are involved in cause-and-effect
relationships.

For example, earlier in this chapter, we
noted that electronic monitoring as a condi-
tion of probation might apply to some popula-
tions but not others. We framed this as a ques-
tion of external validity in the traditional way

Chapter 3 General Issues in Research Design 61

Individuals
Any variety of individuals may be the units of
analysis in criminal justice research. This point
is more important than it may initially seem.
The norm of generalized understanding in
social science should suggest that scientifi c
fi ndings are most valuable when they apply
to all kinds of people. In practice, however, re-
searchers seldom study all kinds of people. At
the very least, studies are typically limited to
people who live in a single country, although
some comparative studies stretch across na-
tional boundaries.

As the units of analysis, individuals may
be considered in the context of their member-
ship in different groups. Examples of groups
whose members may be units of analysis at the
individual level are police, victims, defendants
in criminal court, correctional inmates, gang
members, and active burglars. Note that each
of these terms implies some population of in-
dividual persons. Descriptive studies having
individuals as their units of analysis typically
aim to describe the population that comprises
those individuals.

Groups
Social groups may also be the units of analy-
sis for criminal justice research. This is not
the same as studying the individuals within a
group. If we study the members of a juvenile
gang to learn about teenagers who join gangs,
the individual (teen gang member) is the unit
of analysis. But if we study all the juvenile
gangs in a city to learn the differences between
big gangs and small ones, between gangs selling
drugs and gangs stealing cars, and so forth, the
unit of analysis is the social group (gang).

Police beats or patrol districts might be the
units of analysis in a study. A police beat can be
described in terms of the total number of peo-
ple who live within its boundaries, total street
mileage, annual crime reports, and whether the
beat includes a special facility such as a park
or high school. We can then determine, for ex-
ample, whether beats that include a park report

search for explanation and other purposes:
units of analysis and the time dimension.

Units of Analysis
To avoid mistaken inferences, researchers must
carefully specify the people or phenomena that will
be studied.

In criminal justice research, there is a great deal
of variation in what or who is studied—what
are technically called units of analysis. Individ-
ual people are often units of analysis. Research-
ers may make observations describing certain
characteristics of offenders or crime victims,
such as age, gender, or race. The descriptions
of many individuals are then combined to pro-
vide a picture of the population that comprises
those individuals.

For example, we may note the age and gen-
der of persons convicted of drunk driving in
Fort Lauderdale over a certain period. Aggre-
gating these observations, we might charac-
terize drunk-driving offenders as 72 percent
men and 28 percent women, with an average
age of 26.4 years. This is a descriptive analysis
of convicted drunk drivers in Fort Lauderdale.
Although the description applies to the group
of drunk drivers as a whole, it is based on the
characteristics of individual people convicted
of drunk driving.

Units of analysis in a study are typically also
the units of observation. Thus, to study what
steps people take to protect their homes from
burglary, we might observe individual house-
hold residents, perhaps through interviews.
Sometimes, however, we observe units of analy-
sis indirectly. We might ask individuals about
crime prevention measures for the purpose of
describing households. We might want to fi nd
out whether homes with double-cylinder dead-
bolt locks are burglarized less often than homes
with less substantial protection. In this case,
our units of analysis are households, but the
units of observation are individual household
members who are asked to describe burglaries
and home protection to interviewers.

62 Part Two Structuring Criminal Justice Inquiry

grams in selected neighborhoods of a large city.
In such an evaluation, we might be interested
in how citizens feel about the program (indi-
viduals), whether arrests increased in neighbor-
hoods with the new program compared with
those without it (groups), and whether the po-
lice department’s budget increased more than
the budget in a similar city (organizations).
In such cases, it is imperative that researchers
anticipate what conclusions they wish to draw
with regard to what units of analysis.

Social Artifacts
Yet another potential unit of analysis may be re-
ferred to as social artifacts, or the products of so-
cial behavior. One class of social artifacts is sto-
ries about crime in newspapers and magazines
or on television. A newspaper story might be
characterized by its length, placement on front
or interior pages, size of headlines, and pres-
ence of photographs. A researcher could ana-
lyze whether television news features or news-
paper reports provide the most details about a
new police program to increase drug arrests.

Social interactions are also examples of so-
cial artifacts suitable for criminal justice re-
search. Police crime reports are an example. We
might analyze assault reports to fi nd how many
involved three or more people, whether assaults
involved strangers or people with some prior
acquaintance, or whether they more often oc-
curred in public or private locations.

At fi rst, crime reports may not seem to be so-
cial artifacts, but consider for a moment what
they represent. When a crime is reported to the
police, offi cers usually record what happened
from descriptions by victims or witnesses. For
instance, an assault victim may describe how
he suffered an unprovoked attack while inno-
cently enjoying a cold beer after work. However,
witnesses to the incident might claim that the
“victim” started the fi ght by insulting the “of-
fender.” The responding police offi cer must in-
terpret who is telling the truth in trying to sort
out the circumstances of a violent social inter-
action. The offi cer’s report becomes a social

more assaults than beats without such facilities
or whether auto thefts are more common in
beats with more street mileage. Here the indi-
vidual police beat is the unit of analysis.

Organizations
Formal political or social organizations may
also be the units of analysis in criminal jus-
tice research. An example is correctional facili-
ties, which implies, of course, a population of
all correctional facilities. Individual facilities
might be characterized in terms of their num-
ber of employees, status as state or federal
prisons, security classifi cation, percentage of
inmates who are from racial or ethnic minor-
ity groups, types of offenses for which inmates
are sentenced to each facility, average length
of sentence served, and so forth. We might de-
termine whether federal prisons house a larger
or smaller percentage of offenders sentenced
for white-collar crimes than do state prisons.
Other examples of formal organizations suit-
able as units of analysis are police departments,
courtrooms, probation offi ces, drug treatment
facilities, and victim services agencies.

When social groups or formal organizations
are the units of analysis, their characteristics are
often derived from the characteristics of their
individual members. Thus a correctional facil-
ity might be described in terms of the inmates it
houses—gender distribution, average sentence
length, ethnicity, and so on. In a descriptive
study, we might be interested in the percentage
of institutions housing only females. Or, in an
explanatory study, we might determine whether
institutions housing both males and females
report, on the average, fewer or more assaults
by inmates on staff compared with male-only
institutions. In each example, the correctional
facility is the unit of analysis. In contrast, if we
ask whether male or female inmates are more
often involved in assaults on staff, then the in-
dividual inmate is the unit of analysis.

Some studies involve descriptions or expla-
nations of more than one unit of analysis. Con-
sider an evaluation of community policing pro-

Chapter 3 General Issues in Research Design 63

of analysis, but we wish to draw conclusions
about individual people.

The same problem will arise if we discover
that incarceration rates are higher in states that
have a large proportion of elderly residents. We
will not know whether older people are actually
imprisoned more often. Or, if we fi nd higher
suicide rates in cities with large nonwhite pop-
ulations, we cannot be sure whether more non-
whites than whites committed suicide.

Don’t let these warnings against ecological
fallacy lead you to commit what is called an
individualistic fallacy. Some students approach-
ing criminal justice research for the fi rst time
have trouble reconciling general patterns of at-
titudes and actions with individual exceptions
they know of. If you read a newspaper story
about a Utah resident visiting New York who
is murdered on a subway platform, the fact
remains that most visitors to New York and
most subway riders are not at risk of murder.
Similarly, mass media stories and popular fi lms
about drug problems in U.S. cities frequently
focus on drug use and dealing among African
Americans. But that does not mean that most
African Americans are drug users or that drugs
are not a problem among whites.

The individualistic fallacy can be especially
troublesome for beginning students of crimi-
nal justice. Newspapers, local television news,
and television police dramas often present un-
usual or highly dramatized versions of crime
problems and criminal justice policy. These
messages may distort the way many people ini-
tially approach research problems in criminal
justice.

Units of Analysis in Review
The purpose of this section has been to specify
what is sometimes a confusing topic, in part be-
cause criminal justice researchers use a variety of
different units of analysis. Although individual
people are often the units of analysis, that is not
always the case. Many research questions can
more appropriately be answered through the ex-
amination of other units of analysis.

artifact that represents one among the popula-
tion of all assaults.

Records of different types of social interac-
tions are common units of analysis in crimi-
nal justice research. Criminal history records,
meetings of community anticrime groups, pre-
sentence investigations, and interactions be-
tween police and citizens are examples. Notice
that each example requires information about
individuals but that social interactions between
people are the units of analysis.

The Ecological Fallacy
We now briefl y consider one category of prob-
lems commonly encountered with respect to
units of analysis. The ecological fallacy refers
to the danger of making assertions about indi-
viduals as the unit of analysis based on the ex-
amination of groups or other aggregations.

As an example, suppose we are interested
in learning about robbery in different police
precincts of a large city. Let’s assume that we
have information on how many robberies were
committed in 2004 in each police precinct of
Chicago. Assume also that we have census data
describing some of the characteristics of those
precincts. Our analysis of such data might
show that a large number of robberies in 2004
occurred in the downtown precinct and that
the average family income of persons who live
in downtown Chicago (the Loop) was substan-
tially higher than in other precincts in the city.
We might be tempted to conclude that high-in-
come downtown residents are more likely to be
robbed than are people who live in other parts
of the city—that robbers select richer victims.
In reaching such a conclusion, we run the risk
of committing an ecological fallacy, because
lower-income people who did not live in the
downtown area were also being robbed there in
2004. Victims might be commuters to jobs in
the Loop, people visiting downtown theaters or
restaurants, passengers on subway or elevated
train platforms, or homeless persons who are
not counted by the census. Our problem is
that we examined police precincts as our unit

64 Part Two Structuring Criminal Justice Inquiry

results in part from diffi culties in directly mea-
suring the concepts we want to study.

To test your grasp of the concept of units
of analysis, here are some examples of real re-
search topics. See if you can determine the unit
of analysis in each. (The answers are given later
in the chapter, on page 78.)

1. “Taking into account preexisting traffi c fa-
tality trends and several other relevant fac-
tors, the implementation of the emergency
cellular telephone program resulted in a
substantial and permanent reduction in the
monthly percentage of alcohol-related fatal
crashes.” (D’Alessio, Stolzenberg, and Terry
1999, 463– 464)

2. “The survey robbery rate was highest in
Canada and the Netherlands, and lowest
in Scotland. . . . In 1999 the survey robbery

The concept of units of analysis may seem
more complicated than it needs to be. Under-
standing the logic of units of analysis is more
important than memorizing a list of the units.
It is irrelevant what we call a given unit of
analysis—a group, a formal organization, or a
social artifact. It is essential, however, that we
be able to identify our unit of analysis. We must
decide whether we are studying assaults or as-
sault victims, police departments or police of-
fi cers, courtrooms or judges, and prisons or
prison inmates. Without keeping this point
in mind, we run the risk of making assertions
about one unit of analysis based on the ex-
amination of another. The box titled “Units of
Analysis in the National Youth Gang Survey”
offers examples of using inappropriate units
of analysis. It also illustrates that lack of clar-
ity about units of analysis in criminal justice

UNITS OF ANALYSIS IN
THE NATIONAL YOUTH
GANG SURVEY

In 1997, the third annual National Youth Gang
Survey was completed for the federal Offi ce
of Juvenile Justice and Delinquency Prevention
(OJJDP). This survey refl ects keen interest in de-
veloping better information about the scope of
youth gangs and their activities in different types
of communities around the country. As important
and useful as this effort is, the National Youth
Gang Survey—especially reports of its results—
illustrates how some ambiguities can emerge with
respect to units of analysis.

A variety of methods, often creative, are used
to gather information from or about active offend-
ers. Partly this is because it is diffi cult to system-
atically identify offenders for research. Studying
youth gangs presents more than the usual share of
problems with units of analysis. Are we interested
in gangs (groups), gang members (individuals), or
offenses (social artifacts) committed by gangs?

Following methods developed in earlier years,
the 1997 National Youth Gang Survey was based
on a sample of law enforcement agencies. The
sample was designed to represent different types
of communities: rural areas, suburban counties,

small cities, and large cities. Questionnaires were
mailed to the police chief for municipalities and
to the sheriff for counties (National Youth Gang
Center 1999, 3). Questions asked respondents
to report on gangs and gang activity in their ju-
risdiction—municipality for police departments,
and unincorporated service area for sheriffs’
departments.

Here are examples of the types of questions
included in the survey:

1. How many youth gangs were active in your
jurisdiction?

2. How many active youth gang members were
in your jurisdiction?

3. In your jurisdiction, what percentage of street
sales of drugs were made by youth gang
members? (followed by list: powder cocaine,
crack cocaine, marijuana, heroin, metham-
phetamine, other)

4. Does your agency have the following? (list of
special youth gang units)

Notice the different units of analysis em-
bedded in these questions. Seven are stated or
implied:

1. Gangs: item 1
2. Gang members: items 2, 3

Chapter 3 General Issues in Research Design 65

prised the study’s cross-sectional units.
From January 1986 through July 1989, a
period of 43 months, data were collected
monthly from each substation for a total of
344 cases.” (Kessler 1999, 346)

The Time Dimension
Because time order is a requirement for causal in-
ferences, the time dimension of research requires
careful planning.

We saw earlier in this chapter how the time se-
quence of events and situations is a critical ele-
ment in determining causation. In general, ob-
servations may be made more or less at one time
point or they may be deliberately stretched over a
longer period. Observations made at more than
one time point can look forward or backward.

rate was lowest in the United States.” (Far-
rington et al. 2004, xii)

3. “On average, probationers were 31 years old,
African American, male, and convicted of
drug or property offenses. Most lived with
family, and although they were not mar-
ried, many were in exclusive relationships
(44 percent) and had children (47 percent).”
(MacKenzie, Browning, Skroban, and Smith
1999, 433)

4. “Seventy-fi ve percent (n = 158) of the cases
were disposed at district courts, and 3 per-
cent (n = 6) remained pending. One percent
of the control and 4 percent of the experi-
mental cases were referred to drug treat-
ment court.” (Taxman and Elis 1999, 42)

5. “The department’s eight Field Operations
Command substations (encompassing 20
police districts and 100 patrol beats) com-

3. Jurisdiction (city or part of county area):
items 1, 2, 3

4. Street sales of drugs: item 3
5. Drug types: item 3
6. Agency: item 4
7. Special unit: item 4

Now consider some quotes from a summary re-
port on the 1997 survey (National Youth Gang
Center 1999). Which ones do or do not reason-
ably refl ect the actual units of analysis from the
survey?

■ “Fifty-one percent of survey respondents indi-
cated that they had active youth gangs in their
jurisdictions in 1997.” (page 7)

■ “Thirty-eight percent of jurisdictions in the
Northeast, and 26 percent of jurisdictions in
the Middle Atlantic regions reported active
youth gangs in 1997.” (extracted from Table
3, page 10)

■ “Results of the 1997 survey revealed that there
were an estimated 30,533 youth gangs and
815,986 gang members active in the United
States in 1997.” (page 13)

■ “The percentage of street sales of crack co-
caine, heroin, and methamphetamine con-
ducted by youth gang members varied sub-
stantially by region. . . . Crack cocaine sales

involving youth gang members were most
prevalent in the Midwest (38 percent), heroin
sales were most prevalent in the Northeast (15
percent), and methamphetamine sales were
most prevalent in the West (21 percent).”
(page 27)

■ “The majority (66 percent) of respondents in-
dicated that they had some type of specialized
unit to address the gang problem.” (page 33)

The youth gang survey report includes a number
of statements and tables that inaccurately de-
scribe units of analysis. You probably detected
examples of this in some of the statements shown
here. Other statements accurately refl ect units of
analysis measured in the survey.

If you read the 1997 survey report and keep
in mind our discussion of units of analysis, you
will fi nd more misleading statements and tables.
This will enhance your understanding of units of
analysis.

Source: Information drawn from National Youth
Gang Center (1999).

66 Part Two Structuring Criminal Justice Inquiry

extended period. An example is a researcher who
observes the activities of a neighborhood anti-
crime organization from the time of its incep-
tion until its demise. Analysis of newspaper sto-
ries about crime or numbers of prison inmates
over time are other examples. In the latter in-
stances, it is irrelevant whether the researcher’s
observations are made over the course of the
actual events under study or at one time—for
example, examining a year’s worth of newspa-
pers in the library or 10 years of annual reports
on correctional populations.

Three types of longitudinal studies are com-
mon in criminal justice research: trend, co-
hort, and panel studies. Trend studies look at
changes within some general population over
time. An example is a comparison of Uniform
Crime Report (UCR; described in Chapter 1) fi g-
ures over time, showing an increase in reported
crime from 1960 through 1993 and then a de-
cline through 2001. Or a researcher might want
to know whether changes in sentences for cer-
tain offenses were followed by increases in the
number of people imprisoned in state institu-
tions. In this case, a trend study might examine
annual fi gures for prison population over time,
comparing totals for the years before and after
new sentencing laws took effect.

Cohort studies examine more specifi c pop-
ulations (cohorts) as they change over time.
Typically a cohort is an age group, such as those
people born during the 1980s, but it can also be
based on some other time grouping. Cohorts
are often defi ned as a group of people who
enter or leave an institution at the same time,
such as persons entering a drug treatment cen-
ter during July, offenders released from custody
in 2002, or high school seniors in March 2003.

In what is probably the best-known cohort
study, Marvin Wolfgang and associates (Wolf-
gang, Figlio, and Sellin 1972) studied all males
born in 1945 who lived in the city of Philadel-
phia from their 10th birthday through age 18
or older. The researchers examined records
from police agencies and public schools to de-
termine how many boys in the cohort had been

Cross-Sectional Studies
Many criminal justice research projects are
designed to study a phenomenon by taking
a cross section of it at one time and analyzing
that cross section carefully. Exploratory and
descriptive studies are often cross-sectional.
A single U.S. census, for instance, is a study
aimed at describing the U.S. population at a
given time. A single wave of the National Crime
Victimization Survey (NCVS) is a descriptive
cross-sectional study that estimates how many
people have been victims of crime in a given
time.

A cross-sectional exploratory study might
be conducted by a police department in the
form of a survey that examines what residents
believe to be the sources of crime problems in
their neighborhood. In all likelihood, the study
will ask about crime problems in a single time
frame, with the fi ndings used to help the de-
partment explore various methods of introduc-
ing community policing.

Cross-sectional studies for explanatory or
evaluation purposes have an inherent prob-
lem. Typically their aim is to understand causal
processes that occur over time, but their con-
clusions are based on observations made at
only one time. For example, a survey might
ask respondents whether their home has been
burglarized and whether they have any special
locks on their doors, hoping to explain whether
special locks prevent burglary. Because the
questions about burglary victimization and
door locks are asked at only one time, it is not
possible to determine whether burglary victims
installed locks after a burglary or whether spe-
cial locks were already in place but did not pre-
vent the crime. Some of the ways we can deal
with the diffi cult problem of determining time
order will be discussed in the section on ap-
proximating longitudinal studies.

Longitudinal Studies
Research projects known as longitudinal stud-
ies are designed to permit observations over an

Chapter 3 General Issues in Research Design 67

study of gun ownership and violence by Swiss
researcher Martin Killias (1993). Killias com-
pared rates of gun ownership as reported in an
international crime survey to rates of homicide
and suicide committed with guns. He was in-
terested in the possible effects of gun availabil-
ity on violence: do nations with higher rates of
gun ownership also have higher rates of gun
violence?

Killias reasoned that inferring causation
from a cross-sectional comparison of gun own-
ership and homicides committed with guns
would be ambiguous. Gun homicide rates could
be high in countries with high gun-ownership
rates because the availability of guns was higher.
Or people in countries with high gun-ownership
rates could have bought guns to protect them-
selves, in response to rates of homicide. Cross-
sectional analysis would not make it possible to
sort out the time order of gun ownership and
gun homicides.

But does that reasoning hold for gun sui-
cides? Killias argued that the time order in a re-
lationship between gun ownership and gun sui-
cides is less ambiguous. It makes much more
sense that suicides involving guns are at least
partly a result of gun availability. But it is not
reasonable to assume that people might buy
guns in response to high gun-suicide rates.

Logical inferences may also be made when-
ever the time order of variables is clear. If we dis-
cover in a cross-sectional study of high school
students that males are more likely than females
to smoke marijuana, we can conclude that gen-
der affects the propensity to use marijuana,
not the other way around. Thus, even though
our observations are made at only one time, we
are justifi ed in drawing conclusions about pro-
cesses that take place across time.

Retrospective Studies Research that asks
people to recall their pasts, called retrospective
research, is a common way of approximating
observations over time. In a study of recidi-
vism, for example, we might select a group of
prison inmates and analyze their history of

charged with delinquency or arrested, how old
they were when fi rst arrested, and what differ-
ences there were in school performance between
delinquents and nondelinquents.

Panel studies are similar to trend and co-
hort studies except that observations are made
on the same set of people on two or more oc-
casions. The NCVS is a good example of a de-
scriptive panel study. A member of each house-
hold selected for inclusion in the survey is
interviewed seven times at six-month intervals.
The NCVS serves many purposes, but it was de-
veloped initially to estimate how many people
were victims of various types of crimes each
year. It is designed as a panel study so that per-
sons can be asked about crimes that occurred
in the previous six months, and two waves of
panel data are combined to estimate the na-
tionwide frequency of victimization over a one-
year period.

Among longitudinal studies, panel studies
face a special problem: panel attrition. Some
of the respondents studied in the fi rst wave of
a study may not participate in later waves. The
danger is that those who drop out of the study
may not be typical and may thereby distort the
results of the study. Suppose we are interested
in evaluating the success of a new drug treat-
ment program by conducting weekly drug
tests on a panel of participants for a period of
10 months. Regardless of how successful the
program appears to be after 10 months, if a
substantial number of people drop out of our
study, we can expect that treatment was less ef-
fective in keeping them off drugs.

Approximating
Longitudinal Studies
It may be possible to draw conclusions about
processes that take place over time even when
only cross-sectional data are available. It is
worth noting some of the ways to do that.

Logical Inferences Cross-sectional data
sometimes imply processes that occur over
time on the basis of simple logic. Consider a

68 Part Two Structuring Criminal Justice Inquiry

child victims have a mother or father who was
abused as a child. It seems safe to conclude that
your hypothesis about the intergenerational
transmission of violence is strongly supported,
because 90 percent (18 out of 20) of abuse or
neglect victims brought before your court come
from families with a history of child abuse.

Think for a moment about how you ap-
proached the question of whether child abuse
breeds child abuse. You began with abuse vic-
tims and retrospectively established that many
of their parents had been abused. However, this
is different from the question of how many vic-
tims of childhood abuse later abuse their own
children. That question requires a prospective
approach, in which you begin with childhood
victims and then determine how many of them
later abuse their own children.

To clarify this point, let’s shift from the
hypothetical study to actual research that il-
lustrates the difference between prospective
and retrospective approaches to the same ques-
tion. Rosemary Hunter and Nancy Kilstrom
(1979) conducted a study of 255 infants and
their parents. The researchers began by select-
ing families of premature infants in a newborn
intensive care unit. Interviews with the parents
of 255 infants revealed that either the mother
or the father in 49 of the families had been the
victim of abuse or neglect; 206 families revealed
no history of abuse. In a prospective follow-up
study, Hunter and Kilstrom found that within
one year 10 of the 255 infants had been abused.
Nine of those 10 infant victims were from
the 49 families with a history of abuse, and
1 abused infant was from the 206 families with
no background of abuse.

Figure 3.3 illustrates these prospective re-
sults graphically. Infants in 18 percent (9 out of
49) of families with a history of abuse showed
signs of abuse within one year of birth, whereas
less than 1 percent of infants born to parents
with no history of abuse were abused within
one year. Although that is a sizable difference,
notice that the 18 percent fi gure for continuity

delinquency or crime. Or suppose we are inter-
ested in whether college students convicted of
drunk driving are more likely to have parents
with drinking problems than college students
with no drunk-driving record. Such a study is
retrospective because it focuses on the histories
of college students who have or have not been
convicted of drunk driving.

The danger in this technique is evident.
Sometimes people have faulty memories; some-
times they lie. Retrospective recall is one way of
approximating observations across time, but it
must be used with caution. Retrospective stud-
ies that analyze records of past arrests or con-
victions suffer from different problems: records
may be unavailable, incomplete, or inaccurate.

A more fundamental issue in retrospective
research hinges on how subjects are selected
and how subject selection affects the kinds of
questions such studies can address.

Imagine that you are a juvenile court judge
and you’re troubled by what appears to be a
large number of child abuse cases in your court.
Talking with a juvenile caseworker, you won-
der whether the parents of these children were
abused or neglected during their own child-
hood. Together, you formulate a hypothesis
about the intergenerational transmission of
violence: victims of childhood abuse later abuse
their own children. How might you go about
investigating that hypothesis?

Given your position as a judge who regu-
larly sees abuse victims, you will probably con-
sider a retrospective approach that examines
the backgrounds of families appearing in your
court. Let’s say you and the caseworker plan to
investigate the family backgrounds of 20 abuse
victims who appear in your court during the
next three months. The caseworker consults
with a clinical psychologist from the local uni-
versity and obtains copies of a questionnaire,
or protocol, that has been used by researchers
to study the families of child abuse victims.
After interviewing the families of 20 victims,
the caseworker reports to you that 18 of the 20

Chapter 3 General Issues in Research Design 69

10 abused infants at time 2 and then checked
their family backgrounds. Figure 3.4 illustrates
this retrospective approach. A large majority of
the 10 infant victims (90 percent) had parents
with a history of abuse.

of abuse is very similar to the 19 percent rate
of abuse discovered in the histories of all 255
families.

Now consider what Hunter and Kilstrom
would have found if they had begun with the

10 Victims

Infants

1 Not Victim

9 Victims

Time 1 Time 2

10%

90%

Parents

Figure 3.4 Retrospective Approach to a Subject
Source: Adapted from Hunter and Kilstrom (1979), as suggested by Widom (1989b).

206
Not Victims

[81%]

49
Victims
[19%]

Parents

1 Victim

9 Victims

Infants

Time 1 Time 2

0.5%

18%

Figure 3.3 Prospective Approach to a Subject
Source: Adapted from Hunter and Kilstrom (1979), as suggested by Widom (1989b).

70 Part Two Structuring Criminal Justice Inquiry

hood victims of abuse or neglect later abuse
their own children. A retrospective study can be
used, however, to compare whether childhood
victims are more likely than nonvictims to have
a history of abuse in their family background.

The Time Dimension Summarized
Joel Devine and James Wright (1993, 19) offer
a clever metaphor that distinguishes longitu-
dinal studies from cross-sectional ones. Think
of a cross-sectional study as a snapshot, a trend
study as a slide show, and a panel study as a
motion picture. A cross-sectional study, like
a snapshot, produces an image at one point
in time. This can provide useful information
about crime—burglary, for example—at a single
time, perhaps in a single place. A trend study is
akin to a slide show—a series of snapshots in
sequence over time. By viewing a slide show, we
can tell how some indicator— change in bur-
glary rates—varies over time. But a trend study
is usually based on aggregate information.
It can tell us something about aggregations
of burglary over time, but not, for instance,
whether the same people are committing bur-
glaries at an increasing or decreasing rate or
whether there are more or fewer burglars with a
relatively constant rate of crime commission. A
panel study, like a motion picture, can capture
moving images of the same individuals and
give us information about individual rates of
offending over time.

How to Design a
Research Project
Designing research requires planning several stages,
but the stages do not always occur in the same
sequence.

We’ve now seen some of the options available
to criminal justice researchers in designing
projects, but what if you were to undertake
research? Where would you start? Then where
would you go? How would you begin planning
your research?

You probably realize by now that the pro-
spective and retrospective approaches ad-
dress fundamentally different questions, even
though the questions may appear similar on
the surface:

Prospective: What percentage of abuse
victims later abuse their children?
(18 percent; Figure 3.3)

Retrospective: What percentage of abuse
victims have parents who were abused?
(90 percent; Figure 3.4)

In a study of how child abuse and neglect af-
fect drug use, Cathy Spatz Widom and associ-
ates (Widom, Weiler, and Cotler 1999) present
a similar contrast of prospective and retrospec-
tive analysis. Looking backward, 75 percent of
subjects with a drug abuse diagnosis in semi-
clinical interviews were victims of childhood
abuse or neglect. Looking forward, 35 percent
of childhood victims and 34 percent of nonvic-
tims had a drug abuse diagnosis.

More generally, Robert Sampson and John
Laub (1993, 14) comment on how retrospective
and prospective views yield different interpreta-
tions about patterns of criminal offending over
time:

Looking back over the careers of adult
criminals exaggerates the prevalence of
stability. Looking forward from youth re-
veals the success and failures, including
adolescent delinquents who go on to be
normal functioning adults. (emphasis in
original). This is the paradox noted [by Lee
Robins] earlier: adult criminality seems to
be always preceded by childhood miscon-
duct, but most conduct-disordered chil-
dren do not become antisocial or criminal
adults.

Notice how the time dimension is linked to
how research questions are framed. A retrospec-
tive approach is limited in its ability to reveal
how causal processes unfold over time. A retro-
spective approach is therefore not well suited
to answer questions such as how many child-

Chapter 3 General Issues in Research Design 71

the theory may produce new ideas and create
new interests. Or your understanding of some
theory may encourage you to consider new
policies.

To make this discussion more concrete, let’s
take a specifi c research example. Suppose you
are concerned about the problem of crime on
your campus and you have a special interest in
learning more about how other students view
the issue and what they think should be done
about it. Going a step further, let’s say you have
the impression that students are especially con-
cerned about violent crimes such as assault and
robbery and that many students feel the uni-
versity should be doing more to prevent violent
crime. The source of this idea might be your
own interest after being a student for a couple
of years. You might develop the idea while read-
ing about theories of crime in a course you are
taking. Perhaps you recently read stories about
a crime wave on campus. Or maybe some com-
bination of things makes you want to learn
more about campus crime.

Considering the research purposes dis-
cussed earlier in this chapter, your research will
be mainly exploratory. You probably have de-
scriptive and explanatory interests as well: How
much of a problem is violent crime on campus?
Are students especially concerned about crime
in certain areas? Why are some students more
worried about crime than others? What do stu-
dents think would be effective changes to re-
duce campus crime problems?

At this point, you should begin to think
about units of analysis and the time dimension.
Your interest in violent crime might suggest a
study of crimes reported to campus police in
recent years. In this case, the units of analysis
will be social artifacts (crime reports) in a lon-
gitudinal study (crime reports in recent years).
Or, after thinking a bit more, you may be inter-
ested in current student attitudes and opinions
about violent crime. Here the units of analy-
sis will be individuals (college students), and
a cross-sectional study will suit your purposes
nicely.

Every project has a starting point, but it is
important to think through later stages even at
the beginning. Figure 3.5 presents a schematic
view of the social scientifi c research process.
We present this view reluctantly because it may
suggest more of a cookbook approach to re-
search than is the case in practice. Nonetheless,
it’s important to have an overview of the whole
process before we launch into the details of
particular components of research. This fi gure
presents another and more detailed picture of
the scientifi c process discussed in Chapter 1.

The Research Process
At the top of the diagram in Figure 3.5 are in-
terests, ideas, theories, and new programs—the
possible beginning points for a line of research.
The letters (A, B, X, Y, and so forth) represent
variables or concepts such as deterrence or child
abuse. Thus you might have a general interest
in fi nding out why the threat of punishment de-
ters some but not all people from committing
crimes, or you might want to investigate how
burglars select their targets. Alternatively, your
inquiry might begin with a specifi c idea about
the way things are. You might have the idea that
aggressive arrest policies deter drug use, for ex-
ample. Question marks in the diagram indicate
that you aren’t sure things are the way you sus-
pect they are. We have represented a theory as
a complex set of relationships among several
variables (A, B, E, and F ).

The research process might also begin with
an idea for a new program. Imagine that you
are the director of a probation services depart-
ment and you want to introduce weekly drug
tests for people on probation. Because you
have taken a course on criminal justice research
methods, you decide to design an evaluation
of the new program before trying it out. The
research process begins with your idea for the
new drug-testing program.

Notice the movement back and forth among
these several possible beginnings. An initial
interest may lead to the formulation of an
idea, which may be fi t into a larger theory, and

72 Part Two Structuring Criminal Justice Inquiry

THEORYINTEREST IDEA

CONCEPTUALIZATION

OPERATIONALIZATION

CHOICE OF
RESEARCH METHOD

POPULATION AND SAMPLING

OBSERVATIONS

DATA PROCESSING

ANALYSIS

APPLICATION

Specify the meaning
of the concepts and

variables to be
studied

How will we actually
measure the variables

under study?

Experiments
Survey research
Field research
Content analysis
Existing data research
Comparative research
Evaluation research

Collecting data for
analysis and interpretation

Transforming the data
collected into a form

appropriate to manipulation
and analysis

Analyzing data and
drawing conclusions

Reporting results and
assessing their implications

Whom do we want to be
able to draw conclusions

about? Who will be observed
for that purpose?

? Y A BX Y?

NEW PROGRAM

Drug tests
Probation
violations

FE

Figure 3.5 The Research Process

Chapter 3 General Issues in Research Design 73

ing the planned report will help you make bet-
ter decisions about research design.

Conceptualization
We often talk casually about criminal justice
concepts such as deterrence, recidivism, crime
prevention, community policing, and child
abuse, but it’s necessary to specify what we
mean by these concepts to do research on them.
Chapter 4 will examine this process of concep-
tualization in depth. For now, let’s see what it
might involve in our hypothetical example.

If you are going to study student concerns
about violent crime, you must fi rst specify what
you mean by concern about violent crime. This
ambiguous phrase can mean different things
to different people. Campus police offi cers are
concerned about violent crime because that is
part of their job. On the one hand, students
might be concerned about crime in much the
same way they are concerned about other so-
cial problems, such as homelessness, animal
rights, and the global economy. They recognize
these issues as problems society must deal with,
but they don’t feel that the issues affect them
directly; we could specify this concept as general
concern about violent crime. On the other hand,
students may feel that the threat of violent
crime does affect them directly, and they ex-
press some fear about the possibility of being a
victim; let’s call this fear for personal safety.

Obviously, you need to specify what you
mean by the term in your research, but this
doesn’t necessarily mean you have to settle for a
single defi nition. In fact, you might want to de-
fi ne the concept of concern about violent crime
in more than one way and see how students feel
about each.

Of course, you need to specify all the con-
cepts you wish to study. If you want to study
the possible effect of concern about crime on
student behavior, you’ll have to decide whether
you want to limit your focus to specifi c precau-
tionary behavior such as keeping doors locked
or general behavior such as going to classes,
parties, and football games.

Getting Started
To begin pursuing your interest in student con-
cerns about violent crime, you undoubtedly will
want to read something about the issue. You
might begin by fi nding out what research has
been done on fear of crime and on the sorts of
crime that concern people most. Newspaper
stories should provide information on the vio-
lent crimes that occurred recently on campus.
Appendix A on the website for this book will
give you some assistance in using your college li-
brary. In addition, you will probably want to talk
to people, such as other students or campus po-
lice offi cers. These activities will prepare you to
handle the various research design decisions we
are about to examine. As you review the research
literature, you should make note of the designs
used by other researchers, asking whether the
same designs will meet your research objective.

What is your objective, by the way? It’s im-
portant that you are clear about that before you
design your study. Do you plan to write a pa-
per based on your research to satisfy a course
requirement or as an honors thesis? Is your
purpose to gain information that will support
an argument for more police protection or bet-
ter lighting on campus? Do you want to write
an article for the campus newspaper or an aca-
demic journal?

Usually, your objective for undertaking re-
search can be expressed in a report. Appendix C
on the website for this book will help you with
the organization of research reports, and we
recommend that you make an outline of such
a report as the fi rst step in the design of any
project. You should be clear about the kinds
of statements you will want to make when the
research is complete. Here are two examples of
such statements: “x percentage of State U stu-
dents believe that sexual assault is a big problem
on campus,” and “Female students living off
campus are more likely than females living in
dorms to feel that emergency phones should be
installed near buildings where evening classes
are held.” Although your fi nal report may not
look much like your initial image of it, outlin-

74 Part Two Structuring Criminal Justice Inquiry

You might operationalize fear for personal
safety with the question “How safe do you feel
alone on the campus after dark?” This could be
followed by boxes indicating the possible an-
swers “Safe” and “Unsafe.” Student attitudes
about ways of improving campus safety could
be operationalized with the item “Listed below
are different actions that might be taken to re-
duce violent crime on campus. Beside each de-
scription, indicate whether you favor or oppose
the actions described.” This could be followed
by several different actions, with “Favor” and
“Oppose” boxes beside each.

Population and Sampling
In addition to refi ning concepts and measure-
ments, decisions must be made about whom
or what to study. The population for a study is
that group (usually of people) about whom we
want to be able to draw conclusions. We are al-
most never able to study all the members of the
population that interests us, however. In vir-
tually every case, we must sample subjects for
study. Chapter 6 describes methods for selecting
samples that adequately refl ect the whole pop-
ulation that interests us. Notice in Figure 3.5
that decisions about population and sampling
are related to decisions about the research
method to be used.

In the study of concern about violent crime,
the relevant population is the student popu-
lation of your college. As you’ll discover in
Chapter 6, however, selecting a sample requires
you to get more specifi c than that. Will you in-
clude part-time as well as full-time students?
Only degree candidates or everyone? Students
who live on campus, off campus, or both?
There are many such questions, and each must
be answered in terms of your research purpose.
If your purpose is to study concern about sex-
ual assault, you might consider limiting your
population to female students. If hate crimes
are of special interest, you will want to be sure
that your study population includes minorities
and others who are thought to be particularly
targeted by hate crimes.

Choice of Research Method
A variety of methods are available to the
criminal justice researcher. Each method has
strengths and weaknesses, and certain concepts
are more appropriately studied by some meth-
ods than by others.

A survey is the most appropriate method for
studying both general concern and fear for per-
sonal safety. You might interview students di-
rectly or ask them to fi ll out a questionnaire. As
we’ll see in Chapter 7, surveys are especially well
suited to the study of individuals’ attitudes and
opinions. Thus if you wish to examine whether
students who are afraid of crime are more likely
to believe that campus lighting should be im-
proved than students who are not afraid, a sur-
vey is a good method.

Other methods described in Part Three may
be appropriate. Through content analysis (dis-
cussed in Chapter 9), you might examine letters
to the editor in your campus newspaper and
analyze what the writers believe should be done
to improve campus safety. Field research (see
Chapter 8), in which you observe whether stu-
dents tend to avoid dark areas of the campus,
will help you understand student behavior in
avoiding certain areas of the campus at night.
Or you might study offi cial complaints made to
police and college administrators about crime
problems on campus. As you read Part Three,
you’ll see ways other research methods might be
used to study this topic. Usually the best study
design is one that uses more than one research
method, taking advantage of their different
strengths.

Operationalization
Having specifi ed the concepts to be studied and
chosen the research method, you now must de-
velop specifi c measurement procedures. Opera-
tionalization, discussed in Chapter 4, refers to
the concrete steps, or operations, used to mea-
sure specifi c concepts.

If you decide to use a survey to study con-
cern about violent crime, your operationaliza-
tion will take the form of questionnaire items.

Chapter 3 General Issues in Research Design 75

Application
The fi nal stage of the research process involves
using the research you’ve conducted and the
conclusions you’ve reached. To start, you will
probably want to communicate your fi ndings
so that others will know what you’ve learned.
It may be appropriate to prepare—and even
publish—a written report. Perhaps you will
make oral presentations in class or at a profes-
sional meeting. Or you might create a web page
that presents your results. Other students will
be interested in hearing what you have learned
about their concerns about violent crime on
campus.

Your study might also be used to actually do
something about campus safety. If you fi nd that
a large proportion of students you interviewed
believe that a parking lot near the library is
poorly lighted, university administrators could
add more lights or campus police might patrol
the area more frequently. Crime prevention
programs might be launched in dormitories if
residents are more afraid of violent crime than
students who live in other types of housing.
Students in a Rutgers University class on crime
prevention focused on car thefts and break-ins
surrounding the campus in Newark, New Jer-
sey. Their semester project presented specifi c
recommendations on how university and city
offi cials could reduce the problem.

Finally, you should consider what your re-
search suggests with regard to further research
on your subject. What mistakes should be cor-
rected in future studies? What avenues, opened
up slightly in your study, should be pursued in
later investigations?

Research Design in Review
In designing a research project, you will fi nd it
useful to begin by assessing three things: (1) your
interests, (2) your abilities, and (3) the resources
available to you. Each of these considerations
will suggest a number of possible studies.

What are you interested in understanding?
Surely you have several questions about crime
and possible policy responses. Why do some

Observations
Having decided what to study, among whom,
and by what method, you are ready to make
observations—to collect empirical data. The
chapters of Part Three, which describe various
research methods, discuss the different obser-
vation methods appropriate to each.

For a survey of concern about violent crime,
you might prepare an electronic questionnaire
and e-mail it to a sample selected from the stu-
dent body or you could have a team of inter-
viewers conduct the survey over the telephone.
The relative advantages and disadvantages of
these and other possibilities are discussed in
Chapter 7.

Analysis
Finally, we manipulate the collected data for
the purpose of drawing conclusions that refl ect
on the interests, ideas, and theories that initi-
ated the inquiry. Chapter 11 describes a few of
the many options available to you in analyz-
ing data. Notice in Figure 3.5 that the results
of your analyses feed back into your initial
interests, ideas, and theories. In practice, this
feedback may initiate another cycle of inquiry.
In the study of student concern about violent
crime, the analysis phase will have both de-
scriptive and explanatory purposes. You might
begin by calculating the percentage of students
who feel afraid to use specifi c parking facilities
after dark and the percentage who favor or op-
pose each of the different things that might be
done to improve campus safety. Together, these
percentages will provide a good picture of stu-
dent opinion on the issue.

Moving beyond simple description, you
might examine the opinions of different subsets
of the student body: men versus women; fresh-
men, sophomores, juniors, seniors, and gradu-
ate students; and students who live in dorms
versus off-campus apartments. You might then
conduct some explanatory analysis to make the
point that students who are enrolled in classes
that meet in the evening hours are most in fa-
vor of improved campus lighting.

76 Part Two Structuring Criminal Justice Inquiry

physical sciences, and it is just as important in
criminal justice research.

The Research Proposal
Research proposals describe planned activities and
include a budget and time line.

If you undertake a research project—an assign-
ment for this course, perhaps, or even a major
study funded by the government or a research
foundation—you will probably have to provide
a research proposal describing what you intend
to accomplish and how. We’ll conclude this
chapter with a discussion of how you might
prepare such a proposal.

Elements of a Research Proposal
Some funding agencies have specifi c require-
ments for a proposal’s elements, structure, or
both. For example, in its research solicitation
announcements for the 2007 fi scal year, the Na-
tional Institute of Justice (NIJ) describes what
should be included in research proposals on
such topics as terrorism and elder abuse (www
.ojp.usdoj.gov/nij/funding/; accessed May 12,
2008). Your instructor may have certain re-
quirements for a research proposal you are to
prepare in this course. Here are some basic el-
ements that should be included in almost any
research proposal.

Problem or Objective What exactly do you
want to study? Why is it worth studying? Does
the proposed study contribute to our general
understanding of crime or policy responses to
crime? Does it have practical signifi cance? If
your proposal describes an evaluation study,
then the problem, objective, or research ques-
tions may already be specifi ed for you. For ex-
ample, in its request for research on elder abuse
issued in 2006, the NIJ required that proposals
address certain specifi c items in describing the
impact of proposed research:

1. Potential for signifi cant advances in sci-
entifi c or technical understanding of the
problem

juvenile gangs sell drugs whereas others steal
cars? Why do particular neighborhoods near
campus seem to have higher rates of burglary?
Do sentencing policies discriminate against mi-
norities? Do cities with gun control laws have
lower murder rates? Is burglary more common
in areas near pawnshops? Are sentences for rape
more severe in some states than in others? Are
mandatory jail sentences more effective than
license suspension in reducing repeat drunk-
driving offenses? Think for a while about the
kinds of questions that interest and concern you.

Once you have a few questions you are inter-
ested in answering, think about the kind of in-
formation you will need to answer them. What
research units of analysis will provide the most
relevant information: gangs, burglary victims,
drunk drivers, households, community groups,
police departments, cities, or states? This ques-
tion should be inseparable from the question of
research topics. Then ask which aspects of the
units of analysis will provide the information
you need to answer your research question.

Your next consideration is how to obtain
that information. Are the relevant data likely
to be already available somewhere (say, in a gov-
ernment publication), or will you have to col-
lect them yourself ? If you think you will have
to collect them, how will you do that? Will it
be necessary to observe juvenile gangs, inter-
view a large number of burglary victims, or at-
tend meetings of community crime prevention
groups? Or will you have to design an experi-
ment to study sentences for drunk driving?

As you answer these questions, you are well
into the process of research design. Once you
have a general idea of what you want to study
and how, carefully review previous research in
journals, books, and government reports to see
how other researchers have addressed the topic
and what they have learned about it. Your re-
view of the literature may lead you to revise your
research design; perhaps you will decide to use
another researcher’s method or even replicate
an earlier study. The independent replication of
research projects is a standard procedure in the

www.ojp.usdoj.gov/nij/funding/

www.ojp.usdoj.gov/nij/funding/

Chapter 3 General Issues in Research Design 77

you should include a copy in an appendix to
your proposal.

Data Collection Methods How will you ac-
tually collect the data for your study? Will you
observe behavior directly or conduct a survey?
Will you undertake fi eld research, or will you
focus on the reanalysis of data already collected
by others? Criminal justice research often in-
cludes more than one such method.

Analysis Briefl y describe the kind of analysis
you plan to conduct. Spell out the purpose and
logic of your analysis. Are you interested in pre-
cise description? Do you intend to explain why
things are the way they are? Will you analyze
the impact of a new program? What possible
explanatory variables will your analysis con-
sider, and how will you know whether you’ve
explained the program impact adequately?

References Be sure to include a list of all ma-
terials you consulted and cited in your proposal.
Formats for citations vary. Your instructor may
specify certain formats, or refer you to specifi c
style manuals for guidelines on how to cite
books, articles, and web-based resources.

Schedule It is often appropriate to provide
a schedule for the various stages of research.
Even if you don’t do this for the proposal, do
it for yourself. If you don’t have a time line for
accomplishing the stages of research and keep-
ing track of how you’re doing, you may end up
in trouble.

Budget If you are asking someone to give
you money to pay the costs of your research,
you will need to provide a budget that speci-
fi es where the money will go. Large, expensive
projects include budgetary categories such as
personnel, equipment, supplies, and expenses
(such as travel, copying, and printing). Even for
a more modest project you will pay for yourself,
it’s a good idea to spend some time anticipat-
ing any expenses involved: offi ce supplies, pho-
tocopying, computer disks, telephone calls,
transportation, and so on.

2. Potential for signifi cant advances in the
fi eld

3. Relevance for improving the policy and
practice of criminal justice and related agen-
cies and improving public safety, security,
and the quality of life (National Institute of
Justice 2006, 8)

Literature Review What have others said
about this topic? What theories address it, and
what do they say? What research has been done?
Are the fi ndings consistent, or do past studies
disagree? Are there fl aws in the body of existing
research that you feel you can remedy?

Research Questions What specifi c questions
will your research try to answer? Given what
others have found, as stated in your literature
review, what new information do you expect to
fi nd? It’s useful to view research questions as a
more specifi c version of the problem or objec-
tive described earlier. Then, of course, your spe-
cifi c questions should be framed in the context
of what other research has found.

Subjects for Study Whom or what will you
study in order to collect data? Identify the
subjects in general terms, and then specifi cally
identify who (or what) is available for study
and how you will reach them. Is it appropriate
to select a sample? If so, how will you do that?
If there is any possibility that your research will
have an impact on those you study, how will
you ensure that they are not harmed by the
research? Finally, if you will be interacting di-
rectly with human subjects, you will probably
have to include a consent form (as described in
Chapter 2) in an appendix to your proposal.

Measurement What are the key variables in
your study? How will you defi ne and measure
them? Do your defi nitions and measurement
methods duplicate (that’s okay, incidentally) or
differ from those of previous research on this
topic? If you have already developed your mea-
surement device (such as a questionnaire) or if
you are using something developed by others,

78 Part Two Structuring Criminal Justice Inquiry

but it may also be a group, organization, or so-
cial artifact.

• Researchers sometimes confuse units of analy-
sis, resulting in the ecological fallacy or the in-
dividualistic fallacy.

• Cross-sectional studies are those based on ob-
servations made at one time. Although such
studies are limited by this characteristic, infer-
ences can often be made about processes that
occur over time.

• Longitudinal studies are those in which obser-
vations are made at many times. Such obser-
vations may be made of samples drawn from
general populations (trend studies), samples
drawn from more specifi c subpopulations (co-
hort studies), or the same sample of people each
time (panel studies).

• Retrospective studies can sometimes approxi-
mate longitudinal studies, but retrospective ap-
proaches must be used with care.

• The research process is fl exible, involving differ-
ent steps that are best considered together. The
process usually begins with some general inter-
est or idea.

• A research proposal provides an overview of
why a study will be undertaken and how it will
be conducted. It is a useful device for planning
and is required in some circumstances.

✪ Key Terms

As you can see, if you are interested in con-
ducting a criminal justice research project, it is
a good idea to prepare a research proposal for
your own purposes, even if you aren’t required
to do so by your instructor or a funding agency.
If you are going to invest your time and energy
in such a project, you should do what you can
to ensure a return on that investment.

✪ Answers to the Units-of-Analysis
Exercise

1. Social artifacts (alcohol-related fatal crashes)
2. Groups (countries)
3. Individuals (probationers)
4. Social artifacts (court cases)
5. Organizations (police substations)

✪ Main Points
• Explanatory scientifi c research centers on the

notion of cause and effect.

• Most explanatory social research uses a proba-
bilistic model of causation. X may be said to
cause Y if it is seen to have some infl uence on Y.

• X is a necessary cause of Y if Y cannot happen
without X having happened. X is a suffi cient
cause of Y if Y always happens when X happens.

• Three basic requirements determine a causal
relationship in scientifi c research: (1) the inde-
pendent variable must occur before the depen-
dent variable, (2) the independent and depen-
dent variables must be empirically related to
each other, and (3) the observed relationship
cannot be explained away as the effect of an-
other variable.

• When scientists consider whether causal state-
ments are true or false, they are concerned with
the validity of causal inference.

• Four classes of threats to validity correspond to
the types of questions researchers ask in trying
to establish cause and effect. Threats to statis-
tical conclusion validity and internal validity
arise from bias. Construct and external validity
threats may limit our ability to generalize from
an observed relationship.

• A scientifi c realist approach to examining
mechanisms in context bridges idiographic and
nomothetic approaches to causation.

• Units of analysis are the people or things whose
characteristics researchers observe, describe,
and explain. The unit of analysis in criminal
justice research is often the individual person,

cohort study, p. 66
conceptualization,

p. 73
construct

validity, p. 56
cross-sectional

study, p. 66
ecological

fallacy, p. 63
external

validity, p. 55
internal

validity, p. 55
longitudinal

study, p. 66
operationalization,

p. 74

panel study, p. 67
probabilistic, p. 52
prospective, p. 68
retrospective

research, p. 67
scientifi c

realism, p. 60
statistical conclusion

validity, p. 53
trend study, p. 66
units of

analysis, p. 61
validity, p. 53
validity threats, p. 53

✪ Review Questions and Exercises
1. Discuss one of the following statements in

terms of what you have learned about the cri-

Chapter 3 General Issues in Research Design 79

search on Crime,” Criminology 25 (1987), pp.
581–614. Two other highly respected crimi-
nologists point to some of the shortcomings of
longitudinal studies.

Maxwell, Joseph A., Qualitative Research Design: An
Interactive Approach, 2nd ed. (Thousand Oaks,
CA: Sage, 2005). Despite the word qualitative
in the title, this book offers excellent advice in
progressing from general interests or thoughts
to more specifi c plans for actual research. Each
chapter concludes with exercises that incremen-
tally help readers develop research plans.

Pawson, Ray, and Nick Tilley, Realistic Evaluation
(Thousand Oaks, CA: Sage, 1997). The authors
propose an alternative way of thinking about
cause, in the context of what they call “scientifi c
realism.” Although they criticize traditional so-
cial science approaches to inferring cause, Paw-
son and Tilley supplement the classic insights
of Cook and Campbell.

Sampson, Robert J., and John H. Laub, Crime in the
Making: Pathways and Turning Points Through Life
(Cambridge, MA: Harvard University Press,
1993). John H. Laub and Robert J. Sampson,
Shared Beginnings, Divergent Lives: Delinquent Boys
to Age 70 (Cambridge, MA: Harvard University
Press, 2003). The highly acclaimed research de-
scribed in these two volumes illustrates the lon-
gitudinal approach to explanatory research, be-
ginning with juveniles and following their lives
through age 70. Sampson and Laub are also
attentive to possible validity threats to their
fi ndings.

Shadish, William R., Thomas D. Cook, and Donald
T. Campbell, Experimental and Quasi-Experimental
Designs for Generalized Causal Inference. (Boston:
Houghton Miffl in, 2002). A recent update to a
classic, this book is close to a defi nitive discus-
sion of cause, validity threats, experiments, and
generalizing from research. The authors move
far beyond the earlier edition, but somehow the
book is more accessible. See especially Chapters
1 through 3 and Chapter 11.

teria of causation and threats to the validity of
causal inference. What cause-and-effect rela-
tionships are implied? What are some alterna-
tive explanations?

a. Guns don’t kill people; people kill people.
b. Capital punishment prevents murder.
c. Marijuana is a gateway drug that leads to

the use of other drugs.
2. Several times, we have discussed the relation-

ship between drug use and crime. Describe the
conditions for each of the following that would
lead us to conclude that drug use is:

a. A necessary cause
b. A suffi cient cause
c. A necessary and suffi cient cause
3. In describing different approaches to the time

dimension, criminologist Lawrence Sherman
(1995) claimed that cross-sectional studies can
show differences and that longitudinal studies
can show change. How does this statement re-
late to the three criteria for inferring causation?

4. William Julius Wilson (1996, 167) cites the fol-
lowing example of why it’s important to think
carefully about units and time. Imagine a 13-
bed hospital, in which 12 beds are occupied by
the same 12 people for one year. The other hos-
pital bed is occupied by 52 people, each staying
one week. At any given time, 92 percent of beds
are occupied by long-term patients (12 out of
13), but over the entire year, 81 percent of pa-
tients are short-term patients (52 out of 64).
Discuss the implications of a similar example,
using jail cells instead of hospital beds.

✪ Additional Readings
Farrington, David P., Lloyd E. Ohlin, and James Q.

Wilson, Understanding and Controlling Crime: To-
ward a New Research Strategy (New York: Springer-
Verlag, 1986). Three highly respected criminol-
ogists describe the advantages of longitudinal
studies and policy experiments for criminal jus-
tice research. The book also presents a research
agenda for studying the causes of crime and the
effectiveness of policy responses.

Gottfredson, Michael R., and Travis Hirschi, “The
Methodological Adequacy of Longitudinal Re-

80

Chapter 4

Concepts, Operationalization,
and Measurement
It’s essential to specify exactly what we mean (and don’t mean) by the terms
we use. This is the fi rst step in the measurement process, and we’ll cover it
in depth.

Introduction 81

Conceptions and Concepts 81

Conceptualization 83

Indicators and Dimensions 83

WHAT IS RECIDIVISM? 84

Creating Conceptual Order 84

Operationalization Choices 86

Measurement as Scoring 87

JAIL STAY 88

Exhaustive and Exclusive
Measurement 88

Levels of Measurement 89

Implications of Levels of
Measurement 91

Criteria for Measurement
Quality 92

Reliability 93

Validity 94

Measuring Crime 97

General Issues in Measuring
Crime 97

UNITS OF ANALYSIS AND

MEASURING CRIME 98

Measures Based on Crimes Known
to Police 98

Chapter 4 Concepts, Operationalization, and Measurement 81

Introduction
Because measurement is diffi cult and imprecise, re-
searchers try to describe the measurement process
explicitly.

This chapter describes the progression from
having a vague idea about what we want to
study to being able to recognize it and measure
it in the real world. We begin with the general
issue of conceptualization, which sets up a
foundation for our examination of operation-
alization and measurement. We then turn to
different approaches to assessing measurement
quality. The chapter concludes with an overview
of strategies for combining individual measures
into more complex indicators.

As you read this chapter, keep in mind a cen-
tral theme: communication. Ultimately, crimi-
nal justice and social scientifi c research seek to
communicate fi ndings to an audience, such as
professors, classmates, journal readers, or co-
workers in a probation services agency. Moving
from vague ideas and interests to a completed
research report, as we described in Chapter 3,
involves communication at every step—from
general ideas to more precise defi nitions of
critical terms. With more precise defi nitions,
we can begin to develop measures to apply in
the real world.

Conceptions and Concepts
Clarifying abstract mental images is an essential
fi rst step in measurement.

If you hear the word recidivism, what image
comes to mind? You might think of someone
who has served time for burglary and who
breaks into a house soon after being released
from prison. Or, in contrast to that rather
specifi c image, you might have a more general
image of a habitual criminal. Someone who
works in a criminal justice agency might have
a different mental image. Police offi cers might
think of a specifi c individual they have arrested
repeatedly for a variety of offenses, and a judge
might think of a defendant who has three prior
convictions for theft.

Ultimately, recidivism is simply a term we use
in communication—a word representing a col-
lection of related phenomena that we have ei-
ther observed or heard about somewhere. It’s as
though we have fi le drawers in our minds con-
taining thousands of sheets of paper, and each
sheet has a label in the upper right-hand cor-
ner. One sheet of paper in your fi le drawer has
the term recidivism on it, and the person who
sits next to you in class has one, too.

The technical name for those mental images,
those sheets of paper in our fi le drawers, is con-
ception. Each sheet of paper is a conception—
a subjective thought about things that we en-
counter in daily life. But those mental images

Victim Surveys 102

Surveys of Offending 103

Measuring Crime Summary 104

Composite Measures 105

Typologies 106

An Index of Disorder 107

Measurement Summary 109

82 Part Two Structuring Criminal Justice Inquiry

in pursuit of self-interest—is abstract. Crime is
the symbol, or label, they have assigned to this
concept.

Let’s discuss a specifi c example. What is
your conception of serious crime? What mental
images come to mind? Most people agree that
airplane hijacking, rape, bank robbery, and
murder are serious crimes. What about a physi-
cal assault that results in a concussion and fa-
cial injuries? Many of us would classify it as a
serious crime but not if the incident took place
in a boxing ring. Is burglary a serious crime? It
doesn’t rank up there with drive-by shooting,
but we would probably agree that it is more se-
rious than shoplifting. What about drug use or
drug dealing?

Our mental images of serious crime may
vary depending on our backgrounds and expe-
riences. If your home has ever been burglarized,
you might be more inclined than someone who
has not suffered that experience to rate it as a
serious crime. If you have been both burglar-
ized and robbed at gunpoint, you would prob-
ably think the burglary was less serious than
the robbery. There is much disagreement over
the seriousness of drug use. Younger people,
whether or not they have used drugs, may be
less inclined to view drug use as a serious crime,
whereas police and other public offi cials might
rank drug use as very serious. California and
Oregon are among states that have legalized the
use of marijuana for medical purposes. How-
ever, as of 2006 the U.S. Department of Justice
views all marijuana use as a crime, challenging
state laws and raiding San Francisco medical
marijuana dispensaries (Murphy 2005).

Serious crime is an abstraction, a label we
use to represent a concept. However, we must
be careful to distinguish the label we use for a
concept from the reality that the concept repre-
sents. There are real robberies, and robbery is a
serious crime, but the concept of crime serious-
ness is not real.

The concept of serious crime, then, is a con-
struct created from your conception of it, our

cannot be communicated directly. There is no
way we can directly reveal what’s written on our
mental images. Therefore we use the terms writ-
ten in the upper right-hand corners as a way of
communicating about our conceptions and
the things we observe that are related to those
conceptions.

For example, the word crime represents our
conception about certain kinds of behavior.
But individuals have different conceptions; they
may think of different kinds of behavior when
they hear the word crime. Police offi cers in most
states would include possession of marijuana
among their conceptions of crime, whereas
members of the advocacy group National Or-
ganization for the Reform of Marijuana Laws
(NORML) would not. Recent burglary victims
might recall their own experiences in their
conceptions of crime, whereas more fortunate
neighbors might think about the murder story
in yesterday’s newspaper.

Because conceptions are subjective and can-
not be communicated directly, we use the words
and symbols of language as a way of communi-
cating about our conceptions and the things we
observe that are related to those conceptions.

Concepts are the words or symbols in lan-
guage that we use to represent these mental im-
ages. We use concepts to communicate with one
another, to share our mental images. Although
a common language enables us to communi-
cate, it is important to recognize that the words
and phrases we use represent abstractions.
Concepts are abstract because they are indepen-
dent of the labels we assign to them. Crime as a
concept is abstract, meaning that in the English
language this label represents mental images
of illegal acts. Of course, actual crimes are real
events, and our mental images of crime may be
based on real events (or the stuff of TV drama).
However, when we talk about crime, without
being more specifi c, we are talking about an
abstraction. Thus, for example, the concept of
crime proposed by Michael Gottfredson and
Travis Hirschi (1990, 15)—using force or fraud

Chapter 4 Concepts, Operationalization, and Measurement 83

the theft of unattended personal property such
as bicycles are examples of nonviolent crimes.
Assault, rape, robbery, and murder are violent
crimes.

Indicators and Dimensions
The end product of the conceptualization pro-
cess is the specifi cation of a set of indicators of
what we have in mind, indicating the presence
or absence of the concept we are studying. To il-
lustrate this process, let’s discuss the more gen-
eral concept of crime seriousness. This concept
is more general than serious crime because it
implies that some crimes are more serious than
others.

One good indicator of crime seriousness is
harm to the crime victim. Physical injury is an
example of harm, and physical injury is cer-
tainly more likely to result from violent crime
than from nonviolent crime. What about other
kinds of harm? Burglary victims suffer eco-
nomic harm from property loss and perhaps
damage to their homes. Is the loss of $800 in
a burglary an indicator of more serious crime
than a $10 loss in a robbery in which the victim
was not injured? Victims of both violent crime
and nonviolent crime may suffer psychological
harm. Or people might feel a sense of personal
violation after discovering that their home has
been burglarized. Other types of victim harm
can be combined into groups and subgroups
as well.

The technical term for such groupings is
dimension—some specifi able aspect of a con-
cept. Thus we might speak of the “victim harm
dimension” of crime seriousness. This dimen-
sion could include indicators of physical injury,
economic loss, or psychological consequences.
And we can easily think of other indicators and
dimensions related to the general concept of
crime seriousness. If we consider the theft of
$20 from a poor person to be more serious than
the theft of $2,000 from a wealthy oil company
chief executive offi cer, victim wealth might
be another dimension. Also consider a victim

conception of it, and the conceptions of all
those who have ever used the term. The concept
of serious crime cannot be observed directly or
indirectly. We can, however, meaningfully dis-
cuss the concept, observe examples of serious
crime, and measure it indirectly.

Conceptualization
Day-to-day communication is made possible
through general but often vague and unspo-
ken agreements about the use of terms. Usually
other people do not understand exactly what
we wish to communicate, but they get the gen-
eral drift of our meaning. Although we may not
fully agree about the meaning of the term seri-
ous crime, it’s safe to assume that the crime of
bank robbery is more serious than the crime of
bicycle theft. A wide range of misunderstand-
ings is the price we pay for our imprecision, but
somehow we muddle through. Science, how-
ever, aims at more than muddling, and it can-
not operate in a context of such imprecision.

Conceptualization is the process by which
we specify precisely what we mean when we
use particular terms. Suppose we want to fi nd
out whether violent crime is more serious than
nonviolent crime. Most of us would probably
assume that is true, but it might be interest-
ing to fi nd out whether it’s really so. Notice
that we can’t meaningfully study the issue, let
alone agree on the answer, without some pre-
cise working agreements about the meanings of
the terms we are using. They are working agree-
ments in the sense that they allow us to work
on the question.

We begin by clearly differentiating violent
and nonviolent crime. In violent crimes, an
offender uses force or threats of force against
a victim. Nonviolent crimes either do not in-
volve any direct contact between a victim and
an offender or involve contact but no force. For
example, pickpockets have direct contact with
their victims but use no force. In contrast, rob-
bery involves at least the threat to use force on
victims. Burglary, auto theft, shoplifting, and

84 Part Two Structuring Criminal Justice Inquiry

crime is more serious than nonviolent crime in
all cases.

Creating Conceptual Order
The design and execution of criminal justice re-
search requires that we clear away the confu-
sion over concepts and reality. To this end,
logicians and scientists have found it useful
to distinguish three kinds of defi nitions: real,
conceptual, and operational. With respect to
the fi rst of these, Carl G. Hempel (1952, 6) has
cautioned:

A “real” defi nition, according to traditional
logic, is not a stipulation determining the
meaning of some expression but a state-
ment of the “essential nature” or the “essen-
tial attributes” of some entity. The notion

identity dimension. Killing a burglar in self-
defense would not be as serious as threatening
to kill the president of the United States.

It is possible to subdivide the concept of
crime seriousness into several dimensions. Spec-
ifying dimensions and identifying the various
indicators for each of those dimensions are both
parts of conceptualization.

Specifying the different dimensions of a
concept often paves the way for a better under-
standing of what we are studying. We might
observe that fi stfi ghts among high school stu-
dents result in thousands of injuries per year
but that the annual costs of auto theft cause di-
rect economic harm to hundreds of insurance
companies and millions of auto insurance poli-
cyholders. Recognizing the many dimensions
of crime seriousness, we cannot say that violent

WHAT IS RECIDIVISM?
Tony Fabelo

The Senate Criminal Justice Committee will be
studying the record of the corrections system
and the use of recidivism rates as a measure of
performance for the system. The fi rst task for the
committee should be to clearly defi ne recidivism,
understand how it is measured, and determine
the implications of adopting recidivism rates as
measures of performance.

Defi ning Recidivism
Recidivism is the recurrence of criminal behavior.
The rate of recidivism refers to the proportion of
a specifi c group of offenders (for example, those
released on parole) who engage in criminal be-
havior within a given period of time. Indicators of
criminal behavior are rearrests, reconvictions, or
reincarcerations.

Each of these indicators depends on contact
with criminal justice offi cials and will therefore
underestimate the recurrence of criminal be-
havior. However, criminal behavior that is unre-
ported and not otherwise known to offi cials in

justice agencies is diffi cult to measure in a consis-
tent and economically feasible fashion.

In 1991, the Criminal Justice Policy Council
recommended to the Texas legislature and state
criminal justice agencies that recidivism be mea-
sured in the following way:

Recidivism rates should be calculated by
counting the number of prison releases or
number of offenders placed under commu-
nity supervision who are reincarcerated for
a technical violation or new offense within
a uniform period of at-risk street time.
The at-risk street time can be one, two,
or three years, but it must be uniform for
the group being tracked so that results are
not distorted by uneven at-risk periods.
Reincarceration should be measured
using data from the “rap sheets” collected
by the Texas Department of Public Safety
in their Computerized Criminal History
system. A centralized source of informa-
tion reduces reporting errors.

Systemwide Recidivism Rates
Recidivism rates can be reported for all offend-
ers in the system—for all offenders released from

Chapter 4 Concepts, Operationalization, and Measurement 85

pational status, money in the bank, property,
lifestyle, and so forth.

The specifi cation of conceptual defi nitions
does two important things. First, it serves as a
specifi c working defi nition we present so that
readers will understand exactly what we mean
by a concept. Second, it focuses our observa-
tional strategy. Notice that a conceptual defi –
nition does not directly produce observations;
rather, it channels our efforts to develop actual
measures.

As a next step, we must specify exactly what
we will observe, how we will do it, and what
interpretations we will place on various pos-
sible observations. These further specifi cations
make up the operational defi nition of the
concept—a defi nition that spells out precisely
how the concept will be measured. Strictly

of essential nature, however, is so vague as
to render this characterization useless for
the purposes of rigorous inquiry.

A real or essential nature defi nition is inher-
ently subjective. The specifi cation of concepts
in scientifi c inquiry depends instead on concep-
tual and operational defi nitions. A conceptual
defi nition is a working defi nition specifi cally
assigned to a term. In the midst of disagree-
ment and confusion over what a term really
means, the scientist specifi es a working defi ni-
tion for the purposes of the inquiry. Wishing to
examine socioeconomic status (SES), we may
simply specify that we are going to treat it as a
combination of income and educational attain-
ment. With that defi nitional decision, we rule
out many other possible aspects of SES: occu-

prison or for all offenders placed on probation.
This I call systemwide recidivism rates. Approxi-
mately 48 percent of offenders released from
prison on parole or mandatory supervision, or re-
leased from county jails on parole, in 1991 were
reincarcerated by 1994 for a new offense or a pa-
role violation.

For offenders released from prison in 1991
the reincarceration recidivism rates three years
after release from prison by offense of conviction
are listed below:

Burglary 56% Assault 44%

Robbery 54% Homicide 40%

Theft 52% Sexual assault 39%

Drugs 43% Sex offense 34%

For the same group, the reincarceration recidi-
vism rate three years after release by age group is
listed below:

17–25 56%

26–30 52%

31–35 48%

36–40 46%

41 or older 35%

The Meaning of Systemwide
Recidivism Rates
The systemwide recidivism rate of prison releases
should not be used to measure the performance
of institutional programs. There are many socio-
economic factors that can affect systemwide re-
cidivism rates.

For example, the systemwide recidivism rate of
offenders released from prison in 1995 declined
because of changes in the characteristics of the
population released from prison. Offenders are
receiving and serving longer sentences, which will
raise the average age at release. Therefore per-
formance in terms of systemwide recidivism will
improve but not necessarily because of improve-
ments in the delivery of services within the prison
system.

On the other hand, the systemwide recidivism
rate of felons released from state jail facilities
should be expected to be relatively high, because
state jail felons are property and drug offenders
who tend to have high recidivism rates.

86 Part Two Structuring Criminal Justice Inquiry

To test your understanding of these measure-
ment steps, return to the beginning of the
chapter, where we asked you what image comes
to mind in connection with the word recidivism.
Recall your own mental image, and compare it
with Tony Fabelo’s discussion in the box titled
“What Is Recidivism?”

Operationalization Choices
Describing how to obtain empirical measures begins
with operationalization.

Recall from Chapter 3 that the research process
is not usually a set of steps that proceed in or-
der from fi rst to last. This is especially true of
operationalization, the process of developing
operational defi nitions. Although we begin by
conceptualizing what we wish to study, once
we start to consider operationalization, we may
revise our conceptual defi nition. Developing
an operational defi nition also moves us closer
to measurement, which requires that we think
about selecting a data collection method as well.
In other words, operationalization does not
proceed according to a systematic checklist.

To illustrate this fl uid process, let’s return to
the issue of crime seriousness. Suppose we want
to conduct a descriptive study that shows which
crimes are more serious and which crimes are
less serious.

One obvious dimension of crime serious-
ness is the penalties that are assigned to differ-
ent crimes by law. Let’s begin with this concep-
tualization. Our conceptual defi nition of crime
seriousness is therefore the level of punishment
that a state criminal code authorizes for dif-
ferent crimes. Notice that this defi nition has
the distinct advantage of being unambiguous,
which leads us to an operational defi nition
something like this:

Consult the Texas Criminal Code. (1) Those
crimes that may be punished by death will
be judged most serious. (2) Next will be
crimes that may be punished by a prison
sentence of more than one year. (3) The

speaking, an operational defi nition is a descrip-
tion of the operations undertaken in measuring
a concept.

Pursuing the defi nition of SES, we might
decide to ask the people we are studying three
questions:

1. What was your total household income dur-
ing the past 12 months?

2. How many persons are in your household?
3. What is the highest level of school you have

completed?

Next, we need to specify a system for catego-
rizing the answers people give us. For income, we
might use the categories “under $25,000” and
“$25,000–$35,000.” Educational attainment
might be similarly grouped into categories,
and we might simply count the number of peo-
ple in each household. Finally, we need to spec-
ify a way to combine each person’s responses
to these three questions to create a measure
of SES.

The end result is a working and workable
defi nition of SES. Others might disagree with
our conceptualization and operationalization,
but the defi nition has one essential scientifi c
virtue: it is absolutely specifi c and unambigu-
ous. Even if someone disagrees with our defi ni-
tion, that person will have a good idea of how
to interpret our research results because what
we mean by SES—refl ected in our analyses and
conclusions—is clear.

Here is a diagram showing the progression
of measurement steps from our vague sense of
what a term means to specifi c measurements in
a scientifi c study:

Conceptualization

Conceptual defi nition

Operational defi nition

Measurements in the real world

Chapter 4 Concepts, Operationalization, and Measurement 87

resent how much time you will have to spend
studying. The American Bar Association rates
nominees to the U.S. Supreme Court as quali-
fi ed, highly qualifi ed, or not qualifi ed. You
might rank last night’s date on the proverbial
scale of 1 to 10, refl ecting whatever conceptual
properties are important to you.

Measurement as Scoring
Another way to think of measurement is in
terms of scoring. Your instructor scores ex-
ams by counting the right answers and assign-
ing some point value to each answer. Referees
keep score at basketball games by counting the
number of one-point free throws and two- and
three-point fi eld goals for each team. Judges
or juries score persons charged with crime by
pronouncing “guilty” or “not guilty.” City mur-
der rates are scored by counting the number of
murder victims and dividing by the number of
city residents.

Many people consider measurement to be
the most important and diffi cult phase of
criminal justice research. It is diffi cult, in part,
because so many basic concepts in criminal jus-
tice are not easy to defi ne as specifi cally as we
would like. Without being able to settle on a
conceptual defi nition, we fi nd operationalizing
and measuring things challenging. This is illus-
trated by the box titled “Jail Stay.”

In addition to being challenging, different
operationalization choices can produce dif-
ferent results. In the box titled “What Is Re-
cidivism?” Tony Fabelo argues that the at-risk
period for comparing recidivism for different
groups of offenders should be uniform. It’s pos-
sible to examine one-, two-, or three-year rates,
but comparisons should use standard at-risk
periods. Varying the at-risk period produces, as
we might expect, differences in recidivism rates.
Evaluating a Texas program that provided drug
abuse treatment, Michael Eisenberg (1999, 8)
reports rates for different at-risk periods:

1-Year 2-Year 3-Year

All participants 14% 37% 42%

least serious crimes are those with jail sen-
tences of less than a year, fi nes, or both.

The operations undertaken to measure crime
seriousness are specifi c. Our data collection
strategy is also clear: go to the library, make a list
of crimes described in the Texas Code, and clas-
sify each crime into one of the three groups.

Note that we have produced rather narrow
conceptual and operational defi nitions of crime
seriousness. We might presume that penalties
in the Texas Code take into account additional
dimensions such as victim harm, offender mo-
tivation, and other circumstances of individual
crimes. However, the three groups of crimes in-
clude very different types of incidents and so do
not tell us much about crime seriousness.

An alternative conceptualization of crime se-
riousness might center on what people think of
as serious crime. In this view, crime seriousness
is based on people’s beliefs, which may refl ect
their perceptions of harm to victims, offender
motivation, or other dimensions. Conceptual-
izing crime seriousness in this way suggests a
different approach to operationalization: you
will present descriptions of various crimes to
other students in your class and ask them to in-
dicate how serious they believe the crimes are.
If crime seriousness is operationalized in this
way, a questionnaire is the most appropriate
data collection method.

Operationalization involves describing how
actual measurements will be made. The next
step, of course, is making the measurements.
Royce Singleton and associates (Singleton,
Straits, and Straits 2005, 100) defi ne measure-
ment as “the process of assigning numbers or
labels to units of analysis in order to represent
conceptual properties. This process should be
quite familiar to the reader even if the defi nition
is not.”

Think of some examples of the process.
Your instructor assigns number or letter grades
to exams and papers to represent your mastery
of course material. You count the number of
pages in this week’s history assignment to rep-

88 Part Two Structuring Criminal Justice Inquiry

ties such as employed part-time, employed full-
time, and retired.

Every variable should have two important
qualities. First, the attributes composing it
should be exhaustive. If the variable is to have
any utility in research, researchers must be able
to classify every observation in terms of one of
the attributes composing the variable. We will
run into trouble if we conceptualize the vari-
able sentence in terms of the attributes prison
and fi ne. After all, some convicted persons are
assigned to probation, some have a portion of
their prison sentence suspended, and others
may receive a mix of prison term, probation,
suspended sentence, or perhaps community
service. Notice that we could make the list of
attributes exhaustive by adding other and combi-
nation. Whatever approach we take, we must be
able to classify every observation.

At the same time, attributes composing a
variable must be mutually exclusive. Research-
ers must be able to classify every observation

Note that the difference between one- and two-
year rates is much larger than that between two-
and three-year rates. Operationalizing “recidi-
vism” as a one-year failure rate would be much
less accurate than operationalizing the concept
as a two-year rate, because recidivism rates seem
to stabilize at the two-year point.

Exhaustive and
Exclusive Measurement
Briefl y revisiting terms introduced in Chapter 1,
an attribute is a characteristic or quality of some-
thing. Female is an example, as are old and stu-
dent. Variables, in contrast, are logical sets of
attributes. Thus, gender is a variable composed
of the attributes female and male. The conceptu-
alization and operationalization processes can
be seen as the specifi cation of variables and the
attributes composing them. Thus, employment
status is a variable that has the attributes em-
ployed and unemployed, or the list of attributes
could be expanded to include other possibili-

JAIL STAY

Recall from Chapter 1 that two of the
general purposes of research are de-

scription and explanation. The distinction be-
tween them has important implications for the
process of defi nition and measurement. If you
have formed the opinion that description is a
simpler task than explanation, you may be sur-
prised to learn that defi nitions can be more prob-
lematic for descriptive research than for explana-
tory research. To illustrate this, we present an
example based on an attempt by one of the au-
thors to describe what he thought was a simple
concept.

In the course of an evaluation project, Max-
fi eld wished to learn the average number of days
people stayed in the Marion County (Indiana)
jail. This concept was labeled jail stay. People can
be in the county jail for three reasons: (1) They
are serving a sentence of one year or less. (2) They
are awaiting trial. (3) They are being held tempo-
rarily while awaiting transfer to another county

or state or to prison. The third category includes
people who have been sentenced to prison and
are waiting for space to open up, or those who
have been arrested and are wanted for some rea-
son in another jurisdiction.

Maxfi eld vaguely knew these things but did
not recognize how they complicated the task of
defi ning and ultimately measuring jail stay. So
the original question—“What is the average jail
stay?”—was revised to “What is the average jail
stay for persons serving sentences and for persons
awaiting trial?”

Just as people can be in jail for different rea-
sons, an individual can be in jail for more than
one reason. Let’s consider a hypothetical jail resi-
dent we’ll call Allan. He was convicted of burglary
in July 2002 and sentenced to a year in jail. All but
30 days of his sentence were suspended, meaning
that he was freed but could be required to serve
the remaining 11 months if he got into trouble
again. It did not take long. Two months after be-
ing released, Allan was arrested for robbery and
returned to jail.

Chapter 4 Concepts, Operationalization, and Measurement 89

tiveness and mutual exclusiveness are nomi-
nal measures. Examples are gender, race, city
of residence, college major, Social Security
number, and marital status. The attributes
composing each of these variables—male and
female for the variable gender—are distinct from
one another and pretty much cover the con-
ventional possibilities among people. Nomi-
nal measures merely offer names or labels for
characteristics.

Imagine a group of people being character-
ized in terms of a nominal variable and physi-
cally grouped by the appropriate attributes.
Suppose we are at a convention attended by
hundreds of police chiefs. At a social func-
tion, we ask them to stand together in groups
according to the states in which they live: all
those from Vermont in one group, those from
California in another, and so forth. The vari-
able is state of residence; the attributes are live in
Vermont, live in California, and so on. All the peo-
ple standing in a given group have at least one

in terms of one and only one attribute. Thus,
for example, we need to defi ne prison and fi ne in
such a way that nobody can possess both attri-
butes at the same time. That means we must be
able to handle the variables for a person whose
sentence includes both a prison term and a fi ne.
In this case, attributes could be defi ned more
precisely by specifying prison only, fi ne only, and
both prison and fi ne.

Levels of Measurement
Attributes composing any variable must be mu-
tually exclusive and exhaustive. Attributes may
be related in other ways as well. Of particular
interest is that variables may represent different
levels of measurement: nominal, ordinal, in-
terval, and ratio. Levels of measurement tell us
what sorts of information we can gain from the
scores assigned to the values of a variable.

Nominal Measures Variables whose attri-
butes have only the characteristics of exhaus-

Now it gets complicated. A judge imposes the
remaining 11 months of Allan’s suspended sen-
tence. Allan is denied bail and must wait for his
trial in jail. It is soon learned that Allan is wanted
by police in Illinois for passing bad checks. Many
people would be delighted to send Allan to Illi-
nois; they tell offi cials in that state they can have
him, pending resolution of the situation in Mar-
ion County.

Allan is now in jail for three reasons: (1) serving
his sentence for the original burglary, (2) await-
ing trial on a robbery charge, and (3) waiting for
transfer to Illinois.

Is this one jail stay or three? In a sense, it is
one jail stay because one person, Allan, is occu-
pying a jail cell. But let’s say Allan’s trial on the
robbery charge is delayed until after he completes
his sentence for the burglary. He stays in jail and
begins a new jail stay. When he comes up for trial,
the prosecutor asks to waive the robbery charges
against Allan in hopes of exporting him to the
neighboring state, and a new jail stay begins as
Allan awaits his free trip to Illinois.

You may recognize this as a problem with
units of analysis. Is the unit the person who stays
in jail? Or are the separate reasons Allan is in
jail—which are social artifacts—the units of analy-
sis? After some thought, Maxfi eld decided that
the social artifact was the more appropriate unit
because he was interested in whether jail cells are
more often occupied by people serving sentences
or people awaiting trial. But that produced a
new question of how to deal with people like
Allan. Do we double-count the overlap in Allan’s
jail stays, so that he accounts for two jail stays
while serving his suspended sentence for burglary
and waiting for the robbery trial? This seemed to
make sense, but then Allan’s two jail stays would
count the same as two other people with one
jail stay each. In other words, Allan would appear
to occupy two jail beds at the same time. This
was neither true nor helpful in describing how
long people stay in jail for different reasons.

90 Part Two Structuring Criminal Justice Inquiry

school group and the college group, or else the
rank order is incorrect.

Interval Measures When the actual distance
that separates the attributes composing some
variables does have meaning, the variables are
interval measures. The logical distance be-
tween attributes can then be expressed in mean-
ingful standard intervals.

Interval measures commonly used in social
scientifi c research are constructed measures
such as standardized intelligence tests. The in-
terval that separates IQ scores of 100 and 110
is the same as the interval that separates scores
of 110 and 120 by virtue of the distribution of
the observed scores of the many thousands of
people who have taken the test over the years.
Criminal justice researchers often combine in-
dividual nominal and ordinal measures to pro-
duce a composite interval measure.

Ratio Measures Most of the social scientifi c
variables that meet the minimum requirements
for interval measures also meet the require-
ments for ratio measures. In ratio measures,
the attributes that compose a variable, besides
having all the structural characteristics men-
tioned previously, are based on a true zero
point. Examples from criminal justice research
are age, dollar value of property loss from bur-
glary, number of prior arrests, blood alcohol
content, and length of incarceration.

Returning to the example of various ways to
classify police chiefs, we might ask the chiefs to
group themselves according to years of experi-
ence in their present position. All those new to
their job would stand together, as would those
with one year of experience, those with two
years on the job, and so forth. The facts that
members of each group share the same years of
experience and that each group has a different
shared length of time satisfy the minimum re-
quirements for a nominal measure. Arranging
the several groups in a line from those with the
least to those with the most experience meets
the additional requirements for an ordinal mea-
sure and permits us to determine whether one

thing in common; the people in any one group
differ from the people in all other groups in
that same regard. Where the individual groups
are formed, how close they are to one another,
and how they are arranged in the room is irrele-
vant. All that matters is that all the members of
a given group share the same state of residence
and that each group has a different shared state
of residence.

Ordinal Measures Variables whose attri-
butes may be logically rank ordered are ordinal
measures. The different attributes represent
relatively more or less of the variable. Examples
of variables that can be ordered in some way are
opinion of police, occupational status, crime
seriousness, and fear of crime.

Let’s pursue the earlier example of grouping
police chiefs at a social gathering and imagine
that we ask all those who have graduated from
college to stand in one group, all those with a
high school diploma (but who were not also
college graduates) to stand in another group,
and all those who have not graduated from high
school to stand in a third group. This manner
of grouping people satisfi es the requirements
for exhaustiveness and mutual exclusiveness. In
addition, however, we might logically arrange
the three groups in terms of their amount of
formal education (the shared attribute). We
might arrange the three groups in a row, rang-
ing from most to least formal education. This
arrangement provides a physical representation
of an ordinal measure. If we know which groups
two individuals are in, we can determine that
one has more, less, or the same formal educa-
tion as the other.

Note that in this example it is irrelevant
how close or far apart the educational groups
are from one another. They might stand 5 feet
apart or 500 feet apart; the college and high
school groups could be 5 feet apart, and the
less-than-high-school group might be 500 feet
farther down the line. These physical distances
have no meaning. The high school group, how-
ever, should be between the less-than-high-

Chapter 4 Concepts, Operationalization, and Measurement 91

The fourth column shows the ranking for
each of the 17 crimes in the table; the most seri-
ous crime, murder, is ranked 1, followed by rape
with injury, and so on. The rankings express
only the order of seriousness, however, because
the difference between murder (ranked 1) and
rape (ranked 2) is smaller than the distance be-
tween rape and robbery with injury (ranked 3).

Finally, the crime descriptions presented to
respondents indicated the value of property loss
for each offense. This is a ratio measure with a
true zero point, so that 10 burglaries with a loss
of $1,000 each have the same property value as
one arson offense with a loss of $10,000.

Specifi c analytic techniques require vari-
ables that meet certain minimum levels of mea-
surement. For example, we could compute the
average property loss from the crimes listed in
Table 4.1 by adding up the individual numbers
in the fi fth column and dividing by the number
of crimes listed (17). However, we would not
be able to compute the average victim type be-
cause that is a nominal variable. In that case, we
could report the modal—the most common—
victim type, which is society in Table 4.1.

Researchers may treat some variables as rep-
resenting different levels of measurement. Ra-
tio measures are the highest level, followed by
interval, ordinal, and nominal. A variable that
represents a given level of measurement—say,
ratio—may also be treated as representing a
lower level of measurement—say, ordinal. For
example, age is a ratio measure. If we wish to
examine only the relationship between age and
some ordinal-level variable, such as delinquency
involvement (high, medium, or low), we might
choose to treat age as an ordinal-level variable
as well. We might characterize the subjects of
our study as being young, middle age, or old,
specifying the age range for each of those group-
ings. Finally, age might be used as a nominal-
level variable for certain research purposes.
Thus people might be grouped as baby boom-
ers if they were born between 1945 and 1955.

The analytic uses planned for a given
variable, then, should determine the level of

person is more experienced, is less experienced,
or has the same level of experience as another.
If we arrange the groups so that there is the
same distance between each pair of adjacent
groups, we satisfy the additional requirements
of an interval measure and can say how much
more experience one chief has than another. Fi-
nally, because one of the attributes included—
experience—has a true zero point (police chiefs
just appointed to their job), the phalanx of hap-
less convention goers also meets the require-
ments for a ratio measure, permitting us to
say that one person is twice as experienced as
another.

Implications of
Levels of Measurement
To review this discussion and to illustrate why
level of measurement may make a difference,
consider Table 4.1. It presents information
on crime seriousness adapted from a survey
of crime severity conducted for the Bureau of
Justice Statistics (Wolfgang, Figlio, Tracy, and
Singer 1985). The survey presented brief de-
scriptions of more than 200 different crimes to
a sample of 60,000 people. Respondents were
asked to assign a score to each crime based on
how serious they thought the crime was com-
pared with bicycle theft (scored at 10).

The fi rst column in Table 4.1 lists some of
the crimes described. The second column shows
a nominal measure that identifi es the victim in
the crime: home, person, business, or society.
Type of victim is an attribute of each crime. The
third column lists seriousness scores computed
from survey results, ranging from 0.6 for tres-
passing to 35.7 for murder. These seriousness
scores are interval measures because the dis-
tance between, for example, auto theft (at 8.0)
and accepting a bribe (at 9.0) is the same as
that between accepting a bribe (at 9.0) and ob-
structing justice (at 10.0). Seriousness scores
are not ratio measures; there is no absolute zero
point, and three instances of obstructing jus-
tice (at 10.0) do not equal one rape with injury
(at 30.0).

92 Part Two Structuring Criminal Justice Inquiry

that compose a variable. Saying that a woman
is 43 years old is more precise than that she is
in her forties. Describing a felony sentence as
18 months is more precise than more than one
year.

As a general rule, precise measurements are
superior to imprecise ones, as common sense
would suggest. Precision is not always neces-
sary or desirable, however. If knowing that a
felony sentence is more than one year is suffi –
cient for your research purpose, then any ad-
ditional effort invested in learning the precise
sentence would be wasted. The operationaliza-
tion of concepts, then, must be guided partly
by an understanding of the degree of precision
required. If your needs are not clear, be more
precise rather than less.

But don’t confuse precision with accuracy.
Describing someone as “born in Stowe, Ver-
mont” is more precise than “born in New Eng-
land,” but suppose the person in question was
actually born in Boston? The less precise de-

measurement to be sought, with the realization
that some variables are inherently limited to a
certain level. If a variable is to be used in a variety
of ways that require different levels of measure-
ment, the study should be designed to achieve
the highest level possible. Although ratio mea-
sures such as number of arrests can later be re-
duced to ordinal or nominal ones, it is not pos-
sible to convert a nominal or ordinal measure to
a ratio one. More generally, you cannot convert
a lower-level measure to a higher-level one. That
is a one-way street worth remembering.

Criteria for
Measurement Quality
The key standards for measurement quality are
reliability and validity.

Measurements can be made with varying de-
grees of precision, which refers to the fi neness
of the distinctions made between the attributes

Table 4.1 Crime Seriousness and Levels of Measurement

Seriousness Value of
Crime Victim Score Rank Property Loss

Accepting a bribe Society 9.0 9 0

Arson Business 12.7 6 $10,000

Auto theft Home 8.0 10 $12,000

Burglary Business 15.5 5 $100,000

Burglary Home 9.6 8 $1,000

Buying stolen property Society 5.0 12 0

Heroin sales Society 20.6 4 0

Heroin use Society 6.5 11 0

Murder Person 35.7 1 0

Obstructing justice Society 10.0 7 0

Public intoxication Society 0.8 15 0

Rape and injury Person 30.0 2 0

Robbery and injury Person 21.0 3 $1,000

Robbery attempt Person 3.3 13 0

Robbery, no injury Person 8.0 10 $1,000

Shoplifting Business 2.2 14 $10

Trespassing Home 0.6 16 0

Source: Adapted from Wolfgang, Figlio, Tracy, and Singer (1985).

Chapter 4 Concepts, Operationalization, and Measurement 93

search. For example, forensic DNA evidence is
increasingly being used in violent crime cases. A
National Research Council (1996) study found
a variety of errors in laboratory procedures,
including sample mishandling, evidence con-
tamination, and analyst bias. These are mea-
surement reliability problems that can lead to
unwarranted exclusion of evidence or to the
conviction of innocent people. Irregularities in
DNA tests by a Texas crime lab led to the exon-
eration of at least one previously convicted de-
fendant and prompted reviews of hundreds of
additional cases (McVicker and Khanna 2003).

Reliability problems crop up in many forms.
Reliability is a concern every time a single ob-
server is the source of data because we have no
way to guard against that observer’s subjectiv-
ity. We can’t tell for sure how much of what’s re-
ported represents true variation and how much
is due to the observer’s unique perceptions.

Reliability can also be an issue when more
than one observer makes measurements. Sur-
vey researchers have long known that different
interviewers get different answers from respon-
dents as a result of their own attitudes and de-
meanor. Or we may want to classify a few hun-
dred community anticrime groups into a set
of categories created by the National Institute
of Justice. A police offi cer and a neighborhood
activist are unlikely to classify all those groups
into the same categories; such inconsistency
would be an example of reliability problems.

How do we create reliable measures? Be-
cause the problem of reliability is a basic one in
criminal justice measurement, researchers have
developed a number of techniques for dealing
with it.

The Test–Retest Method Sometimes it is
appropriate to make the same measurement
more than once. If there is no reason to expect
the information to change, we should expect
the same response every time. If answers vary,
however, then the measurement method is, to
the extent of that variation, unreliable. Here’s
an illustration.

scription, in this instance, is more accurate; it’s a
better refl ection of the real world. This is a point
worth keeping in mind. Many criminal justice
measures are imprecise, so reporting approxi-
mate values is often preferable.

Precision and accuracy are obviously impor-
tant qualities in research measurement, and they
probably need no further explanation. When
criminal justice researchers construct and eval-
uate measurements, they pay special attention
to two technical considerations: reliability and
validity.

Reliability
Fundamentally, reliability is a matter of
whether a particular measurement technique,
applied repeatedly to the same thing, will yield
the same result each time. In other words, mea-
surement reliability is roughly the same as mea-
surement consistency or stability. Imagine a po-
lice offi cer standing on the street, guessing the
speed of cars that pass by and issuing speeding
tickets based on that judgment. If you received
a ticket from this offi cer and went to court to
contest it, you would almost certainly win your
case. The judge would no doubt reject this way
of measuring speed, regardless of the police of-
fi cer’s experience. The reliability or consistency
of this method of measuring vehicle speed is
questionable at best. If the same police offi –
cer used a radar speed detector, however, it is
doubtful that you would be able to beat the
ticket. The radar device is judged a much more
reliable way of measuring speed.

Reliability, though, does not ensure accuracy
any more than precision does. The speedometer
in your car may be a reliable instrument for mea-
suring speed, but it is common for speedom-
eters to be off by a few miles per hour, especially
at higher speeds. If your speedometer shows
55 miles per hour when you are actually travel-
ing at 60, it gives you a consistent but inaccu-
rate reading that might attract the attention of
police offi cers with more accurate radar guns.

Measurement reliability is often a problem
with indicators used in criminal justice re-

94 Part Two Structuring Criminal Justice Inquiry

supervisor call a subsample of the respondents
on the telephone and verify selected informa-
tion. West and Farrington (1977, 173) checked
interrater reliability in their study of London
youths and found few signifi cant differences in
results obtained from different interviewers.

Comparing measurements from different
raters works in other situations as well. Michael
Geerken (1994) presents an important discus-
sion of reliability problems that researchers are
likely to encounter in measuring prior arrests
through police rap sheets. Duplicate entries, the
use of aliases, and the need to transform offi cial
crime categories into a smaller number of catego-
ries for analysis are among the problems Geerken
cites. One way to increase consistency in trans-
lating offi cial records into research measures—
a process often referred to as coding—is to have
more than one person code a sample of records
and then compare the consistency of coding de-
cisions made by each person. This approach was
used by Michael Maxfi eld and Cathy Spatz Wi-
dom (1996) in their analysis of adult arrests of
child abuse victims.

In general, whenever researchers are con-
cerned that measures obtained through coding
may not be classifi ed reliably, they should have
each independently coded by different people. A
great deal of disagreement among coders would
most likely be due to ambiguity in operational
defi nitions.

The reliability of measurements is a funda-
mental issue in criminal justice research, and
we’ll return to it in the chapters to come. For
now, however, we hasten to point out that even
total reliability doesn’t ensure that our mea-
sures actually measure what we think they mea-
sure. That brings us to the issue of validity.

Validity
In conventional usage, the term validity means
that an empirical measure adequately refl ects
the meaning of the concept under consider-
ation. Put another way, measurement validity
involves whether you are really measuring what
you say you are measuring. Recall that an oper-

In their classic research on delinquency in
England, Donald West and David Farrington
(1977) interviewed a sample of 411 males from
a working-class area of London at age 16 and
again at age 18. The subjects were asked to de-
scribe a variety of aspects of their lives, including
educational and work history, leisure pursuits,
drinking and smoking habits, delinquent activ-
ities, and experience with police and courts.

West and Farrington assessed reliability in
several ways. One was to compare responses
from the interview at age 18 with those from
the interview at age 16. For example, in each
interview, the youths were asked at what age
they left school. In most cases, there were few
discrepancies in stated age from one interview
to the next, which led the authors to conclude,
“There was therefore no systematic tendency
for youths either to increase or lessen their
claimed period of school attendance as they
grew older, as might have occurred if they had
wanted either to exaggerate or to underplay
their educational attainments” (1977, 76–77).
If West and Farrington had found less consis-
tency in answers to this and other items, they
would have had good reason to doubt the
truthfulness of responses to more sensitive
questions. The test–retest method suggested to
the authors that memory lapses were the most
common source of minor differences.

Although this method can be a useful reliabil-
ity check, it is limited in some respects. Faulty
memory may produce inconsistent responses if
there is a lengthy gap between the initial inter-
view and the retest. A different problem can arise
in trying to use the test–retest method to check
the reliability of attitude or opinion measures.
If the test–retest interval is short, then answers
given in the second interview may be affected by
earlier responses if subjects try to be consistent.

Interrater Reliability It is also possible for
measurement unreliability to be generated by re-
search workers—for example, interviewers and
coders. To guard against interviewer unreliabil-
ity, it is common practice in surveys to have a

Chapter 4 Concepts, Operationalization, and Measurement 95

as valid; this is sometimes referred to as con-
vergent validity. The validity of College Board
exams, for example, is shown in their ability to
predict the success of students in college.

Timothy Heeren and associates (Heeren,
Smith, Morelock, and Hingson 1985) offer a
good example of criterion-related validity in
their efforts to validate a measure of alcohol-
related auto fatalities. Of course, conducting a
blood alcohol laboratory test on everyone killed
in auto accidents would be a valid measure. Not
all states regularly do this, however, so Heeren
and colleagues tested the validity of an alterna-
tive measure: single-vehicle fatal accidents in-
volving male drivers occurring between 8:00 p.m.
and 3:00 a.m. The validity of this measure was
shown by comparing it with the blood alcohol
test results for all drivers killed in states that
reliably conducted such tests in fatal accidents.
Because the two measures agreed closely, Heeren
and associates claimed that the proxy, or sur-
rogate, measure would be valid in other states.

Another approach to criterion-related valid-
ity is to show that our measure of a concept is
different from measures of similar but distinct
concepts. This is called discriminant validity,
meaning that measures can discriminate be-
tween different concepts.

Sometimes it is diffi cult to fi nd behavioral
criteria that can be used to validate measures
as directly as described here. In those instances,
however, we can often approximate such crite-
ria by considering how the variable in question
ought, theoretically, to relate to other variables.

Construct Validity Construct validity is
based on the logical relationships among vari-
ables. Let’s suppose that we are interested in
studying fear of crime—its sources and conse-
quences. As part of our research, we develop a
measure of fear of crime, and we want to assess
its validity.

In addition to our measure, we will also de-
velop certain theoretical expectations about the
way the variable fear of crime relates to other vari-
ables. For instance, it’s reasonable to conclude

ational defi nition specifi es the operations you
will perform to measure a concept. Does your
operational defi nition accurately refl ect the
concept you are interested in? If the answer is
yes, you have a valid measure. A radar gun is a
valid measure of vehicle speed, but a wind ve-
locity indicator is not because it measures total
wind speed, not vehicle speed with respect to
the ground.

Although methods for assessing reliability
are relatively straightforward, it is more diffi –
cult to demonstrate that individual measures
are valid. Because concepts are not real, but ab-
stract, we cannot directly demonstrate that mea-
sures, which are real, are actually measuring an
abstract concept. Nevertheless, researchers have
some ways of dealing with the issue of validity.

Face Validity First, there’s something called
face validity. Particular empirical measures may
or may not jibe with our common agreements
and our individual mental images about a par-
ticular concept. We might debate the adequacy
of measuring satisfaction with police services
by counting the number of citizen complaints
registered by the mayor’s offi ce, but we’d surely
agree that the number of citizen complaints
has something to do with levels of satisfaction.
If someone suggested that we measure satisfac-
tion with police by fi nding out whether people
like to watch police dramas on TV, we would
probably agree that the measure has no face va-
lidity; it simply does not make sense.

Second, there are many concrete agreements
among researchers about how to measure cer-
tain basic concepts. The Census Bureau, for
example, has created operational defi nitions
of such concepts as family, household, and em-
ployment status that seem to have a workable
validity in most studies using those concepts.

Criterion-Related Validity A more formal
way to assess validity is to compare a mea-
sure with some external criterion, known as
criterion-related validity. A measure can be
validated by showing that it predicts scores
on another measure that is generally accepted

96 Part Two Structuring Criminal Justice Inquiry

measure delinquency and criminality. But how
valid are survey questions that ask people how
many crimes they have committed?

The approach used by West and Farrington
(and by others) is to ask people, for example,
how many times they have committed robbery
and how many times they have been arrested
for that crime. Those who admit to having been
arrested for robbery are asked when and where
the arrest occurred. Self-reports can then be
validated by checking police arrest records. This
works two ways: (1) it is possible to validate in-
dividual reports of being arrested for robbery,
and (2) researchers can check police records for
all persons interviewed to see if there are any
records of robbery arrests that subjects do not
disclose to interviewers.

Figure 4.1 illustrates the difference between
validity and reliability. Think of measurement
as analogous to hitting the bull’s-eye on a tar-
get. A reliable measure produces a tight pattern,
regardless of where it hits, because reliability is
a function of consistency. Validity, in contrast,
relates to the arrangement of shots around the
bull’s-eye. The failure of reliability in the fi gure
can be seen as a random error; the failure of va-
lidity is a systematic error. Notice that neither
an unreliable nor an invalid measure is likely to
be very useful.

that people who are afraid of crime are less likely
to leave their homes at night for entertainment
than people who are not afraid of crime. If our
measure of fear of crime relates to how often
people go out at night in the expected fashion,
that constitutes evidence of our measure’s con-
struct validity. However, if people who are afraid
of crime are just as likely to go out at night as
people who are not afraid, that challenges the
validity of our measure. This and related points
about measures of fear are nicely illustrated by
Jason Ditton and Stephen Farrall in their analy-
sis of data from England (2007).

Tests of construct validity, then, can offer a
weight of evidence that our measure either does
or doesn’t tap the quality we want it to measure,
without providing defi nitive proof.

Multiple Measures Another approach to val-
idation of an individual measure is to compare
it with alternative measures of the same con-
cept. The use of multiple measures is similar to
establishing criterion validity. However, the use
of multiple measures does not necessarily as-
sume that the criterion measure is always more
accurate. For example, many crimes never result
in an arrest, so arrests are not good measures of
how many crimes are committed by individu-
als. Self-report surveys have often been used to

Reliable but not valid Valid but not reliable Valid and reliable

Figure 4.1 Analogy to Validity and Reliability

Chapter 4 Concepts, Operationalization, and Measurement 97

self-interest, a term that has engaged philoso-
phers and social scientists for centuries.

James Q. Wilson and Richard Herrnstein
(1985, 22) propose a different defi nition that
should get us started: “A crime is any act com-
mitted in violation of a law that prohibits it
and authorizes punishment for its commis-
sion.” Although other criminologists (such as
Gottfredson and Hirschi) might not agree with
this conceptual defi nition, it has the advantage
of being reasonably specifi c. We could be even
more specifi c by consulting a state or federal
code and listing the types of acts for which the
law provides punishment.

Our list would be very long. In fact, one of
the principal diffi culties we encounter when
we try to measure crime is that many different
types of behaviors and actions are included in
our conceptualization of crime as an act com-
mitted in violation of a law that prohibits it and au-
thorizes punishment for its commission, but we may
be interested in only a small subset of things
included under such a broad defi nition. Differ-
ent measures tend to focus on different types of
crime, primarily because not all crimes can be
measured the same way with any degree of reli-
ability or validity. Therefore one important step
in selecting a measure is deciding what crimes
will be included.

What Units of Analysis? Recall that units of
analysis are the specifi c entities researchers col-
lect information about. Chapter 3 considered
individuals, groups, social artifacts, and other
units of analysis. Deciding how to measure
crime requires that we once again think about
these units.

Crimes involve four elements that are often
easier to recognize in the abstract than they are
to actually measure: offender, victim, offense,
and incident. The most basic of these elements
is the offender. Without an offender, there’s no
crime, so a crime must, at a minimum, involve
an offender. The offender is therefore one pos-
sible unit of analysis. We might decide to study

Measuring Crime
Different approaches to measuring crime illustrate
basic principles in conceptualization and measure-
ment.

By way of illustrating basic principles in mea-
surement, we now focus more narrowly on dif-
ferent ways of measuring crime. Crime is a fun-
damental dependent variable in criminal justice
and criminology. Explanatory studies frequently
seek to learn what causes crime, whereas applied
studies often focus on what actions might be
effective in reducing crime. Descriptive and ex-
ploratory studies may simply wish to count how
much crime there is in a specifi c area, a question
of obvious concern to criminal justice offi cials
as well as researchers.

Crime can also be an independent variable—
for example, in a study of how crime affects fear
or other attitudes or of whether people who live
in high-crime areas are more likely than others
to favor long prison sentences for drug dealers.
Sometimes crime can be both an independent
and a dependent variable, as in a study about
the relationship between drug use and other
offenses.

General Issues in Measuring Crime
At the outset, we must consider two general
questions that infl uence whatever approach we
might take to measuring crime: (1) How will we
conceptualize crime? (2) What units of analysis
should be used?

Conceptualization Let’s begin by propos-
ing a conceptual defi nition of crime— one that
will enable us to decide what specifi c types of
crime we’ll measure. Recall a defi nition from
Michael Gottfredson and Travis Hirschi (1990,
15), mentioned earlier: “acts of force or fraud
undertaken in pursuit of self-interest.” This is
an interesting defi nition, but it is better suited
to an extended discussion of theories of crime
than to our purposes in this chapter. For exam-
ple, we would have to clarify what was meant by

98 Part Two Structuring Criminal Justice Inquiry

“one or more offenses committed by the same
offender, or group of offenders acting in concert,
at the same time and place” (Federal Bureau of In-
vestigation 2000, 17; emphasis in original).

Think about the difference between offense
and incident for a moment. A single incident
can include multiple offenses, but it’s not possi-
ble to have one offense and multiple incidents.

To illustrate the different units of analysis—
offenders, victims, offenses, and incidents—
consider the examples in the box titled “Units
of Analysis and Measuring Crime.” These ex-
amples help distinguish units from each other
and illustrate the links among different units.
Notice that we have said nothing about aggre-
gate units of analysis, a topic we examined in
Chapter 3. We have considered only individual
units, even though measures of crime are often
based on aggregate units of analysis—neighbor-
hoods, cities, counties, states, and so on.

We cover units at some length because they
play a critical, and often overlooked, role in de-
veloping operational defi nitions, to which our
attention now turns.

Measures Based on
Crimes Known to Police
The most widely used measures of crime are
based on police records and are commonly

burglars, auto thieves, bank robbers, child mo-
lesters, drug dealers, or people who have com-
mitted many different types of offenses.

Crimes also require some sort of victim, the
second possible unit of analysis. We could study
victims of burglary, auto theft, bank robbery, or
assault. Notice that this list of victims includes
different types of units: households or busi-
nesses for burglary, car owners for auto theft,
banks for bank robbery, and individuals for
assault. Some of these units are organizations
(banks, businesses), some are individual people,
some are abstractions (households), and some
are ambiguous (individuals or organizations
can own automobiles).

What about so-called victimless crimes like
drug use, bookmaking, or prostitution? In a
legal sense, victimless crimes do not exist be-
cause crimes are acts that injure society, organi-
zations, or individuals. But studying crimes in
which only society is the victim—prostitution,
for example—presents special challenges, and
specialized techniques have been developed to
measure certain types of victimless crimes.

The fi nal two elements of crimes— offense
and incident—are closely intertwined and so
will be discussed together. An offense is defi ned
as an individual act of burglary, auto theft, bank
robbery, and so on. The FBI defi nes incident as

UNITS OF
ANALYSIS AND
MEASURING
CRIME

Figuring out the different units of analysis in
counting crimes can be diffi cult and confusing
at fi rst. Much of the problem comes from the
possibility of what database designers call one-
to-many and many-to-many relationships. The
same incident can have multiple offenses, offend-
ers, and victims or just one of each. Fortunately,
thinking through some examples usually clarifi es
the matter. Our two examples are adapted from
an FBI publication (2000, 18).

Example 1
Two males entered a bar. The bartender was
forced at gunpoint to hand over all money from
the cash register. The offenders also took money
and jewelry from three customers. One of the of-
fenders used his handgun to beat one of the cus-
tomers, thereby causing serious injury. Both of-
fenders fl ed on foot.

One incident
One robbery offense
Two offenders
Four victims (bar owner, three patrons)
One aggravated assault offense
Two offenders
One victim

Chapter 4 Concepts, Operationalization, and Measurement 99

cle theft (Federal Bureau of Investigation 2007).
Other offenses, referred to as Part II crimes,
are counted only if a person has been arrested
and charged with a crime. The UCR therefore
does not include such offenses as shoplifting,
drug sale or use, fraud, prostitution, simple as-
sault, vandalism, receiving stolen property, and
all other nontraffi c offenses unless someone
is arrested. This means that a large number of
crimes reported to police are not measured in
the UCR.

Another source of measurement error in the
UCR is produced by the hierarchy rule used by
police agencies and the FBI to classify crimes.
Under the hierarchy rule, if multiple crimes are
committed in a single incident, only the most
serious is counted in the UCR. For example, if
a burglar breaks into a home, rapes one of the
occupants, and fl ees in the homeowner’s car,
at least three crimes are committed—burglary,
rape, and vehicle theft. Under the FBI hierar-
chy rule, however, only the most serious crime,
rape, is counted in the UCR, even though the
offender could be charged with all three of-
fenses. In the examples described in the box
“Units of Analysis and Measuring Crime,” the
UCR would count one offense in each incident:
a single robbery in the fi rst example and rape in
the second.

referred to as crimes known to police. This phrase
is at the core of police-based operational defi –
nitions and has important implications for
understanding what police records do and do
not measure. The most obvious implication
is that crimes not known to police cannot be
measured by consulting police records. Other
features of measures based on police records
can best be understood by considering specifi c
examples.

Uniform Crime Reports Police measures of
crime form the basis for the FBI’s Uniform
Crime Reports (UCR), a data series that has
been collected since 1930 and has been widely
used by criminal justice researchers. But certain
characteristics and procedures related to the
UCR affect its suitability as a measure of crime.
Most of our comments highlight shortcomings
in this regard, but keep in mind that the UCR
is and will continue to be a very useful measure
for researchers and public offi cials.

The UCR does not even try to count all crimes
reported to police. What are referred to as Part I
offenses are counted if these offenses are re-
ported to police (and recorded by police). Part I
offenses include murder and non-negligent
manslaughter, forcible rape, robbery, aggravated
assault, burglary, larceny-theft, and motor vehi-

Even though only one offender actually assaulted
the bar patron, the other offender would be
charged with assisting in the offense because he
prevented others from coming to the aid of the
assault victim.

Example 2
Two males entered a bar. The bartender was
forced at gunpoint to hand over all the money
from the cash register. The offenders also took
money and jewelry from two customers. One of
the offenders, in searching for more people to
rob, found a customer in a back room and raped
her there, outside the view of the other offender.
When the rapist returned, both offenders fl ed on
foot.

This example includes two incidents because
the rape occurred in a different place and the of-
fenders were not acting in concert. And because
they were not acting in concert in the same place,
only one offender was associated with the rape
incident.

Incident 1
One robbery offense
Two offenders
Three victims (bar owner, two patrons)
Incident 2
One rape offense
One offender
One victim

100 Part Two Structuring Criminal Justice Inquiry

duct descriptive and explanatory studies of
individual events. For example, it’s possible to
compare the relationship between victim and
offender for male victims and female victims or
to compare the types of weapons used in kill-
ings by strangers and killings by nonstrangers.
Such analyses are not possible if we are study-
ing homicide using UCR summary data.

Crime measures based on incidents as units
of analysis therefore have several advantages
over summary measures. It’s important to keep
in mind, however, that SHR data still represent
crimes known to police and recorded by police.

The National Incident-Based Reporting
System The most recent development in
police-based measures at the national level is
the ongoing effort by the FBI and the Bureau
of Justice Statistics (BJS) to convert the UCR
to a National Incident-Based Reporting Sys-
tem (NIBRS, pronounced “ny-bers”). Planning
for replacement of the UCR began in the mid-
1980s, but because NIBRS represents major
changes, law enforcement agencies have shifted
only gradually to the new system.

Briefl y, NIBRS is a Very Big Deal. For exam-
ple, let’s consider NIBRS and the UCR crime
measures for a single state, Idaho. Nationwide,
about 17,000 law enforcement agencies report
UCR summary data each year; that’s 17,000
annual observations, one for each reporting
agency. In 2004, 106 agencies in Idaho reported
UCR data, so Idaho submitted a maximum of
106 observations for 2004. Under NIBRS, Idaho
reported over 95,000 incidents in 2004 (Idaho
State Police 2005). In other words, rather than
reporting 106 summary crime counts for eight
UCR Part I offenses, Idaho reported detailed in-
formation on 95,522 individual incidents. And
this is Idaho, which ranked 39th among the
states in year-2000 resident population!

In addition, NIBRS guidelines call for gath-
ering more detailed information about a much
broader array of offenses. Whereas the UCR re-
ports information about seven Part I offenses
(plus arson), NIBRS is designed to collect de-

Before we move on to other approaches to
measuring crime, consider another important
way units of analysis fi gure into UCR data. The
UCR system produces what is referred to as a
summary-based measure of crime. This means
that UCR data include summary, or total,
crime counts from reporting agencies— cities
or counties. UCR summary data therefore rep-
resent groups as units of analysis. Crime reports
are available for cities or counties, and these
may be aggregated upward to measure crime for
states or regions of the United States. But UCR
data available from the FBI cannot represent in-
dividual crimes, offenders, or victims as units.

Recall that it is possible to aggregate units
of analysis to higher levels, but it is not possible
to disaggregate grouped data to the individual
level. Because UCR data are aggregates, they
cannot be used in descriptive or explanatory
studies that focus on individual crimes, offend-
ers, or victims. UCR data are therefore restricted
to the analysis of such units as cities, counties,
states, or regions.

Incident-Based Police Records The U.S. De-
partment of Justice sponsors two series of crime
measures that are based on incidents as units
of analysis. The fi rst of these incident-based
measures, Supplementary Homicide Reports
(SHR), was begun in 1961 and is part of the
UCR program, as implied by supplementary.

Local law enforcement agencies submit
detailed information about individual homi-
cide incidents under the SHR program. This
includes information about victims and, if
known, offenders (age, gender, race); the re-
lationship between victim and offender; the
weapon used; the location of the incident; and
the circumstances surrounding the killing. No-
tice how the SHR relates to our discussion of
units of analysis. Incidents are the basic unit
and can include one or more victims and of-
fenders; because the series is restricted to homi-
cides, offense is held constant.

Because the SHR is an incident-based sys-
tem, investigators can use SHR data to con-

Chapter 4 Concepts, Operationalization, and Measurement 101

a larger number of law enforcement agencies. In
fact, many agencies have developed their own
incident-based records systems independent
of NIBRS, largely because of major advances in
computing technology (Maxfi eld 1999). Fur-
thermore, researchers are beginning to analyze

tailed information on 46 Group A offenses.
Table 4.2 shows what kinds of information are
collected for offenses, victims, and offenders
under NIBRS. Table 4.3 shows NIBRS crime
data for Idaho in 2004. Compare the top part
of the table, reporting crime counts for UCR
index offenses, to the bottom part. Additional
NIBRS Group A offenses more than double the
number of crimes “known to police” in Idaho
(43,611 UCR Part I, plus 51,911 additional
Group A). Simple assault and vandalism are
by far the most common of these additional
offenses, but drug violations accounted for al-
most 13,000 offenses in 2004 (drug violations
plus drug equipment violations).

Collecting detailed information on each in-
cident for each offense, victim, and offender,
and doing so for a large number of offense
types, represents the most signifi cant changes in
NIBRS compared with the UCR. Dropping the
hierarchy rule is also a major change, but that is
a consequence of incident-based reporting.

In the future, incident-based police records
will become more readily available and will cover

Table 4.2 Selected Information in National
Incident-Based Reporting System Records

Administrative Segment Offense Segment
Incident date and time Offense type
Reporting agency ID Attempted or
Other ID numbers completed
Offender drug/
alcohol use
Location type
Weapon use

Victim Segment Offender Segment
Victim ID number Offender ID number
Offense type Offender age,
Victim age, gender, race gender, race
Resident of jurisdiction?
Type of injury
Relationship to offender
Victim type:
Individual person
Business
Government
Society/public

Source: Adapted from Federal Bureau of Investigation (2000,
6–8, 90).

Table 4.3 Crime in Idaho, 2004

UCR Part I Offenses

Murder, non-negligent manslaughter 35

Rape 577

Robbery 247

Aggravated assault 2,594

Burglary 7,700

Larceny 29,442

Motor vehicle theft 2,696

Arson 320

Subtotal 43,611

Additional NIBRS Group A Offenses

Simple assault 14,192

Intimidation 1,766

Bribery 7

Counterfeit/forgery 1,982

Destruction of property 14,516

Drug violations 6,667

Drug equipment violations 6,329

Embezzlement 295

Extortion/blackmail 12

Fraud 2,426

Gambling 9

Kidnapping/abduction 236

Pornography/obscene material 33

Prostitution 10

Forcible sex offenses 1,200

Nonforcible sex offenses 215

Stolen property 631

Weapons violations 1,385

Subtotal 51,911

Group A Total 95,522

Source: Adapted from Idaho Department of Law Enforcement,
“Crime in Idaho, 2004,” www.isp.state.id.us/identifi cation/
ucr/2004/crime_in_Idaho_2004.html; accessed May 13,
2008.

www.isp.state.id.us/identification/ucr/2004/crime_in_Idaho_2004.html

www.isp.state.id.us/identification/ucr/2004/crime_in_Idaho_2004.html

102 Part Two Structuring Criminal Justice Inquiry

ing household members. Samples of banks, gas
stations, retail stores, business establishments,
or stockbrokers would be needed to measure
those crimes. In much the same fashion, crimes
directed at homeless victims cannot be counted
by surveys of households like the NCVS.

What about victimless crimes? For example,
think about how you would respond to a Cen-
sus Bureau interviewer who asked whether you
had been the victim of a drug sale. If you have
bought illegal drugs, you might think of your-
self as a customer rather than as a victim. Or
if you lived near a park where drug sales were
common, you might think of yourself as a vic-
tim even though you did not participate in a
drug transaction. The point is that victim sur-
veys are not good measures of victimless crimes
because the respondents can’t easily be con-
ceived as victims.

Measuring certain forms of delinquency
through victim surveys presents similar prob-
lems. Status offenses such as truancy and
curfew violations do not have identifi able vic-
tims who can be included in samples based on
households. Homicide and manslaughter are
other crimes that are not well measured by vic-
tim surveys, for obvious reasons.

Since its inception, the NCVS has served
as a measure to monitor the volume of crime,
including crimes not reported to police. In a
regular series of publications, the BJS reports
annual victimization data together with analy-
sis of victimization for special topics such as
carjackings (Klaus 2004), intimate partner vio-
lence (Rand and Rennison 2005), and contacts
between individuals and the police (Durose,
Schmitt, and Langan 2005). In addition, the
NCVS is a valuable tool for researchers who
take advantage of detailed information about
individual victimizations to examine such top-
ics as victimization in public schools (Dinkes,
Cataldi, Lin-Kelly, et al. 2007), identity theft
(Baum 2006), victimization at work (Duhart
2001), and why domestic violence victimiza-
tions may or may not be reported to police (Fel-
son, Messner, Hoskin, et al. 2002).

NIBRS data, something that is certain to prompt
other researchers to do the same. For examples,
see studies of child abuse (Finkelhor and Orm-
rod 2004; Snyder 2000), hate crimes (Nolan,
Akiyama, and Berhanu 2002), and domestic vio-
lence (Vazquez, Stohr, and Perkiss 2005).

Victim Surveys
Conducting a victim survey that asks people
whether they have been the victim of a crime is
an alternative approach to operationalization.
In principle, measuring crime through surveys
has several advantages. Surveys can obtain in-
formation on crimes that were not reported to
police. Asking people about victimizations can
also measure incidents that police may not have
offi cially recorded as crimes. Finally, asking
people about crimes that may have happened
to them provides data on victims and offenders
(individuals) and on the incidents themselves
(social artifacts). Like an incident-based report-
ing system, a survey can therefore provide more
disaggregated units of analysis.

The National Crime Victimization Survey
Since 1972, the U.S. Census Bureau has con-
ducted the NCVS. The NCVS is based on a na-
tionally representative sample of households
and uses uniform procedures to select and in-
terview respondents, which enhances the reli-
ability of crime measures. Because individual
people living in households are interviewed, the
NCVS can be used in studies in which individu-
als or households are the unit of analysis. And
the NCVS uses a panel design, interviewing
respondents from the same household seven
times at six-month intervals.

The NCVS cannot measure all crimes, how-
ever, in part because of the procedures used to
select victims. Because the survey is based on a
sample of households, it cannot count crimes
in which businesses or commercial establish-
ments are the victims. Bank robberies, gas sta-
tion holdups, shoplifting, embezzlement, and
securities fraud are examples of crimes that
cannot be systematically counted by interview-

Chapter 4 Concepts, Operationalization, and Measurement 103

Surveys of Offending
Just as survey techniques can measure crime by
asking people to describe their experiences as
victims, self-report surveys ask people about
crimes they may have committed. We might
initially be skeptical of this technique: how
truthful are people when asked about crimes
they may have committed? Many people do not
wish to disclose illegal behavior to interviewers
even if they are assured of confi dentiality. Oth-
ers might deliberately lie to interviewers and
exaggerate the number of offenses they have
committed. Our concern would be justifi ed, al-
though researchers have devised various meth-
ods to enhance the validity and reliability of self-
report data; we will examine these in Chapter 7.

In any event, self-report surveys are the best
method for operationalizing certain crimes
that are poorly measured by other techniques.
Thinking about the other methods we have
discussed— crimes known to police and vic-
timization surveys—suggests several examples.
Crimes such as prostitution and drug abuse
are excluded from victimization surveys and
underestimated by police records of people ar-
rested for these offenses. Public order crimes
and delinquency are other examples. A third
class of offenses that might be better counted
by self-report surveys is crimes that are rarely
reported to or observed by police—shoplifting
and drunk driving are examples.

Think of it this way: As we saw earlier, all
crimes require an offender. Not all crimes have
clearly identifi able victims who can be inter-
viewed, however, and not all crimes are readily
observed by police, victims, or witnesses. If we
can’t observe the offense and can’t interview a
victim, what’s the next logical step?

There are no nationwide efforts to systemat-
ically collect self-report measures on a variety of
offenses, as is the case with the UCR and NCVS.
Instead, periodic surveys yield information ei-
ther on specifi c types of crime or on crimes
committed by a specifi c target population.
We will briefl y consider two ongoing self-report
surveys here.

Community Victimization Surveys Follow-
ing the initial development of victim survey
methods in the late 1960s, the Census Bureau
completed a series of city-level surveys. These
were discontinued for a variety of reasons, but
researchers and offi cials in the BJS occasionally
conducted city-level victim surveys in specifi c
communities. In 1998, the BJS and the Offi ce of
Community Oriented Policing Services (COPS)
launched pilot surveys in 12 large and medium-
sized cities (Smith, Steadman, Minton, and
Townsend 1999).

The city-level initiative underscores one
of the chief advantages of measuring crime
through victim surveys— obtaining counts of
incidents not reported to police. In large part,
city-level surveys were promoted by BJS and
COPS to enable local law enforcement agencies
to better understand the scope of crime—re-
ported and unreported—in their communities.
Notice also the title of the fi rst report: “Crimi-
nal Victimization and Perceptions of Commu-
nity Safety in 12 Cities, 1998.” We emphasize
perceptions to illustrate that city-level surveys
can be valuable tools for implementing com-
munity policing, a key component of the 1994
Crime Bill that provided billions of dollars to
hire new police offi cers nationwide. It is signifi –
cant that the Department of Justice recognized
the potential value of survey measures of crime
and perceptions of community safety to de-
velop and evaluate community policing.

The initial BJS/COPS effort was a pilot test
of new methods for conducting city-level sur-
veys. These bureaus jointly developed a guide-
book and software so that local law enforce-
ment agencies and other groups can conduct
their own community surveys (Weisel 1999).
These tools also promise to be useful for re-
searchers who wish to study local patterns
of crime and individual responses. Although
the community survey initiative lapsed after
George W. Bush became president, the BJS con-
tinues to update software and make it available
on its website (www.ojp.usdoj.gov/bjs/abstract/
cvs.htm; accessed May 13, 2008).

www.ojp.usdoj.gov/bjs/abstract/cvs.htm

www.ojp.usdoj.gov/bjs/abstract/cvs.htm

104 Part Two Structuring Criminal Justice Inquiry

includes several samples of high school stu-
dents and other groups, totaling about 49,500
respondents in 2004 ( Johnston et al. 2005).

Each spring between 120 and 140 high
schools are sampled within particular geo-
graphic areas. In larger high schools, samples
of up to 350 seniors are selected; in smaller
schools, all seniors may participate. Students
fi ll out computer scan sheets containing batter-
ies of questions that include self-reported use
of alcohol, tobacco, and illegal drugs. In most
cases, students record their answers in class-
rooms during normal school hours.

The core sample of the MTF—surveys of
high school seniors—thus provides a cross sec-
tion for measuring annual drug use and other
illegal acts. Now recall our discussion of the
time dimension in Chapter 3. Each year, both
the MTF and the NSDUH survey drug use for a
cross section of high school seniors and adults
in households, thus providing a snapshot of an-
nual rates of self-reported drug use. Examining
annual results from the MTF and the NSDUH
over time provides a time series, or trend study,
that enables researchers and policy makers to
detect changes in drug use among high school
seniors and adults. Finally, a series of follow-up
samples of MTF respondents constitute a series
of panel studies whereby changes in drug use
among individual respondents can be studied
over time. Thomas Mieczkowski (1996) pres-
ents an excellent discussion of these two sur-
veys and compares self-reported drug use from
each series over time.

Measuring Crime Summary
Table 4.4 summarizes the strengths and weak-
nesses of different measures of crime. The UCR
and SHR provide the best counts for murder
and crimes in which the victim is a business or
a commercial establishment. Crimes against
persons or households that are not reported to
police are best counted by the NCVS. Usually
these are less serious crimes, many of them UCR
Part II incidents that are counted only if a sus-

National Survey on Drug Abuse and
Health Like the NCVS, the National Survey
on Drug Use and Health (NSDUH) is based
on a national sample of households. Currently
sponsored by the Substance Abuse and Mental
Health Services Administration in the U.S. De-
partment of Health and Human Services, the
NSDUH samples households and household
residents ages 12 and older. In the 2004 sample,
about 68,000 individuals responded to ques-
tions regarding their use of illegal drugs, alco-
hol, and tobacco (Substance Abuse and Mental
Health Services Administration 2005). Because
it has been conducted for more than three de-
cades, the NSDUH provides information on
trends and changes in drug use among respon-
dents. The 2004 survey was designed to obtain
statistically reliable samples from the eight
largest states in addition to the overall national
sample.

Think for a moment about what sorts of
questions we would ask to learn about people’s
experience in using illegal drugs. Among other
things, we would probably want to distinguish
someone who tried marijuana once from daily
users. The drug use survey does this by includ-
ing questions to distinguish lifetime use (ever
used) of different drugs from current use (used
within the past month). You may or may not
agree that use in the past month represents cur-
rent use, but it is the standard used in regular
reports on NSDUH results. That’s the opera-
tional defi nition of current use.

Monitoring the Future Our second exam-
ple is different in two respects: (1) it targets a
specifi c population, and (2) it asks sampled re-
spondents a broader variety of questions.

Since 1975, the National Institute on Drug
Abuse has sponsored an annual survey of high
school seniors, Monitoring the Future: A Continu-
ing Study of the Lifestyles and Values of Youth, or the
MTF for short. As its long title implies, the MTF
survey is intended to monitor the behaviors,
attitudes, and values of young people. The MTF

Chapter 4 Concepts, Operationalization, and Measurement 105

ever measure of crime best suits their research
purpose.

Composite Measures
Combining individual measures often produces
more valid and reliable indicators.

Sometimes it is possible to construct a single
measure that captures the variable of interest.
For example, asking auto owners whether their
car has been stolen in the previous six months
is a straightforward way to measure auto-theft
victimization. But other variables may be better
measured by more than one indicator. To begin
with a simple and well-known example, the FBI
crime index was a composite measure of crime
that combined police reports for seven differ-
ent offenses into one indicator.

Composite measures are frequently used
in criminal justice research for three reasons.
First, despite carefully designing studies to

pect is arrested. Recent changes in NCVS pro-
cedures have improved counts of sexual assault
and other violent victimizations. Compared
with the UCR, NIBRS potentially provides much
greater detail for a broader range of offenses. NI-
BRS complements the NCVS by including disag-
gregated incident-based reports for state and lo-
cal areas and by recording detailed information
on crimes against children younger than age 12.

Self-report surveys are best at measuring
crimes that do not have readily identifi able vic-
tims and that are less often observed by or re-
ported to police. The two self-report surveys
listed in Table 4.4 sample different populations
and use different interview procedures.

Don’t forget that all crime measures are se-
lective, so it’s important to understand the se-
lection process. Despite their various fl aws, the
measures of crime available to you can serve
many research purposes. Researchers are best
advised to be critical and careful users of what-

Table 4.4 Measuring Crime Summary

Units Target Population Crime Coverage Best Count for

Known to
police

UCR Aggregate: All law enforcement agencies; Limited number Commercial and
reporting 98% reporting reported and business victims
agency recorded crimes

SHR Incident All law enforcement agencies; Homicides only Homicides
98% reporting

NIBRS Incident All law enforcement agencies; Extensive Details on local
limited reporting incidents; victims

under age 12

Surveys

NCVS Victimization, Individuals in households Household and Household and
individuals and personal crimes personal crimes
households not reported to

police

NSDUH Individual Individuals in households Drug use Drug use by
respondent, adults in
offender households

MTF Individual High school seniors; Substance use, Drug use by
respondent, follow-up on sample delinquency, high school
offender offending seniors

106 Part Two Structuring Criminal Justice Inquiry

very nearly maintaining the specifi c details of
all the individual indicators.

Typologies
Researchers combine variables in different ways
to produce different composite measures. The
simplest of these is a typology, sometimes
called a taxonomy. Typologies are produced by
the intersection of two or more variables to cre-
ate a set of categories or types. We may, for ex-
ample, wish to classify people according to the
range of their experience in criminal court. As-
sume we have asked a sample of people whether
they have ever served as a juror and whether
they have ever testifi ed as a witness in crimi-
nal court. Table 4.5 shows how the yes and no
responses to these two questions can be com-
bined into a typology of experience in court.

Typologies can be more complex— combin-
ing scores on three or more measures or com-
bining scores on two measures that take many
different values. For an example of a complex
typology, consider research by Rolf Loeber and
associates (Loeber, Stouthamer-Loeber, von
Kammen, and Farrington 1991) on patterns of
delinquency over time. The researchers used a
longitudinal design in which a sample of boys
was selected from Pittsburgh public schools
and interviewed many times. Some questions
asked about their involvement in delinquency
and criminal offending. This approach made it
possible to distinguish boys who reported dif-
ferent types of offending at different times.

Loeber and associates fi rst classifi ed delin-
quent and criminal acts into the following or-
dinal seriousness categories (1991, 44):

None: No self-reported delinquency
Minor: Theft of items worth less than $5,

vandalism, fare evasion
Moderate: Theft more than $5, gang fi ght-

ing, carrying weapons
Serious: Car theft, breaking and entering,

forced sex, selling drugs

provide valid and reliable measurements of vari-
ables, the researcher is often unable to develop
single indicators of complex concepts. That
is especially true with regard to attitudes and
opinions that are measured through surveys.
For example, measuring fear of crime through
a question that asks about feelings of safety on
neighborhood streets measures some dimen-
sions of fear but certainly not all of them. This
leads us to question the validity of using that
single question to measure fear of crime.

Second, we may wish to use a rather refi ned
ordinal measure of a variable, arranging cases in
several ordinal categories from very low to very
high according to a variable such as degree of
parental supervision. A single data item might
not have enough categories to provide the de-
sired range of variation, but an index or scale
formed from several items would.

Finally, indexes and scales are effi cient devices
for data analysis. If a single data item gives only
a rough indication of a given variable, consid-
ering several data items may give us a more
comprehensive and more accurate indication.
For example, the results of a single drug test
would give us some indication of drug use by
a probationer. Examining results from several
drug tests would give us a better indication, but
the manipulation of several data items simul-
taneously can be very complicated. In contrast,
composite measures are effi cient data reduction
devices. Several indicators may be summarized
in a single numerical score, even while perhaps

Table 4.5 Typology of Court Experience

Serve on Jury?

No Yes

Testify as Witness? No A B

Yes C D

Typology
A: No experience with court
B: Experience as juror only
C: Experience as witness only
D: Experience as juror and witness

Chapter 4 Concepts, Operationalization, and Measurement 107

low-up) with four categories each are reduced
to a single variable with six categories. Fur-
thermore, the two measures of delinquency are
themselves composite measures, produced by
summarizing self-reports of a large number of
individual offenses. Finally, notice also how this
effi ciency is refl ected in the clear meaning of the
new composite measure. This dynamic typology
summarizes information about time, offending,
and offense seriousness in a single measure.

An Index of Disorder
“What is disorder, and what isn’t?” asks Wes-
ley Skogan (1990a) in his book on the links
between crime, fear, and social problems such
as public drinking, drug use, litter, prostitu-
tion, panhandling, dilapidated buildings, and
groups of boisterous youths. In an infl uential
article titled “Broken Windows,” James Q. Wil-
son and George Kelling (1982) describe dis-
order as a sign of crime that may contribute
independently to fear and crime itself. The ar-
gument goes something like this: Disorder is
a symbol of urban decay that people associate
with crime. Signs of disorder can produce two
related problems. First, disorder may contrib-
ute to fear of crime, as urban residents believe
that physical decay and undesirables are sym-
bols of crime. Second, potential offenders may
interpret evidence of disorder as a signal that
informal social control mechanisms in a neigh-
borhood have broken down and that the area is
fair game for mayhem and predation.

We all have some sort of mental image (con-
ception) of disorder, but, to paraphrase Sko-
gan’s question: how do we measure it? Let’s
begin by distinguishing two conceptions of dis-
order. First, we can focus on the physical pres-
ence of disorder—whether litter, public drink-
ing, public drug use, and the like are evident
in an urban neighborhood. We might measure
the physical presence of disorder through a se-
ries of systematic observations. This is the ap-
proach used by Robert Sampson and Stephen
Raudenbush (1999) in their study of links

Next, to measure changes in delinquency
over time, the researchers compared reports of
delinquency from the fi rst screening interview
with reports from follow-up interviews. These
two measures— delinquency at time 1 and delin-
quency at time 2—formed the typology, which
they referred to as a “dynamic classifi cation of
offenders” (1991, 44). Table 4.6 summarizes this
typology.

The fi rst category in the table, nondelin-
quent, includes those boys who reported com-
mitting no offenses at both the screening and
follow-up interviews. Starters reported no of-
fenses at screening and then minor, moderate,
or serious delinquency at follow-up, whereas de-
sistors were just the opposite. Those who com-
mitted the same types of offenses at both times
were labeled stable; deescalators reported com-
mitting less serious offenses at follow-up; and
escalators moved on to more serious offenses.

Notice the effi ciency of this typology. Two
variables (delinquency at screening and fol-

Table 4.6 Typology of Change
in Juvenile Offending

Juvenile Offending

Screening Follow-Up
Typology (Time 1) (Time 2)

A. Nondelinquent 0 0

B. Starter 0 1, 2, or 3

C. Desistor 1, 2, or 3 0

D. Stable 1 1

D. Stable 2 2

D. Stable 3 3

E. Deescalator 3 2

E. Deescalator 2 or 3 1

F. Escalator 1 2 or 3

F. Escalator 2 3

Juvenile Offending Typology
0: None
1: Minor
2: Moderate
3: Serious

Source: Adapted from Loeber and associates (1991, 43–46).

108 Part Two Structuring Criminal Justice Inquiry

measure different types of disorder and ap-
pear to have reasonable face validity. However,
examining the relationship between each indi-
vidual item and respondents’ fear of crime or
experience as a crime victim would be unwieldy
at best. So Skogan created two indexes, one
for social disorder and one for physical dis-
order, by adding up the scores for each item
and dividing by the number of items in each
group. Figure 4.2 shows a hypothetical sample
questionnaire for these nine items, together
with the scores that would be produced for
each index.

This example illustrates how several related

between disorder and crime in Chicago. Unfor-
tunately, these authors observed very few ex-
amples of disorder and altogether ignored the
question of whether such behaviors were per-
ceived as problematic by residents of Chicago
neighborhoods.

That brings us to the second conception,
one focusing on the perception of disorder.
Thus some people might view public drinking
as disorderly, whereas others (New Orleans resi-
dents, for example) consider public drinking
to be perfectly acceptable. Questionnaires and
survey methods are the best suited for measur-
ing perceived disorder.

Skogan used questions about nine different
examples of disorder and classifi ed them into
two groups representing what he calls social
and physical disorder (Skogan 1990a, 51, 191).
Questions corresponding to each of these ex-
amples of disorder asked respondents to rate
them as big problems (scored 2), some problem
(scored 1), or almost no problem (scored 0) in
their neighborhood. Together, these nine items

Introduction:

Now I’m going to read you a list of crime-related problems that may be found in some parts
of the city. For each one, please tell me how much of a problem it is in your neighborhood. Is
it a big problem, some problem, or almost no problem?

Big Some No
problem problem problem

(S) Groups of people loitering �2 1 0

(S) People using or selling drugs 2 �1 0

(P) Abandoned buildings 2 1 �0
(S) Vandalism 2 �1 0

(P) Garbage and litter on street �2 1 0

(S) Gangs and gang activity 2 1 �0
(S) People drinking in public �2 1 0

(P) Junk in vacant lots 2 1 �0
(S) People making rude or insulting remarks 2 �1 0

(S) Social = 2 + 1 + 1 + 0 + 2 + 1 = 7
Index score = 7⁄6 = 1.16

(P) Physical = 0 + 2 + 0 = 2
Index score = 2⁄3 = 0.67

Figure 4.2 Index of Disorder

Social Disorder Physical Disorder
Groups of loiterers Abandoned buildings
Drug use and sales Garbage and litter
Vandalism Junk in vacant lots
Gang activity
Public drinking
Street harassment

Chapter 4 Concepts, Operationalization, and Measurement 109

• Higher levels of measurements specify catego-
ries that have ranked order or more complex
numerical properties.

• A given variable can sometimes be measured at
different levels of measurement. The most ap-
propriate level of measurement used depends
on the purpose of the measurement.

• Precision refers to the exactness of the measure
used in an observation or description of an
attribute.

• Reliability and validity are criteria for measure-
ment quality. Valid measures are truly indica-
tors of underlying concepts. A reliable measure
is consistent.

• Crime is a fundamental concept in criminal
justice research. Different approaches to mea-
suring crime illustrate general principles of
conceptualization, operationalization, and
measurement. We have different measures of
crime because each measure has its strengths
and weaknesses.

• Different measures of crime are based on differ-
ent units of analysis. Uniform Crime Reports
are summary measures that report totals for in-
dividual agencies. Other measures use offend-
ers, victims, incidents, or offenses as the units
of analysis.

• Crimes known to police have been the most
widely used measures. UCR data are available
for most of the 20th century; more detailed in-
formation about homicides was added to the
UCR in 1961. Most recently, the FBI has devel-
oped an incident-based reporting system that is
gradually being adopted.

• Surveys of victims reveal information about
crimes that are not reported to police. The
NCVS includes detailed information about per-
sonal and household incidents, but does not
count crimes against businesses or individual
victims under age 12.

• Self-report surveys were developed to measure
crimes with unclear victims that are less often
detected by police. Two surveys estimate drug
use among high school seniors and adults.

• The creation of specifi c, reliable measures often
seems to diminish the richness of meaning our
general concepts have. A good solution is to use
multiple measures, each of which taps different
aspects of the concept.

• Composite measures, formed by combining
two or more variables, are often more valid
measures of complex criminal justice concepts.

variables can be combined to produce an index
that has three desirable properties. First, a com-
posite index is a more valid measure of disorder
than is a single question. Second, computing
and averaging across all items in a category cre-
ate more variation in the index than we could
obtain in any single item. Finally, two indexes
are more parsimonious than nine individual
variables; data analysis and interpretation can
be more effi cient.

Measurement Summary
We have covered substantial ground in this
chapter but still have introduced only the im-
portant and often complex issue of measure-
ment in criminal justice research. More than
a step in the research process, measurement
involves continuous thinking about what con-
ceptual properties we wish to study, how we
will operationalize those properties, and how
we will develop measures that are reliable and
valid. Often some type of composite measure
better represents underlying concepts and thus
enhances validity.

Subsequent chapters will pursue issues of
measurement further. Part Three of this book
will describe data collection—how we go about
making actual measurements. And the next
chapter will focus on different approaches to
measuring crime.

✪ Main Points
• Concepts are mental images we use as summary

devices for bringing together observations and
experiences that seem to have something in
common.

• Our concepts do not exist in the real world, so
they can’t be measured directly.

• In operationalization, we specify concrete em-
pirical procedures that will result in measure-
ments of variables.

• Operationalization begins in study design and
continues throughout the research project, in-
cluding the analysis of data.

• Categories in a measure must be mutually ex-
clusive and exhaustive.

110 Part Two Structuring Criminal Justice Inquiry

of trying to measure a particular dimension of
crime: motive. Other examples are hate crimes,
terrorism, and drug-related crimes. Specify
conceptual and operational defi nitions for at
least one of these types. Find one newspaper
story and one research report that present an
example.

✪ Additional Readings
Best, Joel, Damned Lies and Statistics: Untangling

Numbers from the Media, Politicians, and Activists
(Berkeley: University of California Press, 2001).
Despite the title, much of this entertaining
and informative book describes problems with
measurement. For example, page 45 tells us:
“Measuring involves deciding how to go about
counting.” Best emphasizes how ambiguity in
measures of social problems make it easy for
advocates to exaggerate the frequency of such
problems. Mass media often report and per-
petuate errorful measures. What results, Best
informs us, are mutant statistics.

Bureau of Justice Statistics, Performance Measures for
the Criminal Justice System (Washington, DC: U.S.
Department of Justice, Offi ce of Justice Pro-
grams, Bureau of Justice Statistics, 1993). This
collection of essays by prominent criminal jus-
tice researchers focuses on developing measures
for evaluation uses. The discussion of general
measurement issues as encountered in differ-
ent types of justice agencies is uncommonly
thoughtful. You will fi nd this a provocative
discussion of how to measure important con-
structs in corrections, trial courts, and policing.
See especially the general essays by John DiIulio
and James Q. Wilson.

Gaes, Gerald G., Scott D. Camp, Julianne B. Nelson,
and William G. Saylor. Measuring Prison Perfor-
mance: Government Privatization and Accountabil-
ity. (Walnut Creek, CA: AltaMira Press, 2004).
This book stemmed partly from the BJS report,
in an effort to expand how to measure the vari-
ous dimensions of prisons. Another stated goal
of the authors is to devise a system for compar-
ing the performance of public and private cor-
rectional facilities. This is an excellent resource
for anyone interested in corrections.

Hough, Mike, and Mike Maxfi eld (eds.) Surveying
Crime in the 21st Century: Crime Prevention Stud-
ies, vol. 22. (Monsey, NY: Criminal Justice Press,
2007). This collection of essays was produced to

✪ Review Questions and Exercises
1. Review the box titled “What Is Recidivism?” on

page 84. From that discussion, write conceptual
and operational defi nitions for recidivism. Sum-
marize how Fabelo proposes to measure the
concept. Finally, discuss possible reliability and
validity issues associated with Fabelo’s proposed
measure.

2. We all have some sort of mental image of the
pace of life. In a fascinating book titled A Geog-
raphy of Time, Robert Levine (1997) operational-
ized the pace of life in cities around the world
with a composite measure of the following:
a. How long it took a single pedestrian to

walk 60 feet on an uncrowded sidewalk
b. What percentage of public clocks displayed

the correct time
c. How long it took to purchase the equiva-

lent of a fi rst-class postage stamp with the
equivalent of a $5 bill

Discuss possible reliability and validity issues
with these indicators of the pace of life. Be sure
to specify a conceptual defi nition for pace of life.

3. Los Angeles police consider a murder to be
gang-related if either the victim or the offender
is known to be a gang member, whereas Chi-
cago police record a murder as gang-related
only if the killing is directly related to gang ac-
tivities (Spergel 1990). Describe how these dif-
ferent operational defi nitions illustrate general
points about measuring crime discussed in this
chapter.

4. Measuring gang-related crime is an example

concept, p. 82
conception, p. 81
conceptual

defi nition, p. 85
conceptualization,

p. 83
construct

validity, p. 95
criterion-related

validity, p. 95
dimension, p. 83
face validity, p. 95
incident-based

measure, p. 100
interval

measures, p. 90

nominal
measures, p. 89

operational
defi nition, p. 85

ordinal
measures, p. 90

ratio measures, p. 90
reliability, p. 93
self-report

survey, p. 103
summary-based

measure, p. 100
typology, p. 106
validity, p. 94
victim survey, p. 102

✪ Key Terms

Chapter 4 Concepts, Operationalization, and Measurement 111

no one knows how to measure other dimen-
sions of police performance. This document
presents papers and discussions from a series of
meetings in which police, researchers, reporters,
and others discussed what matters in policing
and how to measure it.

Moore, Mark H., and Anthony Braga. The “Bottom
Line” of Policing: What Citizens Should Value (and
Measure) in Police Performance. (Washington,
DC: Police Executive Research Forum, 2003).
This is a spin-off from Langworthy’s anthology.
Though somewhat long-winded, the authors
offer an exceptionally thoughtful discussion
of measuring different dimensions of police
performance.

commemorate the 25th anniversary of the Brit-
ish Crime Survey. Contributors describe what
they have learned from crime surveys in many
research areas. In the concluding essay, the edi-
tors (with Pat Mayhew) suggest how crime sur-
veys should be revised.

Langworthy, Robert (ed.), Measuring What Matters:
Proceedings from the Policing Research Institute
Meetings (Washington, DC: U.S. Department of
Justice, Offi ce of Justice Programs, National
Institute of Justice, 1999). With the spread of
community policing, researchers and offi cials
alike have struggled with the question of how to
measure police performance. Most people agree
that simply counting crimes is not enough, but

112

Chapter 5

Experimental and
Quasi-Experimental Designs
We’ll learn about the experimental approach to social scientifi c research. We’ll
consider a wide variety of experimental and other designs available to crimi-
nal justice researchers.

Introduction 113

The Classical Experiment 113

Independent and Dependent
Variables 114

Pretesting and Posttesting 114

Experimental and Control
Groups 115

Double-Blind Experiments 116

Selecting Subjects 116

Randomization 117

Experiments and Causal
Inference 117

Experiments and Threats to
Validity 118

Threats to Internal Validity 118

Ruling Out Threats to Internal
Validity 120

Generalizability and Threats to
Validity 121

Variations in the Classical
Experimental Design 123

Quasi-Experimental Designs 124

Nonequivalent-Groups Designs 125

Cohort Designs 128

Time-Series Designs 128

Chapter 5 Experimental and Quasi-Experimental Designs 113

frequently, in an average week, they consume al-
cohol for the specifi c purpose of getting drunk.
Next, we might show these subjects a video
depicting the various physiological effects of
chronic drinking and binge drinking. Finally—
say, one month later—we might again ask the
subjects about their use of alcohol in the pre-
vious week to determine whether watching the
video reduced alcohol use.

You might typically think of experiments as
being conducted in laboratories under carefully
controlled conditions. Although this may be
true in the natural sciences, few social scientifi c
experiments take place in laboratory settings.
The most notable exception to this occurs in
the discipline of psychology, in which labora-
tory experiments are common. Criminal justice
experiments are almost always conducted in
fi eld settings outside the laboratory.

The Classical Experiment
Variables, time order, measures, and groups are
the central features of the classical experiment.

Like much of the vocabulary of research, the
word experiment has acquired both a general
and a specialized meaning. So far, we have re-
ferred to the general meaning, defi ned by David
Farrington, Lloyd Ohlin, and James Q. Wilson
(1986, 65) as “a systematic attempt to test a
causal hypothesis about the effect of variations
in one factor (the independent variable) on an-
other (the dependent variable). . . . The defi ning
feature of an experiment lies in the control of
the independent variable by the experimenter.”
In a narrower sense, the term experiment refers
to a specifi c way of structuring research, usually
called the classical experiment. In this section,

Introduction
Experimentation is an approach to research best
suited for explanation and evaluation.

Research design in the most general sense in-
volves devising a strategy for fi nding out some-
thing. We’ll fi rst discuss the experiment as
a mode of scientifi c observation in criminal
justice research. At base, experiments involve
(1) taking action and (2) observing the con-
sequences of that action. Social scientifi c re-
searchers typically select a group of subjects, do
something to them, and observe the effect of
what was done.

It is worth noting at the outset that experi-
ments are often used in nonscientifi c human
inquiry as well. We experiment copiously in our
attempts to develop a more generalized under-
standing about the world we live in. We learn
many skills through experimentation: riding a
bicycle, driving a car, swimming, and so forth.
Students discover how much studying is re-
quired for academic success through experi-
mentation. Professors learn how much prepara-
tion is required for successful lectures through
experimentation.

Experimentation is especially appropriate
for hypothesis testing and evaluation. Suppose
we are interested in studying alcohol abuse
among college students and in discovering
ways to reduce it. We might hypothesize that
acquiring an understanding about the health
consequences of binge drinking and long-term
alcohol use will have the effect of reducing
alcohol abuse. We can test this hypothesis ex-
perimentally. To begin, we might ask a group of
experimental subjects how much beer, wine, or
spirits they drank on the previous day and how

Variations in Time-Series Designs 132

Variable-Oriented Research and
Scientifi c Realism 133

Experimental and Quasi-
Experimental Designs
Summarized 135

114 Part Two Structuring Criminal Justice Inquiry

It is essential that both independent and
dependent variables be operationally defi ned
for the purposes of experimentation. Such op-
erational defi nitions might involve a variety of
observation methods. Responses to a question-
naire, for example, might be the basis for defi n-
ing self-reported alcohol use on the previous
day. Alternatively, alcohol use by subjects could
be measured with a Breathalyzer® or other
blood alcohol test.

Pretesting and Posttesting
In the simplest experimental design, subjects are
measured on a dependent variable (pretested),
exposed to a stimulus that represents an inde-
pendent variable, and then remeasured on the
dependent variable (posttested). Differences
noted between the fi rst and second measure-
ments on the dependent variable are then at-
tributed to the infl uence of the independent
variable.

In our example of alcohol use, we might
begin by pretesting the extent of alcohol use
among our experimental subjects. Using a ques-
tionnaire, we measure the extent of alcohol use
reported by each individual and the average
level of alcohol use for the whole group. After
showing subjects the video on the effects of al-
cohol, we administer the same questionnaire
again. Responses given in this posttest permit
us to measure the subsequent extent of alcohol
use by each subject and the average level of alco-
hol use of the group as a whole. If we discover a
lower level of alcohol use on the second admin-
istration of the questionnaire, we might con-
clude that the video indeed reduced the use of
alcohol among the subjects. In the experimen-
tal examination of behaviors such as alcohol
use, we face a special practical problem relat-
ing to validity. As you can imagine, the subjects
might respond differently to the questionnaires
the second time, even if their level of drinking
remained unchanged. During the fi rst adminis-
tration of the questionnaire, the subjects might
have been unaware of its purpose. By the time of
the second measurement, however, they might

we examine the requirements and components
of the classical experiment. Later in the chapter
we will consider designs that can be used when
some of the requirements for classical experi-
ments cannot be met.

The most conventional type of experiment
in the natural and the social sciences involves
three major pairs of components: (1) indepen-
dent and dependent variables, (2) pretesting
and posttesting, and (3) experimental and con-
trol groups. We will now consider each of those
components and the way they are put together
in the execution of an experiment.

Independent and
Dependent Variables
Essentially, an experiment examines the effect
of an independent variable on a dependent
variable. Typically the independent variable
takes the form of an experimental stimulus
that is either present or absent—that is, having
two attributes. In the example concerning alco-
hol abuse, how often subjects used alcohol is
the dependent variable and exposure to a video
about alcohol’s effects is the independent vari-
able. The researcher’s hypothesis suggests that
levels of alcohol use depend, in part, on under-
standing its physiological and health effects.
The purpose of the experiment is to test the va-
lidity of this hypothesis.

The independent and dependent variables
appropriate to experimentation are nearly lim-
itless. It should be noted, moreover, that a given
variable might serve as an independent variable
in one experiment and as a dependent variable
in another. Alcohol use is the dependent vari-
able in our example, but it might be the inde-
pendent variable in an experiment that exam-
ines the effects of alcohol abuse on academic
performance.

In the terms of our discussion of cause and
effect in Chapter 3, the independent variable is
the cause and the dependent variable is the ef-
fect. Thus we might say that watching the video
causes a change in alcohol use or that reduced
alcohol use is an effect of watching the video.

Chapter 5 Experimental and Quasi-Experimental Designs 115

test of alcohol use to both groups. Figure 5.1
i llustrates this basic experimental design.

Using a control group allows the researcher
to control for the effects of the experiment it-
self. If participation in the experiment leads
the subjects to report less alcohol use, that
should occur in both the experimental and the
control groups. If, on the one hand, the overall
level of drinking exhibited by the control group
decreases between the pretest and posttest as
much as for the experimental group, then the
apparent reduction in alcohol use must be a
function of some external factor, not a func-
tion of watching the video specifi cally. In this
situation, we can conclude that the video did
not cause any change in alcohol use.

If, on the other hand, drinking decreases
only in the experimental group, then we can
be more confi dent in saying that the reduction
is a consequence of exposure to the video (be-
cause that’s the only difference between the two
groups). Or, alternatively, if drinking decreases
more in the experimental group than in the
control group, then that too is grounds for as-
suming that watching the video reduced alco-
hol use.

The need for control groups in experimenta-
tion has been most evident in medical research.

have fi gured out the purpose of the experi-
ment, become sensitized to the questions about
drinking, and changed their answers. Thus the
video might seem to have reduced alcohol abuse
although, in fact, it did not.

This is an example of a more general prob-
lem that plagues many forms of criminal justice
research: The very act of studying something
may change it. Techniques for dealing with this
problem in the context of experimentation are
covered throughout the chapter.

Experimental and Control Groups
The traditional way to offset the effects of the
experiment itself is to use a control group. So-
cial scientifi c experiments seldom involve only
the observation of an experimental group,
to which a stimulus has been administered.
Researchers also observe a control group, to
which the experimental stimulus has not been
administered.

In our example of alcohol abuse, two groups
of subjects are examined. To begin, each group
is administered a questionnaire designed to
measure their alcohol use in general and binge
drinking in particular. Then only one of the
groups—the experimental group—is shown the
video. Later, the researcher administers a post-

Figure 5.1 Basic Experimental Design

EXPERIMENTAL
GROUP

CONTROL
GROUP

Measure dependent
variable

Administer experimental
stimulus (video)

Remeasure dependent
variable

Measure dependent
variable

Remeasure dependent
variable

Compare: Different?

Compare: Same?

116 Part Two Structuring Criminal Justice Inquiry

only participate in group discussions; the con-
trol group would do neither. With this kind of
design, we could determine the impact of each
stimulus separately, as well as their combined
effect.

Double-Blind Experiments
As we saw with medical experimentation, pa-
tients sometimes improve when they think they
are receiving a new drug; thus it is often nec-
essary to administer a placebo to a control
group.

Sometimes experimenters have this same
tendency to prejudge results. In medical re-
search, the experimenters may be more likely
to “observe” improvements among patients
who receive the experimental drug than among
those receiving the placebo. That would be
most likely, perhaps, for the researcher who de-
veloped the drug. A double-blind experiment
eliminates this possibility because neither the
subjects nor the experimenters know which is
the experimental group and which is the con-
trol. In medical experiments, those research-
ers who are responsible for administering the
drug and for noting improvements are not told
which subjects receive the drug. Thus both re-
searchers and subjects are blind with respect
to who is receiving the experimental drug and
who is getting the placebo. Another researcher
knows which subjects are in which group, but
that person is not responsible for administer-
ing the experiment.

Selecting Subjects
Before beginning an experiment, we must
make two basic decisions about who will par-
ticipate. First, we must decide on the target
population—the group to which the results of
our experiment will apply. If our experiment is
designed to determine, for example, whether
restitution is more effective than probation in
reducing recidivism, our target population is
some group of persons convicted of crimes. In
our hypothetical experiment about the effects
of watching a video on the health consequences

Time and again, patients who participated in
medical experiments appeared to improve, but
it was unclear how much of the improvement
came from the experimental treatment and
how much from the experiment. Now, in test-
ing the effects of new drugs, medical researchers
frequently administer a placebo (for example,
sugar pills) to a control group. Thus the con-
trol group patients believe that they, like mem-
bers of the experimental group, are receiving an
experimental drug—and they often improve. If
the new drug is effective, however, those who
receive that drug will improve more than those
who receive the placebo.

In criminal justice experiments, control
groups are important as a guard against the
effects of not only the experiments themselves
but also events that may occur outside the
laboratory during the course of experiments.
Suppose the alcohol use experiment was being
conducted on your campus and at that time a
popular athlete was hospitalized for acute al-
cohol poisoning after he and a chum drank a
bottle of rum. This event might shock the ex-
perimental subjects and thereby decrease their
reported drinking. Because such an effect
should happen about equally for members of
the control and experimental groups, lower lev-
els of reported alcohol use in the experimental
group than in the control group would again
demonstrate the impact of the experimental
stimulus: watching the video that describes the
health effects of alcohol abuse.

Sometimes an experimental design requires
more than one experimental or control group.
In the case of the alcohol video, we might also
want to examine the impact of participating in
group discussions about why college students
drink alcohol, with the intent of demonstrating
that peer pressure may promote drinking by
people who would otherwise abstain. We might
design our experiment around three experimen-
tal groups and one control group. One experi-
mental group would see the video and partici-
pate in the group discussions, another would
only see the video, and still another would

Chapter 5 Experimental and Quasi-Experimental Designs 117

ization in criminal justice research to labora-
tory controls in the natural sciences:

The control of extraneous variables by ran-
domization is similar to the control of ex-
traneous variables in the physical sciences
by holding physical conditions (e.g., tem-
perature, pressure) constant. Randomiza-
tion insures that the average unit in [the]
treatment group is approximately equiva-
lent to the average unit in another [group]
before the treatment is applied.

You’ve surely heard the expression, “All other
things being equal.” Randomization makes
it possible to assume that all other things are
equal.

Experiments and
Causal Inference
Experiments potentially control for many threats
to the validity of causal inference, but researchers
must remain aware of these threats.

The central features of the classical experiment
are independent and dependent variables, pre-
testing and posttesting, and experimental and
control groups created through random assign-
ment. Think of these features as building blocks
of a research design to demonstrate a cause-
and-effect relationship. This point will become
clearer by comparing the criteria for causality,
discussed in Chapter 3, to the features of the
classical experiment, as shown in Figure 5.2.

The experimental design ensures that the
cause precedes the effect in time by taking post-
test measurements of the dependent variable
after introducing the experimental stimulus.
The second criterion for causation—an em-
pirical correlation between the cause-and-effect
variables—is determined by comparing the
pretest (in which the experimental stimulus is
not present) to the posttest for the experimen-
tal group (after the experimental stimulus is
administered). A change in pretest to posttest
measures demonstrates correlation.

of alcohol abuse, the target population might
be college students.

Second, we must decide how particular mem-
bers of the target population will be selected for
the experiment. In most cases, the methods used
to select subjects must meet the scientifi c norm
of generalizability; it should be possible to gen-
eralize from the sample of subjects studied to
the population those subjects represent.

Aside from the question of generalizability,
the cardinal rule of subject selection and exper-
imentation is the comparability of the experi-
mental and control groups. Ideally, the control
group represents what the experimental group
would have been like if it had not been exposed
to the experimental stimulus. It is essential,
therefore, that the experimental and control
groups be as similar as possible.

Randomization
Having recruited, by whatever means, a group
of subjects, we randomly assign those subjects
to either the experimental or the control group.
This might be accomplished by numbering all
the subjects serially and selecting numbers by
means of a random-number table. Or we might
assign the odd-numbered subjects to the exper-
imental group and the even-numbered subjects
to the control group.

Randomization is a central feature of the
classical experiment. The most important char-
acteristic of randomization is that it produces
experimental and control groups that are sta-
tistically equivalent. Put another way, random-
ization reduces possible sources of systematic
bias in assigning subjects to groups. The basic
principle is simple: if subjects are assigned to
experimental and control groups through a
random process such as fl ipping a coin, the as-
signment process is said to be unbiased and the
resultant groups are equivalent.

Although the rationale underlying this prin-
ciple is a bit complex, understanding how ran-
domization produces equivalent groups is a key
point. Farrington and associates (Farrington,
Ohlin, and Wilson 1986, 66) compare random-

118 Part Two Structuring Criminal Justice Inquiry

from experimental results may not accurately
refl ect what went on in the experiment it-
self. Put differently, conclusions about cause
and effect may be biased in some systematic
way. Shadish, Cook, and Campbell (2002, 54 –
60) pointed to several sources of the problem.
As you read about these different threats to in-
ternal validity, keep in mind that each is an ex-
ample of a simple point: possible ways research-
ers might be wrong in inferring causation.

History Historical events may occur during
the course of the experiment that confound the
experimental results. The hospitalization of a
popular athlete for acute alcohol poisoning dur-
ing an experiment on reducing alcohol use is an
example.

Maturation People are continually grow-
ing and changing, whether in an experiment
or not, and those changes affect the results of
the experiment. In a long-term experiment,
the fact that the subjects grow older may have
an effect. In shorter experiments, they may be-
come tired, sleepy, bored, or hungry, or change
in other ways that affect their behavior in the
experiment. A long-term study of alcohol abuse
might reveal a decline in binge drinking as the
subjects mature.

History and maturation are similar in that
they represent a correlation between cause
and effect that is due to something other
than the independent variable. They’re differ-
ent in that history represents something that’s
outside the experiment altogether, whereas

The fi nal requirement is to show that the
observed correlation between cause and effect is
not due to the infl uence of a third variable. The
classical experiment makes it possible to satisfy
this criterion for cause in two ways. First, the
posttest measures for the experimental group
(stimulus present) are compared with those for
the control group (stimulus not present). If the
observed correlation between the stimulus and
the dependent variable is due to some other fac-
tor, then the two posttest scores will be similar.
Second, random assignment produces experi-
mental and control groups that are equivalent
and will not differ on some other variable that
could account for the empirical correlation be-
tween cause and effect.

Experiments and Threats to Validity
The classical experiment is designed to sat-
isfy the three requirements for demonstrating
cause-and-effect relationships. But what about
threats to the validity of causal inference, dis-
cussed in Chapter 3? In this section, we consider
certain threats in more detail and describe how
the classical experiment reduces many of them.
Our discussion draws mostly on the book by
William Shadish, Thomas Cook, and Donald
Campbell (2002). We present these threats in a
slightly different order, beginning with threats
to internal validity.

Threats to Internal Validity
The problem of threats to internal validity
refers to the possibility that conclusions drawn

Experimental
group

Pretest Stimulus

TIME

Posttest

PosttestPretestControl
group

COMPARISONS

Figure 5.2 Another Look at the Classical Experiment

Chapter 5 Experimental and Quasi-Experimental Designs 119

questionnaires about prejudice, for example,
that is a testing problem. However, if different
questionnaires about prejudice are used in pre-
test and posttest measurements, instrumenta-
tion is a potential threat.

Statistical Regression Sometimes it’s appro-
priate to conduct experiments on subjects who
start out with extreme scores on the dependent
variable. Many sentencing policies, for example,
target chronic offenders. Commonly referred
to as regression to the mean, this threat to validity
can emerge whenever researchers are interested
in extreme cases. As a simple example, statisti-
cians often point out that extremely tall people
as a group are likely to have children shorter
than themselves, and extremely short people as
a group are likely to have children taller than
themselves. The danger, then, is that changes
occurring by virtue of subjects starting out in
extreme positions will be attributed erroneously
to the effects of the experimental stimulus.

Statistical regression can also be at work in
aggregate analysis of changes in crime rates.
For example, some researchers initially viewed
declines in crime rates throughout U.S. cities in
the 1990s as a return to more normal levels of
crime after abnormally high rates in the 1980s.

Selection Biases Randomization eliminates
the potential for systematic bias in selecting
subjects, but subjects may be chosen in other
ways that threaten validity. Volunteers are of-
ten solicited for experiments conducted on col-
lege campuses. Students who volunteer for an
experiment may not be typical of students as a
whole, however. Volunteers may be more inter-
ested in the subject of the experiment and more
likely to respond to a stimulus.

A common type of selection bias in applied
criminal justice studies results from the natu-
ral caution of public offi cials. Let’s say you are a
bail commissioner in a large city, and the mayor
wants to try a new program to increase the
number of arrested persons who are released on
bail. The mayor asks you to decide what kinds
of defendants should be eligible for release and
informs you that staff from the city’s criminal

maturation refers to change within the subjects
themselves.

Testing Often the process of testing and re-
testing infl uences people’s behavior and thereby
confounds the experimental results. Suppose
we administer a questionnaire to a group as a
way of measuring their alcohol use. Then we
administer an experimental stimulus and re-
measure their alcohol use. By the time we con-
duct the posttest, the subjects may have gotten
more sensitive to the issue of alcohol use and
so provide different answers. In fact, they may
believe we are trying to determine whether they
drink too much. Because excessive drinking is
frowned on by university authorities, our sub-
jects will be on their best behavior and give an-
swers that they think we want or that will make
them look good.

Instrumentation Thus far we haven’t said
much about the process of measurement in
pretesting and posttesting, and it’s appropri-
ate to keep in mind the problems of concep-
tualization and operationalization discussed
in Chapter 4. If we use different measures of
the dependent variable (say, different ques-
tionnaires about alcohol use), how can we be
sure that they are comparable? Perhaps alco-
hol use seems to have decreased simply because
the pretest measure was more sensitive than the
posttest measure.

Or if the measurements are being made by
the experimenters, their procedures may change
over the course of the experiment. You prob-
ably recognize this as a problem with reliability.
Instrumentation is always a potential problem
in criminal justice research that uses secondary
sources of information such as police records
about crime or court records about probation
violations. There may be changes in how pro-
bation violations are defi ned or changes in the
r ecord-keeping practices of police departments.

In general, testing refers to changes in how
subjects respond to measurement, whereas in-
strumentation is concerned with changes in
the measurement process itself. If police offi –
cers respond differently to pretest and posttest

120 Part Two Structuring Criminal Justice Inquiry

fending exhibited this threat to validity by rely-
ing on single interviews with subjects who were
asked how they viewed alternative punishments
and whether they had committed any crimes.

Ruling Out Threats to
Internal Validity
The classical experiment, coupled with proper
subject selection and assignment, can potentially
handle each of these threats to internal validity.
Let’s look again at the classical experiment, pre-
sented graphically in Figure 5.2.

Pursuing the example of the educational
video as an attempt to reduce alcohol abuse,
if we use the experimental design shown in
F igure 5.2, we should expect two fi ndings. For
the experimental group, the frequency of drink-
ing measured in their posttest should be less
than in their pretest. In addition, when the two
posttests are compared, the experimental group
should have less drinking than the control
group.

This design guards against the problem of
history because anything occurring outside the
experiment that might affect the experimen-
tal group should also affect the control group.
There should still be a difference in the two
posttest results. The same comparison guards
against problems of maturation as long as the
subjects have been randomly assigned to the two
groups. Testing and instrumentation should
not be problems because both the e xperimental
and the control groups are subject to the same
tests and experimenter effects. If the subjects
have been assigned to the two groups randomly,
statistical regression should affect both equally,
even if people with extreme scores on drinking
(or whatever the dependent variable is) are be-
ing studied. Selection bias is ruled out by the
random assignment of subjects.

Experimental mortality can be more compli-
cated to handle because dropout rates may be
different between the experimental and control
groups. The experimental treatment itself may
increase mortality in the group exposed to the
video. As a result, the group of experimental

justice services agency will be evaluating the
program. In establishing eligibility criteria, you
will probably try to select defendants who will
not be arrested again while on bail and defen-
dants who will most likely show up for sched-
uled court appearances. In other words, you
will try to select participants who are least
likely to fail. This common and understandable
caution is sometimes referred to as creaming—
skimming the best risks off the top. Creaming
is a threat to validity because the low-risk per-
sons selected for release, although most likely
to succeed, do not represent the jail population
as a whole.

Experimental Mortality Experimental sub-
jects often drop out of an experiment before
it is completed, and that can affect statistical
comparisons and conclusions. This is termed
experimental mortality, also known as attrition.
In the classical experiment involving an experi-
mental and a control group, each with a pretest
and a posttest, suppose that the heavy drinkers
in the experimental group are so turned off by
the video on the health effects of binge drink-
ing that they tell the experimenter to forget it
and leave. Those subjects who stick around for
the posttest were less heavy drinkers to start
with, and the group results will thus refl ect a
substantial “decrease” in alcohol use.

Mortality may also be a problem in experi-
ments that take place over a long period (people
may move away) or in experiments that require
a substantial commitment of effort or time by
subjects; they may become bored with the study
or simply decide it’s not worth the effort.

Ambiguous Causal Time Order In criminal
justice research, there may be ambiguity about
the time order of the experimental stimulus
and the dependent variable. Whenever this oc-
curs, the research conclusion that the stimulus
caused the dependent variable can be challenged
with the explanation that the “dependent” vari-
able actually caused changes in the stimulus.
Many early studies of the relationship between
different types of punishments and rates of of-

Chapter 5 Experimental and Quasi-Experimental Designs 121

Generalizability and
Threats to Validity
Potential threats to internal validity are only
some of the complications faced by experiment-
ers. They also have the problem of generalizing
from experimental fi ndings to the real world.
Even if the results of an experiment are an
accurate gauge of what happened during
that experiment, do they really tell us anything
about life in the wilds of society? With our ex-
amination of cause and effect in Chapter 3 in
mind, we consider two dimensions of gen-
eralizability: construct validity and external
validity.

Threats to Construct Validity In the lan-
guage of experimentation, construct validity is
the correspondence between the empirical test
of a hypothesis and the underlying causal pro-
cess that the experiment is intended to repre-
sent. Construct validity is thus concerned with
generalizing from our observations in an ex-
periment to causal processes in the real world.
In our hypothetical example, the educational
video is how we operationalize the construct
of understanding the health effects of alcohol
abuse. Our questionnaire represents the depen-
dent construct of actual alcohol use.

Are these reasonable ways to represent the
underlying causal process in which under-
standing the effects of alcohol use causes peo-
ple to reduce excessive or abusive drinking? It’s
a reasonable representation but also one that is
certainly incomplete. People develop an under-
standing of the health effects of alcohol use in
many ways. Watching an educational video is
one way; having personal experience, talking to
friends and parents, taking other courses, and
reading books and articles are others. Our video
may do a good job of representing the health
effects of alcohol use, but it is an incomplete
representation of that construct. Alternatively,
the video may be poorly produced, too tech-
nical, or incomplete. Then the experimental
stimulus may not adequately represent the con-
struct we are interested in— educating students

subjects that received the posttest will differ
from the group that received the pretest. In our
example of the alcohol video, it would probably
not be possible to handle this problem by ad-
ministering a placebo, for instance. In general,
however, the potential for mortality can be re-
duced by shortening the time between pretest
and posttest, by emphasizing to subjects the
importance of completing the posttest, or per-
haps by offering cash payments for participat-
ing in all phases of the experiment.

The remaining problems of internal invalid-
ity can be avoided through the careful admin-
istration of a controlled experimental design.
We emphasize careful administration. Random
assignment, pretest and posttest measures, and
use of control and experimental groups do not
automatically rule out threats to validity. This
caution is especially true in fi eld studies and
evaluation research, in which subjects partici-
pate in natural settings and uncontrolled varia-
tion in the experimental stimulus may be pres-
ent. Control over experimental conditions is
the hallmark of this approach, but conditions
in fi eld settings are usually more diffi cult to
control.

For example, Richard Berk and associates
(2003) randomly assigned several thousand in-
mates entering California prisons to an experi-
mental or traditional (control) procedure for
classifying inmate risk. That was a straight-
forward intervention that was easily adminis-
tered and unlikely to vary much; classifi cation
also took place over a short period of time.
In contrast, Denise Gottfredson and colleagues
(2006) conducted a classical experiment to
assess the effects of drug courts in reducing
recidivism. Drug courts involve a range of in-
terventions that are more diffi cult to standard-
ize. The treatment— drug-court participation—
can take place over a long period of time.
Researchers examined how long individuals
stayed in drug-court treatment, but acknowl-
edged that the quality of treatment was more
diffi cult to control and could have varied quite
a lot.

122 Part Two Structuring Criminal Justice Inquiry

carefully controlled conditions of the experi-
ment might have had something to do with the
video’s effectiveness.

In contrast, criminal justice fi eld experi-
ments are conducted in more natural settings.
Real probation offi cers in different local juris-
dictions deliver intensive supervision to real
probationers. This is not to say that external
validity is never a problem in fi eld experiments.
But one of the advantages of fi eld experiments
in criminal justice is that, because they take
place under real-world conditions, results are
more likely to be valid in other real-world set-
tings as well.

You may have detected a fundamental con-
fl ict between internal and external validity.
Threats to internal validity are reduced by con-
ducting experiments under carefully controlled
conditions. But such conditions do not refl ect
real-world settings, and this restricts our ability
to generalize results. Field experiments gener-
ally have greater external validity, but their in-
ternal validity may suffer because such studies
are more diffi cult to monitor than those tak-
ing place in more controlled settings. John Eck
describes this trade-off as a diabolical dilemma
(Eck 2002, 104).

Shadish, Cook, and Campbell (2002, 98–101)
offered some useful advice for resolving the
potential for confl ict between internal and ex-
ternal validity. Explanatory studies that test
cause-and-effect theories should place greater
emphasis on internal validity, whereas applied
studies should be more concerned with exter-
nal validity. This is not a hard and fast rule be-
cause internal validity must be established be-
fore external validity becomes an issue. That is,
applied researchers must have confi dence in the
internal validity of their cause-and-effect rela-
tionships before they ask whether similar rela-
tionships would be found in other settings.

Threats to Statistical Conclusion Validity
The basic principle of statistical conclusion va-
lidity is simple. Virtually all experimental re-
search in criminal justice is based on samples of

about the health effects of alcohol use. There
may also be problems with our measure of the
dependent variable: questionnaire items on
self-reported alcohol use.

By this time, you should recognize a similar-
ity between construct validity and some of the
measurement issues discussed in Chapter 4.
Almost any empirical example or measure of a
construct is incomplete. Part of construct valid-
ity involves how completely an empirical mea-
sure can represent a construct or how well we
can generalize from a measure to a construct.

A related issue in construct validity is
whether a given level of treatment is suffi cient.
Perhaps showing a single video to a group of
subjects would have little effect on alcohol use,
but administering a series of videos over several
weeks would have a greater impact. We could
test this experimentally by having more than
one experimental group and varying the num-
ber of videos seen by different groups.

Threats to External Validity Will an experi-
mental study, conducted with the kind of con-
trol we have emphasized here, produce results
that would also be found in more natural set-
tings? Can an intensive probation program
shown to be successful in Minneapolis achieve
similar results in Miami? External validity rep-
resents a slightly different form of generaliz-
ability, one in which the question is whether
results from experiments in one setting (time
and place) will be obtained in other settings or
whether a treatment found to be effective for
one population will have similar effects on a
different group.

Threats to external validity are greater for ex-
periments conducted under carefully controlled
conditions. If the alcohol education experiment
reveals that drinking decreased among students
in the experimental group, then we can be con-
fi dent that viewing the video led to reduced
alcohol use among our experimental subjects.
But will the video have the same effect on high
school students or adults if it is broadcast on
television? We cannot be certain because the

Chapter 5 Experimental and Quasi-Experimental Designs 123

control groups, (2) the number and variation of
experimental stimuli, (3) the number of pretest
and posttest measurements, and (4) the proce-
dures used to select subjects and assign them
to groups. By way of illustrating these building
blocks and the ways they are used to produce
different designs, we adopt the widely used sys-
tem of notation introduced by Campbell and
Stanley (1966). Figure 5.3 presents this nota-
tion and shows how it is used to represent the
classical experiment and examples of variations
on this design.

In Figure 5.3, the letter O represents obser-
vations or measurements, and X represents an
experimental stimulus or treatment. Different
time points are displayed as t, with a subscript
number to represent time order. Thus for the
classical experiment shown in Figure 5.3, O
at t1 is the pretest, O at t3 is the posttest, and
the experimental stimulus, X, is administered
to the experimental group at t2, between the
pretest and posttest. Measures are taken for
the control group at times t1 and t3, but the

subjects that represent a target population.
Larger samples of subjects, up to a point, are
more representative of the target population
than are smaller samples. Statistical conclu-
sion validity most often becomes an issue when
fi ndings are based on small samples of cases.
Because experiments can be costly and time
consuming, they are frequently conducted with
relatively small numbers of subjects. In such
cases, only large differences between experimen-
tal and control groups on posttest measures
can be detected with any degree of confi dence.

In practice, this means that fi nding cause-
and-effect relationships through experiments
depends on two related factors: (1) the number
of subjects and (2) the magnitude of posttest
differences between the experimental and con-
trol groups. Experiments with large numbers of
cases may be able to reliably detect small differ-
ences, but experiments with smaller numbers
can detect only large differences.

Threats to statistical conclusion validity can
be magnifi ed by other diffi culties in fi eld ex-
periments. Unreliable measurement is one such
problem that is often encountered in criminal
justice research. More generally, Weisburd and
associates (1993) concluded, after reviewing a
large number of criminal justice experiments,
that failure to maintain control over experi-
mental conditions reduces statistical conclu-
sion validity even for studies with large num-
bers of subjects.

Variations in the Classical
Experimental Design
The basic experimental design is adapted to meet
different research applications.

We now turn to a more systematic considera-
tion of variations on the classical experiment
that can be produced by manipulating the
building blocks of experiments.

Slightly restating our earlier remarks, four
basic building blocks are present in experimen-
tal designs: (1) the number of experimental and

Classical Experiment
Experimental group O X O
Control group O O

t1 t2 t3

Time
⎯⎯⎯⎯⎯⎯⎯⎯→

O � observation or
measurement

X � experimental stimulus
t � time point

Posttest Only
Experimental group X O
Control group O
t1 t2

Factorial
Experimental treatment 1 O X1 O
Experimental treatment 2 O X2 O
Control O O

t1 t2 t3

Figure 5.3 Variations in the Experimental
Design

124 Part Two Structuring Criminal Justice Inquiry

gle treatment, and one control group. This de-
sign is useful for comparing the effects of dif-
ferent interventions or different amounts of a
single treatment. In evaluating a probation pro-
gram, we might wish to compare how different
levels of contact between probation offi cers and
probation clients affect recidivism. In this case,
subjects in one experimental group might re-
ceive weekly contact (X1), the other experimen-
tal group be seen by probation offi cers twice
each week (X2), and control-group subjects have
normal contact (say, monthly) with probation
offi cers. Because more contact is more expen-
sive than less contact, we would be interested in
seeing how much difference in recidivism was
produced by monthly, weekly, and twice-weekly
contacts.

Thus an experimental design may have more
than one group receiving different versions or
levels of experimental treatment. We can also
vary the number of measurements made on
dependent variables. No hard and fast rules ex-
ist for using these building blocks to design a
given experiment. A useful rule of thumb, how-
ever, is to keep the design as simple as possible
to control for potential threats to validity. The
specifi c design for any particular study depends
on the research purpose, available resources,
and unavoidable constraints in designing and
actually carrying out the experiment.

One very common constraint is how sub-
jects or units of analysis are selected and as-
signed to experimental or control groups. This
building block brings us to the subject of quasi-
experimental designs.

Quasi-Experimental Designs
When randomization is not possible, researchers can
use different types of quasi-experimental designs.

By now, the value of random assignment in con-
trolling threats to validity should be apparent.
However, it is often impossible to randomly
select subjects for experimental and control
groups and satisfy other requirements. Most

experimental stimulus is not administered to
the control group.

Now consider the design labeled “Posttest
Only.” As implied by its name, no pretest mea-
sures are made on either the experimental or the
control group. Thinking for a moment about
the threats to internal validity, we can imagine
situations in which a posttest-only design is ap-
propriate. Testing and retesting might especially
infl uence subjects’ behavior if measurements
are made by administering a questionnaire,
with subjects’ responses to the posttest poten-
tially affected by their experience in the pretest.
A posttest-only design can reduce the possibility
of testing being a threat to validity by eliminat-
ing the pretest.

Without a pretest, it is obviously not pos-
sible to detect change in measures of the depen-
dent variable, but we can still test the effects of
the experimental stimulus by comparing post-
test measures for the experimental group with
posttest measures for the control group. For ex-
ample, if we are concerned about the possibility
of sensitizing subjects in a study of an alcohol
education video, we might eliminate the pretest
and examine the posttest differences between
the experimental and control groups. Random-
ization is the key to the posttest-only design. If
subjects are randomly assigned to experimental
and control groups, we expect them to be equiv-
alent. Any posttest differences between the two
groups on the dependent variable can then be
attributed to the infl uence of the video.

In general, posttest-only designs are appro-
priate when researchers suspect that the process
of measurement may bias subjects’ responses
to a questionnaire or other instrument. This
is more likely when only a short time elapses
between pretest and posttest measurements.
The number of observations made on subjects
is a design building block that can be varied as
needed. We emphasize here that random assign-
ment is essential in a posttest-only design.

Figure 5.3 also shows a factorial design,
which has two experimental groups that receive
different treatments, or different levels of a sin-

Chapter 5 Experimental and Quasi-Experimental Designs 125

approaches to matching and the creative use of
experimental design building blocks. Examples
include studies of child abuse (Widom 1989a),
obscene phone calls (Clarke 1997a), and video
cameras for crime prevention (Gill and Spriggs
2005). Figure 5.4 shows a diagram of each de-
sign using the X, O, and t notation. The solid
line that separates treatment and comparison
groups in the fi gure signifi es that subjects have
been placed in groups through some nonran-
dom procedure.

often, there may be practical or administrative
obstacles. There may also be legal or ethical rea-
sons randomization cannot be used in criminal
justice experiments.

When randomization is not possible, the
next-best choice is often a quasi-experiment.
The prefi x quasi-, meaning “to a certain degree,”
is signifi cant—a quasi-experiment is, to a cer-
tain degree, an experiment. In most cases, quasi-
experiments do not randomly assign subjects
and therefore may suffer from the internal va-
lidity threats that are so well controlled in true
experiments. Without random assignment,
the other building blocks of experimental de-
sign must be used creatively to reduce validity
threats. We group quasi-experimental designs
into two categories: (1) nonequivalent-groups
designs and (2) time-series designs. Each can be
represented with the same O, X, and t notation
used to depict experimental designs.

Nonequivalent-Groups Designs
The name for this family of designs is also
meaningful. The main strength of random as-
signment is that it allows us to assume equiva-
lence in experimental and control groups. When
it is not possible to create groups through ran-
domization, we must use some other procedure,
one that is not random. If we construct groups
through a nonrandom procedure, however, we
cannot assume that the groups are equivalent—
hence the label nonequivalent-groups design.

Whenever experimental and control groups
are not equivalent, we should select subjects
in a way that makes the two groups as compa-
rable as possible. Often the best way to achieve
comparability is through a matching process in
which subjects in the experimental group are
matched with subjects in a comparison group.
The term comparison group is commonly used,
rather than control group, to highlight the non-
equivalence of groups in quasi-experimental de-
signs. A comparison group does, however, serve
the same function as a control group.

Some examples of research that use non-
equivalent-groups designs illustrate various

Widom (1989a)
Treatment group X O

Comparison group O
t1 t2

X � official record of child abuse
O � counts of juvenile or adult arrest

Clarke (1997a)
Treatment group O X O

Comparison group O O

t1 t2 t3

X � caller identification and call tracing
O � customer complaints of obscene calls

Gill and Spriggs (2005)
Target area 1 O X1 O

Comparison area 1 O O
Target area 2 O X2 O

Comparison area 2 O O

Target area 13 O X13 O

Comparison area 13 O O
t1 t2 t3

Xi � CCTV installation in area i
O � Police crime data, survey data on fear of

crime

Figure 5.4 Quasi-Experimental Design
Examples

126 Part Two Structuring Criminal Justice Inquiry

Deterring Obscene Phone Calls In 1988, the
New Jersey Bell telephone company introduced
caller identifi cation (ID) and instant call trac-
ing in a small number of telephone exchange
areas. Now ubiquitous in mobile phones, caller
ID was a new technology in 1988. Instant call
tracing allows the recipient of an obscene or
threatening call to automatically initiate a pro-
cedure to trace the source of the call.

Ronald Clarke (1997a) studied the effects
of these new technologies in deterring obscene
phone calls. Clarke expected that obscene calls
would decrease in areas where the new services
were available. To test this, he compared records
of formal customer complaints about annoying
calls in the New Jersey areas that had the new
services to formal complaints in other New Jer-
sey areas where caller ID and call tracing were
not available. One year later, the number of for-
mal complaints had dropped sharply in areas
serviced by the new technology; no decline was
found in other New Jersey Bell areas.

In this study, telephone service areas with
new services were the treatment group, and ar-
eas without the services were the comparison
group. Clarke’s matching criterion was a simple
one: telephone service by New Jersey Bell, as-
suming the volume of obscene phone calls was
relatively constant within a single phone service
area. Of course, matching on telephone ser-
vice area cannot eliminate the possibility that
the volume of obscene phone calls varies from
one part of New Jersey to another, but Clarke’s
choice of a comparison group was straightfor-
ward and certainly more plausible than com-
paring New Jersey to, say, New Mexico.

Clarke’s study is a good example of a natural
fi eld experiment. The experimental stimulus—
caller ID and call tracing—was not specifi cally
introduced by Clarke, but he was able to obtain
measures for the dependent variable before
and after the experimental stimulus was intro-
duced. This design made it possible for Clarke
to infer with reasonable confi dence that caller
ID and call tracing reduced the number of for-
mal complaints about obscene phone calls.

Child Abuse and Later Arrest Cathy Spatz
Widom studied the long-term effects of child
abuse—whether abused children are more likely
to be charged with delinquent or adult criminal
offenses than children who were not abused.
Child abuse was the experimental stimulus,
and the number of subsequent arrests was the
dependent variable.

Of course, it is not possible to assign chil-
dren randomly to groups in which some
are abused and others are not. Widom’s de-
sign called for selecting a sample of children
who, according to court records, had been
abused. She then matched each abused subject
with a comparison subject— of the same gen-
der, race, age, and approximate socioeconomic
status (SES)—who had not been abused. The
assumption with these matching criteria was
that age at the time of abuse, gender, race, and
SES differences might confound any observed
relationship between abuse and subsequent
arrests.

You may be wondering how a researcher se-
lects important variables to use in matching ex-
perimental and comparison subjects. We cannot
provide a defi nitive answer to that question, any
more than we can specify what particular vari-
ables should be used in a given experiment. The
answer ultimately depends on the nature and
purpose of the experiment. As a general rule,
however, the two groups should be comparable
in terms of variables that are likely to be related
to the dependent variable under study. Widom
matched on gender, race, and SES because these
variables are correlated with juvenile and adult
arrest rates. Age at the time of reported abuse
was also an important variable because children
abused at a younger age had a longer “at-risk”
period for delinquent arrests.

Widom produced experimental and com-
parison groups matching individual subjects. It
is also possible to construct experimental and
comparison groups through aggregate match-
ing, in which the average characteristics of each
group are comparable. This is illustrated in our
next example.

Chapter 5 Experimental and Quasi-Experimental Designs 127

ing), and because CCTV was carefully tailored
to each site. Instead, the researchers created
two types of comparison areas. First, compari-
son areas “were selected by similarity on socio-
demographic and geographical characteristics
and crime problems.” The second type of com-
parison was “buffer zones,” defi ned as an area
in a one-mile radius from the edge of the target
area where CCTV cameras were installed; buffer
zones were defi ned only for CCTV areas.

The rationale for comparison areas is clear.
If CCTV is effective in reducing crime, we
should expect declines in target areas, but not
in comparison areas. Alternatively, if post-
treatment measures of crime went down in both
treatment and comparison areas, we might
expect greater declines in the CCTV sites. But
what about buffer areas? After defi ning buffer
areas, researchers then subdivided them into
concentric rings around a target area, shown as
T in Figure 5.5. The stated purpose was to as-
sess any movement of crime around the target
area. If CCTV was effective in reducing crime,
any reduction should be greatest in the target
area; the size of the reduction should decline
moving outward from the target area.

Short-term results found some reduction
of some types of crime in some CCTV areas. In
other treatment areas, some crimes increased
more than in comparison areas. In particular,
Gill and Spriggs found that public order of-
fenses such as drunkenness tended to increase
more in CCTV target areas. Overall, signifi cant
drops in crime were found in just 2 of 13 target
areas. Fear and related attitudes declined in all
target and comparison areas, but the authors
believed this was largely due to declining crime
in all areas.

This example illustrates why nonequivalent
comparison groups are important. Because
crime declined in most areas and fear declined
in all, a simple comparison of pre- and post-
intervention measures would have been mis-
leading. That strategy would have suggested
that CCTV was responsible for reduced crime
and fear. Only by adding the comparison and

Cameras and Crime Prevention U.S. resi-
dents have probably become accustomed to
seeing closed-circuit television (CCTV) cameras
in stores and at ATMs, but this technology is
less used in public spaces such as streets and
parking lots. With an estimated 4 million cam-
eras deployed, CCTV is widely used as a crime
prevention and surveillance tool in the United
Kingdom (McCahill and Norris 2003). CCTV
enabled the London Metropolitan Police to
quickly identify suspects in the Underground
bombing attacks that took place in 2005. Cam-
eras are increasingly used to monitor traffi c,
and even record license plates of cars running
traffi c lights. But does CCTV have any effect in
reducing crime?

Martin Gill and associates (Gill and Spriggs
2005; Gill, Spriggs, Argomaniz, et al. 2005)
conducted an evaluation of 13 CCTV projects
installed in a variety of residential and commer-
cial settings in England. These were a mix of
smaller and large-scale CCTV projects involv-
ing multiple cameras. One area on the outskirts
of London included more than 500 cameras
installed to reduce thefts of and from vehicles
in parking facilities. Five projects in London
and other urban areas placed 10 to 15 cameras
in low-income housing areas, seeking to reduce
burglary and robbery. Researchers examined two
types of dependent variables before and after
cameras were installed: crimes reported to
police and fear of crime. Fear was measured
through surveys of people living in residential
areas, and samples of people on local streets for
commercial areas and parking facilities.

Measuring police data and fear of crime be-
fore and after cameras were installed made it
possible for Gill and associates to satisfy two
criteria for cause—time order and covariation
between the independent variable (CCTV) and
dependent variables. However, they were not
able to randomly assign some areas to receive
the CCTV intervention, while other areas did
not. This was because the intervention was
planned for only a small number of locations
of each type (residential, commercial, park-

128 Part Two Structuring Criminal Justice Inquiry

Now think of a cohort that is exposed to
some experimental stimulus. The May proba-
tion cohort might be required to complete
100 hours of community service in addition to
meeting other conditions of probation. If we are
interested in whether probationers who receive
community service sentences are charged with
fewer probation violations, we can compare the
performance of the May cohort with that of the
April cohort, or the June cohort, or some other
cohort not sentenced to community service.

Cohorts that do not receive community
service sentences serve as comparison groups.
The groups are not equivalent because they
were not created by random assignment. But if
we assume that a comparison cohort does not
systematically differ from a treatment cohort
on important variables, we can use this design
to determine whether community service sen-
tences reduce probation violations.

That last assumption is very important, but
it may not be viable. Perhaps a criminal court
docket is organized to schedule certain types
of cases at the same time, so a May cohort
would be systematically different from a June
cohort. But if the assumption of comparability
can be met, cohorts may be used to construct
nonequivalent comparison and experimental
groups by taking advantage of the natural fl ow
of cases through an institutional process.

Time-Series Designs
Time-series designs are common examples
of longitudinal studies in criminal justice re-
search. As the name implies, a time-series de-
sign involves examining a series of observations
on some variable over time. A simple example
is examining trends in arrests for drunk driving
over time to see whether the number of arrests
is increasing, decreasing, or staying constant. A
police executive might be interested in keeping
track of arrests for drunk driving, or for other
offenses, as a way of monitoring the perfor-
mance of patrol offi cers. Or state corrections
offi cials might want to study trends in prison
admissions as a way of predicting the future
need for correctional facilities.

buffer areas to their research were Gill and
Spriggs able to learn that CCTV was probably
not the cause of declines, since similar patterns
were found in many areas where CCTV systems
were not installed.

Together, these three studies illustrate dif-
ferent approaches to research design when it
is not possible to randomly assign subjects
to treatment and control groups. Lacking
random assignment, researchers must use cre-
ative procedures for selecting subjects, con-
structing treatment and comparison groups,
measuring dependent variables, and exercising
other controls to reduce possible threats to
validity.

Cohort Designs
Chapter 3 mentioned cohort studies as ex-
amples of longitudinal designs. We can also
view cohort studies as a type of nonequivalent-
groups design. Recall from Chapter 3 that a co-
hort may be defi ned as a group of subjects who
enter or leave an institution at the same time.
For example, a class of police offi cers who grad-
uate from a training academy at the same time
could be considered a cohort. Or we might view
all persons who were sentenced to probation in
May as a cohort.

Figure 5.5 Buffer Zones in CCTV Quasi-
Experiment
Source: Adapted from Gill and Spriggs (2005, 40).

T123

Chapter 5 Experimental and Quasi-Experimental Designs 129

nize this as an example of history as a validity
threat to the inference that the new checkpoint
program caused a change in auto accidents.
The general decline in pattern 1 may be due
to reduced drunk driving that has nothing to
do with sobriety checkpoints. Pattern 2 illus-
trates what is referred to as seasonality in a time
series—a regular pattern of change over time.
In our example, the data might refl ect seasonal
variation in alcohol-related accidents that oc-
curs around holidays or maybe on football
weekends near a college campus.

Patterns 3 and 4 lend more support to the
inference that sobriety checkpoints caused a
decline in alcohol-related accidents, but the two
patterns are different in a subtle way. In pattern 3,
accidents decline more sharply from a general
downward trend immediately after the check-
point program was introduced, whereas pattern
4 displays a sharper decline sometime after the
new program was established. Which pattern
provides stronger support for the inference?

In framing your answer, recall what we have
said about construct validity. Think about the
underlying causal process these two patterns
represent, or consider possible mechanisms
that might be at work. Pattern 3 suggests that
the program was immediately effective and
supports what we might call an incapacitation
mechanism: roadside checkpoints enabled po-
lice to identify and arrest drunk drivers, thereby
getting them off the road and reducing acci-
dents. Pattern 4 suggests a deterrent mecha-
nism: as drivers learned about the checkpoints,
they less often drove after drinking, and acci-
dents eventually declined. Either explanation
is possible given the evidence presented. This il-
lustrates an important limitation of i nterrupted
time-series designs: they operationalize com-
plex causal constructs in simple ways. Our in-
terpretation depends in large part on how we
understand this causal process.

The classic study by Richard McCleary and as-
sociates (McCleary, Nienstedt, and Erven 1982)
illustrates the need to think carefully about
how well time-series results refl ect underly-
ing causal patterns. McCleary and colleagues

An interrupted time series is a special type
of time-series design that can be used in cause-
and-effect studies. A series of observations is
compared before and after an intervention is
introduced. For example, a researcher might
want to know whether roadside sobriety check-
points cause a decrease in fatal automobile ac-
cidents. Trends in accidents could be compared
before and after the roadside checkpoints are
established.

Interrupted time-series designs can be very
useful in criminal justice research, especially in
applied studies. They do have some limitations,
however, just like other ways of structuring re-
search. Shadish, Cook, and Campbell (2002) de-
scribed the strengths and limitations of differ-
ent approaches to time-series designs. We will
introduce these approaches with a hypothetical
example and then describe some specifi c crimi-
nal justice applications.

Continuing with the example of sobriety
checkpoints, Figure 5.6 presents four possible
patterns of alcohol-related automobile acci-
dents. The vertical line in each pattern shows
the time when the roadside checkpoint pro-
gram is introduced. Which of these patterns
indicates that the new program caused a reduc-
tion in car accidents?

If the time-series results looked like pattern 1
in Figure 5.6, we might think initially that the
checkpoints caused a reduction in alcohol-
related accidents, but there seems to be a gen-
eral downward trend in accidents that contin-
ues after the intervention. It’s safer to conclude
that the decline would have continued even
without the roadside checkpoints.

Pattern 2 shows that an increasing trend in
auto accidents has been reversed after the inter-
vention, but this appears to be due to a regular
pattern in which accidents have been bouncing
up and down. The intervention was introduced
at the peak of an upward trend, and the later
decline may be an artifact of the underlying
pattern rather than of the new program.

Patterns 1 and 2 exhibit some outside trend,
rather than an intervention, that may account
for a pattern observed over time. We may recog-

130 Part Two Structuring Criminal Justice Inquiry

glaries, after investigating incidents over a pe-
riod of time and making arrests. But it is highly
u nlikely that changing investigative procedures
would have an immediate impact. This dis-
crepancy prompted McCleary and associates to
look more closely at the policy change and led
to their conclusion that the apparent decline

reported a sharp decline in burglaries imme-
diately after a special burglary investigation
unit was established in a large city. This fi nding
was at odds with their understanding of how
police investigations could reasonably be ex-
pected to reduce burglary. A special unit might
eventually be able to reduce the number of bur-

Pattern 1

Sobriety
checkpoints

Fatal
accidents

60

50

40

30

20

10

0
1 2 3 4 5

Week

6 7 8

Pattern 2

Sobriety
checkpoints

Fatal
accidents

60

50

40

30

20

10

0
1 2 3 4 5

Week

6 7 8

Figure 5.6 Four Patterns of Change in Fatal Automobile Accidents (Hypothetical Data)

Chapter 5 Experimental and Quasi-Experimental Designs 131

what appeared to be a reduction in burglary.
Instrumentation can be a particular problem
in time-series designs for two reasons. First,
observations are usually made over a relatively
long time period, which increases the likeli-
hood of changes in measurement instruments.

in burglaries was produced by changes in
record-keeping practices. No evidence existed of
any decline in the actual number of burglaries.

This example illustrates our discussion of in-
strumentation earlier in this chapter. Changes
in the way police counted burglaries produced

Pattern 3

Sobriety
checkpoints

Fatal
accidents

60

50

40

30

20

10

0
1 2 3 4 5

Week

6 7 8

Pattern 4

Sobriety
checkpoints

Fatal
accidents

60

50

40

30

20

10

0
1 2 3 4 5

Week

6 7 8

Figure 5.6 (continued)

132 Part Two Structuring Criminal Justice Inquiry

of police problem solving in Chicago. They ex-
amined changes in crime for police beats where
specifi c problems were identifi ed and addressed
to comparison beats where no crime-specifi c
interventions were developed.

A single-series design may be modifi ed by
introducing and then removing the interven-
tion, as shown in the third part of Figure 5.7.
We might test sobriety checkpoints by setting
them up every weekend for a month and then
not setting them up for the next few months. If
the checkpoints caused a reduction in alcohol
related accidents, we might expect an increase
after they were removed. Or the effects of week-
end checkpoints might persist even after we re-
moved them.

Because different states or cities sometimes
introduce new drunk-driving programs at dif-
ferent times, we might be able to use what
Shadish, Cook, and Campbell (2002, 192) called
a “time-series design with switching replica-
tions.” The bottom of Figure 5.7 illustrates this
design. For example, assume that Ohio begins
using checkpoints in May 1998 and Michigan

Second, time-series designs often use measures
that are produced by an organization such as a
police department, criminal court, probation
offi ce, or corrections department. There may be
changes or irregularities in the way data are col-
lected by these agencies that are not readily ap-
parent to researchers and that are, in any case,
not subject to their control.

Variations in Time-Series Designs
If we view the basic interrupted time-series de-
sign as an adaptation of basic design building
blocks, we can consider how modifi cations can
help control for many validity problems. The
simplest time-series design studies one group—
the treatment group— over time. Rather than
making one pretest and one posttest observa-
tion, the interrupted time-series design makes
a longer series of observations before and after
introducing an experimental treatment.

What if we considered the other building
blocks of experimental design? Figure 5.7 pres-
ents the basic design and some variations using
the familiar O, X, and t notation. In the basic de-
sign, shown at the top of Figure 5.7, many pre-
test and posttest observations are made on a sin-
gle group that receives some type of treatment.

We could strengthen this design by adding
a comparison series of observations on a group
that does not receive the treatment. If, for ex-
ample, roadside sobriety checkpoints were in-
troduced all over the state of Ohio but were not
used at all in Michigan, then we could compare
auto accidents in Ohio (the treatment series)
with auto accidents in Michigan (the compari-
son series). If checkpoints caused a reduction
in alcohol-related accidents, we would expect
to see a decline in Ohio following the inter-
vention, but there should be no change or a
lesser decline in Michigan over the same time
period. The second part of Figure 5.7 shows
this design—an interrupted time series with a
nonequivalent comparison group. The two se-
ries are not equivalent because we did not ran-
domly assign drivers to Ohio or Michigan. So
Young Kim and Wesley Skogan (2003) present
a good example of this design in their analysis

Simple Interrupted Time Series
O O O O X O O O O
t1 t2 t3 t4 t5 t6 t7 t8

Interrupted Time Series with
Nonequivalent Comparison Group

O O O O X O O O O

O O O O O O O O
t1 t2 t3 t4 t5 t6 t7 t8

Interrupted Time Series
with Removed Treatment

O O X O O O –X O O O
t1 t2 t3 t4 t5 t6 t7 t8

Interrupted Time Series
with Switching Replications

O O O X O O O O O

O O O O O X O O O
t1 t2 t3 t4 t5 t6 t7 t8

Figure 5.7 Interrupted Time-Series Designs

Chapter 5 Experimental and Quasi-Experimental Designs 133

rectional facilities. Using a variable-oriented
approach, we might visit one or a few facilities
to conduct in-depth interviews with staff, ob-
serve the condition of facilities, and gather in-
formation from institutional records. Here, we
are collecting information on a wide range of
variables from a small number of institutions.

The case-study design is an example of vari-
able-oriented research. Here, the researcher’s at-
tention centers on an in-depth examination of
one or a few cases on many dimensions. Robert
Yin (2003) points out that the terms case and
case study are used broadly. Cases can be individ-
ual people, neighborhoods, correctional facili-
ties, courtrooms, or other aggregations.

Robert Yin cautions that the case study de-
sign is often misunderstood as representing
“qualitative” research or participant observa-
tion study. Instead, Yin advises that the case
study is a design strategy and that the labels
qualitative and quantitative are not useful ways
to distinguish design strategies. Case studies
might appear qualitative because they focus
on one or a small number of units. But many
case studies employ sophisticated statistical
techniques to examine many variables for those
units. An example illustrates how misleading it
can be to associate case studies with qualitative
research.

In what has come to be known as the “Bos-
ton Gun Project,” Anthony Braga and associ-
ates (Braga, Kennedy, Waring, and Piehl 2001)
studied violence by youth gangs in Boston
neighborhoods. Theirs was an applied explana-
tory study. They worked with local offi cials to
better understand gang violence, develop ways
to reduce it, and eventually assess the effects of
their interventions. Neither a classical experi-
ment nor a nonequivalent-groups design was
possible. Researchers sought to understand and
reduce violence by all gangs in the city. Their
research centered on gangs, not individuals,
though some interventions targeted particular
gang members.

Researchers collected a large amount of
information about gangs and gang violence
from several sources. Researchers used network

introduces them in July of the same year. A
switching-replications design could strengthen
our conclusion that checkpoints reduce acci-
dents if we saw that a decline in Ohio began in
June and a similar pattern was found in Michi-
gan beginning in August. The fact that similar
changes occurred in the dependent variable in
different states at different times, correspond-
ing to when the program was introduced, would
add to our confi dence in stating that sobriety
checkpoints reduced auto accidents.

Variable-Oriented Research and
Scientifi c Realism
Another way to think about a time-series design
is as a study of one or a few cases with many
observations. If we design a time-series study of
roadside checkpoints in Ohio, we will be exam-
ining one case (Ohio) with many observations
of auto accidents. Or a design that compares
Ohio and Michigan will examine many observa-
tions for two cases. Thinking once again about
design building blocks, notice how we have
slightly restated one of those building blocks.
Instead of considering the number of experi-
mental and control groups, our attention centers
on the number of subjects or cases in our study.
In Figure 5.7, the fi rst and third time-series de-
signs have one case each, while the second and
fourth designs examine two cases each.

Classical experiments and quasi-experiments
with large numbers of subjects are examples of
what Charles Ragin (2000) terms case-oriented
research, in which many cases are examined to
understand a small number of variables. Time-
series designs and case studies are examples of
variable-oriented research, in which a large
number of variables are studied for a small
number of cases or subjects. Suppose we wish
to study inmate-on-inmate assaults in correc-
tional facilities. With a case-oriented approach,
we might send a questionnaire to a sample of
500 correctional facilities, asking facility staff
to provide information about assaults, facil-
ity design, inmate characteristics, and housing
conditions. Here, we are gathering information
on a few variables from a large number of cor-

134 Part Two Structuring Criminal Justice Inquiry

violence operated as a different mechanism;
the “levers” pulled in Boston did not work else-
where. Braga and associates emphasize that the
problem-solving process is exportable to other
settings but that the interventions used in Bos-
ton are not (2001, 220).

How do case studies address threats to valid-
ity? In the most general sense, case studies at-
tempt to isolate causal mechanisms from pos-
sible confounding infl uences by studying very
precisely defi ned subjects. Donald Campbell
(2003, ix–x) likened this to laboratory experi-
ments in the natural sciences, in which research-
ers try to isolate causal variables from outside
infl uences. Case-study research takes place in
natural fi eld settings, not in laboratories. But
the logic of trying to isolate causal mechanisms
by focusing on one or a few cases is a direct de-
scendant of the rationale for experimental iso-
lation in laboratories.

Figure 5.8 summarizes advice from Yin
(2003, 33–39) on how to judge the quality of
case-study designs in language that should
now be familiar. Construct validity is estab-
lished through multiple sources of evidence,
the establishment of chains of causation that
connect independent and dependent variables,
and what are termed member checks—asking
key informants to review tentative conclu-
sions about causation. Examples of techniques
for strengthening internal validity are theory-
based pattern matching and time-series analy-
sis. The fi rst criterion follows Shadish, Cook,
and Campbell, calling on researchers to make
specifi c theory-based predictions about what
pattern of results will support hypothesized
causal relationships. Alternative explanations,
also termed rival hypotheses, are less persuasive
when specifi c predictions of results are actu-
ally obtained. For example, Braga and associ-
ates (2001) predicted that gun killings among
male Boston residents under age 25 would de-
cline following implementation of the package
of interventions in the Boston gun strategy. Al-
though other explanations are possible for the
sharp observed declines, the specifi c focus of

analysis to examine relationships between
gangs in different neighborhoods and confl icts
over turf within neighborhoods. Police records
of homicides, assaults, and shootings were
studied. Based on extensive data on a small
number of gangs, researchers collaborated with
public offi cials, neighborhood organizations,
and a coalition of religious leaders—the “faith
community.” A variety of interventions were
devised, but most were crafted from a detailed
understanding of the specifi c nature of gangs
and gang violence as they existed in Boston
neighborhoods. David Kennedy (1998) sum-
marizes these using the label “pulling levers,”
signifying that key gang members were vulner-
able to intensive monitoring via probation or
parole. The package of strategies was markedly
successful: Youth homicides were reduced from
about 35 to 40 each year in the 20 years preced-
ing the program to about 15 per year in the fi rst
5 post-intervention years (Braga 2002, 70).

The Boston research is also a good example
of the scientifi c realist approach of Ray Pawson
and Nick Tilley (1997). Researchers examined
a small number of subjects—gangs and gang
members—in a single city and in the context
of specifi c neighborhoods where gangs were ac-
tive. Extensive data were gathered on the mech-
anisms of gang violence. Interventions were tai-
lored to those mechanisms in their context.

Braga and associates (2001) emphasize that
the success of the Boston efforts was due to the
process by which researchers, public offi cials, and
community members collaboratively studied
gang violence and then developed appropriate
policy actions based on their analyses. Other ju-
risdictions mistakenly tried to reproduce Bos-
ton’s interventions, with limited or no success,
failing to recognize that the interventions were
developed specifi cally for Boston. In case-study
language, researchers examined many variables
for one site and based policy decisions on that
analysis. In the words of scientifi c realism, re-
searchers studied the gang violence mecha-
nism in the Boston context. In other contexts
(Baltimore or Minneapolis, for example), gang

Chapter 5 Experimental and Quasi-Experimental Designs 135

are conducted in the manner of experiments,
using design building blocks in different ways.

Experimental and Quasi-
Experimental Designs
Summarized
Understanding the building blocks of research de-
sign and adapting them accordingly works better
than trying to apply the same design to all research
questions.

By now it should be clear that there are no sim-
ple formulas or recipes for designing an experi-
mental or quasi-experimental study. Research-
ers have an almost infi nite variety of ways of
varying the number and composition of groups
of subjects, selecting subjects, determining how
many observations to make, and deciding what
types of experimental stimuli to introduce or
study.

Variations on experimental and quasi-
experimental designs are constructed for basic
and applied explanatory studies. As we stated
early in this chapter, experiments are best
suited to topics that involve well-defi ned con-
cepts and propositions. Experiments and quasi-
experiments also require that researchers be
able to exercise, or at least approximate, some
degree of control over an experimental stimu-
lus. Finally, these designs depend on the ability
to unambiguously establish the time order of
experimental treatments and observations on
the dependent variable. Often it is not possible
to achieve the necessary degree of control.

In designing research projects, researchers
should be alert to opportunities for using ex-
perimental designs. Researchers should also be
aware of how quasi-experimental designs can
be developed when randomization is not pos-
sible. Experiments and quasi-experiments lend
themselves to a logical rigor that is often much
more diffi cult to achieve in other modes of ob-
servation. The building blocks of research de-
sign can be used in creative ways to address a
variety of criminal justice research questions.

the researchers’ interventions and the concomi-
tant results undermine the credibility of rival
hypotheses. Having many measures of variables
over time strengthens internal validity if ob-
servations support our predicted expectations
about cause. We saw earlier how nonequivalent
time-series comparisons and switching replica-
tions can enhance fi ndings. This is also consis-
tent with pattern matching—we make specifi c
statements about what patterns of results we
expect in our observations over time.

Finally, a single case study is vulnerable to ex-
ternal validity threats because it is rooted in the
context of a specifi c site. Conducting multiple
case studies in different sites illustrates the prin-
ciple of replication. By replicating research fi nd-
ings, we accumulate evidence. We may also fi nd
that causal relationships are different in dif-
ferent settings, as did researchers who tried to
transplant specifi c interventions from the Bos-
ton Gun Project. Although such fi ndings can
undermine the generalizability of causality, they
also help us understand how causal mechanisms
can operate differently in different settings.

Time-series designs and case studies are ex-
amples of variable-oriented research. A case
study with many observations over time can
be an example of a time-series design. Adding
one or more other cases offers opportunities to
create nonequivalent comparisons. Time-series
designs, case studies, and nonequivalent com-
parisons are quasi-experimental designs—they

Case Study Approach

Construct Validity Multiple sources of evidence
Establish chain of
causation
Member checks
Internal Validity Pattern-matching
Time-series analysis
External Validity Replicate through multiple
case studies

Figure 5.8 Case Studies and Validity
Source: Adapted from Yin (2003, 34).

136 Part Two Structuring Criminal Justice Inquiry

✪ Key TermsCareful attention to design issues, and to how
design elements can reduce validity threats, is
essential to the research process.

✪ Main Points
• Experiments are an excellent vehicle for the con-

trolled testing of causal processes. Experiments
may also be appropriate for evaluation studies.

• The classical experiment tests the effect of an
experimental stimulus on some dependent vari-
able through the pretesting and posttesting of
experimental and control groups.

• It is less important that a group of experimen-
tal subjects be representative of some larger
population than that experimental and control
groups be similar to each other.

• Randomization is the best way to achieve com-
parability in the experimental and control
groups.

• The classical experiment with random assign-
ment of subjects guards against most of the
threats to internal invalidity.

• Because experiments often take place under
controlled conditions, results may not be gen-
eralizable to real-world constructs. Or fi ndings
from an experiment in one setting may not ap-
ply to other settings.

• The classical experiment may be modifi ed
to suit specifi c research purposes by chang-
ing the number of experimental and control
groups, the number and types of experimental
stimuli, and the number of pretest or posttest
measurements.

• Quasi-experiments may be conducted when it is
not possible or desirable to use an experimental
design.

• Nonequivalent-groups and time-series designs
are two general types of quasi-experiments.

• Time-series designs and case studies are exam-
ples of variable-oriented research, in which a
large number of variables are examined for one
or a few cases.

• Both experiments and quasi-experiments may
be customized by using design building blocks
to suit particular research purposes.

• Not all research purposes and questions are
amenable to experimental or quasi-experimen-
tal designs because researchers may not be able
to exercise the required degree of control.

case study, p. 133
case-oriented

research, p. 133
classical experiment,

p. 113
control group, p.115
dependent variable,

p. 114
experimental group,

p. 115

generalizability,
p. 121

independent vari-
able, p. 114

quasi-experiment,
p. 125

randomization,
p. 117

variable-oriented
research, p. 133

✪ Review Questions and Exercises
1. If you do not remember participating in

D.A.R.E.—Drug Abuse Resistance Education—
you have probably heard or read something
about it. Describe an experimental design to
test the causal hypothesis that D.A.R.E. reduces
drug use. Is your experimental design feasible?
Why or why not?

2. Experiments are often conducted in public
health research where a distinction is made be-
tween an effi cacy experiment and an effective-
ness experiment. Effi cacy experiments focus
on whether a new health program works un-
der ideal conditions; effectiveness experiments
test the program under typical conditions that
health professionals encounter in their day-to-
day work. Discuss how effi cacy experiments
and effectiveness experiments refl ect concerns
about internal validity threats on the one hand
and generalizability on the other.

3. Crime hot spots are areas where crime reports,
calls for police service, or other measures of
crime are especially common. Police in depart-
ments with a good analytic capability routinely
identify hot spots and launch special tactics
to reduce crime in these areas. What kinds of
v alidity threats should researchers be especially
attentive to in studying the effects of police in-
terventions on hot spots?

✪ Additional Readings
Campbell, Donald T., and Julian Stanley, Experi-

mental and Quasi-Experimental Designs for Research
(Chicago: Rand McNally, 1966). This short
book provides an excellent analysis of the logic
and methods of experimentation in social re-

Chapter 5 Experimental and Quasi-Experimental Designs 137

ton: Houghton Miffl in, 2002). An update of the
defi nitive guide to quasi-experimentation, this
book focuses on basic principles of research
design. In addition to numerous pointers on
designing research, the authors stress that de-
signing out validity threats is much preferred to
trying to control them through later statistical
analysis.

Weisburd, David, Cynthia M. Lum, and Anthony
Petrosino, “Does Research Design Affect Study
Outcomes in Criminal Justice?” The Annals
578(2001): 50–70. The authors make the in-
triguing claim that stronger experimental de-
signs are more likely to fi nd no causal relation-
ships, whereas quasi-experimental designs more
often fi nd relationships. Read this article care-
fully (whether or not you complete the exercise
described above), and decide whether you agree
with the authors’ conclusions.

Yin, Robert K., Case Study Research: Design and Methods,
3rd ed. (Thousand Oaks, CA: Sage, 2003). Many
people incorrectly associate case studies with
qualitative research. Yin describes a variety of
case-study designs as quasi-experiments. In do-
ing so, he is consistent with how Shadish, Cook,
and Campbell (2002) describe case studies.

search and is widely cited as the classic discus-
sion of validity threats.

Kim, So Young, and Wesley G. Skogan, “Statisti-
cal Analysis of Time Series Data on Problem
Solving,” Community Policing Working Paper
#27 (Center for Policy Research, Northwestern
University, 2003; www.northwestern.edu/ipr/
publications/policing.html; accessed May 21,
2008). Kim and Skogan present a number of
time-series studies to examine the effects of
problem solving by Chicago police. This is a
good example of switching replications time-
series designs by researchers at the university
where Campbell and Cook did their pioneering
work on quasi-experimental designs.

Pawson, Ray, and Nick Tilley, Realistic Evaluation
(Thousand Oaks, CA: Sage, 1997). We men-
tioned this book in Chapter 3. Pawson and
Tilley argue that experiments and quasi-experi-
ments focus too narrowly on threats to internal
validity. Instead, they propose a different view
of causation and different approaches to assess-
ing cause.

Shadish, William R., Thomas D. Cook, and Donald
T. Campbell, Experimental and Quasi-Experimen-
tal Designs for Generalized Causal Inference (Bos-

www.northwestern.edu/ipr/publications/policing.html

www.northwestern.edu/ipr/publications/policing.html

This page intentionally left blank

139

Having covered the basics of structuring
research, from general issues to research de-
sign, let’s dive into the various observational
techniques available for criminal justice re-
search.

Chapter 6 examines how social scientists
go about selecting people or things for ob-
servation. Our discussion of sampling ad-
dresses the fundamental scientifi c issue of
generalizability. As we’ll see, it is possible for
us to select a few people or things for obser-
vation and then apply what we observe to a
much larger group of people or things than
we actually observed. It is possible, for ex-
ample, to ask a thousand people how they
feel about “three strikes and you’re out”
laws and then accurately predict how tens of
millions of people feel about it.

Chapter 7 describes survey research and
other techniques for collecting data by ask-
ing people questions. We’ll cover different
ways of asking questions and discuss the

various uses of surveys and related tech-
niques in criminal justice research.

Chapter 8, on fi eld research, examines
what is perhaps the most natural form of
data collection: the direct observation of
phenomena in natural settings. As we will
see, observations can be highly structured
and systematic (such as counting pedestri-
ans who walk by a specifi ed point) or less
structured and more fl exible.

Chapter 9 discusses ways to take advan-
tage of some of the data available all around
us. Researchers often examine data collected
by criminal justice and other public agen-
cies. Content analysis is a method of collect-
ing data through carefully specifying and
counting communications such as news
stories, court opinions, or even recorded
visual images. Criminal justice researchers
may also conduct secondary analysis of data
collected by others.

Part Three

Modes of Observation

140

Chapter 6

Sampling
Sampling makes it possible to select a few hundred or thousand people for study
and discover things that apply to many more people who are not studied.

Introduction 141

The Logic of Probability
Sampling 141

Conscious and Unconscious
Sampling Bias 143

Representativeness and Probability of
Selection 144

Probability Theory and Sampling
Distribution 145

The Sampling Distribution of 10
Cases 145

From Sampling Distribution to
Parameter Estimate 149

Estimating Sampling Error 150

Confi dence Levels and Confi dence
Intervals 151

Probability Theory and Sampling
Distribution Summed Up 152

Populations and Sampling
Frames 153

Types of Sampling Designs 154

Simple Random Sampling 154

Systematic Sampling 154

Stratifi ed Sampling 155

Disproportionate Stratifi ed
Sampling 156

Multistage Cluster Sampling 157

Multistage Cluster Sampling with
Stratifi cation 158

Illustration: Two National Crime
Surveys 160

The National Crime Victimization
Survey 160

The British Crime Survey 161

Chapter 6 Sampling 141

Probability Sampling in Review 162

Nonprobability Sampling 162

Purposive Sampling 162

Quota Sampling 163

Reliance on Available Subjects 164

Snowball Sampling 165

Nonprobability Sampling in
Review 166

Introduction
How we collect representative data is fundamental
to criminal justice research.

Much of the value of research depends on how
data are collected. A critical part of criminal jus-
tice research is deciding what will be observed
and what won’t. If you want to study drug us-
ers, for example, which drug users should you
study? This chapter discusses the logic and
fundamental principles of sampling, then de-
scribes different general approaches for select-
ing subjects or other units.

Sampling is the process of selecting obser-
vations. Sampling is ordinarily used to select
observations for one of two related reasons.
First, it is often not possible to collect informa-
tion from all persons or other units we wish
to study. We may wish to know what propor-
tion of all persons arrested in U.S. cities have
recently used drugs, but collecting all that data
would be virtually impossible. Thus, we have to
look at a sample of observations.

The second reason for sampling is that it
is often not necessary to collect data from all
persons or other units. Probability sampling
techniques enable us to make relatively few
observations and then generalize from those
observations to a much wider population. If
we are interested in what proportion of high
school students have used marijuana, collecting
data from a probability sample of a few thou-
sand students will serve just as well as trying
to study every high school student in the
country.

Although probability sampling is central to
criminal justice research, it cannot be used in
many situations of interest. A variety of non-
probability sampling techniques are available
in such cases. Nonprobability sampling has its
own logic and can provide useful samples for
criminal justice inquiry. In this chapter, we ex-
amine both the advantages and the shortcom-
ings of such methods, and we discuss where
they fi t in the larger picture of sampling and
collecting data. Keep in mind one important
goal of all sampling: to reduce, or at least un-
derstand potential biases that may be at work
in selecting subjects.

The Logic of Probability
Sampling
Probability sampling helps researchers generalize
from observed cases to unobserved ones.

In selecting a group of subjects for study, social
science researchers often use some type of sam-
pling. Sampling in general refers to selecting
part of a population. In selecting samples, we
want to do two related things. First, we select
samples to represent some larger population
of people or other things. If we are interested
in attitudes about a community correctional
facility, we might draw a sample of neighbor-
hood residents, ask them some questions, and
use their responses to represent the attitudes
of all neighborhood residents. Or, in studying
cases in a criminal court, we may not be able
to examine all cases, so we select a sample to

142 Part Three Modes of Observation

represent that population of all cases processed
through some court.

Second, we may want to generalize from a
sample to an unobserved population the sample
is intended to represent. If we interview a sample
of community residents, we may want to gener-
alize our fi ndings to all community residents—
those we interviewed and those we did not. We
might similarly expect that our sample of crimi-
nal court cases can be generalized to the popula-
tion of all criminal court cases.

A special type of sampling that enables us
to generalize to a larger population is known as
probability sampling, a method of selection in
which each member of a population has a known
chance or probability of being selected. Know-
ing the probability that any individual member
of a population could be selected makes it pos-
sible for us to make predictions that our sample
accurately represents the larger population.

If all members of a population are identical
in all respects— demographic characteristics,

attitudes, experiences, behaviors, and so on—
there is no need for careful sampling proce-
dures. Any sample will be suffi cient. In this ex-
treme case of homogeneity, in fact, a single case
will be suffi cient as a sample to study character-
istics of the whole population.

In reality, of course, the human beings who
make up any real population are heterogeneous,
varying in many ways. Figure 6.1 offers a sim-
plifi ed illustration of a heterogeneous popula-
tion: the 100 members of this small population
differ by gender and race. We’ll use this hypo-
thetical micropopulation to illustrate various
aspects of sampling.

A sample of individuals from a population,
if it is to provide useful descriptions of the
total population, must contain essentially the
same variations that exist in the population.
This is not as simple as it might seem. Let’s
look at some of the possible biases in selec-
tion or ways researchers might go astray. Then
we will see how probability sampling provides

44 white women
44 white men
6 African American women
6 African American men

Figure 6.1 A Population of 100 People

Chapter 6 Sampling 143

an effi cient method for selecting a sample that
should adequately refl ect variations that exist
in the population.

Conscious and Unconscious
Sampling Bias
At fi rst glance, it may seem as if sampling is a
rather straightforward matter. To select a sam-
ple of 100 lawyers, a researcher might simply
go to a courthouse and interview the fi rst 100
lawyers who walk through the door. This kind
of sampling method is often used by untrained
researchers, but it is subject to serious biases. In
connection with sampling, bias simply means
that those selected are not “typical” or “represen-
tative” of the larger populations they have been
chosen from. This kind of bias is virtually inevi-
table when a researcher picks subjects casually.

Figure 6.2 illustrates what can happen when
we simply select people who are convenient for
study. Although women make up only 50 per-
cent of our micropopulation, those closest to

the researcher (people in the upper right-hand
corner of Figure 6.2) happen to be 70 percent
women. Although the population is 12 percent
African American, none were selected into this
sample of people who happened to be conve-
niently situated near the researcher.

Moving beyond the risks inherent in simply
studying people who are convenient, we need
to consider other potential problems as well. To
begin, our own personal leanings or biases may
affect the sample selected in this manner; hence,
the sample will not truly represent the popula-
tion of lawyers. Suppose a researcher is a little
intimidated by lawyers who look particularly
prosperous, believing that they might ridicule
his research effort. He might consciously or
unconsciously avoid interviewing them. Or he
might believe that the attitudes of “establish-
ment” lawyers are irrelevant to his research pur-
poses and avoid interviewing them.

Even if the researcher seeks to interview a
“balanced” group of lawyers, he won’t know the

Figure 6.2 A Sample of Convenience: Easy, but Not Representative

The
sample

144 Part Three Modes of Observation

exact proportions of different types of lawyers
who make up such a balance and won’t always
be able to identify the different types merely by
watching them walk by.

The researcher might make a conscious ef-
fort to interview, say, every 10th lawyer who
enters the courthouse, but he still cannot be
sure of a representative sample because differ-
ent types of lawyers visit the courthouse with
different frequencies, and some never go to the
courthouse at all. Thus, the resulting sample
will overrepresent lawyers who visit the court-
house more often.

Similarly, “call-in polls”—in which radio
stations ask people to call specifi ed telephone
numbers to register their opinions— cannot be
trusted to represent the general population. At
the very least, not everyone in the population is
even aware of the poll. Those who are aware of
it have some things in common simply because
they listen to the same radio station. As mar-
ket researchers understand very well, a classical
music station has a different audience than a
hard rock station. Adding even more bias to the
sample, those who are motivated to take part in
the poll are probably different from others who
are not so motivated.

A similar problem affects polls linked to we-
blogs or mass e-mail. Blogs tend to be selective;
people regularly visit blogs that present views on
personal and political issues they endorse (He-
witt 2005). As a result, the population of people
who respond to weblog polls can only represent
the population of people who regularly visit in-
dividual blogs. As a general principle, the more
self-selection is involved, the more bias will be
introduced into the sample.

The possibilities for inadvertent sampling
bias are endless and not always obvious. Fortu-
nately, some techniques can help us avoid bias.

Representativeness and
Probability of Selection
Although the term representativeness has no
precise, scientifi c meaning, it carries a com-
monsense meaning that makes it useful in the

discussion of sampling. As we’ll use the term
here, a sample is representative of the popula-
tion from which it is selected if the aggregate
characteristics of the sample closely approxi-
mate those same aggregate characteristics in
the population. If the population, for example,
contains 50 percent women, a representative
sample will also contain “close to” 50 percent
women. Later in this chapter, we’ll discuss
“how close” in detail. Notice that samples need
not be representative in all respects; representa-
tiveness is limited to those characteristics that
are relevant to the substantive interests of the
study.

A basic principle of probability sampling is
that a sample will be representative of the pop-
ulation from which it is selected if all members
of the population have an equal chance of be-
ing selected in the sample. Samples that have
this quality are often labeled equal probabil-
ity of selection method (EPSEM) samples.
This principle forms the basis of probability
sampling.

Even carefully selected EPSEM samples are
seldom, if ever, perfectly representative of the
populations from which they are drawn. Never-
theless, probability sampling offers two special
advantages. First, probability samples, though
never perfectly representative, are typically more
representative than other types of samples
because they avoid the biases discussed in the
preceding section. In practice, there is a greater
likelihood that a probability sample will be
representative of the population from which
it is drawn than that a nonprobability sample
will be.

Second, and more importantly, probability
sampling permits us to estimate the accuracy or
representativeness of the sample. Conceivably,
a researcher might wholly by chance select a
sample that closely represents the larger popu-
lation. The odds are against doing so, however,
and we cannot estimate the likelihood that a
haphazard sample will achieve representative-
ness. The probability sample can provide an
accurate estimate of success or failure, because

Chapter 6 Sampling 145

probability samples enable us to draw on prob-
ability theory.

Probability Theory and
Sampling Distribution
Probability theory permits inferences about how
sampled data are distributed around the value
found in a larger population.

With a basic understanding of the logic of
probability sampling in hand, we can examine
how probability sampling works in practice.
We will then be able to devise specifi c sampling
techniques and assess the results of those tech-
niques. To do so, we fi rst need to understand
four important concepts.

A sample element is that unit about which
information is collected and that provides
the basis of analysis. Typically, in survey re-
search, elements are people or certain types of
people. However, other kinds of units can be
the elements for criminal justice research—
correctional facilities, police beats, or court
cases, for example. Elements and units of analy-
sis are often the same in a given study, although
the former refers to sample selection and the
latter to data analysis.

A population is the theoretically specifi ed
grouping of study elements. Whereas the vague
term delinquents might describe the target for a
study, a more precise description of the popu-
lation includes the defi nition of the element
delinquents (for example, a person charged with
a delinquent offense) and the time referent for
the study (charged with a delinquent offense
in the previous six months). Translating the
abstract adult drug addicts into a workable popu-
lation requires specifying the age that defi nes
adult and the level of drug use that constitutes
an addict. Specifying college student includes a
consideration of full- and part-time students,
degree and nondegree candidates, undergradu-
ate and graduate students, and so on.

A population parameter is the value for a
given variable in a population. The average in-

come of all families in a city and the age distri-
bution of the city’s population are parameters.
An important portion of criminal justice re-
search involves estimating population param-
eters on the basis of sample observations.

The summary description of a given vari-
able in the sample is called a sample statistic.
Sample statistics are used to make estimates
of population parameters. Thus, the average
income computed from a sample and the age
distribution of that sample are statistics, and
those statistics are used to estimate income and
age parameters in a population.

The ultimate purpose of sampling is to se-
lect a set of elements from a population in such
a way that descriptions of those elements (sam-
ple statistics) accurately portray the param-
eters of the total population from which the
elements are selected. Probability sampling
enhances the likelihood of accomplishing this
aim and also provides methods for estimating
the degree of probable success.

The key to this process is random selection.
In random selection, each element has an equal
chance of being selected independent of any
other event in the selection process. Flipping a
coin is the most frequently cited example: the
“selection” of a head or a tail is independent of
previous selections of heads or tails.

There are two reasons for using random se-
lection methods. First, this procedure serves as
a check on conscious or unconscious bias on
the part of the researcher. The researcher who
selects cases on an intuitive basis might choose
cases that will support his or her research ex-
pectations or hypotheses. Random selection
erases this danger. Second, and more impor-
tantly, with random selection we can draw on
probability theory, which allows us to estimate
population parameters and to estimate how ac-
curate our statistics are likely to be.

The Sampling Distribution
of 10 Cases
Suppose there are 10 people in a group, and
each has a certain amount of money in his or her

146 Part Three Modes of Observation

pocket. To simplify, let’s assume that one per-
son has no money, another has $1, another has
$2, and so forth up to the person who has $9.
Figure 6.3 illustrates this population of 10
people.

Our task is to determine the average amount
of money one person has—specifi cally, the mean
number of dollars. If you simply add up the
money shown in Figure 6.3, the total is $45, so
the mean is $4.50 (45 ÷ 10). Our purpose in the
rest of this example is to estimate that mean
without actually observing all 10 individuals.
We’ll do that by selecting random samples from
the population and using the means of those
samples to estimate the mean for the whole
population.

To start, suppose we select—at random—a
sample of only 1 person from the 10. Depend-
ing on which person we select, we will estimate
the group’s mean as anywhere from $0 to $9.
Figure 6.4 shows a display of those 10 possible
samples. The 10 dots shown on the graph rep-
resent the 10 “sample” means we will get as esti-
mates of the population. The range of the dots
on the graph is the sampling distribution,
defi ned as the range of sample statistics we will
obtain if we select many samples. Figure 6.4

shows how all of our possible samples of 1 are
distributed. Obviously, it is not a good idea to
select a sample of only 1 because we stand a
good chance of missing the true mean of $4.50
by quite a bit.

What if we take samples of 2 each? As you can
see from Figure 6.5, increasing the sample size
improves our estimations. Once again, each dot
represents a possible sample. There are 45 pos-
sible samples of two elements: $0/$1, $0/$2, . . . ,
$7/$8, $8/$9. Moreover, some of these samples
produce the same means. For example, $0/$6,
$1/$5, and $2/$4 all produce means of $3. In
Figure 6.5, the three dots shown above the $3
mean represent those 3 samples.

Notice that the means we get from the 45
samples are not evenly distributed. Rather, they
are somewhat clustered around the true value
of $4.50. Only 2 samples deviate by as much
as $4 from the true value ($0/$1 and $8/$9),
whereas 5 of the samples give the true estimate
of $4.50, and another 8 samples miss the mark
by only $.50 (plus or minus).

Now suppose we select even larger samples.
What will that do to our estimates of the mean?
Figure 6.6 presents the sampling distributions
of samples of 3, 4, 5, and 6. The progression of

Figure 6.3 A Population of 10 People with $0 to $9

$ 8 $ 1 $ 7

$ 2
$ 0 $ 6

$ 5
$ 3

$ 9$ 4

Chapter 6 Sampling 147

$0 $1 $2 $3 $4 $5 $6 $7 $8 $9

• • • • • • • • • •

True mean = $4.50
10

9

8

7

6

5

4

3

2

1

Estimate of mean
(Sample size = 1)

N
um

be
r

of
s

am
pl

es
(T

ot
al

=
1

0)

Figure 6.4 The Sampling Distribution of Samples of 1

$0 $1 $2 $3 $4 $5 $6 $7 $8 $9

• • • • • • • • •• • • • • • •
• • • • • • •• • • • • •

• • • • •• • • •
• • •• •

True mean = $4.50
10

9

8

7

6

5

4

3

2

1

Estimate of mean
(Sample size = 2)

N
um

be
r

of
s

am
pl

es
(T

ot
al

=
4

5)

Figure 6.5 The Sampling Distribution of Samples of 2

148 Part Three Modes of Observation

Figure 6.6 The Sampling Distribution of Samples of 3, 4, 5, and 6

True mean = $4.50

True mean = $4.50

True mean = $4.50True mean = $4.50

20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1

N
um

be
r

of
s

am
pl

es
(

To
ta

l =
2

10
)

$0 $1 $2 $3 $4 $5 $6 $7 $8 $9

Estimate of mean
(Sample size = 4)

20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1

N
um

be
r

of
s

am
pl

es
(

T
ot

al
=

2
10

)

$0 $1 $2 $3 $4 $5 $6 $7 $8 $9

Estimate of mean
(Sample size = 6)

10
9
8
7
6
5
4
3
2
1

N
um

be
r

of
s

am
pl

es
(

T
ot

al
=

1
20

)

$0 $1 $2 $3 $4 $5 $6 $7 $8 $9

Estimate of mean
(Sample size = 3)

20
19
18
17
16
15
14
13
12
11
10

9
8
7
6
5
4
3
2
1

N
um

be
r

of
s

am
pl

es
(

T
ot

al
=

2
52

)

$0 $1 $2 $3 $4 $5 $6 $7 $8 $9

Estimate of mean
(Sample size = 5)

• • • • • • • • • • • • • • •• • • • • • •
• • • • • • • • • • • •• • • • • •

• • • • • • • • • •• • • • • •
• • • • • • • •• •• • • •
• • • • • • • •• • • •
• • • • • • •• • •
• • • • • • •• • •

• • • • •• • •
• • • •• •

• • ••

••• •••• •••• •••• ••••• •• •• ••
• •••• •••• •••• ••• •• •• ••

•••• •••• •••••• •• •• ••
•••• •••• ••••• •• •• •
•••• •••• ••••• •• •• •

•••• •••• •••••• ••
•••• •••• •••••• ••
••• •••• ••••• ••
••• •••• ••••• ••
•• •••• •••• ••
•• •••• •••• ••
• •••• ••• ••
• •••• ••• ••
• •••• ••• ••

•••••• ••
•••••• ••
••••• •

••••• •

••••
••

••• •••• •••• •••• •••• •• •• ••
• •••• •••• •••••• •• •• ••

•••• •••• •••••• •• •• •

•••• •••• ••••• •• ••
•••• •••• ••••• •• ••

•••• •••• ••••• ••
••• •••• •••• ••
••• •••• •••• ••
••• •••• •••• ••
•• •••• ••• ••

• •••••• ••
• •••••• ••

• •••••• ••
•••••• •

•••••
•••••


• • • ••• • • • • • • • • • • • •• • • • • • ••
• ••• • • • • • • • • • •• • • • • • ••

••• • • • • • • • • •• • • • • • •

••• • • • • • • • • •• • • • •
••• • • • • • • • • •• • • • •

••• • • • • • • • •• • • •
•• • • • • • • •• • • •
•• • • • • • • •• • • •
•• • • • • • • •• • • •
• • • • • • •• • • •

• • • • •• • • •
• • • • •• • • •

• • • • •• • • •
• • • •• • •

• • • • •
• • • • •


A. Samples of 3

B. Samples of 4

D. Samples of 6C. Samples of 5

Chapter 6 Sampling 149

the sampling distributions is clear. Every in-
crease in sample size improves the distribution
of estimates of the mean in two related ways.
First, in the distribution for samples of 5, for
example, no sample means are at the extreme
ends of the distribution. Why? Because it is not
possible to select fi ve elements from our popu-
lation and obtain an average of less than $2 or
greater than $7. The second way sampling dis-
tributions improve with larger samples is that
sample means cluster more and more around
the true population mean of $4.50. Figure 6.6
clearly shows this tendency.

From Sampling Distribution to
Parameter Estimate
Let’s turn now to a more realistic sampling sit-
uation and see how the notion of sampling dis-
tribution applies. Assume that we wish to study
the population of Placid Coast, California, to
assess the levels of approval or disapproval of
a proposed law to ban possession of handguns
within the city limits.

Our target population is all adult residents.
In order to draw an actual sample, we need some
sort of list of elements in our population; such
a list is called a sampling frame. Assume our
sampling frame is a voter registration list of, say,

20,000 registered voters in Placid Coast. The el-
ements are the individual registered voters.

The variable under consideration is attitudes
toward the proposed law: approve or disap-
prove. Measured in this way, attitude toward
the law is a binomial variable; it can have only
two values. We’ll select a random sample of, say,
100 persons for the purpose of estimating the
population parameter for approval of the pro-
posed law.

Figure 6.7 presents all the possible values of
this parameter in the population—from 0 per-
cent approval to 100 percent approval. The mid-
point of the line—50 percent—represents half
the voters approving of the handgun ban and
the other half disapproving.

To choose our sample, we assign each
person on the voter registration list a num-
ber and use a computer program to generate
100 random numbers. Then we interview the
100 people whose numbers have been selected
and ask for their attitudes toward the hand-
gun ban: whether they approve or disapprove.
Suppose this operation gives us 48 people who
approve of the law and 52 who disapprove. We
present this statistic by placing a dot at the
point representing 48 percent, as shown in
Figure 6.8.

0 50 100

Percentage of voters approving of the proposed law

Figure 6.7 The Range of Possible Sample Study Results

0 50 100

Percentage of voters approving of the proposed law

Sample 1 (48%) Sample 3 (52%)

Sample 2 (51%)

• ••

Figure 6.8 Results Produced by Three Hypothetical Samples

150 Part Three Modes of Observation

Now suppose we select another sample of
100 people in exactly the same fashion and
measure their approval or disapproval of the
proposed law. Perhaps 51 people in the second
sample approve of the law. We place another
dot in the appropriate place on the line in
Figure 6.8. Repeating this process once more,
we may discover that 52 people in the third
sample approve of the handgun ban; we add a
third dot to Figure 6.8.

Figure 6.8 now presents the three different
sample statistics that represent the percentages
of people in each of the three random samples
who approved of the proposed law. Each of the
random samples, then, gives us an estimate of
the percentage of people in the total popula-
tion of registered voters who approve of the
handgun law. Unfortunately, we now have three
separate estimates.

To rescue ourselves from this dilemma, let’s
draw more and more samples of 100 registered
voters each, question each of the samples con-
cerning their approval or disapproval, and plot
the new sample statistics on our summary
graph. In drawing many such samples, we dis-
cover that some of the new samples provide
duplicate estimates, as in our earlier illustra-
tion with 10 cases. Figure 6.9 shows the sam-
pling distribution of hundreds of samples. This
is often referred to as a normal or bell-shaped
curve.

Notice that by increasing the number of
samples selected and interviewed we have also
increased the range of estimates provided by
the sampling operation. In one sense, we have
increased our dilemma in attempting to fi nd
the parameter in the population. Fortunately,
probability theory provides certain important
rules about the sampling distribution shown in
Figure 6.9.

Estimating Sampling Error
Probability theory can help resolve our di-
lemma with some basic statistical concepts.
First, if many independent random samples are
selected from a population, then the sample
statistics provided by those samples will be dis-
tributed around the population parameter in a
known way. Thus, although Figure 6.9 shows
a wide range of estimates, more of them are in
the vicinity of 50 percent than elsewhere in the
graph. Probability theory tells us, then, that the
true value is in the vicinity of 50 percent.

Second, probability theory gives us a for-
mula for estimating how closely the sample sta-
tistics are clustered around the true value:

s
p q

n

where s is the standard error— defi ned as a mea-
sure of sampling error—n is the number of cases

0 50 100

Percentage of voters approving of the proposed law







80
60
40
20
0

N
um

be
r

of
s

am
pl

es

• • • • •
• • • • • •
• • • • • • •

• • • • • • • •
• • • • • • • • •
• • • • • • • • • •

• • • • • • • • • • •
• • • • • • • • • • • •

• • • • • • • • • • • • •
• • • • • • • • • • • • • •

• • • • • • • • • • • • • • •
• • • • • • • • • • • • • • • •

• • • • • • • • • • • • • • • • • •
• • • • • • • • • • • • • • • • • • • • •

• • • • • • • • • • • • • • • • • • • • • • • •
• • • • • • • • • • • • • • • • • • • • • • • • • •

• • • • • • • • • • • • • • • • • • • • • • • • • • • • •
• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •

• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •

Figure 6.9 The Sampling Distribution

Chapter 6 Sampling 151

in each sample, and p and q are the population
parameters for the binomial. If 60 percent of reg-
istered voters approve of the ban on handguns
and 40 percent disapprove, then p and q are 60
percent and 40 percent, or .6 and .4, respectively.

To see how probability theory makes it pos-
sible for us to estimate sampling error, suppose
that in reality 50 percent of the people approve
of the proposed law and 50 percent disapprove.
These are the population parameters we are try-
ing to estimate with our samples. Recall that we
have been selecting samples of 100 cases each.
When these numbers are plugged into the for-
mula, we get:

s �


.5 .5

100
.05

The standard error equals .05, or 5 percent.
In probability theory, the standard error is

a valuable piece of information because it in-
dicates how closely the sample estimates will
be distributed around the population param-
eter. A larger standard error indicates sample
estimates are widely dispersed, while a smaller
standard error means that estimates are more
clustered around a population parameter.
Probability theory tells us that approximately
34 percent (.3413) of the sample estimates will
fall within one standard error increment above
the population parameter, and another 34 per-
cent will fall within one standard error incre-
ment below the parameter. In our example, the
standard error increment is 5 percent, so we
know that 34 percent of our samples will give
estimates of approval between 50 percent (the
parameter) and 55 percent (one standard error
above); another 34 percent of the samples will
give estimates between 50 and 45 percent (one
standard error below the parameter). Taken to-
gether, then, we know that roughly two-thirds
(68 percent) of the samples will give estimates
between 45 and 55 percent, which is within
5 percent of the parameter.

The standard error is also a function of the
sample size—an inverse function. This means

that as the sample size increases, the standard
error decreases. And as the sample size increases,
the several samples will be clustered nearer to
the true value. Figure 6.6 illustrates this clus-
tering. Another rule of thumb is evident in the
formula for the standard error: Because of the
square root operation, the standard error is re-
duced by half if the sample size is quadrupled.
In our example, samples of 100 produce a stan-
dard error of 5 percent; to reduce the standard
error to 2.5 percent, we would have to increase
the sample size to 400.

All of this information is provided by estab-
lished probability theory in reference to the se-
lection of large numbers of random samples. If
the population parameter is known and many
random samples are selected, probability the-
ory allows us to predict how many of the sam-
ples will fall within specifi ed intervals from the
parameter.

Of course, this discussion illustrates only
the logic of probability sampling. It does not de-
scribe the way research is actually conducted.
Usually, we do not know the parameter; we con-
duct a sample survey precisely because we want
to estimate that value. Moreover, we don’t actu-
ally select large numbers of samples; we select
only one sample. What probability theory does
is provide the basis for making inferences about
the typical research situation. Knowing what it
would be like to select thousands of samples
allows us to make assumptions about the one
sample we do select and study.

Confi dence Levels and
Confi dence Intervals
Probability theory specifi es that 68 percent of
that fi ctitious large number of samples will
produce estimates that fall within one standard
error of the parameter. As researchers, we can
turn the logic around and infer that any single
random sample has a 68 percent chance of fall-
ing within that range. In this regard, we speak
of confi dence levels: we are 68 percent confi –
dent that our sample estimate is within one
standard error of the parameter. Or we may

152 Part Three Modes of Observation

say that we are 95 percent confi dent that the
sample statistic is within two standard errors of
the parameter, and so forth. Quite reasonably,
our confi dence level increases as the margin for
error is extended. We are virtually positive
(99.9 percent) that our statistic is within three
standard errors of the true value.

Although we may be confi dent (at some
level) of being within a certain range of the pa-
rameter, we seldom know what the parameter is.
To resolve this dilemma, we substitute our sam-
ple estimate for the parameter in the formula;
lacking the true value, we substitute the best
available guess.

The result of these inferences and estima-
tions is that we are able to estimate a popula-
tion parameter and also the expected degree of
error on the basis of one sample drawn from a
population. We begin with this question: what
percentage of the registered voters in Placid
Coast approve of the proposed handgun ban?
We select a random sample of 100 registered
voters and interview them. We might then re-
port that our best estimate is that 50 percent
of registered voters approve of the gun ban and
that we are 95 percent confi dent that between
40 and 60 percent (plus or minus two standard
errors) approve. The range from 40 to 60 per-
cent is called the confi dence interval. At the
68 percent confi dence level, the confi dence in-
terval is 45 to 55 percent.

The logic of confi dence levels and confi –
dence intervals also provides the basis for deter-
mining the appropriate sample size for a study.
Once we decide on the sampling error we can
tolerate, we can calculate the number of cases
needed in our sample.

Probability Theory and Sampling
Distribution Summed Up
This, then, is the basic logic of probability sam-
pling. Random selection permits the researcher
to link fi ndings from a sample to the body of
probability theory so as to estimate the accuracy
of those fi ndings. All statements of accuracy in
sampling must specify both a confi dence level

and a confi dence interval. The researcher must
report that he or she is x percent confi dent that
the population parameter is between two spe-
cifi c values.

In this example, we have demonstrated the
logic of sampling error using a binomial vari-
able—a variable analyzed in percentages. A dif-
ferent statistical procedure would be required
to calculate the standard error for a mean, but
the overall logic is the same.

Notice that nowhere in this discussion did
we consider the size of the population being
studied. This is because the population size
is almost always irrelevant. A sample of 2,000
respondents drawn properly to represent resi-
dents of Vermont will be no more accurate than
a sample of 2,000 drawn properly to represent
residents in the United States, even though
the Vermont sample would be a substantially
larger proportion of that small state’s residents
than would the same number chosen to repre-
sent the nation’s residents. The reason for this
counterintuitive fact is that the equations for
calculating sampling error assume that the
populations being sampled are infi nitely large,
so all samples would equal zero percent of the
whole.

Two cautions are in order before we con-
clude this discussion of the basic logic of prob-
ability sampling. First, the survey uses of prob-
ability theory as discussed here are technically
not wholly justifi ed. The theory of sampling
distribution makes assumptions that almost
never apply in survey conditions. The exact pro-
portion of samples contained within specifi ed
increments of standard errors mathematically
assumes an infi nitely large population, an in-
fi nite number of samples, and sampling with
replacement—that is, every sampling unit se-
lected is “thrown back into the pot” and could
be selected again. Second, our discussion has
greatly oversimplifi ed the inferential jump
from the distribution of several samples to the
probable characteristics of one sample.

We offer these cautions to provide per-
spective on the uses of probability theory in

Chapter 6 Sampling 153

sampling. Researchers in criminal justice and
other social sciences often appear to overesti-
mate the precision of estimates produced by
the use of probability theory. Variations in sam-
pling techniques and nonsampling factors may
further reduce the legitimacy of such estimates.
For example, those selected in a sample who fail
or refuse to participate further detract from the
representativeness of the sample.

Nevertheless, the calculations discussed in
this section can be extremely valuable to you
in understanding and evaluating your data.
Although the calculations do not provide as
precise estimates as some researchers might as-
sume, they can be quite valid for practical pur-
poses. They are unquestionably more valid than
less rigorously derived estimates based on less-
rigorous sampling methods. Most important,
being familiar with the basic logic underlying
the calculations can help you react sensibly
both to your own data and to those reported by
others.

Populations and
Sampling Frames
The correspondence between a target population
and sampling frames affects the generalizability of
samples.

Although as researchers and as consumers of
research we need to understand the theoretical
foundations of sampling, it is no less important
to appreciate the less-than-perfect conditions
that exist in the fi eld. One aspect of fi eld con-
ditions that requires a compromise in terms of
theoretical conditions and assumptions is the
relationship between populations and sam-
pling frames.

A sampling frame is the list or quasi-list of
elements from which a probability sample is se-
lected. We say quasi-list because, even though an
actual list might not exist, we can draw samples
as if there were a list. Properly drawn samples
provide information appropriate for describ-
ing the population of elements that compose

the sampling frame—nothing more. This point
is important in view of the common tendency
for researchers to select samples from a particu-
lar sampling frame and then make assertions
about a population that is similar, but not
identical, to the study population defi ned by
the sampling frame.

For example, if we want to study the atti-
tudes of corrections administrators toward de-
terminant sentencing policies, we might select a
sample by consulting the membership roster of
the American Correctional Association. In this
case, the membership roster is our sampling
frame, and corrections administrators are the
population we wish to describe. However, un-
less all corrections administrators are members
of the American Correctional Association and
all members are listed in the roster, it would be
incorrect to generalize results to all corrections
administrators.

Studies of organizations are often the sim-
plest from a sampling standpoint because or-
ganizations typically have membership lists. In
such cases, the list of members may be an ac-
ceptable sampling frame. If a random sample is
selected from a membership list, then the data
collected from that sample may be taken as rep-
resentative of all members—if all members are
included in the list. It is, however, imperative
that researchers learn how complete or incom-
plete such lists might be and limit their gener-
alizations to listed sample elements rather than
to an entire population.

Other lists of individuals may be especially
relevant to the research needs of a particu-
lar study. Lists of licensed drivers, automobile
owners, welfare recipients, taxpayers, holders
of weapons permits, and licensed professionals
are just a few examples. Although it may be dif-
fi cult to gain access to some of these lists, they
provide excellent sampling frames for special-
ized research purposes.

Telephone directories are frequently used
for “quick and dirty” public opinion polls.
Undeniably, they are easy and inexpensive to
use, and that is no doubt the reason for their

154 Part Three Modes of Observation

popularity. Still, they have several limitations.
A given directory will not include new sub-
scribers or those who have requested unlisted
numbers. Sampling is further complicated by
the inclusion of nonresidential listings in di-
rectories. Moreover, telephone directories are
sometimes taken to be a listing of a city’s popu-
lation, which is simply not the case. Lower-in-
come people are less likely to have telephones,
and higher-income people may have more than
one line. A growing number of households are
served only by wireless phone service and so are
not listed in directories. A recent national study
reported that 7 percent of households had only
wireless phones and 2 percent had no telephone
service (Blumberg, Luke, and Cynamon 2006).
Telephone companies may not publish list-
ings for temporary residents such as students.
And persons who live in institutions or group
quarters— dormitories, nursing homes, room-
ing houses, and the like—are not listed in
phone directories.

Street directories and tax maps are often
used for easy samples of households, but they
also may suffer from incompleteness and pos-
sible bias. For example, in strictly zoned urban
regions, illegal housing units are unlikely to ap-
pear on offi cial records. As a result, such units
have no chance for selection and sample fi nd-
ings will not be representative of those units,
which are often substandard and overcrowded.

In a more general sense, it’s worth viewing
sampling frames as operational defi nitions of
a study population. Just as operational defi ni-
tions of variables describe how abstract con-
cepts will be measured, sampling frames serve
as a real-world version of an abstract study
population. For example, we may want to study
how criminologists deal with ethical issues in
their research. We don’t know how many crimi-
nologists exist out there, but we can develop
a general idea about the population of crimi-
nologists. We could also operationalize the con-
cept by using the membership directory for the
American Society of Criminology—that list is
our operational defi nition of criminologist.

Types of Sampling Designs
Different types of sampling designs can be used alone
or in combination for different research purposes.

The illustrations we have considered so far have
been based on simple random sampling. How-
ever, researchers have a number of options in
choosing their sampling method, each with its
own advantages and disadvantages.

Simple Random Sampling
Simple random sampling forms the basis of
probability theory and the statistical tools we
use to estimate population parameters, standard
error, and confi dence intervals. More accurately,
such statistics assume unbiased sampling, and
simple random sampling is the foundation of
unbiased sampling.

Once a sampling frame has been established
in keeping with the guidelines we presented, to
use simple random sampling, the researcher as-
signs a single number to each element in the
list, not skipping any number in the process. A
table of random numbers, or a computer pro-
gram for generating them, is then used to select
elements for the sample.

If the sampling frame is a computerized da-
tabase or some other form of electronic data,
a simple random sample can be selected by
computer. In effect, the computer program
numbers the elements in the sampling frame,
generates its own series of random numbers,
and prints out the list of elements selected.

Systematic Sampling
Simple random sampling is seldom used in prac-
tice, primarily because it is not usually the most
effi cient method, and it can be tedious if done
manually. It typically requires a list of elements.
And when such a list is available, researchers
usually use systematic sampling rather than
simple random sampling.

In systematic sampling, the researcher
chooses all elements in the list for inclusion in
the sample. If a list contains 10,000 elements
and we want a sample of 1,000, we select every

Chapter 6 Sampling 155

10th element for our sample. To ensure against
any possible human bias, we should select the
fi rst element at random. Thus, to systematically
select 1,000 from a list of 10,000 elements, we
begin by selecting a random number between 1
and 10. The element having that number, plus
every 10th element following it, is included in
the sample. This method technically is referred
to as a systematic sample with a random start.

In practice, systematic sampling is virtually
identical to simple random sampling. If the list
of elements is indeed randomized before sam-
pling, one might argue that a systematic sample
drawn from that list is, in fact, a simple random
sample.

Systematic sampling has one danger. A pe-
riodic arrangement of elements in the list can
make systematic sampling unwise; this arrange-
ment is usually called periodicity. If the list of
elements is arranged in a cyclical pattern that
coincides with the sampling interval, a biased
sample may be drawn. Suppose we select a sam-
ple of apartments in an apartment building. If
the sample is drawn from a list of apartments
arranged in numerical order (for example, 101,
102, 103, 104, 201, 202, and so on), there is a
danger of the sampling interval coinciding with
the number of apartments on a fl oor or some
multiple of it. Then the samples might include
only northwest-corner apartments or only
apartments near the elevator. If these types of
apartments have some other particular charac-
teristic in common (for example, higher rent),
the sample will be biased. The same potential
danger would apply in a systematic sample of
houses in a subdivision arranged with the same
number of houses on a block.

In considering a systematic sample from a
list, then, we need to carefully examine the na-
ture of that list. If the elements are arranged
in any particular order, we have to fi gure out
whether that order will bias the sample to be
selected and take steps to counteract any pos-
sible bias.

In summary, systematic sampling is usually
superior to simple random sampling, in terms

of convenience if nothing else. Problems in the
ordering of elements in the sampling frame can
usually be remedied quite easily.

Stratifi ed Sampling
We have discussed two methods of selecting
a sample from a list: random and systematic.
Stratifi cation is not an alternative to these
methods, but it represents a possible modifi –
cation in their use. Simple random sampling
and systematic sampling both ensure a degree
of representativeness and permit an estimate
of the sampling error present. Stratifi ed sam-
pling is a method for obtaining a greater degree
of representativeness— decreasing the probable
sampling error. To understand why that is the
case, we must return briefl y to the basic theory
of sampling distribution.

Recall that sampling error is reduced by two
factors in the sample design: (1) a large sample
produces a smaller sampling error than a small
sample does, and (2) a homogeneous popula-
tion produces samples with smaller sampling
errors than a heterogeneous population does. If
99 percent of the population agrees with a cer-
tain statement, it is extremely unlikely that any
probability sample will greatly misrepresent the
extent of agreement. If the population is split
50–50 on the statement, then the sampling er-
ror will be much greater.

Stratifi ed sampling is based on this second
factor in sampling theory. Rather than select-
ing our sample from the total population at
large, we select appropriate numbers of ele-
ments from homogeneous subsets of that pop-
ulation. To get a stratifi ed sample of university
students, for example, we fi rst organize our
population by college class and then draw ap-
propriate numbers of freshmen, sophomores,
juniors, and seniors. In a nonstratifi ed sample,
representation by class is subject to the same
sampling error as other variables. In a sample
stratifi ed by college class, the sampling error on
that variable is reduced to zero.

Even more complex stratifi cation methods
are possible. In addition to stratifying by class,

156 Part Three Modes of Observation

we might also stratify by gender, grade point
average, and so forth. In this fashion, we could
ensure that our sample contains the proper
numbers of freshman men with a 4.0 average,
freshman women with a 4.0 average, and so
forth.

The ultimate function of stratifi cation, then,
is to organize the population into homoge-
neous subsets (with heterogeneity between
subsets) and to select the appropriate number
of elements from each. To the extent that the
subsets are homogeneous on the stratifi cation
variables, they may also be homogeneous on
other variables. Because age is usually related to
college class, a sample stratifi ed by class will be
more representative in terms of age as well.

The choice of stratifi cation variables typi-
cally depends on what variables are available
and what variables might help reduce sampling
error for a particular study. Gender can often
be determined in a list of names. Many local
government sources of information on hous-
ing units are arranged geographically. Age, race,
education, occupation, and other variables are
often included on lists of persons who have had
contact with criminal justice offi cials.

In selecting stratifi cation variables, however,
we should be concerned primarily with those
that are presumably related to the variables we
want to represent accurately. Because gender is
related to many variables and is often available
for stratifi cation, it is frequently used. Age and
race are related to many variables of interest in
criminal justice research. Income is also related
to many variables, but it is often not available
for stratifi cation. Geographic location within a
city, state, or nation is related to many things.
Within a city, stratifi cation by geographic loca-
tion usually increases representativeness in so-
cial class and ethnicity.

Stratifi ed sampling ensures the proper rep-
resentation of the stratifi cation variables to
enhance representation of other variables re-
lated to them. Taken as a whole, then, a strati-
fi ed sample is likely to be more representative

on a number of variables than a simple random
sample is.

Disproportionate
Stratifi ed Sampling
Another use of stratifi cation is to purposively
produce samples that are not representative
of a population on some variable, referred to
as disproportionate stratifi ed sampling. Be-
cause the purpose of sampling, as we have been
discussing, is to represent a larger population,
you may wonder why anyone would want to
intentionally produce a sample that was not
representative.

To understand the logic of disproportionate
stratifi cation, consider again the role of popula-
tion homogeneity in determining sample size. If
members of a population vary widely on some
variable of interest, then larger samples must
be drawn to adequately represent the larger
sampling error in that population. Similarly,
if only a small number of people in a popula-
tion exhibit some attribute or characteristic of
interest, then a large sample must be drawn to
produce adequate numbers of elements that
exhibit the uncommon condition. Dispropor-
tionate stratifi cation is a way of obtaining suf-
fi cient numbers of these rare cases by selecting
a number disproportionate to their representa-
tion in the population.

The best example of disproportionate sam-
pling in criminal justice is a national crime
survey in which one goal is to obtain some min-
imum number of crime victims in a sample. Be-
cause crime victimization for certain offenses—
such as robbery or aggravated assault—is rela-
tively rare on a national scale, persons who live
in large urban areas, where serious crime is more
common, are disproportionately sampled.

The British Crime Survey (BCS) is a na-
tionwide survey of people aged 16 and over in
England and Wales. Over its fi rst 20 years (since
1982) the BCS selectively oversampled people
or areas to yield larger numbers of designated
subjects than would result from proportionate

Chapter 6 Sampling 157

random samples of the population of England
and Wales. The BCS conducted in 2000 in-
cluded special questions to better understand
contacts between ethnic minorities and police,
and ethnic minorities were disproportionately
oversampled to produce a large enough num-
ber of ethnic minority respondents for later
analysis (Kershaw, Budd, Kinshott, et al. 2000).

Multistage Cluster Sampling
The preceding sections have described reason-
ably simple procedures for sampling from lists
of elements. Unfortunately, however, many in-
teresting research problems require the selec-
tion of samples from populations that cannot
easily be listed for sampling purposes—that is,
sampling frames are not readily available. Ex-
amples are the population of a city, state, or na-
tion and all police offi cers in the United States.
In such cases, the sample design must be much
more complex. Such a design typically involves
the initial sampling of groups of elements—
clusters—followed by the selection of elements
within each of the selected clusters. This proce-
dure yields multistage cluster samples.

Cluster sampling may be used when it is ei-
ther impossible or impractical to compile an ex-
haustive list of the elements that compose the
target population, such as all law enforcement
offi cers in the United States. Often, however,
population elements are already grouped into
subpopulations, and a list of those subpopula-
tions either exists or can be created.

Population elements, or aggregations of
those elements, are referred to as sampling
units. In the simplest forms of sampling, ele-
ments and units are the same thing—usually
people. But in cases in which a listing of ele-
ments is not available, we can often use some
other unit that includes a grouping of elements.

Because U.S. law enforcement offi cers are
employed by individual cities, counties, or
states, it is possible to create lists of those po-
litical units. For cluster sampling, then, we
could sample the list of cities, counties, and

states in some manner as discussed previously
(for example, a systematic sample stratifi ed by
population). Next, we could obtain lists of law
enforcement offi cers from agencies in each of
the selected jurisdictions. We could then sam-
ple each of the lists to provide samples of police
offi cers for study.

Another typical situation concerns sam-
pling among population areas such as a city.
Although there is no single list of a city’s popu-
lation, citizens reside on discrete city blocks or
census blocks. It is possible, therefore, to select
a sample of blocks initially, create a list of per-
sons who live on each of the selected blocks, and
then sample persons from that list. In this case,
blocks are treated as the primary sampling unit.

In a more complex design, we might sample
blocks, list the households on each selected
block, sample the households, list the persons
who reside in each household, and, fi nally, sam-
ple persons within each selected household.
This multistage sample design will lead to the
ultimate selection of a sample of individuals
without requiring the initial listing of all indi-
viduals in the city’s population.

Multistage cluster sampling, then, involves
the repetition of two basic steps: listing and
sampling. The list of primary sampling units
(city blocks) is compiled and perhaps stratifi ed
for sampling. Next, a sample of those units is
selected. The list of secondary sampling units is
then sampled, and the process continues.

Cluster sampling is highly recommended for
its effi ciency, but the price of that effi ciency is a
less accurate sample. A simple random sample
drawn from a population list is subject to a sin-
gle sampling error, but a two-stage cluster sam-
ple is subject to two sampling errors. First, the
initial sample of clusters represents the popula-
tion of clusters only within a range of sampling
error. Second, the sample of elements selected
within a given cluster represents all the elements
in that cluster only within a range of sampling
error. Thus, for example, we run a certain risk of
selecting a sample of disproportionately wealthy

158 Part Three Modes of Observation

city blocks, plus a sample of disproportionately
wealthy households within those blocks. The
best solution to this problem involves the num-
ber of clusters selected initially and the number
of elements selected within each cluster.

Recall that sampling error is reduced by two
factors: (1) an increase in the sample size and
(2) increased homogeneity of the elements be-
ing sampled. These factors operate at each level
of a multistage sample design. A sample of
clusters will best represent all clusters if a large
number are selected and if all clusters are very
much alike. A sample of elements will best rep-
resent all elements in a given cluster if a large
number are selected from the cluster and if all
the elements in the cluster are very much alike.

A good general guideline for cluster de-
sign is to maximize the number of clusters se-
lected while decreasing the number of elements
within each cluster. But this scientifi c guideline
must be balanced against an administrative
constraint. The effi ciency of cluster sampling is
based on the ability to minimize the list of pop-
ulation elements. By initially selecting clusters,
we need only list the elements that make up the
selected clusters, not all elements in the entire
population. Increasing the number of clusters,
however, reduces this effi ciency in cluster sam-
pling. A small number of clusters may be listed
more quickly and more cheaply than a large
number. Remember that all the elements in a
selected cluster must be listed, even if only a few
are to be chosen in the sample.

The fi nal sample design will refl ect these
two constraints. In effect, we will probably se-
lect as many clusters as we can afford. So as not
to leave this issue too open-ended, here is a rule
of thumb: population researchers convention-
ally aim for the selection of 5 households per
census block. If a total of 2,000 households are
to be interviewed, researchers select 400 blocks
and interview 5 households on each. Figure 6.10
presents a graphic overview of this process.

As we turn to more detailed procedures in
cluster sampling, keep in mind that this method
almost inevitably involves a loss of accuracy.

First, as noted earlier, a multistage sample de-
sign is subject to a sampling error at each stage.
Because the sample size is necessarily smaller at
each stage than the total sample size, the sam-
pling error at each stage will be greater than
would be the case for a single-stage random
sample of elements. Second, sampling error
is estimated on the basis of observed variance
among the sample elements. When those ele-
ments are drawn from relatively homogeneous
clusters, the estimated sampling error will be
too optimistic and so must be corrected in light
of the cluster sample design.

Multistage Cluster Sampling with
Stratifi cation
Thus far we have looked at cluster sampling as
though a simple random sample were selected at
each stage of the design. In fact, we can use strat-
ifi cation techniques to refi ne and improve the
sample being selected. The basic options avail-
able are essentially the same as those possible
in single-stage sampling from a list. In selecting
a national sample of law enforcement offi cers,
we might initially stratify our list of agencies by
type (state, county, municipal), geographic re-
gion, size, and rural or urban location.

Once the primary sampling units (law en-
forcement agencies) have been grouped ac-
cording to the relevant, available stratifi cation
variables, either simple random or systematic
sampling techniques can be used to select the
sample. We might select a specifi ed number of
units from each group or stratum, or we might
arrange the stratifi ed clusters in a continuous
list and systematically sample that list.

To the extent that clusters are combined
into homogeneous strata, the sampling error at
this stage will be reduced. The primary goal of
stratifi cation, as before, is homogeneity.

In principle, stratifi cation can take place
at each level of sampling. The elements listed
within a selected cluster might be stratifi ed
before the next stage of sampling. Typically,
however, that is not done because we strive for
relative homogeneity within clusters. If clusters

Chapter 6 Sampling 159

Figure 6.10 Multistage Cluster Sampling

Stage One: Identify
blocks and select
a sample. (Selected
blocks are shaded.)

Stage Two: Go to each
selected block and list
all households in order.
(Example of one listed block.)

Stage Three: For
each list, select a
sample of households.
(In this example, every
sixth household has
been selected starting
with #5, which was
selected at random.)

1st St.

2nd St.

3rd St.

4th St.

5th St.

P
ar

sl
ey

A
ve

.

S
ag

e
A

ve
.

R
os

em
ar

y
A

ve
.

T
hy

m
e

A
ve

.

R
ob

in
so

n
A

ve
.

B
ox

er
A

ve
.

B
rid

ge
A

ve
.

1.
2.
3.
4.
5.
6.
7.
8.
9.

10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.

491 Rosemary Ave.
487 Rosemary Ave.
473 Rosemary Ave.
455 Rosemary Ave.
437 Rosemary Ave. • •

423 Rosemary Ave.
411 Rosemary Ave.
403 Rosemary Ave.
1101 4th St.
1123 4th St.
1137 4th St. • •

1157 4th St.
1169 4th St.
1187 4th St.
402 Thyme Ave.
408 Thyme Ave.
424 Thyme Ave. • •

446 Thyme Ave.
458 Thyme Ave.
480 Thyme Ave.
498 Thyme Ave.
1186 5th St.
1174 5th St. • •

1160 5th St.
1140 5th St.
1122 5th St.
1118 5th St.
1116 5th St.
1104 5th St. • •

1102 5th St.

160 Part Three Modes of Observation

are suffi ciently similar, it is not necessary to
stratify again.

Illustration: Two National
Crime Surveys
Two national crime surveys show different ways of
designing samples to achieve desired results.

Our discussion of sampling designs suggests
that researchers can combine many different
techniques of sampling and their various com-
ponents in different ways to suit various needs.
In fact, the different components of sampling
can be tailored to specifi c purposes in much the
same way that research design principles can be
modifi ed to suit various needs. Because sample
frames suitable for simple random sampling
are often unavailable, researchers use multi-
stage cluster sampling to move from aggregate
sample units to actual sample elements. We can
add stratifi cation to ensure that samples are
representative of important variables. And we
can design samples to produce elements that
are proportionate or disproportionate to the
population.

Two national crime surveys illustrate how
these various building blocks may be com-
bined in complex ways: (1) the National Crime
Victimization Survey (NCVS), conducted by
the Census Bureau, and (2) the British Crime
Survey (BCS). Each is a multistage cluster sam-
ple, but the two surveys use different strategies
for sampling to produce suffi cient numbers of
respondents in different categories. Our sum-
mary description is adapted from BJS (2006)
for the NCVS and from Sian Nicholas and asso-
ciates (2007) for the BCS. Essays in the volume
edited by Hough and Maxfi eld (2007) trace the
history and development of the BCS.

The National Crime
Victimization Survey
Although various parts of the NCVS have been
modifi ed since the surveys were begun in 1972,
the basic sampling strategies have remained rel-

atively constant. The most signifi cant changes
have been fl uctuations in sample size and a
shift to telephone interviewing, with samples of
telephone number listings eventually leading to
households. In 2006, the NCVS changed inter-
viewing procedures and revised samples some-
what to account for population movement
from central cities (Rand and Catalano 2007).

The NCVS seeks to represent the nationwide
population of persons aged 12 and over who
are living in households. We noted in Chapter
6 that the phrase “living in households” is sig-
nifi cant; this is especially true in our current
discussion of sampling. NCVS procedures are
not designed to sample homeless persons or
people who live in institutional settings such
as military group housing, temporary hous-
ing, or correctional facilities. Also, because the
sample targets persons who live in households,
it cannot provide estimates of crimes in which
a commercial establishment or business is the
victim.

Because there is no national list of house-
holds in the United States, multistage cluster
sampling must be used to proceed from larger
units to households and their residents. The
national sampling frame used in the fi rst stage
defi nes primary sampling units (PSUs) as large
metropolitan areas, nonmetropolitan counties,
or groups of contiguous counties (to represent
rural areas).

The largest 93 PSUs are specifi ed as self-
representing and are automatically included
in the fi rst stage of sampling. The remaining
PSUs are stratifi ed by size, population density,
reported crimes, and other variables. An addi-
tional 110 non-self-representing PSUs are then
selected with a probability proportionate to the
population of the PSU. Thus, if one stratum
includes Bugtussle, Texas (population 7,000),
Punkinseed, Indiana (5,000), and Rancid, Mis-
souri (3,000), the probability that each PSU will
be selected is 7 in 15 for Bugtussle, 5 in 15 for
Punkinseed, and 3 in 15 for Rancid.

The second stage of sampling involves des-
ignating four different sampling frames within

Chapter 6 Sampling 161

each PSU. Each of these frames is used to se-
lect different types of subsequent units. First,
the housing unit frame lists addresses of hous-
ing units from census records. Second, a group
quarters frame lists group quarters such as dor-
mitories and rooming houses from census re-
cords. Third, a building permit frame lists newly
constructed housing units from local govern-
ment sources. Finally, an area frame lists census
blocks (physical geographic units), from which
independent address lists are generated and
sampled. Notice that these four frames are nec-
essary because comprehensive, up-to-date lists
of residential addresses are not available in this
country.

For the 2005 NCVS, these procedures yielded
a sample of approximately 39,000 housing units.
Completed interviews were obtained from about
67,000 individuals living in households. The
sample design for the NCVS is an excellent illus-
tration of the relationship between sample size
and variation in the target population. Because
serious crime is a relatively rare event when aver-
aged across the entire U.S. population, very large
samples must be drawn. And because no single
list of the target population exists, samples are
drawn in several stages.

For further information, consult the NCVS
documentation maintained by the Bureau
of Justice Statistics (www.ojp.usdoj.gov/bjs/
cvictgen.htm; accessed May 8, 2008). Also see
the “National Crime Victimization Survey Re-
source Guide,” maintained at the National Ar-
chive of Criminal Justice Data (www.icpsr.um-
ich.edu/NACJD/NCVS; accessed May 8, 2008).

The British Crime Survey
We have seen that NCVS sampling procedures
begin with demographic units and work down
to selection of housing units. BCS sampling
is simplifi ed by the existence of a national list
of something close to addresses. The Postcode
Address File (PAF) lists postal delivery points
nationwide and is further subdivided to distin-
guish “small users,” those addresses receiving
less than 50 items per day. Even though 50 pieces

of mail might still seem like quite a bit, this
classifi cation makes it easier to distinguish
household addresses from commercial ones.

Postcode sectors, roughly corresponding to
U.S. fi ve-digit zip codes, are easily defi ned clus-
ters of addresses from the PAF. Samples of ad-
dresses are then selected from within these sec-
tors. In most cases, 32 addresses are selected
from within the postcode.

In addition, BCS researchers devised
“booster samples” to increase the number of
respondents who were ethnic minorities or
aged 16 to 24. Victimization experiences of eth-
nic minorities were of special interest to police
and other public offi cials. Young people were
oversampled to complete a special question-
naire of self-report behavior items.

The ethnic minority booster was accom-
plished by fi rst selecting respondents using
formal sampling procedures. Interviewers then
sought information about four housing units
adjacent to the selected unit in an effort to de-
termine if any residents were nonwhite. If ad-
jacent units housed minority families, one was
selected to be interviewed for the ethnic minor-
ity booster sample. This is an example of what
Steven Thompson (1997) calls “adaptive sam-
pling.” Probability samples are selected, and
then those respondents are used to identify
other individuals who meet some criterion. In-
creasing the number of respondents aged 16 to
24 was simpler—interviewers sought additional
respondents in that age group within sampled
households.

One fi nal sampling dimension refl ects the
regional organization of police in England and
Wales into 43 police areas. The BCS was fur-
ther stratifi ed to produce 600 to 700 interviews
in each police area to support analysis within
those areas.

Apart from the young-person booster, once
individual households were selected one person
age 16 or over was randomly chosen to pro-
vide information for all household members.
Sampling procedures initially produced about
54,700 addresses for the year 2004 BCS. About

www.ojp.usdoj.gov/bjs/cvictgen.htm

www.ojp.usdoj.gov/bjs/cvictgen.htm

www.icpsr.umich.edu/NACJD/NCVS

www.icpsr.umich.edu/NACJD/NCVS

162 Part Three Modes of Observation

8 percent of these were eliminated because
they were vacant, had been demolished, or
contained a business, not a private household.
Of the remaining 50,000 addresses, interviews
were completed with 37,213 individuals for a
response rate of about 74 percent.

Although sampling designs for both the
BCS and the NCVS are more complex than we
have represented in this discussion, the impor-
tant point is how multistage cluster sampling
is used in each. Notice two principal differences
between the samples. First, the NCVS uses
proportionate sampling to select a large number
of respondents who may then represent the rel-
atively rare attribute of victimization. The BCS
samples a disproportionate number of minority
and young residents, who are more likely to be
victims of crime. Second, sampling procedures
for the BCS are somewhat simpler than those
for the NCVS, largely due to the existence of a
suitable sampling frame at the national level.
Stratifi cation and later-stage sampling are con-
ducted to more effi ciently represent each police
area and to oversample minority respondents.

Probability Sampling in Review
Depending on the fi eld situation, probability
sampling can be very simple or extremely com-
plex, time consuming, and expensive. Whatever
the situation, however, it is usually the pre-
ferred method for selecting study elements. It’s
worth restating the two main reasons for this.

First, probability sampling avoids conscious
or unconscious biases in element selection on
the part of the researcher. If all elements in the
population have an equal (or unequal and sub-
sequently weighted) chance of selection, there
is an excellent chance that the sample so se-
lected will closely represent the population of
all elements.

Second, probability sampling permits esti-
mates of sampling error. Although no probabil-
ity sample will be perfectly representative in all
respects, controlled selection methods permit
the researcher to estimate the degree of expected
error.

Despite these advantages, it is sometimes im-
possible to use standard probability sampling
methods. Sometimes, it isn’t even appropriate
to do so. In those cases, researchers turn to non-
probability sampling.

Nonprobability Sampling
In many research applications, nonprobability sam-
ples are necessary or advantageous.

You can no doubt envision situations in which
it would be either impossible or unfeasible to
select the kinds of probability samples we have
described. Suppose we want to study auto
thieves. There is no list of all auto thieves, nor
are we likely to be able to create anything other
than a partial and highly selective list. More-
over, probability sampling is sometimes inap-
propriate even if it is possible. In many such sit-
uations, nonprobability sampling procedures
are called for. Recall that probability samples
are defi ned as those in which the probability
that any given sampling element will be se-
lected is known. Conversely, in nonprobability
sampling, the likelihood that any given element
will be selected is not known.

We’ll examine four types of nonprobability
samples in this section: (1) purposive or judg-
mental sampling, (2) quota sampling, (3) the
reliance on available subjects, and (4) snowball
sampling.

Purposive Sampling
Occasionally, it may be appropriate to select a
sample on the basis of our own knowledge of
the population, its elements, and the nature of
our research aims—in short, based on our judg-
ment and the purpose of the study. Such a sam-
ple is called a purposive sample.

We may wish to study a small subset of a
larger population in which many members of
the subset are easily identifi ed, but the enumer-
ation of all of them would be nearly impossible.
For example, we might want to study members
of community crime prevention groups; many
members are easily visible, but it is not feasible

Chapter 6 Sampling 163

to defi ne and sample all members of commu-
nity crime prevention organizations. In study-
ing a sample of the most visible members,
however, we may collect data suffi cient for our
purposes.

Criminal justice research often compares
practices in different jurisdictions, such as cit-
ies or states. In such cases, study elements may
be selected because they exhibit some particu-
lar attribute. For instance, Cassia Spohn and
Julie Horney (1991) were interested in how
differences among states in rape shield laws
affected the use of evidence in sexual assault
cases. Strong rape shield laws restricted the use
of evidence or testimony about a rape victim’s
sexual behavior, whereas weak laws routinely
permitted such testimony. Spohn and Horney
selected a purposive sample of six states for
analysis based on the strength of their rape
shield laws. Similarly, Michael Leiber and Jayne
Stairs (1999) were interested in how economic
inequality combined with race to affect sentenc-
ing practices in Iowa juvenile courts. After con-
trolling for economic status, they found that
African American defendants received more
restrictive sentences than white defendants.
Leiber and Stairs selected three jurisdictions
purposively to obtain sample elements with ad-
equate racial diversity in the state of Iowa. The
researchers then selected more than 5,000 juve-
nile cases processed in those three courts.

Researchers may also use purposive or judg-
mental sampling to represent patterns of com-
plex variation. In their study of closed-circuit
television (CCTV) systems, Martin Gill and
Angela Spriggs (2005) describe how sites were
sampled to refl ect variation in type of area (resi-
dential, commercial, city center, large parking
facilities). Some individual CCTV projects were
selected because of certain specifi c features—
they were installed in a high-crime area, or the
CCTV setup was notably expensive. One ele-
ment of this study involved interviews to assess
changes in fear of crime following CCTV instal-
lation. Spriggs and associates (2005) sampled
passers-by on city center streets. They fi rst se-

lected purposive samples of areas and spread
their interviews across four day/time periods.
This was done to refl ect variation in the types
of people encountered on different streets at
different times. Sampling strategies were thus
adapted because of expected heterogeneity that
would have been diffi cult to capture with ran-
dom selection.

Pretesting a questionnaire is another situa-
tion in which purposive sampling is common.
If we plan to study people’s attitudes about
court-ordered restitution for crime victims,
we might want to test the questionnaire on a
sample of crime victims. Instead of selecting a
probability sample of the general population,
we might select some number of known crime
victims, perhaps from court records.

Quota Sampling
Like probability sampling, quota sampling ad-
dresses the issue of representativeness, although
the two methods approach the issue quite dif-
ferently. Obtaining a quota sample begins with
a matrix or table describing the characteristics
of the target population we wish to represent.
To do this, we need to know, for example, what
proportion of the population is male or female
and what proportions fall into various age cat-
egories, education levels, ethnic groups, and so
forth. In establishing a national quota sample,
we need to know what proportion of the na-
tional population is, say, urban, eastern, male,
under 25, white, working-class, and all the com-
binations of these attributes.

Once we have created such a matrix and
assigned a relative proportion to each cell in
the matrix, we can collect data from people
who have all the characteristics of a given cell.
We then assign all the persons in a given cell a
weight appropriate to their portion of the total
population. When all the sample elements are
weighted in this way, the overall data should
provide a reasonable representation of the total
population.

Although quota sampling may resemble
prob ability sampling, it has two inherent prob-

164 Part Three Modes of Observation

It is generally best justifi ed if the researcher
wants to study the characteristics of people
who are passing the sampling point at some
specifi ed time. For example, in her study of
street lighting as a crime prevention strategy,
Painter (1996) interviewed samples of pedes-
trians as they walked through specifi ed areas
of London just before and six weeks after im-
provements were made in lighting conditions.
Painter clearly understood the scope and limits
of this sampling technique. Her fi ndings are de-
scribed as applying to people who actually use
area streets after dark, while recognizing that
this population may be quite different from
the population of area residents. Interviewing
a sample of available evening pedestrians is an
appropriate sampling technique for generaliz-
ing to the population of evening pedestrians,
and the population of pedestrians will not be
the same as the population of residents.

In a more general sense, samples like Paint-
er’s select elements of a process—the process that
generates evening pedestrians—rather than el-
ements of a population. If we can safely assume
that no systematic pattern generates elements
of a process, then a sample of available elements
as they happen to pass by can be considered to
be representative. If you are interested in study-
ing crimes reported to police, a sample of, say,
every seventh crime report over a two-month
period will be representative of the general pop-
ulation of crime reports over that two-month
period.

Sometimes nonprobability and probability
sampling techniques can be combined. For ex-
ample, most attempts to sample homeless or
street people rely on available subjects found
in shelters, parks, or other locations. Semaan
and associates (2002) suggest that once areas
are found where homeless people congregate,
individuals there can be enumerated and then
sampled. Here’s a semi-hypothetical example.

In recent years, Maxfi eld has observed that
many people who appear to be homeless con-
gregate at the corner of 9th Avenue and 41st
Street in Manhattan. An effi cient strategy for

lems. First, the quota frame (the proportions
that different cells represent) must be accu-
rate, and it is often diffi cult to get up-to-date
information for this purpose. A quota sample
of auto thieves or teenage vandals would obvi-
ously suffer from this diffi culty. Second, biases
may exist in the selection of sample elements
within a given cell— even though its proportion
of the population is accurately estimated. An
interviewer instructed to interview fi ve persons
who meet a given complex set of characteristics
may still avoid people who live at the top of
seven-story walk-ups, have particularly run-
down homes, or own vicious dogs.

Quota and purposive sampling may be com-
bined to produce samples that are intuitively,
if not statistically, representative. For example,
Kate Painter and David Farrington (1998) de-
signed a survey to study marital and partner
violence. They wanted to represent several vari-
ables: marital status, age, an occupational mea-
sure of social status, and each of 10 standard
regions in the United Kingdom. A probability
sample was rejected because the authors wished
to get adequate numbers of respondents in each
of several categories, and some of the catego-
ries were thought to be relatively uncommon.
Instead, the authors selected quota samples of
100 women in each of 10 regions and sought
equal numbers of respondents in each of fi ve
occupational status categories.

Reliance on Available Subjects
Relying on available subjects—that is, stop-
ping people at a street corner or some other
location—is sometimes misleadingly called
“convenience sampling.” University researchers
frequently conduct surveys among the students
enrolled in large lecture classes. The ease and
economy of such a method explain its popu-
larity; however, it seldom produces data of any
general value. It may be useful to pretest a ques-
tionnaire, but it should not be used for a study
purportedly describing students as a whole.

Reliance on available subjects can be an ap-
propriate sampling method in some situations.

Chapter 6 Sampling 165

was ingenious. The Port Authority drew a
sample of outgoing buses . . . and placed
representatives aboard. After the bus had
departed, he or she would hand out a
questionnaire to be completed during the
trip . . . [and] collect these questionnaires
as each customer arrived at the destina-
tion. This procedure produced a very high
response rate and high completion rate for
each item.

Snowball Sampling
Another type of nonprobability sampling that
closely resembles the available-subjects ap-
proach is called snowball sampling. Commonly
used in fi eld observation studies or specialized
interviewing, snowball sampling begins by
identifying a single subject or small number of
subjects and then asking the subject(s) to iden-
tify others like him or her who might be willing
to participate in a study.

Criminal justice research on active crimi-
nals or deviants frequently uses snowball sam-
pling techniques. The researcher often makes
an initial contact by consulting criminal justice
agency records to identify, say, someone con-
victed of auto theft and placed on probation.
That person is interviewed and asked to sug-
gest other auto thieves whom researchers could
contact. Stephen Baron and Timothy Hartna-
gel (1998) studied violence among homeless
youths in Edmonton, Canada, identifying their
sample through snowball techniques. Similarly,
snowball sampling is often used to study drug
users and dealers. Leon Pettiway (1995) de-
scribes crack cocaine markets in Philadelphia
through the eyes of his snowball sample. Bruce
Jacobs and Jody Miller (1998) accumulated a
sample of 25 female crack dealers in St. Louis
to study specifi c techniques to avoid arrest.

Contacting an initial subject or informant
who will then refer the researcher to other sub-
jects can be especially diffi cult in studies of ac-
tive offenders. As in most aspects of criminal
justice research, the various approaches to ini-
tiating contacts for snowball sampling have

interviewing samples of homeless people would
be a time-space sample where, for example, each
hour individuals would be counted and some
fraction sampled. Let’s say we wished to inter-
view 30 people and spread those interviews over
a six-hour period; we would try to interview fi ve
people per hour. So each hour we would count
the number of people within some specifi c area
(say, 20 at 1:00 p.m.), then divide that number
by fi ve to obtain a sampling fraction (4 in this
case). Recalling our earlier discussion of system-
atic probability sampling, we would then select
a random starting point to identify the fi rst
person to interview, then select the fourth per-
son after that, and so on. This approach would
yield an unbiased sample that represented the
population of street people on one Manhattan
corner over a six-hour period.

As it happens, 41st Street and 9th Avenue in
Manhattan is the rear entrance to the Port Au-
thority bus terminal. Marcus Felson and asso-
ciates (1996) described efforts to reduce crime
and disorder in the Port Authority terminal, a
place they claim is the world’s busiest bus sta-
tion. Among the most important objectives were
to reduce perceptions of crime problems and to
improve how travelers felt about the Port Au-
thority terminal. These are research questions
appropriate to some sort of survey. Because
more than 170,000 passengers pass through the
bus station on an average spring day, obtaining
a suffi ciently large sample of users presents no
diffi culty. The problem was how to select a sam-
ple. Felson and associates point out that stop-
ping passengers on their way to or from a bus
was out of the question. Most passengers are
commuters whose journey to and from work is
timed to the minute, with none to spare for in-
terviewers’ questions. Here’s how Felson and as-
sociates describe the solution and the sampling
strategy it embodied (1996, 90–91):

Response rates would have been low if the
Port Authority had tried to interview rush-
ing customers or to hand out question-
naires to be returned later. Their solution

166 Part Three Modes of Observation

Like other elements of criminal justice re-
search, sampling plans must be adapted to
specifi c research applications. When it’s impor-
tant to make estimates of the accuracy of our
samples, and when suitable sampling frames
are possible, we use probability sampling tech-
niques. When no reasonable sampling frame
is available, and we cannot draw a probability
sample, we cannot make estimates about sam-
ple accuracy. Fortunately, in such situations,
we can make use of a variety of approaches for
drawing nonprobability samples.

✪ Main Points
• The logic of probability sampling forms the

foundation for representing large populations
with small subsets of those populations.

• The chief criterion of a sample’s quality is the
degree to which it is representative—the extent
to which the characteristics of the sample are
the same as those of the population from which
it was selected.

• The most carefully selected sample is almost
never a perfect representation of the popula-
tion from which it was selected. Some degree of
sampling error always exists.

• Probability sampling methods provide one
excellent way of selecting samples that will be
quite representative. They make it possible to
estimate the amount of sampling error that
should be expected in a given sample.

• The chief principle of probability sampling is
that every member of the total population must
have some known nonzero probability of being
selected in the sample.

• Our ability to estimate population parameters
with sample statistics is rooted in the sampling
distribution and probability theory. If we draw
a large number of samples of a given size, sam-
ple statistics will cluster around the true popu-
lation parameter. As sample size increases, the
cluster becomes tighter.

• A variety of sampling designs can be used and
combined to suit different populations and re-
search purposes. Each type of sampling has its
own advantages and disadvantages.

• Simple random sampling is logically the most
fundamental technique in probability sampling
although it is seldom used in practice.

• Systematic sampling involves using a sampling
frame to select units that appear at some speci-

advantages and disadvantages. Beginning with
subjects who have a previous arrest or convic-
tion is usually the easiest method for research-
ers, but it suffers from potential bias by de-
pending on offenders who are known to police
or other offi cials (McCall 1978).

Because snowball samples are used most
commonly in fi eld research, we’ll return to this
method of selecting subjects in Chapter 10 on
fi eld methods and observation. In the mean-
time, recent studies by researchers at the Uni-
versity of Missouri–St. Louis offer good exam-
ples of snowball samples of offenders that are
not dependent on contacts with criminal jus-
tice offi cials. Beginning with a street-savvy ex-
offender, these researchers identifi ed samples
of burglars (Wright and Decker 1994), mem-
bers of youth gangs (Decker and Van Winkle
1996), and armed robbers (Wright and Decker
1997). It’s especially diffi cult to identify active
offenders as research subjects, but these exam-
ples illustrate notably clever uses of snowball
sampling techniques.

Nonprobability Sampling in Review
Snowball samples are essentially variations on
purposive samples (we want to sample juve-
nile gang members) and on samples of avail-
able subjects (sample elements identify other
sample elements that are available to us). Each
of these is a nonprobability sampling tech-
nique. And, like other types of nonprobability
samples, snowball samples are most appropri-
ate when it is impossible to determine the prob-
ability that any given element will be selected in
a sample. Furthermore, snowball sampling and
related techniques may be necessary when the
target population is diffi cult to locate or even
identify. Selecting pedestrians who happen to
pass by, for example, is not an effi cient way to
select a sample of prostitutes or juvenile gang
members. In contrast, approaching a pedestrian
is an appropriate sampling method for study-
ing pedestrians, whereas drawing a probability
sample of urban residents to identify people
who walk in specifi c areas of the city would be
costly and ineffi cient.

Chapter 6 Sampling 167

✪ Review Questions and Exercises
1. Discuss possible study populations, elements,

sampling units, and sampling frames for draw-
ing a sample to represent the populations
listed here. You may wish to limit your discus-
sion to populations in a specifi c state or other
jurisdiction.

a. Municipal police offi cers
b. Felony court judges
c. Auto thieves
d. Licensed automobile drivers
e. State police superintendents
f. Persons incarcerated in county jails
2. What steps would be involved in selecting a

multistage cluster sample of undergraduate
students taking criminal justice research meth-
ods courses in U.S. colleges and universities?

3. Briefl y discuss some potential problems in draw-
ing a sample of visitors to a popular website.

✪ Additional Readings
Kish, Leslie, Survey Sampling (New York: Wiley,

1965). Unquestionably the defi nitive work on
sampling in social research. Kish’s coverage
ranges from the simplest matters to the most
complex and mathematical. He is both highly
theoretical and downright practical. Easily read-
able and diffi cult passages intermingle as Kish
dissects everything you could want or need to
know about each aspect of sampling.

Patton, Michael Quinn, Qualitative Research and
Evaluation Methods, 3rd ed. (Thousand Oaks,
CA: Sage, 2001). Though its focus is evaluation,
this book presents one of the best discussions
of nonprobability sampling available. Patton
covers a wide range of variations on purposive
sampling.

Semaan, Salaam, Jennifer Lauby, and Jon
Liebman, “Street and Network Sampling in
Evaluation Studies of HIV Risk-Reduction In-

fi ed interval—for example, every 8th, or 15th,
or 1,023rd unit. This method is functionally
equivalent to simple random sampling.

• Stratifi cation improves the representativeness
of a sample by reducing the sampling error.

• Disproportionate stratifi ed sampling is espe-
cially useful when we want to select adequate
numbers of certain types of subjects who are rel-
atively rare in the population we are studying.

• Multistage cluster sampling is frequently used
when there is no list of all the members of a
population.

• The NCVS and the BCS are national crime sur-
veys based on multistage cluster samples. Sam-
pling methods for each survey illustrate differ-
ent approaches to representing relatively rare
events.

• Nonprobability sampling methods are less sta-
tistically representative and less reliable than
probability sampling methods. However, they
are often easier and cheaper to use.

• Purposive sampling is used when researchers
wish to select specifi c elements of a population.
This may be because the elements are believed
to be representative of extreme cases or because
they represent the range of variation expected
in a population.

• In quota sampling, researchers begin with a de-
tailed description of the characteristics of the
total population and then select sample mem-
bers in a way that includes the different com-
posite profi les that exist in the population.

• In cases in which it’s not possible to draw non-
probability samples through other means, re-
searchers often rely on available subjects. Pro-
fessors sometimes do this—students in their
classes are available subjects.

• Snowball samples accumulate subjects through
chains of referrals and are most commonly used
in fi eld research.

✪ Key Terms
binomial variable,

p. 149
cluster sample, p. 157
confi dence interval,

p. 152
confi dence level,

p. 151
disproportionate

stratifi ed sam-
pling, p. 156

equal probability of
selection method
(EPSEM), p. 144

nonprobability
sample, p. 162

population, p. 145
population param-

eter, p. 145
probability sample,

p. 142

purposive sample,
p. 162

quota sample, p. 163
sample element,

p. 145
sample statistic,

p. 145
sampling distribu-

tion, p. 146
sampling frame,

p. 149

sampling units,
p. 157

simple random
sample, p. 154

snowball sampling,
p. 165

standard error,
p. 150

stratifi cation, p. 155
systematic sampling,

p. 154

168 Part Three Modes of Observation

(Washington, DC: U.S. Department of Justice,
Offi ce of Justice Programs, Bureau of Justice
Statistics, 1999). This short handbook offers
good, basic advice on drawing samples for com-
munity-level victimization surveys. The infor-
mation on estimating sample size is especially
good.

terventions,” AIDS Reviews 4(2002): 213–223.
Many techniques used in public health research
can cross over nicely for criminal justice stud-
ies. This is a good example of creative sampling
techniques for fi nding hard-to-fi nd people.

Weisel, Deborah, Conducting Community Surveys:
A Practical Guide for Law Enforcement Agencies

169

Chapter 7

Survey Research and Other
Ways of Asking Questions
We’ll examine how mail, interview, and telephone surveys can be used in
criminal justice research. We’ll also consider other ways of collecting data by
asking people questions.

Introduction 170

Topics Appropriate to Survey
Research 171

Counting Crime 171

Self-Reports 171

Perceptions and Attitudes 172

Targeted Victim Surveys 172

Other Evaluation Uses 172

Guidelines for Asking
Questions 173

Open-Ended and Closed-Ended
Questions 173

Questions and Statements 174

Make Items Clear 174

Short Items Are Best 174

Avoid Negative Items 174

Biased Items and Terms 175

Designing Self-Report Items 175

Questionnaire Construction 177

General Questionnaire Format 177

Contingency Questions 177

Matrix Questions 178

Ordering Items in a
Questionnaire 180

(continued)

170 Part Three Modes of Observation

Introduction
Asking people questions is the most common data
collection method in social science.

A little-known survey was attempted among
French workers in 1880. A German political so-
ciologist mailed some 25,000 questionnaires to
workers to determine the extent of their exploi-
tation by employers. The rather lengthy ques-
tionnaire included items such as these:

Does your employer or his representa-
tive resort to trickery in order to defraud
you of a part of your earnings? If you are
paid piece rates, is the quality of the article
made a pretext for fraudulent deductions
from your wages?

The survey researcher in this case was not George
Gallup but Karl Marx (1880, 208). Although
25,000 questionnaires were mailed out, there is

no record of any being returned. And you need
not know much about survey methods to recog-
nize the loaded questions posed by Marx.

Survey research is perhaps the most fre-
quently used mode of observation in sociology
and political science, and surveys are often used
in criminal justice research as well. You have no
doubt been a respondent in a survey, and you
may have conducted surveys yourself.

We begin this chapter by discussing the crim-
inal justice topics that are most appropriate for
survey methods. Next, we cover the basic princi-
ples of how to ask people questions for research
purposes, including some of the details of ques-
tionnaire construction. We describe the three
basic ways of administering questionnaires—
self-administration, face-to-face interviews,
and telephone interviews—and summarize the
strengths and weaknesses of each method. Af-
ter discussing more specialized interviewing

DON’T START FROM

SCRATCH! 181

Self-Administered
Questionnaires 181

Mail Distribution and Return 182

Warning Mailings and Cover
Letters 182

Follow-Up Mailings 183

Acceptable Response Rates 183

Computer-Based Self-
Administration 184

In-Person Interview Surveys 185

The Role of the Interviewer 185

Coordination and Control 186

Computer-Assisted In-Person
Interviews 187

Telephone Surveys 189

Computer-Assisted Telephone
Interviewing 190

Comparison of the Three
Methods 191

Strengths and Weaknesses of
Survey Research 192

Other Ways of Asking
Questions 194

Specialized Interviewing 194

Focus Groups 195

Should You Do It Yourself ? 196

Chapter 7 Survey Research and Other Ways of Asking Questions 171

of data collected by police. Of course, survey
measures have their own shortcomings. Most
of these diffi culties, such as recall error and
reluctance to discuss victimization with in-
terviewers, are inherent in survey methods.
Nevertheless, victim surveys have become im-
portant sources of data about the volume
of crime in the United States and in other
countries.

Self-Reports
Surveys that ask people about crimes they may
have committed were also discussed in Chapter
4. For research that seeks to explore or explain
why people commit criminal, delinquent, or de-
viant acts, asking questions is the best method
available.

Within the general category of self-report
surveys, two different applications are dis-
tinguished by their target population and
sampling methods. Studies of offenders select
samples of respondents known to have com-
mitted crimes, often prisoners. Typically the fo-
cus is on the frequency of offending—how many
crimes of various types are committed by active
offenders over a period of time.

The other type of self-report survey focuses
on the prevalence of offending—how many peo-
ple commit crimes, in contrast to how many
crimes are committed by a target population
of offenders. Such surveys typically use samples
that represent a broader population, such as
U.S. households, adult males, or high school se-
niors. The Monitoring the Future survey, briefl y
described in Chapter 4, is a self-report survey
that centers on measuring the prevalence of of-
fending among high school seniors.

General-population surveys and surveys
of offenders tend to present different types of
diffi culties in connection with the validity and
reliability of self-reports. Recall error and the
r eporting of fabricated offenses may be prob-
lems in a survey of high-rate offenders (Roberts,
Mulvey, Horney, et al. 2005), whereas respon-
dents in general-population self-report surveys
may be reluctant to disclose illegal behavior.
When we discuss questionnaire construction

t echniques, such as focus groups, we conclude
the chapter with some advice on the benefi ts
and pitfalls of conducting your own surveys.

Topics Appropriate to
Survey Research
Surveys have a wide variety of uses in basic and ap-
plied criminal justice research.

Surveys may be used for descriptive, explana-
tory, exploratory, and applied research. They
are best suited for studies that have individual
people as the units of analysis. They are often
used for other units of analysis as well, such
as households or organizations. Even in these
cases, however, one or more individual people
act as respondents or informants.

For example, researchers sometimes use vic-
timization incidents as units of analysis in ex-
amining data from crime surveys. The fact that
some people may be victimized more than once
and others not at all means that victimization
incidents are not the same units as individuals.
However, a survey questionnaire must still be
administered to people who provide informa-
tion about victimization incidents. In a similar
fashion, the National Jail Census, conducted
every fi ve or so years by the Census Bureau, col-
lects information about local detention facili-
ties. Jails are the units of analysis, but informa-
tion about each jail is provided by individuals.
Quite a lot of research on police practices was
conducted following passage of the 1994 Crime
Bill, and in most cases, law enforcement agen-
cies were the units of analysis; individual peo-
ple, however, provided information for the sur-
veys of police departments.

We now consider some broad categories of
research applications in which survey methods
are especially appropriate.

Counting Crime
We touched on this use of surveys in Chapter 4.
Asking people about victimizations is a measure
of crime that adjusts for some of the l imitations

172 Part Three Modes of Observation

propriate for evaluating any policy that may in-
crease crime reporting as a side effect.

Consider also that large-scale surveys such
as the NCVS cannot be used to evaluate local
crime prevention programs. This is because
the NCVS is designed to represent the national
population of persons who live in households.
Although NCVS data for the 11 largest states
can be analyzed separately (Lauritsen and
Schaum 2005), the NCVS is not representa-
tive of any particular local jurisdiction. It is not
possible to identify the specifi c location of vic-
timizations from NCVS data.

The community victim surveys designed
by the BJS and the COPS offi ce help with each
of these needs. Local surveys can be launched
specifi cally to evaluate local crime prevention
efforts. Or innovative programs can be timed
to correspond to regular cycles of local surveys.
In each case, the BJS-COPS guide (Weisel 1999)
presents advice on drawing samples to repre-
sent local jurisdictions.

Another type of targeted victim survey is one
that focuses on particular types of incidents
that might target more narrowly defi ned pop-
ulation segments. A good example is the Na-
tional Violence Against Women Survey, a joint
effort of the National Institute of Justice and
a violence prevention bureau in the National
Institutes of Health (Tjaden and Thoennes
2000). Screening questions presented explicit
descriptions of sexual and other violence with
the specifi c purpose of providing better infor-
mation about these incidents that have proved
diffi cult to measure through general-purpose
crime surveys.

Other Evaluation Uses
Other types of surveys may be appropriate for
applied studies. A good illustration of this is a
continuing series of neighborhood surveys
to evaluate community policing in Chicago.
Here’s an example of how the researchers link
their information needs to surveys (Chicago
Community Policing Evaluation Consortium
2004, 2):

later in this chapter, we will present examples
and suggestions for creating self-report items.

Perceptions and Attitudes
Another application of surveys in criminal
justice is to learn how people feel about crime
and criminal justice policy. Public views about
sentencing policies, gun control, police per-
formance, and drug abuse are often solicited
in opinion polls. Begun in 1972, the General
Social Survey is an ongoing survey of social in-
dicators in the United States. Questions about
fear of crime and other perceptions of crime
problems are regularly included. Since the
1970s, a growing number of explanatory stud-
ies have been conducted on public perceptions
of crime and crime problems. A large body of
research on fear of crime has grown, in part,
from the realization that fear and its behav-
ioral consequences are much more widespread
among the population than is actual criminal
victimization (Ditton and Farrall 2007).

Targeted Victim Surveys
Victim surveys that target individual cities or
neighborhoods are important tools for evalu-
ating policy innovations. Many criminal jus-
tice programs seek to prevent or reduce crime
in a specifi c area, but crimes reported to po-
lice cannot be used to evaluate many types of
programs.

To see why this is so, consider a hypothetical
community policing program that encourages
neighborhood residents to report all suspected
crimes to the police. Results from the National
Crime Victimization Survey (NCVS) have con-
sistently shown that many minor incidents are
not reported because victims believe that po-
lice will not want to be bothered. But if a new
program stresses that police actually want to
be bothered, the proportion of crimes reported
may increase, resulting in what appears to be an
increase in crime.

The solution is to conduct targeted victim
surveys before and after introducing a policy
change. Such victim surveys are especially ap-

Chapter 7 Survey Research and Other Ways of Asking Questions 173

p olice in your city today?” and be provided with
a space to write in the answer (or be asked to
report it orally to an interviewer). The other op-
tion is closed-ended questions, in which the
respondent is asked to select an answer from
among a list provided by the researcher.

Closed-ended questions are especially useful
because they provide more uniform responses
and are more easily processed. They often can be
transferred directly into a data fi le. Open-ended
responses, in contrast, must be coded before
they can be processed for analysis. This coding
process often requires that the researcher in-
terpret the meaning of responses, which opens
up the possibility of misunderstanding and re-
searcher bias. Also, some respondents may give
answers that are essentially irrelevant to the re-
searcher’s intent.

The chief shortcoming of closed-ended
questions lies in the researcher’s structuring of
responses. When the relevant answers to a given
question are relatively clear, there should be no
problem. In some cases, however, the research-
er’s list of responses may fail to include some
important answers. When we ask about “the
most important crime problem facing the po-
lice in your city today,” for example, our check-
list might omit certain crime problems that re-
spondents consider important.

In constructing closed-ended questions, we
are best guided by two of the requirements for
operationalizing variables stated in Chapter 4.
First, the response categories provided should
be exhaustive: they should include all the pos-
sible responses that might be expected. Often,
researchers ensure this by adding a category
labeled something like “Other (Please specify:
______________).” Second, the answer catego-
ries must be mutually exclusive: the respondent
should not feel compelled to select more than
one. In some cases, researchers solicit multiple
answers, but doing so can create diffi culties in
subsequent data processing and analysis. To
ensure that categories are mutually exclusive,
we should carefully consider each combination
of categories, asking whether a person could

Because it is a participatory program, CAPS
[Chicago’s Alternative Policing Strategy]
depends on the effectiveness of campaigns
to bring it to the public’s attention and on
the success of efforts to get the public in-
volved in beat meetings and other district
projects. The surveys enable us to track
the public’s awareness and involvement in
community policing in Chicago.

In general, surveys can be used to evaluate
policy that seeks to change attitudes, beliefs,
or perceptions. Consider a program designed
to promote victim and witness cooperation
in criminal court by reducing case-processing
time. At fi rst, we might consider direct mea-
sures of case-processing time as indicators of
program success. If the program goal is to in-
crease cooperation, however, a survey that asks
how victims and witnesses perceive case-pro-
cessing time will be more appropriate.

Guidelines for Asking
Questions
How questions are asked is the single most impor-
tant feature of survey research.

A defi ning feature of survey methods is that
research concepts are operationalized by ask-
ing people questions. Several general guidelines
can assist in framing and asking questions that
serve as excellent operationalizations of vari-
ables. It is important also to be aware of pitfalls
that can result in useless and even misleading
information. We’ll begin with some of the op-
tions available for creating questionnaires.

Open-Ended and
Closed-Ended Questions
In asking questions, researchers have two basic
options, and each can accommodate certain
variations. The fi rst is open-ended questions,
in which the respondent is asked to provide
his or her own answers. For example, the re-
spondent may be asked, “What do you feel is
the most important crime problem facing the

174 Part Three Modes of Observation

cide?” Questionnaire items should be precise
so that the respondent knows exactly what the
researcher wants an answer to.

Frequently, researchers ask respondents for a
single answer to a combination question. Such
double-barreled questions seem to occur most
often when the researcher has personally iden-
tifi ed with a complex question. For example,
the researcher might ask respondents to agree
or disagree with the statement “The Depart-
ment of Corrections should stop releasing in-
mates for weekend furloughs and concentrate
on rehabilitating criminals.” Although many
people will unequivocally agree with the state-
ment and others will unequivocally disagree,
still others will be unable to answer. Some
might want to terminate the furlough program
and punish—not rehabilitate—prisoners. Oth-
ers might want to expand rehabilitation efforts
while maintaining weekend furloughs; they can
neither agree nor disagree without misleading
the researcher.

Short Items Are Best
In the interest of being unambiguous and pre-
cise and pointing to the relevance of an issue, re-
searchers often create long, complicated items.
That should be avoided. In the case of ques-
tionnaires respondents complete themselves,
they are often unwilling to study an item to un-
derstand it. The respondent should be able to
read an item quickly, understand its intent, and
select or provide an answer without diffi culty.
In general, it’s safe to assume that respondents
will read items quickly and give quick answers;
therefore short, clear items that will not be mis-
interpreted under those conditions are best.
Questions read to respondents in person or
over the phone should be similarly brief.

Avoid Negative Items
A negation in a questionnaire item paves the
way for easy misinterpretation. Asked to agree
or disagree with the statement “Drugs such
as marijuana should not be legalized,” many

reasonably choose more than one answer. In
addition, it is useful to add an instruction that
respondents should select the one best answer.
However, this is still not a satisfactory substi-
tute for a carefully constructed set of responses.

Questions and Statements
The term questionnaire suggests a collection
of questions, but a typical questionnaire prob-
ably has as many statements as questions. This
is because researchers often are interested in
determining the extent to which respondents
hold a particular attitude or perspective. Re-
searchers try to summarize the attitude in a
fairly brief statement; then they present that
statement and ask respondents whether they
agree or disagree with it. Rensis Likert formal-
ized this procedure through the creation of the
Likert scale, a format in which respondents are
asked whether they strongly agree, agree, dis-
agree, or strongly disagree, or perhaps strongly
approve, approve, and so forth.

Both questions and statements may be used
profi tably. Using both in a questionnaire adds
fl exibility in the design of items and can make
the questionnaire more interesting as well.

Make Items Clear
It should go without saying that questionnaire
items must be clear and unambiguous, but the
broad proliferation of unclear and ambiguous
questions in surveys makes the point worth
stressing here. Researchers commonly become
so deeply involved in the topic that opinions
and perspectives that are clear to them will not
be at all clear to respondents, many of whom
have given little or no thought to the topic. Or
researchers may have only a superfi cial under-
standing of the topic and so may fail to specify
the intent of a question suffi ciently. The ques-
tion “What do you think about the governor’s
decision concerning prison furloughs?” may
evoke in the respondent some counter-ques-
tions: “Which governor’s decision?” “What are
prison furloughs?” “What did the governor de-

Chapter 7 Survey Research and Other Ways of Asking Questions 175

found that the way programs were identifi ed
had an impact on the amount of public sup-
port they received. Here are some comparisons:

More Support Less Support
“Assistance to the “Welfare”
poor”
“Halting the rising “Law
crime rate” enforcement”
“Dealing with drug “Drug
addiction” rehabilitation”

In 1986, for example, 63 percent of respondents
said too little money was being spent on “assis-
tance to the poor,” while in a matched survey
that year, only 23 percent said we were spend-
ing too little on “welfare.”

The main guidance we offer for avoiding bias
is that researchers imagine how they would feel
giving each of the answers they offer to respon-
dents. If they would feel embarrassed, perverted,
inhumane, stupid, irresponsible, or anything
like that, then they should give some serious
thought to whether others will be willing to
give those answers. Researchers must carefully
examine the purpose of their inquiry and con-
struct items that will be most useful to it.

We also need to be generally wary of what re-
searchers call the social desirability of questions
and answers. Whenever we ask people for in-
formation, they answer through a fi lter of what
will make them look good. That is especially
true if they are being interviewed in a face-to-
face situation.

Designing Self-Report Items
Social desirability is one of the problems that
plagues self-report crime questions in general
population surveys. Adhering to the ethical
principles of confi dentiality and anonymity, as
well as convincing respondents that we are do-
ing so, is one way of getting more truthful re-
sponses to self-report items. Other techniques
can help us avoid or reduce problems with self-
report items.

respondents will overlook the word not and an-
swer on that basis. Thus some will agree with
the statement when they are in favor of legal-
izing marijuana and others will agree when
they oppose it. And we may never know which
is which.

Biased Items and Terms
Recall from the earlier discussion of conceptu-
alization and operationalization that there are
no ultimately true meanings for any of the con-
cepts we typically study in social science. This
same general principle applies to the responses
we get from persons in a survey.

The meaning of a given response to a ques-
tion depends in large part on the wording of
the question. That is true of every question
and answer. Some questions seem to encour-
age particular responses more than other ques-
tions. Questions that encourage respondents
to answer in a particular way are biased. Most
researchers recognize the likely effect of a ques-
tion such as “Do you support the president’s
initiatives to promote the safety and security
of all Americans?” and no reputable researcher
would use such an item. The biasing effect of
items and terms is far subtler than this example
suggests, however.

The mere identifi cation of an attitude or po-
sition with a prestigious (or unpopular) person
or agency can bias responses. For example, an
item that starts with “Do you agree or disagree
with the recent Supreme Court decision that …”
might have this effect. We are not suggesting
that such wording will necessarily produce
consensus or even a majority in support of the
position identifi ed with the prestigious person
or agency. Rather, support will likely be greater
than what would have been obtained without
such identifi cation.

Sometimes, the impact of different forms
of question wording is relatively subtle. For ex-
ample, Kenneth Rasinski (1989) analyzed the
results of several General Social Survey studies
of attitudes toward government spending. He

176 Part Three Modes of Observation

lustrate how thoughtful wording and intro-
ductions can be incorporated into sensitive
questions.

Self-report surveys of known offenders
encounter different problems. Incarcerated
p ersons may be reluctant to admit commit-
ting crimes because of the legal consequences.
High-rate offenders may have diffi culty distin-
guishing among a number of different crimes
or remembering even approximate dates. Sort-
ing out dates and details of individual crimes
among high-rate offenders requires different
strategies.

One technique that is useful in surveys of
active offenders is to interview subjects several
times at regular intervals. For example, Lisa Ma-
her (1997) interviewed her sample of heroin- or
cocaine-addicted women repeatedly (sometimes
daily) over the course of three years. Each sub-
ject was asked about her background, intimate
relationships with men, income-generating ac-
tivities, and drug-use habits. Having regular in-
terviews helped respondents recall offending.

One method, used in earlier versions of
the BCS, is to introduce a group of self-report
items with a disclaimer and to sanitize the
p resentation of offenses. The self-report section
of the 1984 BCS began with this introduction:

There are lots of things which are actually
crimes, but which are done by lots of peo-
ple, and which many people do not think
of as crimes. On this card [printed card
handed to respondents] are a list of eight
of them. For each one can you tell me how
many people you think do it—most people,
a lot of people, or no one.

Respondents then read a card, shown in Fig-
ure 7.1, that presented descriptions of various
offenses. Interviewers fi rst asked respondents
how many people they thought ever did X,
where X corresponded to the letter for an of-
fense shown in Figure 7.1. Next, respondents
were asked whether they had ever done X. Inter-
viewers then moved on down the list of letters
for each offense on the card.

This procedure incorporates three tech-
niques to guard against the socially desirable
response of not admitting to having commit-
ted a crime. First, the disclaimer seeks to reas-
sure respondents that “many people” do not
really think of various acts as crimes. Second,
respondents are asked how many people they
think commit each offense before being asked
whether they have done so themselves. This
takes advantage of a common human justifi –
cation for engaging in certain kinds of behav-
ior— other people do it. Third, asking whether
they “have ever done X” is less confrontational
than asking whether they “have ever cheated
on an expense account.” Again, the foibles of
human behavior are at work here, in much the
same way that people use euphemisms such as
restroom for “toilet” and sleep together for “have
sexual intercourse.” It is, of course, not realis-
tic to expect that such ploys will reassure all
respondents. Furthermore, disclaimers about
serious offenses such as rape or bank robbery
would be ludicrous. But such techniques il-

Figure 7.1 Showcard for Self-Report Items,
1984 British Crime Survey
Source: Adapted from the 1984 British Crime Survey (NOP
Market Research Limited 1985).

A. Taken office supplies from work (such as statio-
nery, envelopes, and pens) when not sup-
posed to.

B. Taken things other than office supplies from work
(such as tools, money, or other goods) when not
supposed to.

C. Fiddled expenses [fiddled is the Queen’s English
equivalent of fudged ].

D. Deliberately traveled [on a train] without a ticket or
paid too low a fare.

E. Failed to declare something at customs on which
duty was payable.

F. Cheated on tax.

G. Used cannabis (hashish, marijuana, ganga,
grass).

H. Regularly driven a car when they know they have
drunk enough to be well above the legal limit.

Chapter 7 Survey Research and Other Ways of Asking Questions 177

tions. An improperly laid-out questionnaire
can cause respondents to miss questions, con-
fuse them about the nature of the data desired,
and, in the extreme, lead them to throw the
questionnaire away.

As a general rule, the questionnaire should be
uncluttered. Inexperienced researchers tend to
fear that their questionnaire will look too long,
so they squeeze several questions onto a single
line, abbreviate questions, and try to use as few
pages as possible. Such efforts are ill-advised
and even counterproductive. Putting more than
one question on a line will cause some respon-
dents to miss the second question altogether.
Some respondents will misinterpret abbreviated
questions. And, more generally, respondents
who have spent considerable time on the fi rst
page of what seemed a short questionnaire will
be more demoralized than respondents who
quickly completed the fi rst several pages of what
initially seemed a long form. Moreover, the lat-
ter will have made fewer errors and will not have
been forced to reread confusing, abbreviated
questions. Nor will they have been forced to
write a long answer in a tiny space.

Contingency Questions
Quite often in questionnaires, certain ques-
tions are clearly relevant to only some of the
respondents and irrelevant to others. A victim
survey, for example, presents batteries of ques-
tions about victimization incidents that are
meaningful only to crime victims.

Frequently, this situation—realizing that the
topic is relevant only to some respondents—
arises when we wish to ask a series of ques-
tions about a certain topic. We may want to
ask whether respondents belong to a particular
organization and, if so, how often they attend
meetings, whether they have held offi ce in the
organization, and so forth. Or we might want
to ask whether respondents have heard any-
thing about a certain policy proposal, such as
opening a youth shelter in the neighborhood,
and then investigate the attitudes of those who
have heard of it.

Other research asks offenders to complete
“crime calendars” on which they make records
of weekly or monthly offenses committed. Jen-
nifer Roberts and associates (Roberts, Mulvey,
Horney, et al. 2005) found that more frequent
interviews were necessary for use by high-rate
offenders, and that crime calendars were best
suited for tracking more serious offenses

Obtaining valid and reliable results from
self-report items is challenging, but self-report
survey techniques are important tools for ad-
dressing certain types of criminal justice re-
search questions. Because of this, researchers
are constantly striving to improve self-report
items. See the collection of essays by Kennet
and Gfroerer (2005) for a detailed discussion of
issues involved in measuring self-reported drug
use through the National Household Survey
on Drug Abuse. A National Research Council
report (2001) discusses self-report survey mea-
sures more generally.

Computer technology has made it possible
to signifi cantly improve self-reported items.
David Matz (2007) describes advances in self-
report items from recent surveys that supple-
ment the British Crime Survey. We present ex-
amples later in the chapter when we focus on
different modes of survey administration.

Questionnaire Construction
After settling on question content, researchers must
consider the format and organization of all items in
a questionnaire.

Because questionnaires are the fundamental in-
struments of survey research, we now turn our
attention to some of the established techniques
for constructing them. The following sections
are best considered as a continuation of our
theoretical discussions in Chapter 4 of concep-
tualization and measurement.

General Questionnaire Format
The format of a questionnaire is just as im-
portant as the nature and wording of the ques-

178 Part Three Modes of Observation

tingency questions is long enough to extend
over several pages. Victim surveys typically in-
clude many contingency questions. Figure 7.3
presents a few questions from the NCVS ques-
tionnaire used in 2004. All respondents are
asked a series of screening questions to reveal
possible victimizations. Persons who answer
yes to any of the screening questions then com-
plete a crime incident report that presents a
large number of items designed to measure de-
tails of the victimization incident.

As Figure 7.3 shows, the crime incident
report itself also contains contingency ques-
tions. You might notice that even this brief ad-
aptation from the NCVS screening and crime
incident report questionnaires is rather com-
plex. NCVS questionnaires are administered
p rimarily through computer-assisted telephone
interviews in which the fl ow of contingency
questions is more or less automated. It would
be diffi cult to construct a self-administered
victimization questionnaire with such compli-
cated contingency questions.

Matrix Questions
Often researchers want to ask several questions
that have the same set of answer categories.
This happens whenever the Likert response
categories are used. Then it is often possible
to construct a matrix of items and answers, as
i llustrated in Figure 7.4.

This format has three advantages. First,
it uses space effi ciently. Second, respondents

The subsequent questions in series such as
these are called contingency questions; whether
they are to be asked and answered is contingent
on the response to the fi rst question in the
s eries. The proper use of contingency questions
can make it easier for respondents to complete
the questionnaire because they do not have to
answer questions that are irrelevant to them.

Contingency questions can be presented in
several formats on printed questionnaires. The
one shown in Figure 7.2 is probably the clear-
est and most effective. Note that the questions
shown in the fi gure could have been dealt with
in a single question: “How many times, if any,
have you smoked marijuana?” The response
categories then would be: “Never,” “Once,”
“2 to 5 times,” and so forth. This single ques-
tion would apply to all respondents, and each
would fi nd an appropriate answer category.
Such a question, however, might put pressure
on some respondents to report having smoked
marijuana, because the main question asks how
many times they have smoked it. The contin-
gency question format illustrated in Figure 7.2
reduces the subtle pressure on respondents to
report having smoked marijuana. This discus-
sion shows how seemingly theoretical issues of
validity and reliability are involved in so mun-
dane a matter as how to format questions on a
piece of paper.

Used properly, even complex sets of contin-
gency questions can be constructed without
confusing respondents. Sometimes a set of con-

If yes: About how many times have
you smoked marijuana?
[ ] Once
[ ] 2 to 5 times
[ ] 6 to 10 times
[ ] 11 to 20 times
[ ] More than 20 times

23. Have you ever smoked marijuana?
[ ] Yes
[ ] No

Figure 7.2 Contingency Question Format

Chapter 7 Survey Research and Other Ways of Asking Questions 179

Screening Question:

36a. I’m going to read you some examples that will give you an idea of the kinds of crimes this study covers. As I go
through them, tell me if any of these happened to you in the last 6 months, that is, since [date].

Was something belonging to YOU stolen, such as-

(a) Things that you carry, like luggage, a wallet, purse, briefcase, book-
(b) Clothing, jewelry or calculator-
(c) Bicycle or sports equipment-
(d) Things in your home—like a TV, stereo, or tools-
(e) Things outside your home, such as a garden hose or lawn furniture-
(f) Things belonging to children in the household-
(g) Things from a vehicle, such as a package, groceries, camera, or tapes-
OR
(h) Did anyone ATTEMPT to steal anything belonging to you?

Crime Incident Report:

20a. Were you or any other member of this household present when this incident occurred?

___ Yes [ask item 20b]

___ No [skip to 56, page 8]

20b. Which household members were present

___ Respondent only [ask item 21]

___ Respondent and other household member(s) [ask item 21]

___ Only other household member(s) [skip to 59, page 8]

21. Did you personally see an offender?

___ Yes

___ No

. . . . . . . . .

56. Do you know or have you learned anything about the offender(s)—for instance, whether there was one or more
than one offender involved, whether it was someone young or old, or male or female?

___ Yes [ask 57]

___ No [skip to 88, page 11]

Figure 7.3 NCVS Screening Questions and Crime Incident Report
Source: Adapted from National Crime Victimization Survey, NCVS-1 Basic Screen Questionnaire, 9/16/2004 version, www
.ojp.usdoj.gov/bjs/pub/pdf/ncvs104 , accessed May 9, 2008; National Crime Victimization Survey, NCVS-2 Crime Incident
Report, 9/16/2004 version, www.ojp.usdoj.gov/bjs/pub/pdf/ncvs204 , accessed May 9, 2008.

Figure 7.4 Matrix Question Format

17. Beside each of the statements presented below, please indicate whether you Strongly Agree (SA), Agree (A),
Disagree (D), Strongly Disagree (SD), or are Undecided (U).

SA A D SD U

a. What this country needs is more law and order [ ] [ ] [ ] [ ] [ ]

b. Police in America should not carry guns [ ] [ ] [ ] [ ] [ ]

c. Repeat drug dealers should receive life sentences [ ] [ ] [ ] [ ] [ ]

www.ojp.usdoj.gov/bjs/pub/pdf/ncvs104

www.ojp.usdoj.gov/bjs/pub/pdf/ncvs204

www.ojp.usdoj.gov/bjs/pub/pdf/ncvs104

180 Part Three Modes of Observation

ones. If several questions ask about the dangers
of illegal drug use and then a question (open-
ended) asks respondents to volunteer what they
believe to be the most serious crime problems
in U.S. cities, drug use will receive more men-
tions than would otherwise be the case. In this
situation, it is preferable to ask the open-ended
question fi rst.

If respondents are asked to rate the over-
all effectiveness of corrections policy, they will
answer subsequent questions about specifi c
aspects of correctional institutions in a way
that is consistent with their initial assessment.
The converse is true as well: if respondents are
fi rst asked specifi c questions about prisons and
other correctional facilities, their subsequent
overall assessment will be infl uenced by the ear-
lier question.

The best solution is sensitivity to the prob-
lem. Although we cannot avoid the effect of
question order, we should attempt to estimate
what that effect will be. Then we will be able to
interpret results in a meaningful fashion. If the
order of questions seems an especially impor-
tant issue in a given study, we could construct
several versions of the questionnaire that con-
tain the different possible orderings of ques-
tions. We could then determine the effects of
ordering. At the very least, different versions of
the questionnaire should be pretested.

The desired ordering of questions differs
somewhat between self-administered question-
naires and interviews. In the former, it is usually
best to begin the questionnaire with the most
interesting questions. Potential respondents
who glance casually at the fi rst few questions
should want to answer them. Perhaps the ques-
tions involve opinions that they are aching to
express. At the same time, however, the initial
questions should be neither threatening nor
sensitive. It might be a bad idea to begin with
questions about sexual behavior or drug use.
Requests for demographic data (age, gender,
and the like) should generally be placed at the
end of a self-administered questionnaire. Plac-
ing these questions at the beginning, as many

probably fi nd it easier to complete a set of ques-
tions presented in this fashion. Third, this for-
mat may increase the comparability of responses
given to different questions for the respondent,
as well as for the researcher. Because respon-
dents can quickly review their answers to earlier
items in the set, they might choose between, say,
“strongly agree” and “agree” on a given state-
ment by comparing their strength of agreement
with their earlier responses in the set.

Some dangers are inherent in using this for-
mat, as well. Its advantages may promote struc-
turing an item so that the responses fi t into the
matrix format when a different, more idiosyn-
cratic, set of responses might be more appropri-
ate. Also, the matrix question format can gen-
erate a response set among some respondents.
This means that respondents may develop a
pattern of, say, agreeing with all the statements,
without really thinking about what the state-
ments mean. That is especially likely if the set
of statements begins with several that indicate a
particular orientation (for example, a conserva-
tive political perspective) and then offers only a
few subsequent ones that represent the opposite
orientation. Respondents might assume that all
the statements represent the same orientation
and, reading quickly, misread some of them,
thereby giving the wrong answers. This problem
can be reduced somewhat by alternating state-
ments that represent different orientations and
by making all statements short and clear.

A more diffi cult problem is when responses
are generated through respondent boredom or
fatigue. This can be avoided by keeping matrix
questions and the entire questionnaire as short
as possible. Later in this chapter, in the section
on comparing different methods of question-
naire administration, we will describe a useful
technique for avoiding response sets generated
by respondent fatigue.

Ordering Items in a Questionnaire
The order in which questions are asked can also
affect the answers given. The content of one
question can affect the answers given to later

Chapter 7 Survey Research and Other Ways of Asking Questions 181

Finally, it’s common for less experienced re-
searchers to assume that questionnaires must
be newly constructed for each application. In
contrast, it’s almost always possible—and usu-
ally preferable—to use an existing question-
naire as a point of departure. See the box “Don’t
Start from Scratch!” for more on this.

Self-Administered
Questionnaires
Self-administered questionnaires are generally the
least expensive and easiest to complete.

Although the mail survey is the typical method
used in self-administered studies, several other
methods are also possible. In some cases, it may
be appropriate to administer the q uestionnaire
to a group of respondents gathered at the same

inexperienced researchers are tempted to do,
might make the questionnaire appear overly in-
trusive, so the person who receives it may not
want to complete it.

Just the opposite is generally true for in-
person interview and telephone surveys. When
the potential respondent’s door fi rst opens, the
interviewer must begin to establish rapport
quickly. After a short introduction to the study,
the interviewer can best begin by enumerating
the members of the household, obtaining de-
mographic data about each. Such questions are
easily answered and generally nonthreatening.
Once the initial rapport has been established,
the interviewer can move into more sensitive
areas. An interview that begins with the ques-
tion “Do you ever worry about strangers ap-
pearing at your doorstep?” will probably end
rather quickly.

DON’T
START FROM
SCRATCH!

It’s always easier to modify an existing question-
naire for a particular research application than it
is to start from scratch. It’s also diffi cult to imag-
ine asking questions that nobody has asked be-
fore. Here are examples of websites that present
complete questionnaires or batteries of question-
naire items.
■ Bureau of Justice Statistics (BJS). In addition

to administering the NCVS, the BJS collects
information from a variety of justice organiza-
tions. Copies of recent questionnaires for all
BJS-sponsored surveys are available:

www.ojp.usdoj.gov/bjs/quest.htm
■ California Healthy Kids Survey. This set of

questionnaires is useful for assessing behavior
routines. Most include items on alcohol, to-
bacco, and other drug use; fi ghting; and other
behaviors of potential interest for school-
based interventions. English and Spanish ver-
sions are available for elementary, middle,
and high school:

www.wested.org/hks/

■ Centers for Disease Control and Prevention
(CDC). Various centers within the CDC reg-
ularly collect a variety of health-related data
through questionnaires and other data col-
lection systems. Copies of instruments are
available:

www.cdc.gov/nchs/express.htm
■ The Measurement Group. This website pro-

vides links to questionnaires designed for use
in public health studies, but many of these in-
clude items of potential interest to treatment-
related initiatives:

www.themeasurementgroup.com/evalbttn
.htm

■ University of Surrey Question Bank. Main-
tained by a university in England, the Ques-
tion Bank includes links to complete question-
naires for a wide variety of surveys conducted
in the United Kingdom and other countries.
You can fi nd a master list of surveys or browse
questionnaires by topic. An excellent resource:
http://qb.soc.surrey.ac.uk/

Source: Adapted from Maxfi eld (2001). All websites
accessed May 9, 2008.

www.ojp.usdoj.gov/bjs/quest.htm

www.wested.org/hks/

www.wested.org/hks/�

www.cdc.gov/nchs/express.htm

www.themeasurementgroup.com/evalbttn.htm

www.themeasurementgroup.com/evalbttn.htm

http://qb.soc.surrey.ac.uk/

182 Part Three Modes of Observation

recall your reasons for not returning it—and
keep those in mind any time you plan to send
questionnaires to others.

One big reason people do not return ques-
tionnaires is that it seems like too much trou-
ble. To overcome this problem, researchers have
developed ways to make the return of ques-
tionnaires easier. One method involves a self-
mailing questionnaire that requires no return
envelope. The questionnaire is designed so that
when it is folded in a particular fashion, the re-
turn address appears on the outside. That way,
the respondent doesn’t have to worry about
losing the envelope.

Warning Mailings and Cover Letters
Warning mailings are used to verify who lives
at sampled addresses, and to increase response
rates. Warning mailings work like this: After
researchers generate a sample, they send a post-
card to each selected respondent, with the nota-
tion “Address correction requested” printed on
the postcard. If the addressee has moved and left
a forwarding address, the questionnaire is sent
to the new address. In cases in which someone
has moved and not left a forwarding address, or
more than a year has elapsed and the post offi ce
no longer has information about a new address,
the postcard is returned marked something
like “Addressee unknown.” Selected persons
who still reside at the original listed address are
warned in suitable language to expect a ques-
tionnaire in the mail. In such cases, postcards
should briefl y describe the purpose of the survey
for which the respondent has been selected.

Warning letters can be more effective than
postcards in increasing response rates, and
they can also serve the purpose of cleaning ad-
dresses. Letters printed on letterhead stationery
can present a longer description of the survey’s
purpose and a more reasoned explanation of
why it is important for everyone to respond.

Cover letters accompanying the question-
naire offer a similar opportunity to increase
response rates. Two features of cover letters
warrant some attention. First, the content of

place at the same time, such as police offi cers
at roll call or prison inmates at some specially
arranged assembly. Or probationers might
complete a questionnaire when they report
for a meeting with their probation supervi-
sor. The Monitoring the Future survey (see
Chapter 4) has high school seniors complete
self-administered questionnaires in class.

Some experimentation has been conducted
on the home delivery of questionnaires. A re-
search worker delivers the questionnaire to the
home of sample respondents and explains the
study. Then the questionnaire is left for the re-
spondent to complete, and the researcher picks
it up later.

Home delivery and the mail can be used in
combination as well. Questionnaires can be
mailed to families, and then research workers
may visit the homes to pick up the question-
naires and check them for completeness. In the
opposite approach, survey packets are hand-
delivered by research workers with a request
that the respondents mail the completed ques-
tionnaires to the research offi ce. In general,
when a research worker delivers the question-
naire, picks it up, or both, the completion rate
is higher than for straightforward mail surveys.

More recently, the Internet has made it
possible to have respondents complete self-
administered questionnaires online. Before dis-
cussing web-based questionnaires, let us turn
our attention to the fundamentals of mail sur-
veys, which might still be used for people with-
out Internet access.

Mail Distribution and Return
The basic method for collecting data through
the mail is transmittal of a questionnaire ac-
companied by a letter of explanation and a self-
addressed, stamped envelope for returning the
questionnaire. You have probably received a
few. As a respondent, you are expected to com-
plete the questionnaire, put it in the envelope,
and mail it back. If, by any chance, you have re-
ceived such a questionnaire and failed to r eturn
it, it would be a valuable exercise for you to

Chapter 7 Survey Research and Other Ways of Asking Questions 183

do so at all. Properly timed follow-up mailings
provide additional stimuli to respond.

The effects of follow-up mailings may be
seen by monitoring the number of question-
naires received over time. Initial mailings will
be followed by a rise in and subsequent subsid-
ing of returns, and follow-up mailings will spur
a resurgence of returns. In practice, three mail-
ings (an original and two follow-ups) are most
effective.

Acceptable Response Rates
A question frequently asked about mail sur-
veys concerns the percentage return rate that
should be achieved. Note that the body of in-
ferential statistics used in connection with sur-
vey analysis assumes that all members of the
initial sample complete and return their ques-
tionnaires. Because this almost never happens,
response bias becomes a concern. Researchers
must test (and hope for) the possibility that re-
spondents look essentially like a random sam-
ple of the initial sample and thus a somewhat
smaller random sample of the total population.
For example, if the gender of all people in the
sample is known, a researcher can compare the
percentages of males and females indicated on
returned questionnaires with the percentages
for the entire sample.

Nevertheless, overall response rate is one
guide to the representativeness of the sample
respondents. If the response rate is high, there
is less chance of signifi cant response bias than
if the rate is low. As a rule of thumb, a response
rate of at least 50 percent is adequate for analysis
and reporting. A response rate of at least 60 per-
cent is good, and a response rate of 70 percent
is very good. Bear in mind that these are only
rough guides; they have no statistical basis, and
a demonstrated lack of response bias is far more
important than a high response rate. Response
rates tend to be higher for surveys that target
a narrowly defi ned population, whereas general
population surveys yield lower response rates.

Don Dillman (2006) has undertaken an ex-
tensive review of the various techniques survey

the letter is obviously important. The message
should communicate why a survey is being
conducted, how and why the respondent was
selected, and why it is important for the re-
spondent to complete the questionnaire. In line
with our discussion of the protection of human
subjects in Chapter 2, the cover letter should
also assure respondents that their answers will
be confi dential.

Second, the cover letter should identify the
institutional affi liation or sponsorship of the
survey. The two alternatives are (1) an institu-
tion that the respondent respects or can identify
with, or (2) a neutral but impressive-sounding
affi liation. For example, if we are conducting a
mail survey of police chiefs, printing our cover
letter on International Association of Chiefs of
Police (IACP) stationery and having the letter
signed by an offi cial in the IACP might increase
the response rate. Of course, we cannot adopt
such a procedure unless the survey is endorsed
by the IACP.

By the same token, it is important to avoid
controversial affi liations or those inappropriate
for the target population. The National Orga-
nization for the Reform of Marijuana Laws, for
instance, is not suitable for most target popu-
lations. A university affi liation is appropriate
in many cases, unless the university is on bad
terms with the target population.

Follow-Up Mailings
Follow-up mailings may be administered in a
number of ways. In the simplest, nonrespon-
dents are sent a letter of additional encourage-
ment to participate. A better method, however,
is to send a new copy of the survey questionnaire
with the follow-up letter. If potential respon-
dents have not returned their questionnaires
after two or three weeks, the questionnaires
probably have been lost or misplaced.

The methodological literature on follow-up
mailings strongly suggests that they are effec-
tive in increasing return rates in mail surveys.
In general, the longer a potential respondent
delays replying, the less likely he or she is to

184 Part Three Modes of Observation

The advantages of this method are obvious.
Responses are automatically recorded in com-
puter fi les, saving time and money. Web-page
design tools make it possible to create attractive
questionnaires that include contingency ques-
tions, matrixes, and other complex tools for
presenting items to respondents. Dillman, long
recognized for his total design approach to con-
ducting mail surveys, has written a comprehen-
sive guide to conducting mail, web-based, and
other self-administered surveys (Dillman 2006).

All electronic versions of self-administered
questionnaires face a couple of problems. The
fi rst concerns representativeness: will the peo-
ple who can be surveyed online be representa-
tive of meaningful populations, such as all U.S.
adults, all registered voters, or all residents of
particular urban neighborhoods? This criti-
cism has also been raised with regard to sur-
veys via fax and, in the mid-20th century, with
regard to telephone surveys. Put in terms that
should be familiar from the previous chapter,
how closely do available sampling frames for
electronic surveys match possible target popu-
lations? If, for example, our target population
is university students, how can we obtain a list
of e-mail addresses or other identifi ers that will
enable us to survey a representative sample? It’s
easy to think of other target populations of in-
terest to criminal justice researchers that might
be diffi cult to reach via e-mail or web-based
questionnaires.

The second problem is an unfortunate con-
sequence of the rapid growth of e-mail and re-
lated technologies. Just as junk mail clutters our
physical mailboxes with all sorts of advertising,
“spam” and other kinds of unwanted messages
pop up all too often in our virtual mailboxes.
The proliferation of junk e-mail has led to the
development of anti-spam fi lters that screen
out unwanted correspondence. Unfortunately,
such programs can also screen out unfamiliar
but well-meaning mail such as e-mail question-
naires. Similar problems with telemarketing
have made it increasingly diffi cult to conduct
surveys by telephone.

researchers use to increase return rates on mail
surveys, and he evaluates the impact of each.
More importantly, Dillman stresses the necessity
of paying attention to all aspects of the study—
what he calls the “total design method”—rather
than one or two special gimmicks.

Computer-Based Self-Administration
Advances in computer and telecommunica-
tions technology over the past several decades
have produced additional options for distrib-
uting and collecting self-administered ques-
tionnaires. Jeffrey Walker (1994) describes vari-
ations on conducting surveys by fax machine.
Questionnaires are faxed to respondents, who
are asked to fax their answers back. As the In-
ternet and Web have permeated work and lei-
sure activities, different types of computer-as-
sisted self-administered surveys have become
more common.

David Shannon and associates (Shannon,
Johnson, Scarcy, and Lott 2002) describe three
general types of electronic surveys. The fi rst is
a disk-based survey. Respondents load a ques-
tionnaire from a disk or CD into their own
computer, key in responses to survey items, and
then either mail the disk back to researchers or
transmit the information electronically. As the
earliest form of electronic survey, the disk-based
survey is a relic of stand-alone personal com-
puters. Disk-based surveys are now virtually ob-
solete as personal computers are routinely con-
nected to the Web in one way or another.

The second type, e-mail surveys, has a few
variations. Researchers can include a few simple
questions in an e-mail message and ask respon-
dents to reply by e-mail. More elaborate ver-
sions can embed complex formatted question-
naires in e-mail messages. Respondents might
be asked to open an attached fi le that contains
a questionnaire, or they might be directed to
another web page that contains a formatted
questionnaire. That brings us to the third type
of electronic survey described by Shannon and
associates—a questionnaire posted on a web
page.

Chapter 7 Survey Research and Other Ways of Asking Questions 185

tions have replaced letters, check-writing, and
other correspondence, self-administered sur-
veys will increasingly be conducted on web-
based computers. At the end of this chapter, we
list a small sample of resources for conducting
web-based surveys.

In-Person Interview Surveys
Face-to-face interviews are best for complex ques-
tionnaires and other specialized needs.

The in-person interview is an alternative
method of collecting survey data. Rather than
asking respondents to read questionnaires and
enter their own answers, researchers send inter-
viewers to ask the questions orally and record
respondents’ answers. Most interview surveys
require more than one interviewer, although a
researcher might undertake a small-scale inter-
view survey alone.

The Role of the Interviewer
Not surprisingly, in-person interview surveys
typically attain higher response rates than mail
surveys. Respondents seem more reluctant to
turn down an interviewer who is standing on
their doorstep than to throw away a mail ques-
tionnaire. A properly designed and executed
interview survey ought to achieve a completion
rate of at least 80 to 85 percent.

The presence of an interviewer generally de-
creases the number of “don’t know” and “no an-
swer” responses. If minimizing such responses
is important to the study, the interviewer can
be instructed to probe for answers (“If you had
to pick one of the answers, which do you think
would come closest to your feelings?”).

The interviewer can also help respondents
with confusing questionnaire items. If the re-
spondent clearly misunderstands the intent of
a question, the interviewer can clarify matters
and thereby obtain a relevant response. Such
clarifi cations must be strictly controlled, how-
ever, through formal specifi cations. Finally,
the interviewer can observe as well as ask ques-
tions. For example, the interviewer can make

In yet another example of technology ad-
vances being accompanied by new threats, the
spread of computer viruses has made people
cautious about opening e-mail or attachments
from unfamiliar sources. This problem, and
the electronic version of junk mail, can be ad-
dressed in a manner similar to warning mail-
ings for printed questionnaires. Before sending
an e-mail with an embedded or linked question-
naire, researchers can distribute e-mail mes-
sages from trusted sources that warn recipients
to expect to receive a questionnaire, and urge
them to complete it.

We should keep one basic principle in mind
when considering whether a self-administered
questionnaire can be distributed electronically:
web-based surveys depend on access to the Web,
which, of course, implies having a computer.
The use of computers and the Web continues
to increase rapidly. Although access to this
technology still is unequally distributed across
socioeconomic classes, web-based surveys can
be readily conducted for many target popula-
tions of interest to criminal justice researchers.

In their recommendations for rethinking
crime surveys, Maxfi eld and associates argue
that web-based surveys are well-suited for learn-
ing more about victims of computer-facilitated
fraud. Since only people with Internet access
are possible victims, an Internet-based sample
is ideal (Maxfi eld, Hough, and Mayhew 2007).
Mike Sutton (2007) describes other examples
of nontraditional crimes where Internet sam-
ples of computer users are appropriate.

Recalling our discussion in Chapter 6, the
correspondence between a sampling frame and
target population is a crucial feature of sam-
pling. Most justice professionals and criminal
justice organizations routinely use the Web
and e-mail. Lower-cost but generalizable victim
surveys can use web-based samples of univer-
sity students to distribute questionnaires. Or
printed warning letters can be mailed, inviting
respondents to complete either traditional or
e-mail self-administered questionnaires. Just as
e-mail, electronic bill-paying, and other transac-

186 Part Three Modes of Observation

neighborhood crime problems, the respondent
might simply reply, “Pretty bad.” The inter-
viewer could obtain an elaboration on this re-
sponse through a variety of probes. Sometimes,
the best probe is silence; if the interviewer sits
quietly with pencil poised, the respondent will
probably fi ll the pause with additional com-
ments. Appropriate verbal probes are “How is
that?” and “In what ways?” Perhaps the most
generally useful probe is “Anything else?”

In every case, however, it is imperative that
the probe be completely neutral. The probe
must not in any way affect the nature of the
subsequent response. If we anticipate that a
given question may require probing for appro-
priate responses, we should write one or more
useful probes next to the item in the question-
naire. This practice has two important advan-
tages. First, it allows for more time to devise the
best, most neutral probes. Second, it ensures
that all interviewers will use the same probes as
needed. Thus even if the probe is not perfectly
neutral, the same stimulus is presented to all
respondents. This is the same logical guideline
as for question wording. Although a question
should not be loaded or biased, it is essential
that every respondent be presented with the
same question, even if a biased one.

Coordination and Control
Whenever more than one interviewer will ad-
minister a survey, it is essential that the efforts
be carefully coordinated and controlled. Two
ways to ensure this control are by (1) training
interviewers and (2) supervising them after they
begin work.

Whether the researchers will be administer-
ing a survey themselves or paying a professional
fi rm to do it for them, they should be attentive
to the importance of training interviewers. The
interviewers usually should know what the
study is all about. Even though the interview-
ers may be involved only in the data collection
phase of the project, they should understand
what will be done with the information they
gather and what purpose will be served.

observations about the quality of the dwelling,
the presence of various possessions, the respon-
dent’s ability to speak English, the respondent’s
general reactions to the study, and so forth.

Survey research is, of necessity, based on an
unrealistic stimulus–response theory of cogni-
tion and behavior. That is, it is based on the as-
sumption that a questionnaire item will mean
the same thing to every respondent, and every
given response must mean the same thing when
given by different respondents. Although this is
an impossible goal, survey questions are drafted
to approximate the ideal as closely as possible.
The interviewer also plays a role in this ideal sit-
uation. The interviewer’s presence should not
affect a respondent’s perception of a question or
the answer given. The interviewer, then, should
be a neutral medium through which questions
and answers are transmitted. If this goal is met,
different interviewers will obtain the same re-
sponses from a given respondent, an example of
reliability in measurement (see Chapter 4).

Familiarity with the Questionnaire The in-
terviewer must be able to read the question-
naire items to respondents without stumbling
over words and phrases. A good model for in-
terviewers is the actor reading lines in a play or
fi lm. The interviewer must read the questions
as though they are part of a natural conversa-
tion, but that “conversation” must precisely
follow the language set down in the question.

By the same token, the interviewer must be
familiar with the specifi cations for administer-
ing the questionnaire. Inevitably, some ques-
tions will not exactly fi t a given respondent’s sit-
uation, and the interviewer must determine how
those questions should be interpreted in that
situation. The specifi cations provided to the in-
terviewer should include adequate guidelines in
such cases, but the interviewer must know the
organization and content of the specifi cations
well enough to refer to them effi ciently.

Probing for Responses Probes are frequently
required to elicit responses to open-ended
questions. For example, to a question about

Chapter 7 Survey Research and Other Ways of Asking Questions 187

Computer-Assisted Interviewing in the BCS
Early waves of the BCS, a face-to-face inter-
view survey, asked respondents to complete a
self-administered questionnaire about drug
use, printed as a small booklet that was promi-
nently marked “Confi dential.” Beginning with
the 1994 survey, respondents answered self-
report questions on laptop computers. The BCS
includes two related versions of CAI. In comput-
er-assisted personal interviewing (CAPI), inter-
viewers read questions from computer screens,
instead of printed questionnaires, and then key
in respondents’ answers. For self-report items,
interviewers hand the computers to subjects,
who then key in the responses themselves. This
approach is known as computer-assisted self-
interviewing (CASI). In addition, CASI as used
in the BCS is supplemented with audio instruc-
tions—respondents listen to interview prompts
on headphones connected to the computer.
After subjects key in their responses to self-
report items, the answers are scrambled so
the interviewer cannot access them. Notice
how this feature of CASI enhances the re-
searcher’s ethical obligation to keep responses
confi dential.

Malcolm Ramsay and Andrew Percy (1996)
report that CASI had at least two benefi ts. First,
respondents seemed to sense a greater degree of
confi dentiality when they responded to ques-
tions on a computer screen as opposed to ques-
tions on a written form. Second, the laptop
computers were something of a novelty that
stimulated respondents’ interest; this was espe-
cially true for younger respondents.

Examining results from the BCS reveals that
CASI techniques produced higher estimates of
illegal drug use than those revealed in previous
surveys. Table 7.1 compares self-reported drug
use from the 1998 BCS (Ramsey and Partridge
1999) with results from the 1992 BCS (Mott
and Mirrlees-Black 1995), in which respon-
dents answered questions in printed booklets.
We present results for only three drugs here,
together with tabulations about the use of
any drug. For each drug, the survey measured

There may be some exceptions to this, how-
ever. In their follow-up study of child abuse
victims and controls, Cathy Spatz Widom and
associates (Widom, Weiler, and Cotler 1999)
did not inform the professional interviewers
who gathered data that their interest was in the
long-term effects of child abuse. This safeguard
was used to avoid even the slightest chance
that interviewers’ knowledge of the study focus
would affect how they conducted interviews.

Obviously, training should ensure that in-
terviewers understand the questionnaire. In-
terviewers should also understand procedures
to select respondents from among household
members. And interviewers should recognize
circumstances in which substitute sample ele-
ments may be used in place of addresses that
no longer exist, families who have moved, or
persons who simply refuse to be interviewed.

Training should include practice sessions in
which interviewers administer the questionnaire
to one another. The fi nal stage of the training
should involve some real interviews conducted
under conditions like those in the survey.

While interviews are being conducted, it is a
good idea to review questionnaires as they are
completed. This may reveal questions or groups
of questions that respondents do not under-
stand. By reviewing completed questionnaires
it is also possible to determine if interviewers
are completing items accurately.

Computer-Assisted
In-Person Interviews
Just as e-mail and web-based surveys apply
new technology to the gathering of survey data
through self-administration, laptop and hand-
held computers are increasingly being used to
conduct in-person interviews. Different forms
of computer-assisted interviewing (CAI) of-
fer major advantages in the collection of survey
data. At the same time, CAI has certain disad-
vantages that must be considered. We’ll begin
by describing an example of how this technol-
ogy was adopted in the BCS, one of the earliest
uses of CAI in a general-purpose crime survey.

188 Part Three Modes of Observation

• Questionnaires for self-interviewing can be
programmed in different languages, readily
switching to the language appropriate for a
particular respondent.

• Audio-supplemented CASI produces a stan-
dardized interview, avoiding any bias that
might emerge from interviewer effects.

• Audio supplements, in different languages,
facilitate self-interviews of respondents who
cannot read.

At the same time, CAI has certain disad-
vantages that preclude its use in many survey
applications:

• Although computers become more of a bar-
gain each day, doing a large-scale in-person
interview survey requires providing comput-
ers for each interviewer. Costs can quickly
add up. CAI also requires specialized soft-
ware to format and present on-screen
questionnaires.

• Although CAI reduces costs in data process-
ing, it requires more up-front investment
in programming questionnaires, skip se-
quences, and the like.

lifetime use (“Ever used?”) and use in the past
12 months.

Notice that rates of self-reported use were
substantially higher in 1998 than in 1992,
with the exception of “semeron” use, reported
by very few respondents in 1992 and none in
1998. If you’ve never heard of semeron, you’re
not alone. It’s a fi ctitious drug, included in the
list of real drugs to detect untruthful or exag-
gerated responses. If someone confessed to
using semeron, his or her responses to other
self-reported items would be suspect. Notice
in Table 7.1 that CASI use in 1998 reduced the
number of respondents who admitted using a
drug that doesn’t exist.

CASI has also been used in the BCS since
1996 to measure domestic violence commit-
ted by partners and ex-partners of males and
females aged 16 to 59. Catriona Mirrlees-Black
(1999) reports that CASI techniques reveal
higher estimates of domestic violence victim-
ization among both females and males.

Advantages and Disadvantages Different
types of CAI offer a number of advantages for
in-person interviews. The BCS and other sur-
veys that include self-report items indicate that
CAI is more productive in that self-reports of
drug use and other offending tend to be higher.
In 1999, the National Household Survey on
Drug Abuse shifted completely to CAI (Wright,
Barker, Gfroerer, and Piper 2002). Other advan-
tages include the following:

• Responses can be quickly keyed in and au-
tomatically reformatted into data fi les for
analysis.

• Complex sequences of contingency ques-
tions can be automated. Instead of p rinting
many examples of “If answer is yes, go to
question 43a; if no … ,” computer-based
questionnaires automatically jump to the
next appropriate question contingent on re-
sponses to earlier ones.

• CAI offers a way to break up the monotony
of a long interview by shifting from verbal
interviewer prompts to self-interviewing,
with or without an audio supplement.

Table 7.1 Self-Reported Drug Use, 1992 and
1998 British Crime Survey

Percentage of
Respondents
Ages 16–29
Who Report Use

1992 1998

Marijuana or cannabis
Ever used? 24 42
Used in previous 12 months? 12 23

Amphetamines
Ever used? 9 20
Used in previous 12 months? 4 8

Semeron
Ever used? 0.3 0.0
Used in previous 12 months? 0.1 0.0

Any drug
Ever used? 28 49
Used in previous 12 months? 14 25

Source: 1992 data adapted from Mott and Mirrlees-Black
(1995, 41–42); 1998 data adapted from Ramsay and
Partridge (1999, 68–71).

Chapter 7 Survey Research and Other Ways of Asking Questions 189

respondents may be more honest in giving so-
cially disapproved answers if they don’t have to
look the questioner in the eye. Similarly, it may
be possible to probe into more sensitive areas,
although that is not necessarily the case. People
are, to some extent, more suspicious when they
can’t see the person asking them questions—
perhaps a consequence of telemarketing and
salespeople conducting bogus surveys before
making sales pitches.

Telephone surveys can give a researcher
greater control over data collection if several in-
terviewers are engaged in the project. If all the
interviewers are calling from the research offi ce,
they can get clarifi cation from the supervisor
whenever problems occur, as they inevitably
do. Alone in the fi eld, an interviewer may have
to wing it between weekly visits with the inter-
viewing supervisor.

A related advantage is rooted in the grow-
ing diversity of U.S. cities. Because many major
cities have growing immigrant populations,
interviews may need to be conducted in differ-
ent languages. Telephone interviews are usually
conducted from a central site, so that one or
more multilingual interviewers can be quickly
summoned if an English-speaking interviewer
makes contact with, say, a Spanish-speaking re-
spondent. In-person interview surveys present
much more diffi cult logistical problems in han-
dling multiple languages. And mail surveys re-
quire printing and distributing questionnaires
in different languages.

Telephone interviewing has its problems,
however. Telephone surveys are limited by defi –
nition to people who own telephones. Years
ago, this method produced a substantial social
class bias by excluding poor people. Over time,
however, the telephone has become a standard
fi xture in almost all American homes. The U.S.
Census Bureau estimates that 95.5 percent of
all households now have telephones, so the ear-
lier class bias has been substantially reduced
(U.S. Bureau of the Census 2006, Table 1117).
The NCVS, traditionally an in-person interview,
has increased its use of telephone interviews as
part of the crime survey’s redesign.

• Automated skip sequences for contingency
questions are great, but if something goes
wrong with the programmed questionnaire,
all sorts of subsequent problems are pos-
sible. As Emma Forster and Alison McCleery
(1999) point out, such question-routing
mistakes might mean whole portions of a
questionnaire are skipped. Whereas occa-
sional random errors are possible with pen-
and-paper interviews, large-scale systematic
error can happen with CAI technology.

• Proofreading printed questionnaires is
straightforward, but it can be diffi cult to au-
dit a computerized questionnaire. Doing so
might require special technical skills.

• It can be diffi cult to print and archive a com-
plex questionnaire used in CAI. This was a
problem with early applications of CAI tech-
nology, but improvements in software are
helping to solve it.

• Batteries of laptops run down, and comput-
ers and software are more vulnerable to mal-
functions and random weirdness than are
stacks of printed questionnaires.

In sum, CAI can be costly and requires some
specialized skills. As a result, these and related
technologies are best suited for use by profes-
sional survey researchers or research centers
that regularly conduct large-scale in-person in-
terviews. We will return to this issue in the con-
cluding section of this chapter.

Telephone Surveys
Telephone surveys are fast and relatively low cost.

Telephone surveys have many advantages that
make them a popular method. Probably the
greatest advantages involve money and time. In
a face-to-face household interview, a researcher
may drive several miles to a respondent’s home,
fi nd no one there, return to the research offi ce,
and drive back the next day—possibly fi nding
no one there again.

Interviewing by telephone, researchers can
dress any way they please, and it will have no
effect on the answers respondents give. And

190 Part Three Modes of Observation

should plan on making fi ve calls to reach two
households.

Another developing problem is the increas-
ing number of households that have mobile
phone service only. Stephen Blumberg (Blum-
berg, Luke, and Cynamon 2006) and associates
report that about 7 percent of households in a
United States national sample have only mobile
phones. Jan van Dijk (2007) describes how this
is especially troublesome in some European
countries where mobile-only households are
more common.

The most diffi cult challenge with telephone
surveys involves the explosion of telemarket-
ing. The volume of junk phone calls rivals that
of junk mail, and salespeople often begin their
pitch by describing a “survey.” The viability of
legitimate surveys is now hampered by the pro-
liferation of bogus surveys, which are actually
sales campaigns disguised as research. As if that
weren’t bad enough, telemarketing has become
so annoying that many people simply hang up
whenever they hear a strange voice announcing
some institutional affi liation.

Following action in several states, the Federal
Trade Commission approved new regulations
restricting telemarketing by enabling individu-
als to place their phone numbers on a national
do-not-call registry (Telemarketing Sales Rule
2003). The rule went into effect in October 2003
after withstanding vigorous legal challenges by
the telemarketing industry. The impact of this
regulation remains to be seen. On the one hand,
it has the potential to make life easier for legiti-
mate telephone survey researchers. On the other
hand, since conducting research is identifi ed as
a legitimate activity under the Telemarketing
Sales Rule, it may induce more unscrupulous
fi rms to disguise their sales calls as surveys.

Computer-Assisted
Telephone Interviewing
Much of the growth in telemarketing has been
fueled by advances in computer and telecom-
munications technology. Beginning in the
1980s, much of the same technology came to

At the same time, phone surveys are much
less suitable for individuals not living in house-
holds. Homeless people are obvious examples;
those who live in institutions are also diffi cult
to reach out and touch via telephone. Patricia
Tjaden and Nancy Thoennes (2000) cite single-
adult households and people living in rural or
inner-city areas as targets least likely to have
telephone coverage.

A related sampling problem involves un-
listed numbers. If the survey sample is selected
from the pages of a local telephone directory,
it totally omits all those people who have re-
quested that their numbers not be published.
Similarly, those who recently moved and tran-
sient residents are not well represented in pub-
lished telephone directories. This potential bias
has been eliminated through random-digit
dialing (RDD), a technique that has advanced
telephone sampling substantially.

RDD samples use computer algorithms to
generate lists of random telephone numbers—
usually the last four digits. This procedure gets
around the sampling problem of unlisted tele-
phone numbers but may substitute an admin-
istrative problem. Randomly generating phone
numbers produces numbers that are not in op-
eration or that serve a business establishment
or pay phone. In most cases, businesses and pay
phones are not included in the target popula-
tion; dialing these numbers and learning that
they’re out of scope will take time away from
producing completed interviews with the tar-
get population.

Weisel (1999) offers an excellent description
of RDD samples for use in community crime
surveys. Among the increasingly important is-
sues in RDD samples is the growing number
of telephone numbers in use for cell phones,
pagers, and home access to the Internet. This
means that the rate of ineligible telephone
numbers generated through RDD is increas-
ing. Weisel’s rule of thumb is that ineligible
numbers account for an estimated 60 percent
of phone numbers produced by typical RDD
procedures. Thus, if RDD is used, researchers

Chapter 7 Survey Research and Other Ways of Asking Questions 191

rapidly gets to the next appropriate question.
This can be especially handy in a victim survey,
in which affi rmative answers to screening ques-
tions automatically bring up detailed questions
about each crime incident.

Comparison of
the Three Methods
Cost, speed, and question content are issues to con-
sider in selecting a survey method.

We’ve now examined three ways of collecting
survey data: self-administered questionnaires,
in-person interviews, and telephone surveys. We
have also considered some recent advances in
each mode of administration. Although we’ve
touched on some of the relative advantages and
disadvantages of each, let’s take a minute to
compare them more directly.

Self-administered questionnaires are gen-
erally cheaper to use than interview surveys.
Moreover, for self-administered e-mail or web-
based surveys, it costs no more to conduct a
national survey than a local one. Obviously, the
cost difference between a local and a national
in-person interview survey is considerable. Tele-
phone surveys are somewhere in between. Al-
though national surveys can infl ate costs some-
what through long-distance telephone charges,
fl at-rate long-distance service can be negotiated
or Internet-based phone service might be used.
Mail surveys typically require a small staff. One
person can conduct a reasonable mail survey,
although it is important not to underestimate
the work involved.

Up to a point, cost and speed are related.
In-person interview surveys can be completed
very quickly if a large pool of interviewers
is a vailable and funding is adequate to pay
them. In contrast, if a small number of people
are conducting a larger number of face-to-face
interviews, costs are generally lower, but the
survey takes much longer to complete. Tele-
phone surveys that use CATI technology are the
fastest.

be widely used in telephone surveys, referred
to as computer-assisted telephone interviewing
(CATI). Perhaps you’ve occasionally marveled
at news stories that report the results of a na-
tionwide opinion poll the day after some major
speech or event. The speed of CATI technology,
coupled with RDD, makes these instant poll re-
ports possible.

Interviewers wearing telephone headsets sit
at computer workstations. Computer programs
dial phone numbers, which can be either gener-
ated through RDD or extracted from a database
of phone numbers compiled from a source. As
soon as phone contact is made, the computer
screen displays an introduction (“Hello, my
name is . . . calling from the Survey Research
Center at Ivory Tower University”) and the fi rst
question to be asked, often a query about the
number of residents who live in the household.
As interviewers key in answers to each question,
the computer program displays a new screen
that presents the next question, until the end
of the interview is reached.

CATI systems offer several advantages over
procedures in which an interviewer works
through a printed interview schedule. Speed is
one obvious plus. Forms on computer screens
can be fi lled in more quickly than can paper
forms. Typing answers to open-ended questions
is much faster than writing them by hand. And
CATI software immediately formats responses
into a data fi le as they are keyed in, which elimi-
nates the step of manually transferring answers
from paper to computer.

Accuracy is also enhanced by CATI systems
in several ways. First, CATI programs can be de-
signed to accept only valid responses to a given
questionnaire item. For example, if valid re-
sponses to respondent “gender” are f for f emale
and m for male, the computer will accept only
those two letters, emitting a disagreeable noise
and refusing to proceed if something else is
keyed in. Second, the software can be pro-
grammed to automate contingency questions
and skip sequences, thus ensuring that the in-
terviewer skips over inappropriate items and

192 Part Three Modes of Observation

hood, the dwelling unit, and so forth. They may
also note characteristics of the respondents
or the quality of their interaction with the
respondents—whether the respondent had dif-
fi culty communicating, was hostile, seemed to
be lying, and so on. Finally, when the safety of
interviewers is an issue, a mail or phone survey
may be the best option.

Ultimately, researchers must weigh all these
advantages and disadvantages of the three
methods against research needs and available
resources.

Strengths and Weaknesses
of Survey Research
Surveys tend to be high on reliability and generaliz-
ability, but validity can often be a weak point.

Like other modes of collecting data in crimi-
nal justice research, surveys have strengths and
weaknesses. It is important to consider these in
deciding whether the survey format is appro-
priate for a specifi c research purpose.

Surveys are particularly useful in describ-
ing the characteristics of a large population.
The NCVS has become an important tool for
researchers and public offi cials because of its
ability to describe levels of crime. A carefully se-
lected probability sample, in combination with
a standardized questionnaire, allows research-
ers to make refi ned descriptive statements
about a neighborhood, a city, a nation, or some
other large population.

Standardized questionnaires have an im-
portant advantage in regard to measurement.
Earlier chapters discussed the ambiguous na-
ture of concepts: they ultimately have no real
meanings. One person’s view about, say, crime
seriousness or punishment severity is quite dif-
ferent from another’s. Although we must be
able to defi ne concepts in ways that are most
relevant to research goals, it’s not always easy
to apply the same defi nitions uniformly to all
subjects. Nevertheless, the survey researcher
is bound to the requirement of having to ask
exactly the same questions of all subjects and

Self-administered surveys may be more ap-
propriate to use with especially sensitive issues
if the surveys offer complete anonymity. Re-
spondents are sometimes reluctant to report
controversial or deviant attitudes or behav-
iors in interviews, but they may be willing to
respond to an anonymous self-administered
questionnaire. However, the successful use of
computers for self-reported items in the BCS
and the National Household Survey on Drug
Use and Health indicates that interacting with
a machine can promote more candid responses.
This is supported by experimental research
comparing different modes of questionnaire
administration (Tourangeau and Smith 1996).

Interview surveys have many advantages, too.
For example, in-person or telephone surveys
are more appropriate when respondent literacy
may be a problem. Interview surveys also result
in fewer incomplete questionnaires. Respon-
dents may skip questions in a self-administered
questionnaire, but interviewers are trained not
to do so. CAI offers a further check on this in
telephone and in-person surveys.

Although self-administered questionnaires
may be more effective in dealing with sensitive
issues, interview surveys are defi nitely more
effective in dealing with complicated ones. In-
terviewers can explain complex questions to re-
spondents and use visual aids that are not pos-
sible in mail or phone surveys.

In-person interviews, especially with com-
puter technology, can also help reduce response
sets. Respondents (like students?) eventually
become bored listening to a lengthy series of
similar types of questions. It’s easier to main-
tain individuals’ interest by changing the kind
of stimulation they are exposed to. A mix of
questions verbalized by a person, presented on
a computer screen, and heard privately through
earphones is more interesting for respondents
and reduces fatigue.

Interviewers who question respondents face
to face are also able to make important observa-
tions aside from responses to questions asked
in the interview. In a household interview, they
may summarize characteristics of the neighbor-

Chapter 7 Survey Research and Other Ways of Asking Questions 193

tic violence is measured in the context of
a crime survey, and some women may not
see what happened to them as “crime,” or
be reluctant to do so. Also, there is little
time to approach the topic “gently.” A spe-
cially designed questionnaire with care-
fully selected interviewers may well have
the edge here.

In recent years, both the NCVS and the BCS
have been revised to produce better measures of
domestic and intimate violence. Estimates of
domestic violence increased in the 1996 wave of
the BCS, and researchers think that the increase
refl ects a greater willingness by respondents to
discuss domestic violence with interviewers
(Mirrlees-Black, Mayhew, and Percy 1996). The
use of CASI in the BCS has produced higher
estimates of victimization prevalence among
women, as well as the fi rst measurable rates of
domestic violence victimization for males (Mir-
rlees-Black 1999).

Survey research is generally weaker on valid-
ity and stronger on reliability. In comparison
with fi eld research, for instance, the artifi ciality
of the survey format puts a strain on validity.
As an illustration, most researchers agree that
fear of crime is not well measured by the stan-
dard question “How safe do you feel, or would
you feel, out alone in your neighborhood at
night?” Survey responses to that question are,
at best, approximate indicators of what we have
in mind when we conceptualize fear of crime.

Reliability is a different matter. By present-
ing all subjects with a standardized stimu-
lus, survey research goes a long way toward
e liminating unreliability in observations made
by the researcher.

However, even this statement is subject to
qualifi cation. Critics of survey methods argue
that questionnaires for standard crime surveys
and many specialized studies embody a narrow,
legalistic conception of crime that cannot re-
fl ect the perceptions and experiences of minori-
ties and women. Survey questions typically are
based on male views and do not adequately tap
victimization or fear of crime among women

having to impute the same intent to all respon-
dents giving a particular response.

At the same time, survey research has its weak-
nesses. First, the requirement for standardiza-
tion might mean that we are trying to fi t round
pegs into square holes. Standardized question-
naire items often represent the least common
denominator in assessing people’s attitudes,
orientations, circumstances, and experiences. By
designing questions that are at least minimally
appropriate to all respondents, we may miss
what is most appropriate to many respondents.
In this sense, surveys often appear superfi cial in
their coverage of complex topics.

Using surveys to study crime and criminal
justice policy presents special challenges. The
target population frequently includes lower-
income, transient persons who are diffi cult to
contact through customary sampling methods.
For example, homeless persons are excluded
from any survey that samples households, but
people who live on the street no doubt fi gure
prominently as victims and offenders. Max-
fi eld (1999) describes how new data from the
National Incident-Based Reporting System
suggest that a number of “non-household-
associated” persons are systematically under-
counted by sampling procedures used in the
NCVS. Crime surveys such as the NCVS and
the BCS have been defi cient in getting informa-
tion about crimes of violence when the victim
and offender have a prior relationship. This is
particularly true for domestic violence.

Underreporting of domestic violence ap-
pears to be due, in part, to the very general
nature of large-scale crime surveys. Catriona
M irrlees-Black (1995, 8) of the British Home Of-
fi ce summarizes the trade-offs of using survey
techniques to learn about domestic violence:

Measuring domestic violence is diffi cult
territory. The advantage of the BCS is that
it is based on a large nationally representa-
tive sample, has a relatively high response
rate, and collects information on enough
incidents to provide reliable details of their
nature. One disadvantage is that domes-

194 Part Three Modes of Observation

Another approach is to study one or two (or
some other small number) correctional institu-
tions intensively. We might interview a psychol-
ogist in each institution and present questions
about various approaches to drug treatment
therapy. In all likelihood, we will use, not a
highly structured questionnaire, but rather a
list of questions or topics we wish to discuss
with each subject. And we will treat the inter-
view as more of a directed conversation than a
formal interview. Of course, we cannot general-
ize from interviews with one or two prison psy-
chologists to any larger population. However,
we will gain an understanding (and probably a
more detailed one) of how staff psychologists
in specifi c institutions feel about different drug
treatment programs.

Specialized interviewing asks questions of
a small number of subjects, typically using an
interview schedule that is much less structured
than that in sample surveys. Michael Quinn
Patton (2001) distinguishes two variations of
specialized interviews. The less structured al-
ternative is to prepare a general interview guide
that includes the issues, topics, or questions the
researcher wishes to cover. Issues and items are
not presented to respondents in any standard-
ized order. The interview guide is more like a
checklist than an interview schedule, ensuring
that planned topics are addressed at some point
in the interview. The standardized open-ended
interview, in contrast, is more structured, us-
ing specifi c questions arranged in a particular
order. The researcher presents each respondent
with the same questions in the same sequence
(subject to any contingency questions). The
questions are open-ended, but their format and
presentation are standardized.

To underscore the fl exibility of specialized
interviewing, Patton describes how the two ap-
proaches can be used in combination (2001,
347):

A conversational strategy can be used
within an interview guide approach, or you
can combine a guide approach. . . . This

(Straus 1999; Tjaden and Thoennes 2000).
Concern that survey questions might mean dif-
ferent things to different respondents raises im-
portant questions about reliability and about
the generalizability of survey results across sub-
groups of a population.

As with all methods of observation, a full
awareness of the inherent or probable weak-
nesses of survey research may partially resolve
them. Ultimately, though, we are on the safest
ground when we can use several different re-
search methods to study a given topic.

Other Ways of
Asking Questions
Specialized interviews and focus groups are alterna-
tive ways of gathering question-based data.

Sample surveys are perhaps the best-known
application of asking questions as a data-
gathering strategy for criminal justice research.
Often, however, more specialized interviewing
techniques are appropriate.

Specialized Interviewing
No precise defi nition of the term survey enables
us to distinguish a survey from other types of
interview situations. As a rule of thumb, a sam-
ple survey (even one that uses nonprobability
sampling methods) is an interview-based tech-
nique for generalizing to a larger population
using a standardized questionnaire. In contrast,
specialized interviewing focuses on the views
and opinions of only those individuals who are
interviewed.

Let’s say we are interested in how mental
health professionals view different drug treat-
ment programs for prison inmates. One ap-
proach is to conduct a sample survey of psy-
chologists who work in state correctional
facilities in which each sampled psychologist
completes a structured questionnaire concern-
ing drug treatment programs. This approach
will enable us to generalize to the population of
state prison psychologists.

Chapter 7 Survey Research and Other Ways of Asking Questions 195

cus groups have proved to be more suitable for
many market research applications. In recent
years, focus groups have commonly been used
as substitutes for surveys in criminal justice
and other social scientifi c research.

In a focus group, 8 to 15 people are brought
together in a room to engage in a guided group
discussion of some topic. Although focus
groups cannot be used to make statistical esti-
mates about a population, members are never-
theless selected to represent a target population.
Richard Krueger and Mary Anne Casey (2000)
describe focus groups, their applications, and
their advantages and disadvantages in detail.

For example, the location of community
correctional facilities such as work-release cen-
ters and halfway houses often prompts a classic
“Not in my backyard!” (NIMBY) response from
people who live in neighborhoods where pro-
posed facilities will be built. Recognizing this,
a mayor who wants to fi nd a suitable site with-
out annoying neighborhood residents (voters)
is well advised to convene a focus group that
includes people who live in areas near possible
facility locations. A focus group can test the
“market acceptability” of a work-release center,
which might include the best way to package
and sell the product. Such an exercise might
reveal that an appeal to altruism (“We all have
to make sacrifi ces in the fi ght against crime”)
is much less effective in gaining support than
an alternative sales pitch that stresses potential
economic benefi ts (“This new facility will pro-
vide jobs for neighborhood residents”).

Generalizations from focus groups to target
populations cannot be precise; however, a study
by V. M. Ward and associates (Ward, Bertrand,
and Brown 1991) found that focus group and
survey results can be quite consistent under
certain conditions. They conclude that focus
groups are most useful in two cases: (1) when
precise generalization to a larger population
is not necessary, and (2) when focus group
participants and the larger population they
are intended to represent are relatively homo-
geneous. So, for example, a focus group is not

combined strategy offers the interviewer
fl exibility in probing and in determining
when it is appropriate to explore certain
subjects in greater depth, or even to pose
questions about new areas of inquiry that
were not originally anticipated in the in-
terview instrument’s development.

Open-ended questions are ordinarily used
because they capture rich detail better. The pri-
mary disadvantage of open-ended questions—
having to categorize responses—is not a prob-
lem in specialized interviewing because of the
small number of subjects and because research-
ers are more interested in describing than in
generalizing.

Specialized interviewing can be incorpo-
rated into any research project as a supplemen-
tary source of information. If, for example,
we are interested in the effects of determinant
sentencing on prison populations, we can ana-
lyze data from the Census of State Adult Cor-
rectional Facilities, conducted by the BJS. We
might also interview a small number of correc-
tions administrators, perhaps asking them to
react to our data analysis. Evaluation studies
and other applied research projects frequently
use specialized interviewing techniques, alone
or in combination with other sources of data.

Focus Groups
Like sample surveys, focus group techniques
were refi ned by market research fi rms in the
years following World War II. As the name im-
plies, market research explores questions about
the potential for sales of consumer products.
Because a fi rm may spend millions of dollars
developing, advertising, and distributing some
new item, market research is an important tool
to test consumer reactions before large sums of
money are invested in a product.

Surveys have two disadvantages in market
research. First, a nationwide or large-scale prob-
ability survey can be expensive. Second, it may
be diffi cult to present advertising messages or
other product images in a survey format. Fo-

196 Part Three Modes of Observation

select participants from a specifi c target popu-
lation that relates to our research questions. If
we’re interested in how residents of a specifi c
neighborhood will feel about opening a work-
release center, we should select group partici-
pants who live in the target neighborhood or
one very much like it.

Should You Do It Yourself ?
Anyone can do a mail or simple telephone survey,
but many times it’s better to use professional survey
researchers.

The fi nal issue we address in this chapter is
who should conduct surveys. Drawing a sam-
ple, constructing a questionnaire, and either
conducting interviews or distributing self-
administered instruments are not especially
diffi cult. Equipped with the basic principles we
have discussed so far in this book, you could
complete a modest in-person or telephone
survey yourself. Mail and web-based surveys
of large numbers of subjects are entirely pos-
sible, especially with present-day computer
capabilities.

At the same time, the various tasks involved
in completing a survey require a lot of work
and attention to detail. We have presented
many tips for constructing questionnaires, but
our guidelines barely scratch the surface. Many
books describe survey techniques in more de-
tail, and a growing number focus specifi cally on
telephone or mail techniques (see “Additional
Readings” at the end of this chapter). In many
respects, however, designing and executing a
survey of even modest size can be challenging.

Consider the start-up costs involved in in-
person or telephone interview surveys of any
size. Finding, training, and paying interview-
ers are time consuming, potentially costly, and
require some degree of expertise. The price of
computer equipment continues downward, but
a CATI setup or supply of laptops and associ-
ated software for interviewers still represents
a substantial investment that cannot easily be
justifi ed for a single survey.

appropriate to predict how all city residents will
react to a ban on handgun ownership. But a fo-
cus group of registered handgun owners could
help evaluate a proposed city campaign to buy
back handguns.

Focus groups may also be used in combina-
tion with survey research in one of two ways.
First, a focus group can be valuable in ques-
tionnaire development. When researchers are
uncertain how to present items to respondents,
a focus group discussion about the topic can
generate possible item formats. For instance,
James Nolan and Yoshio Akiyama (1999) stud-
ied police routines for making records of hate
crimes. In a general sense, they knew what con-
cepts they wanted to measure but were unsure
how to operationalize them. They convened fi ve
focus groups in different cities, including po-
lice administrators, mid-level managers, patrol
offi cers, and civilian employees, to learn about
different perspectives on hate-crime recording.
Analyzing focus group results, Nolan and Aki-
yama prepared a self-administered question-
naire that was sent to a large number of indi-
viduals in four police departments.

Second, after a survey has been completed
and preliminary results tabulated, focus groups
may be used to guide the interpretation of
some results. After a citywide survey in which
we fi nd, for example, that recent immigrants
from Southeast Asian countries are least sup-
portive of community policing, we might con-
duct a focus group of Asian residents to delve
more deeply into their concerns.

Focus groups are fl exible and can be adapted
to many uses in basic and applied research.
Keep in mind, however, two key elements
e xpressed in the name of this data collection
technique. Focus means that researchers pres-
ent specifi c questions or issues for directed dis-
cussion. Having a free-for-all discussion about
hate crime, for example, would not have yielded
much useful insight for Nolan and Akiyama to
develop a survey instrument. Group calls our at-
tention to potential participants in the focused
discussions. Like market researchers, we should

Chapter 7 Survey Research and Other Ways of Asking Questions 197

The alternative to doing it yourself is to con-
tract with a professional survey research fi rm
or a company that routinely conducts surveys.
Most universities have a survey research center
or institute, often affi liated with a sociology or
political science department. Such institutes
are usually available to conduct surveys for
government organizations as well as university
researchers, and they can often do so very eco-
nomically. Private research fi rms are another
possibility. Most have the capability to conduct
all types of surveys as well as focus groups.

Using a professional survey fi rm or institute
has several advantages. Chapter 6 described the
basic principles of sampling, but actually draw-
ing a probability sample can be complex. Even
the BJS-COPS do-it-yourself guide book for
community crime surveys (Weisel 1999) coun-
sels police departments and others to consult
with experts in drawing RDD samples. Profes-
sional fi rms regularly use sampling frames that
can represent city, state, and national samples
or whatever combination is appropriate.

We have emphasized the importance of
measurement throughout this book. Research-
ers should develop conceptual and operational
defi nitions and be attentive to all phases of the
measurement process. However, constructing a
questionnaire requires attention to details that
may not always be obvious to researchers. Sur-
vey fi rms are experienced in preparing standard
demographic items, batteries of matrix ques-
tions, and complex contingency questions with
appropriate skip sequences.

Although it is often best for researchers
to discuss specifi c concepts and even to draft
questions, professional fi rms offer the con-
siderable benefi t of experience in pulling it all
together. This is not to say that a researcher
should simply propose some ideas for ques-
tions and then leave the details to the pros.
Working together with a survey institute or
market research fi rm to propose questionnaire
items, review draft instruments, evaluate pre-
tests, and make fi nal modifi cations is usually
the best approach.

If interview surveys are beyond a researcher’s
means, he or she might fall back on a mail or
web-based survey. Few capital costs are involved;
most expenses are in consumables such as enve-
lopes, stamps, and stationery. One or two per-
sons can orchestrate a mail survey reasonably
well at minimal expense. Perhaps a consultant
could be hired to design a web-based survey at
modest cost. But consider two issues.

First, the business of completing a survey
involves a great deal of tedious work. In mail
surveys, questionnaires and cover letters must
be printed, folded or stuffed into envelopes,
stamped, and delivered (fi nally!) to the post of-
fi ce. None of this is much fun. The enjoyment
starts when completed questionnaires start to
trickle in. It’s rewarding to begin the actual em-
pirical research, and the excitement may get the
researcher past the next stretch of tedium: go-
ing from paper questionnaires to actual data. So
it’s possible for one individual to do a mail sur-
vey, but the researcher must be prepared for lots
of work; even then, it will be more work than
expected. Web-based surveys have their own
trade-offs. Economizing on up-front program-
ming costs entails the risk of being swamped by
unusable electronic questionnaires.

The second issue is more diffi cult to deal
with and is often overlooked by researchers.
We have examined at some length the advan-
tages and disadvantages of the three methods
of questionnaire administration. Some meth-
ods are more or less appropriate than others
for different kinds of research questions. If a
telephone or an in-person interview survey is
best for the particular research needs, conduct-
ing a mail or web-based survey would be a com-
promise, perhaps an unacceptable one. But the
r esearcher’s excitement at actually beginning
the research may lead him or her to overlook or
minimize problems with doing a mail survey on
the cheap, in much the same way that research-
ers often are not in a position to recognize ethi-
cal problems with their own work. Doing a mail
survey because it’s all you can afford does not
necessarily make the mail survey worth doing.

198 Part Three Modes of Observation

lations, but surveys have many other uses in
criminal justice research.

• Surveys are the method of choice for obtain-
ing self-reported offending data. Continuing
efforts to improve self-report surveys include
using confi dential computer-assisted personal
interviews.

• Questions may be open-ended or closed-ended.
Each technique for formulating questions has
advantages and disadvantages.

• Short items in a questionnaire are usually bet-
ter than long ones.

• Bias in questionnaire items encourages respon-
dents to answer in a particular way or to sup-
port a particular point of view. It should be
avoided.

• Questionnaires are administered in three basic
ways: self-administered questionnaires, face-to-
face interviews, and telephone interviews. Each
mode of administration can be varied in a num-
ber of ways.

• Computers can be used to enhance each type of
survey. Computer-assisted surveys have many
advantages, but they often require special skills
and equipment.

• It is generally advisable to plan follow-up mail-
ings for self-administered questionnaires, send-
ing new questionnaires to respondents who fail
to respond to the initial appeal.

• The essential characteristic of interviewers is
that they be neutral; their presence in the data
collection process must not have any effect on
the responses given to questionnaire items.

• Surveys conducted over the telephone are fast
and fl exible.

• Each method of survey administration has a va-
riety of advantages and disadvantages.

• Survey research has the weaknesses of being
somewhat artifi cial and potentially superfi cial.
It is diffi cult to gain a full sense of social pro-
cesses in their natural settings through the use
of surveys.

• Specialized interviews with a small number of
people and focus groups are additional ways of
collecting data by asking questions.

• Although the particular tasks required to com-
plete a survey are not especially diffi cult, re-
searchers must carefully consider whether to
conduct surveys themselves or contract with a
professional organization.

Perhaps the chief benefi t of contracting for a
survey is that survey research centers and other
professional organizations have the latest spe-
cialized equipment, software, and know-how to
take advantage of advances in all forms of CAI.
Furthermore, such companies can more readily
handle such administrative details as training
interviewers, arranging travel for in-person sur-
veys, coordinating mail surveys, and providing
general supervision. This frees researchers from
much of the tedium of survey research, enabling
them to focus on more substantive issues.

Researchers must ultimately decide whether
to conduct a survey themselves or contract with
a professional fi rm. And the decision is best
made after carefully considering the pros and
cons of each approach. Too often, university fac-
ulty assume that students can get the job done
while overlooking the important issues of how
to maintain quality control and whether a sur-
vey is a worthwhile investment of students’ time.
Similarly, criminal justice practitioners may be-
lieve that agency staff can handle a mail survey
or conduct phone interviews from the offi ce.
Again, compromises in the quality of results,
together with the opportunity costs of divert-
ing staff from other tasks, must be considered.
The do-it-yourself strategy may seem cheaper in
the short run, but it often becomes a false econ-
omy when attention turns to data analysis and
interpretation.

We’ll close this section with an apocryphal
story about a consultant’s business card; the
card reads, “Fast! Low cost! High quality! Pick
any two.” It’s best to make an informed choice
that best suits your needs.

✪ Main Points
• Survey research, a popular social research

method, involves the administration of ques-
tionnaires to a sample of respondents selected
from some population.

• Survey research is especially appropriate for de-
scriptive or exploratory studies of large popu-

Chapter 7 Survey Research and Other Ways of Asking Questions 199

overview of survey methods, this textbook cov-
ers many aspects of survey techniques that are
omitted here.

Dillman, Don A., Mail and Internet Surveys: The Tai-
lored Design Method 2007 Update, 2nd ed. (New
York: Wiley, 2006). This update of a classic refer-
ence on self-administered surveys includes a va-
riety of web-based techniques. Dillman makes
many good suggestions for improving response
rates.

General Accounting Offi ce, Using Structured Inter-
viewing Techniques (Washington, DC: General
Accounting Offi ce, 1991). This is another use-
ful handbook in the GAO series on evaluation
methods. In contrast to Patton (below), the
GAO emphasizes getting comparable informa-
tion from respondents through structured in-
terviews. This is very useful step-by-step guide.

Krueger, Richard A., and Mary Anne Casey, Focus
Groups: A Practical Guide for Applied Research,
3rd ed. (Thousand Oaks, CA: Sage, 2000). A
clear and comprehensive introduction to focus
groups, this book really lives up to its title, de-
scribing basic principles of focus groups and
giving numerous practical tips.

Patton, Michael Quinn, Qualitative Research and
Evaluation Methods, 3rd ed. (Thousand Oaks,
CA: Sage, 2001). This is a thorough discussion
of specialized interviewing. Patton’s advice will
also be useful in constructing questionnaires
for surveys in general.

Weisel, Deborah, Conducting Community Surveys:
A Practical Guide for Law Enforcement Agencies
(Washington, DC: U.S. Department of Justice,
Offi ce of Justice Programs, Bureau of Justice
Statistics, and Offi ce of Community Oriented
Police Services, 1999). Another practical guide,
this brief publication is prepared for use by
public offi cials, not researchers. As such, it’s a
very good description of the nuts and bolts of
doing telephone surveys.

✪ Key Terms

✪ Review Questions and Exercises
1. Find a questionnaire on the Internet. Bring the

questionnaire to class and critique it. Critique
other aspects of the survey design as well.

2. For each of the open-ended questions listed,
construct a closed-ended question that could
be used in a questionnaire.

a. What was your family’s total income last
year?

b. How do you feel about shock incarceration
or “boot camp” programs?

c. How do people in your neighborhood feel
about the police?

d. What do you feel is the biggest problem fac-
ing this community?

e. How do you protect your home from
burglary?

3. Prepare a brief questionnaire to study percep-
tions of crime near your college or university.
Include questions asking respondents to de-
scribe a nearby area where they either are afraid
to go after dark or think crime is a problem.
Then use your questionnaire to interview at
least 10 students.

4. A recent evaluation of a federal program to
support community policing included sending
questionnaires to a sample of about 1,200 police
chiefs. Each questionnaire included a number
of items asking about specifi c features of com-
munity policing and whether they were being
used in the department. Almost all the police
chiefs had someone else complete the question-
naire. What’s the unit of analysis in this survey?
What problems might result from having an in-
dividual complete such a questionnaire?

✪ Additional Readings
Babbie, Earl, Survey Research Methods, 2nd ed. (Bel-

mont, CA: Wadsworth, 1990). A comprehensive

closed-ended ques-
tions, p. 173

computer-assisted
interviewing,
p. 187

focus group, p. 195
open-ended ques-

tions, p. 173
questionnaire, p. 174

200

Chapter 8

Field Research
The techniques described in this chapter focus on observing life in its natural
habitat—going where the action is and watching. We’ll consider how to pre-
pare for the fi eld, how to observe, how to make records of what is observed,
and how to recognize the relative strengths and weaknesses of fi eld research.

Introduction 201

Topics Appropriate to
Field Research 202

The Various Roles of
the Observer 203

Asking Questions 205

Gaining Access to Subjects 207

Gaining Access to Formal
Organizations 207

Gaining Access to Subcultures 210

Selecting Cases for Observation 210

Purposive Sampling in
Field Research 212

Recording Observations 214

Cameras and Voice Recorders 214

Field Notes 215

Structured Observations 216

Linking Field Observations
and Other Data 217

Illustrations of Field Research 219

Field Research on Speeding and
Traffi c Enforcement 219

CONDUCTING A

SAFETY AUDIT 220

Bars and Violence 222

Strengths and Weaknesses of
Field Research 224

Validity 224

Reliability 225

Generalizability 226

Chapter 8 Field Research 201

Introduction
Field research is often associated with qualitative
techniques, though many other applications are
possible.

We turn now to what may seem like the most
obvious method of making observations: fi eld
research. If researchers want to know about
something, why not simply go where it’s hap-
pening and watch it happen?

Field research encompasses two different
methods of obtaining data: (1) making direct
observation and (2) asking questions. This
chapter concentrates primarily on observation,
although we briefl y describe techniques for spe-
cialized interviewing in fi eld studies.

Most of the observation methods discussed
in this book are designed to produce data appro-
priate for quantitative analysis. Surveys provide
data to calculate things like the percentage of
crime victims in a population or the mean value
of property lost in burglaries. Field research
may yield qualitative data— observations not eas-
ily reduced to numbers—in addition to quanti-
tative data. For example, a fi eld researcher who
is studying burglars may note how many times
subjects have been arrested (quantitative), as
well as whether individual burglars tend to se-
lect certain types of targets (qualitative).

Qualitative fi eld research is often a theory or
hypothesis-generating activity, as well. In many
types of fi eld studies, researchers do not have
precisely defi ned hypotheses to be tested. Field
observation may be used to make sense out of
an ongoing process that cannot be predicted
in advance. This process involves making ini-
tial observations, developing tentative general
conclusions that suggest further observations,
making those observations, revising the prior
conclusions, and so forth.

For example, Ross Homel and associates
(Homel, Tomsen, and Thommeny 1992) con-
ducted a fi eld study of violence in bars in Syd-
ney, Australia, and found that certain situations
tended to trigger violent incidents. Subsequent
studies tested a series of hypotheses about the

links between certain situations and violence
(Homel and Clark 1994), and how interior de-
sign was related to aggression in dance clubs
(Macintyre and Homel 1996). Later research
by James Roberts (2002) expanded these fi nd-
ings by examining management and serving
practices in New Jersey bars and clubs. Barney
Glaser and Anselm Strauss (1967) refer to this
process as “grounded theory.” Rather than fol-
lowing the deductive approach to theory build-
ing described in Chapter 2, grounded theory is
based on (or grounded in) experience, usually
through observations made in the fi eld.

Field studies in criminal justice may also
produce quantitative data that can be used to
test hypotheses or evaluate policy innovations.
Typically, qualitative exploratory observations
help defi ne the nature of some crime problem
and suggest possible policy responses. Follow-
ing the policy response, further observations
are made to assess the policy’s impact. For ex-
ample, the situational crime prevention ap-
proach proposes fi ve steps to analyze specifi c
crime problems. The fi rst and last of those steps
illustrate the dual uses of observation for prob-
lem defi nition and hypothesis testing (Clarke
1997b, 5). The fi rst step is to collect data about
the nature and dimensions of the specifi c crime
problem. The last step is to monitor results and
disseminate experience.

By now, especially if you have experience as a
criminal justice professional, you may be think-
ing that fi eld research is not much different
from what police offi cers and many other people
do every day—make observations in the fi eld
and ask people questions. Police may also col-
lect data about particular crime problems, take
action, and monitor results. So what’s new here?

Compared with criminal justice profession-
als, researchers tend to be more concerned with
making generalizations and then using system-
atic fi eld research techniques to support those
generalizations. Consider the different goals
and approaches used by two people who might
observe shoplifters: a retail store security guard
and a criminal justice researcher. The security

202 Part Three Modes of Observation

For example, Clifford Shearing and Phillip
Stenning (1992, 251) describe how Disney World
employs subtle but pervasive mechanisms of in-
formal social control that are largely invisible to
millions of theme park visitors. It is diffi cult to
imagine any technique other than direct obser-
vation that could produce these insights:

Control strategies are embedded in both
environmental features and structural rela-
tions. In both cases control structures and
activities have other functions which are
highlighted so that the control function is
overshadowed. For example, virtually every
pool, fountain, and fl ower garden serves
both as an aesthetic object and to direct
visitors away from, or towards, particu-
lar locations. Similarly, every Disney em-
ployee, while visibly and primarily engaged
in other functions, is also engaged in the
maintenance of order.

Many of the different uses of fi eld obser-
vation in criminal justice research are nicely
summarized by George McCall. Comparing
the three principal ways of collecting data—
observing, asking questions, and consulting
written records—McCall (1978, 8–9) states that
observation is most appropriate for obtaining
information about physical or social settings,
behaviors, and events.

Field research is especially appropriate for
topics that can best be understood within their
natural settings. Surveys may be able to mea-
sure behaviors and attitudes in somewhat arti-
fi cial settings, but not all behavior is best mea-
sured this way. For example, fi eld research is a
superior method for studying how street-level
drug dealers interpret behavioral and situa-
tional cues to distinguish potential customers,
normal street traffi c, and undercover police of-
fi cers. It would be diffi cult to study these skills
through a survey.

Field research on actual crimes involves
obtaining information about events. McCall

guard wishes to capture a thief and prevent the
loss of shop merchandise. Toward those ends,
he or she adapts surveillance techniques to the
behavior of a particular suspected shoplifter.
The researcher’s interests are different; perhaps
she or he estimates the frequency of shoplifting,
describes characteristics of shoplifters, or evalu-
ates some specifi c measure to prevent shoplift-
ing. In all likelihood, researchers use more stan-
dardized methods of observation aimed toward
a generalized understanding.

This chapter examines fi eld research meth-
ods in some detail, providing a logical over-
view and suggesting some specifi c skills and
techniques that make scientifi c fi eld research
more useful than the casual observation we
all engage in. As we cover the various applica-
tions and techniques of fi eld research, it’s use-
ful to recall the distinction we made, way back
in Chapter 1, between ordinary human inquiry
and social scientifi c research. Field methods il-
lustrate how the common techniques of obser-
vation that we all use in ordinary inquiry can be
deployed in systematic ways.

Topics Appropriate to
Field Research
When conditions or behavior must be studied in
natural settings, fi eld research is usually the best
approach.

One of the key strengths of fi eld research is
the comprehensive perspective it gives the re-
searcher. This aspect of fi eld research enhances
its validity. By going directly to the phenome-
non under study and observing it as completely
as possible, we can develop a deeper and fuller
understanding of it. This mode of observa-
tion, then, is especially (though not exclusively)
appropriate to research topics that appear to
defy simple quantifi cation. The fi eld researcher
may recognize nuances of attitude, behavior,
and setting that escape researchers using other
methods.

Chapter 8 Field Research 203

items about victimization, perceptions of crime
problems and lighting quality, and reports
about routine nighttime behavior in areas af-
fected by the lighting.

Although the pretest and posttest survey
items could have been used to assess changes
in attitudes and behavior associated with im-
proved lighting, fi eld observations provided
better measures of behavior. Painter conducted
systematic counts of pedestrians in areas both
before and after street lighting was enhanced.
Observations like these are better measures of
such behavior than are survey items because
people often have diffi culty recalling actions
such as how often they walk through some area
after dark.

The Various Roles
of the Observer
Field observer roles range from full participation to
fully detached observation.

The term fi eld research is broader and more in-
clusive than the common term participant obser-
vation. Field researchers need not always partic-
ipate in what they are studying, although they
usually will study it directly at the scene of the
action. As Catherine Marshall and Gretchen
Rossman (1995, 60) point out:

The researcher may plan a role that entails
varying degrees of “participantness”—that
is, the degree of actual participation in
daily life. At one extreme is the full partici-
pant, who goes about ordinary life in a role
or set of roles constructed in the setting. At
the other extreme is the complete observer,
who engages not at all in social interac-
tion and may even shun involvement in
the world being studied. And, of course, all
possible complementary mixes along the
continuum are available to the researcher.

The full participant, in this sense, may be a gen-
uine participant in what he or she is studying

(1978) points out that observational studies of
vice—such as prostitution and drug use—are
much more common than observational stud-
ies of other crimes, largely because these behav-
iors depend at least in part on being visible and
attracting customers. One notable exception is
research on shoplifting. A classic study by Terry
Baumer and Dennis Rosenbaum (1982) had
two goals: (1) to estimate the incidence of shop-
lifting in a large department store and (2) to
assess the effectiveness of different store secu-
rity measures. Each objective required devising
some measure of shoplifting, which Baumer
and Rosenbaum obtained through direct ob-
servation. Samples of persons were followed by
research staff from the time they entered the
store until they left. Observers, posing as fellow
shoppers, watched for any theft by the person
they had been assigned to follow.

Many aspects of physical settings are prob-
ably best studied through direct observation.
The prevalence and patterns of gang graffi ti in
public places could not be reliably measured
through surveys, unless the goal was to mea-
sure perceptions of graffi ti. The work of Oscar
Newman (1972, 1996), Ray Jeffery (1977), and
Patricia and Paul Brantingham (Brantingham
and Brantingham 1991) on the relationship be-
tween crime and environmental design depends
crucially on fi eld observation of settings. If op-
portunities for crime vary by physical setting,
then observation of the physical characteristics
of a setting is required.

An evaluation of street lighting as a crime
prevention tool in two areas of London illus-
trates how observation can be used to mea-
sure both physical settings and behavior. Kate
Painter (1996) was interested in the relation-
ships between street lighting, certain crime rates
(measured by victim surveys), fear of crime, and
nighttime mobility. Improvements in street
lighting were made in selected streets; surveys
of pedestrians and households in the affected
areas were conducted before and after the light-
ing improvements. Survey questions included

204 Part Three Modes of Observation

workers are not normally compatible with col-
lecting data for research.

Because of these considerations— ethical,
scientifi c, practical, and safety—fi eld research-
ers most often choose a different role. The re-
searcher taking the role participant-as-observer
participates with the group under study but
makes it clear that he or she is also undertak-
ing research. If someone has been convicted of
some offense and been placed on probation, for
example, that might present an opportunity to
launch a study of probation offi cers.

McCall (1978) suggests that fi eld research-
ers who study active offenders may comfort-
ably occupy positions around the periphery
of criminal activity. Acting as a participant in
certain types of leisure activities, such as fre-
quenting selected bars or dance clubs, may be
appropriate roles. This approach was used by
Dina Perrone in her research on drug use in
New York dance clubs (2006; also mentioned
in Chapter 2). Furthermore, McCall describes
how making one’s role as a researcher known
to criminals and becoming known as a “right
square” is more acceptable to subjects than an
unsuccessful attempt to masquerade as a col-
league. There are dangers in this role also, how-
ever. The people being studied may shift their
attention to the research project, and the pro-
cess being observed may no longer be typical.
Conversely, a researcher may come to identify
too much with the interests and viewpoints of
the participants. This is referred to as going na-
tive and results in loss of the detachment neces-
sary for social science.

The observer-as-participant identifi es himself
or herself as a researcher and interacts with
the participants in the course of their routine
activities but makes no pretense of actually be-
ing a participant. Many observational studies
of police patrol are examples of this approach.
Researchers typically accompany police offi cers
on patrol, observing routine activities and in-
teractions between police and citizens. Spend-
ing several hours in the company of a police

(for example, a participant in a demonstration
against capital punishment)— or at least pre-
tend to be a genuine participant. In any event,
if you are acting as a full participant, you let
people see you only as a participant, not as a
researcher.

That raises an ethical question: is it ethical
to deceive the people we are studying in the
hope that they will confi de in us as they would
not confi de in an identifi ed researcher? Do the
interests of science—the scientifi c values of the
research— offset any ethical concerns?

Related to this ethical consideration is a sci-
entifi c one. No researcher deceives his or her
subjects solely for the purpose of deception.
Rather, it is done in the belief that the data will
be more valid and reliable, that the subjects will
be more natural and honest if they do not know
the researcher is doing a research project. If the
people being studied know they are being stud-
ied, they might reject the researcher or modify
their speech and behavior to appear more re-
spectable than they otherwise would. In either
case, the process being observed might radically
change.

On the other side of the coin, if we assume
the role of complete participant, we may affect
what we are studying. To play the role of partic-
ipant, we must participate, yet our participation
may affect the social process we are studying.
Additional problems may emerge in any partic-
ipant observation study of active criminals. Le-
gal and physical risks, mentioned in Chapter 2,
present obstacles to the complete participant
in fi eld research among active offenders or
delinquents.

Finally, complete participation in fi eld stud-
ies of criminal justice institutions is seldom
possible. Although it is common for police of-
fi cers to become criminal justice researchers,
practical constraints on the offi cial duties of
police present major obstacles to simultane-
ously acting as researcher and police offi cer.
Similarly, the responsibilities of judges, pros-
ecutors, probation offi cers, and corrections

Chapter 8 Field Research 205

their experience. In making a decision, research-
ers must be guided by both methodological and
ethical considerations. Because these often con-
fl ict, deciding on the appropriate role may be
diffi cult. Often, researchers fi nd that their role
limits the scope of their study.

Asking Questions
Field researchers frequently supplement observa-
tions by interviewing subjects.

Field research often involves going where the
action is and simply watching and listening.
Researchers can learn a lot merely by being at-
tentive to what’s going on. Field research can
also involve more active inquiry. Sometimes it’s
appropriate to ask people questions and record
their answers.

We examined interviewing during our dis-
cussion on survey research in Chapter 7. Field
research interviews are usually much less struc-
tured than survey interviews. At one extreme,
an unstructured interview is essentially a con-
versation in which the interviewer establishes
a general direction for the conversation and
pursues specifi c topics raised by the respon-
dent. Ideally, the respondent does most of the
talking. Michael Quinn Patton (2001) refers to
this type as an “informal conversational inter-
view,” which is especially well suited to in-depth
probing.

Unstructured interviews are most appro-
priate when researchers have little knowledge
about a topic and when it’s reasonable for them
to have a casual conversation with a subject.
This is a good strategy for interviewing active
criminals. Unstructured interviews are also ap-
propriate when researchers and subjects are
together for an extended time, such as a re-
searcher accompanying police on patrol.

In other fi eld research situations, interviews
will be somewhat more structured. The conver-
sational approach may be diffi cult to use with
offi cials in criminal justice or other agencies,

offi cer also affords opportunities for unstruc-
tured interviewing.

The complete observer, at the other extreme,
observes a location or process without becom-
ing a part of it in any way. The subjects of study
might not even realize they are being studied
because of the researcher’s unobtrusiveness. An
individual making observations while sitting
in a courtroom is an example. Although the
complete observer is less likely to affect what is
being studied and less likely to go native than
the complete participant, he or she may also be
less able to develop a full appreciation of what
is being studied. A courtroom observer, for ex-
ample, witnesses only the public acts that take
place in the courtroom, not private conferences
between judges and attorneys.

McCall (1978, 45) points out an interesting
and often unnoticed trade-off between the role
observers adopt and their ability to learn from
what they see. If their role is covert (complete
participation) or detached (complete observa-
tion), they are less able to ask questions to clar-
ify what they observe. As complete participants,
they take pains to conceal their observations
and must exercise care in querying subjects.
Similarly, complete observation means that it is
generally not possible to interact with the per-
sons or things being observed.

Researchers have to think carefully about
the trade-off. If it is most important that sub-
jects not be affected by their role as observer,
then complete participation or observation is
preferred. If being able to ask questions about
what they observe is important, then some role
that combines participation and observation is
better.

More generally, the appropriate role for ob-
servers hinges on what they want to learn and
how their inquiry is affected by opportunities
and constraints. Different situations require
different roles for researchers. Unfortunately,
there are no clear guidelines for making this
choice; fi eld researchers rely on their under-
standing of the situation, their judgment, and

206 Part Three Modes of Observation

equipment by police or even the ways they go
about doing traffi c enforcement. Part of the re-
search involved semistructured fi eld interviews
with commanders, supervisors, and troopers.
Figure 8.1 shows the interview guide Maxfi eld
and Andresen used with supervisors. This guide
was just that—a guide. Some subjects were
friendly and wanted to talk, so the interview
became more of a conversation that eventually
yielded answers to the queries in the guide. Oth-
ers were wary, probably because the agency had
been subject to criticism over racial profi ling
and other discriminatory practices. Interviews
with such persons were brief, and they followed
the guide very closely.

At its best, a fi eld research interview is much
like normal conversation. Because of this, it
is essential to keep reminding ourselves that
we are not having a normal conversation. In

who will respond best (at least initially) to a
specifi c set of open-ended questions. This is
because it is usually necessary to arrange ap-
pointments to conduct fi eld research interviews
with judges, prosecutors, bail commissioners,
and other offi cials. Having arranged such an
appointment, it would be awkward to initiate
a casual conversation in hopes of eliciting the
desired information.

At the same time, one of the special strengths
of fi eld research is its fl exibility in the fi eld. Even
during structured interviews with public of-
fi cials, the answers evoked by initial questions
should shape subsequent ones.

For example, Michael Maxfi eld and Carsten
Andresen (2002) studied the use by the New
Jersey State Police of video recording equip-
ment to document traffi c stops. Very little
published research existed on the use of video

1. What do you feel are the main uses of the mobile
video system (MVS) for supervisors? [probe con-
tingent on responses]

2. Please describe how you use tapes from the MVS
system to periodically review officer performance.
[probes]

• Compare tapes with incident reports.

• Review tapes from certain types of incidents.
[probe: Please describe types and why these
types.]

• Keep an eye on individual officers. [probe:
Please describe what might prompt you to se-
lect individual officers. I do not want you to name
individuals. Instead, can you describe the rea-
sons why you might want to keep an eye on
specific individuals?]

3. For routine review, how do you decide which tapes to
select for review?

• Does it vary by shift?

• Do you try to review a certain number of
incidents?

• Or do you scan through tapes sort of at random?

4. Think back to when the system first became
operational in cars at this station. What were some
questions, concerns, or problems that officers might
have had at the beginning?

5. How do officers under your command feel about the
system now?

6. What, if any, technical problems seem to come up
regularly? Occasionally?

7. Have you encountered any problems, or do you
have any concerns about the MVS? [probe: opera-
tional issues; tape custody issues; other]

8. Please describe any specific ways you think the
system can or should be improved.

9. If you want to find the tape for an individual officer
on a specific day, and for a specific incident, would
you say that’s pretty easy, not difficult, or somewhat
difficult? Do you have any suggestions for changing
the way tapes are fi led and controlled?

10. I would like to view some sample tapes. Can
we go to the tape cabinet and find a tape for [select
date from list]?

Figure 8.1 Interview Guide for Field Study of State Police
Source: Adapted from Maxfi eld and Andresen (2002).

Chapter 8 Field Research 207

riety of informal organizational cultures. Crim-
inal courts are highly structured organizations
in which a presiding judge may oversee court
assignments and case scheduling for judges
and many support personnel. At the same time,
courts are chaotic organizations in which three
constellations of professionals—prosecutors,
defense attorneys, and judges— episodically in-
teract to process large numbers of cases.

Continuing with the example of your re-
search on community corrections, the best
strategy in gaining access to virtually any other
formal criminal justice organization is to use a
four-step procedure: sponsor, letter, phone call,
and meeting. Our discussion of these steps as-
sumes that you will begin your fi eld research by
interviewing the agency executive director and
gaining that person’s approval for subsequent
interviews and observations.

Sponsor The fi rst step is to fi nd a sponsor—a
person who is personally known to and respected
by the executive director. Ideally, a sponsor will
be able to advise you on a person to contact, that
person’s formal position in the organization,
and her or his informal status, including her or
his relationships with other key offi cials. Such
advice can be important in helping you initiate
contact with the right person while avoiding
people who have bad reputations.

For example, you may initially think that
a particular judge who is often mentioned
in newspaper stories about community cor-
r ections would be a useful source of informa-
tion. However, your sponsor might advise you
that the judge is not held in high regard
by prosecutors and community corrections
staff. Your association with this judge would
generate suspicion on the part of other offi –
cials whom you might eventually want to con-
tact and may frustrate your attempts to obtain
information.

Finding the right sponsor is often the most
important step in gaining access. It may require
a couple of extra steps, because you might fi rst
need to ask a professor whether she or he knows

normal conversations, each of us wants to come
across as an interesting, worthwhile person. Of-
ten we don’t really hear each other because we’re
too busy thinking of what we’ll say next. As an
interviewer, the desire to appear interesting is
counterproductive to the task. We need to make
the other person seem interesting by being in-
terested ourselves.

Gaining Access to Subjects
Arranging access to subjects in formal organizations
or subcultures begins with an initial contact.

Suppose you decide to undertake fi eld research
on a community corrections agency in a large
city. Let’s assume that you do not know a great
deal about the agency and that you will identify
yourself as a researcher to staff and other peo-
ple you encounter. Your research interests are
primarily descriptive: you want to observe the
routine operations of the agency in the offi ce
and elsewhere. In addition, you want to inter-
view agency staff and persons who are serving
community corrections sentences. This section
will discuss some of the ways you might prepare
before you conduct your interviews and direct
observations.

As usual, you are well advised to begin with
a search of the relevant literature, fi lling in your
knowledge of the subject and learning what
others have said about it.

Gaining Access to
Formal Organizations
Any research on a criminal justice institution,
or on persons who work either in or under
the supervision of an institution, normally re-
quires a formal request and approval. One of
the fi rst steps in preparing for the fi eld, then, is
to arrange access to the community corrections
agency.

Obtaining initial approval can be confusing
and frustrating. Many criminal justice agencies
in large cities have a complex organization, com –
bining a formal hierarchy with a bewildering va-

208 Part Three Modes of Observation

introduction, brief statement of your research
purpose, and action request. See Figure 8.2 for
an example. The introduction begins by naming
your sponsor, thus immediately establishing
your mutual acquaintance. This is a key part of
the process; if you do not name a sponsor, or if
you name the wrong sponsor, you might get no
further.

Next, you describe your research purpose
succinctly. This is not the place to give a de-
tailed description as you would in a proposal.

someone. You could then contact that person
(with the sponsorship of your professor) and
ask for further assistance. For purposes of il-
lustration, we will assume that your professor
is knowledgeable, well connected, and happy
to act as your sponsor. Your professor confi rms
your view that it is best to begin with the execu-
tive director of community corrections.

Letter Next, write a letter to the executive
director. Your letter should have three parts:

Jane Adams
Executive Director
Chaos County Community Corrections
Anxiety Falls, Colorado 1 May 2009

Dear Ms. Adams:

My colleague, Professor Marcus Nelson, suggested I contact you for as sistance in my research on community
corrections. I will be conducting a study of community corrections programs and wish to include the Chaos
County agency in my research.

Briefly, I am interested in learning more about the different types of sentences that judges impose in jurisdic-
tions with community corrections agencies. As you know, Colorado’s community corrections statute grants
considerable discretion to individual counties in arranging locally administered corrections programs. Because
of this, it is generally believed that a wide variety of corrections programs and sentences have been developed
throughout the state. My research seeks to learn more about these programs as a first step toward developing
recommendations that may guide the development of programs in other states. I also wish to learn more about
the routine administration of a community corrections program such as yours.

I would like to meet with you to discuss what programs Chaos County has developed, including current pro-
grams and those that were considered but not implemented. In addition, any information about different types
of community corrections sentences that Chaos County judges impose would be very useful. Finally, I would
appreciate your suggestions on further sources of information about community corrections programs in Chaos
County and other areas.

I will call your office at about 10:00 a.m. on Monday, May 8, to arrange a meeting. If that time will not be conve-
nient, or if you have any questions about my research, please contact me at the number below.

Thanks in advance for your help.

Sincerely,

Alfred Nobel
Research Assistant
Institute for Advanced Studies
(201) 555-1212

Figure 8.2 Sample Letter of Introduction

Chapter 8 Field Research 209

with the executive director—and established
your legitimacy by naming a sponsor.

Meeting The fi nal step is meeting with or
interviewing the contact person. Because you
have used the letter–phone call–meeting pro-
cedure, the contact person may have already
taken preliminary steps to help you. For ex-
ample, because the letter in Figure 8.2 indicates
that you wish to interview the executive direc-
tor about different types of community correc-
tions sentences, she may have assembled some
procedures manuals or reports in preparation
for your meeting.

This procedure generally works well in gain-
ing initial access to public offi cials or other
people who work in formal organizations. Once
initial access is gained, it is up to the researcher
to use interviewing skills and other techniques
to elicit the desired information. This is not as
diffi cult as it might seem to novice (or appren-
tice) researchers for a couple of reasons.

First, most people are at least a bit fl attered
that their work, ideas, and knowledge are of in-
terest to a researcher. And researchers can take
advantage of this with the right words of en-
couragement. Second, criminal justice profes-
sionals are often happy to talk about their work
with a knowledgeable outsider. Police, proba-
tion offi cers, and corrections workers usually
encounter only their colleagues and their cli-
ents on the job. Interactions with colleagues
become routine and suffused with offi ce poli-
tics, and interactions with clients are common
sources of stress. Talking with an interested
and knowledgeable researcher is often seen as a
pleasant diversion.

By the same token, Richard Wright and
Scott Decker (1994) report that most members
of their sample of active burglars were both
happy and fl attered to discuss the craft of bur-
glary. Because they were engaged in illegal ac-
tivities, burglars had to be more circumspect
about sharing their experiences, unlike the way

If possible, keep the description to one or two
paragraphs, as in Figure 8.2. If a longer descrip-
tion is necessary to explain what you will be
doing, you should still include only a brief de-
scription in your introductory letter, referring
the reader to a separate attachment in which
you summarize your research.

The action request describes what immedi-
ate role you are asking the contact person to
play in your research. You may simply be re-
questing an interview, or you may want the per-
son to help you gain access to other offi cials.
Notice how the sample in Figure 8.2 mentions
both an interview and “suggestions on further
sources of information about community cor-
rections.” In any case, you will usually want to
arrange to meet or at least talk with the contact
person. That leads to the third step.

Phone Call You probably already know that
it can be diffi cult to arrange meetings with
public offi cials (and often professors) or even
to reach people by telephone. You can simplify
this task by concluding your letter with a pro-
posal for this step: arranging a phone call. The
example in Figure 8.2 specifi es a date and ap-
proximate time when you will call. To be safe,
specify a date about a week from the date of
your letter. Notice also the request that the ex-
ecutive director call you if some other time will
be more convenient.

When you make the call, the executive di-
rector will have some idea of who you are and
what you want. She will also have had the op-
portunity to contact your sponsor if she wants
to verify any information in your introductory
letter.

The actual phone call should go smoothly.
Even if you are not able to talk with the execu-
tive director personally, you will probably be
able to talk to an assistant and make an ap-
pointment for a meeting (the next step). Again,
this will be possible because your letter de-
scribed what you eventually want—a meeting

210 Part Three Modes of Observation

criminals hang out. Wright and Decker rejected
that strategy as a time-consuming and uncer-
tain way to fi nd burglars, in part because they
were not sure where burglars hung out. In con-
trast, Bruce Jacobs (1999) initiated contact with
street-level drug dealers by hanging around
and being noticed in locations known for crack
availability. Consider how this tactic might
make sense for fi nding drug dealers, whose il-
legal work requires customers. In contrast, the
offense of burglary is more secretive, and it’s
more diffi cult to imagine how one would fi nd
an area known for the presence of burglars.

Whatever techniques are used to identify
subjects among subcultures, it is generally not
possible to produce a probability sample, and so
the sample cannot be assumed to represent some
larger population within specifi ed confi dence
intervals. It is also important to think about po-
tential selection biases in whatever procedures
are used to recruit subjects. Although we can’t
make probability statements about samples of
active offenders, such samples may be represen-
tative of a subculture target population.

Selecting Cases for Observation
This brings us to the more general question
of how to select cases for observation in fi eld
research. The techniques used by Wright and
Decker, as well as by many other researchers
who have studied active criminals, combine the
use of informants and what is called snowball
sampling. As we mentioned in Chapter 6, with
snowball sampling, initial research subjects (or
informants) identify other persons who might
also become subjects, who in turn suggest more
potential subjects, and so on. In this way, a
group of subjects is accumulated through a se-
ries of referrals.

Wright and Decker’s (1994) study provides a
good example. The ex-offender contacted a few
active burglars and a few streetwise noncrimi-
nals, who referred researchers to additional
subjects, and so on. This process is illustrated
in Figure 8.3, which shows the chain of referrals

many people talk about events at work. As a
result, burglars enjoyed the chance to describe
their work to interested researchers who both
promised confi dentiality and treated them “as
having expert knowledge normally unavailable
to outsiders” (1994, 26).

Gaining Access to Subcultures
Research by Wright and Decker illustrates
how gaining access to subcultures in criminal
justice—such as active criminals, deviants, juve-
nile gangs, and inmates—requires tactics that
differ in some respects from those used to meet
with public offi cials. Letters, phone calls, and
formal meetings are usually not appropriate for
initiating fi eld research among active offenders.
However, the basic principle of using a sponsor
to gain initial access operates in much the same
way, although the word informant is normally
used to refer to someone who helps make con-
tact with subcultures.

Informants may be people whose job in-
volves working with criminals—such as police,
juvenile caseworkers, probation offi cers, attor-
neys, and counselors at drug clinics. Lawyers
who specialize in criminal defense work can be
especially useful sources of information about
potential subjects. Frances Gant and Peter Gra-
bosky (2001) contacted a private investigator
to help locate car thieves and people working
in auto-related businesses who were reputed to
deal in stolen parts.

Wright and Decker (1994) were fortunate to
encounter a former offender who was well con-
nected with active criminals. Playing the role
of sponsor, the ex-offender helped researchers
in two related ways. First, he referred them to
other people, who in turn found active burglars
willing to participate in the study. Second, he
was well known and respected among burglars,
and his sponsorship of the researchers made it
possible for them to study a group of naturally
suspicious subjects.

A different approach for gaining access to
subcultures is to hang around places where

Chapter 8 Field Research 211

and 029. Notice also that some subjects were
themselves “nominated” by more than one
source. In the middle of the bottom row in Fig-
ure 8.3, for example, subject 064 was mentioned
by subjects 060 and 061.

There are, of course, other ways of selecting
subjects for observation. Chapter 6 discussed
the more conventional techniques involved in
probability sampling and the accompanying
logic. Although the general principles of rep-
resentativeness should be remembered in fi eld
research, controlled sampling techniques are
often not possible.

that accumulated a snowball sample of 105
individuals.

Starting at the top of Figure 8.3, the ex-
offender put researchers in contact with two
subjects directly (001 and 003), a small-time
criminal, three streetwise noncriminals, a crack
addict, a youth worker, and someone serving
probation. Continuing downward, the small-
time criminal was especially helpful, identifying
12 subjects who participated in the study (005,
006, 008, 009, 010, 021, 022, 023, 025, 026, 030,
032). Notice how the snowball effect conti n-
ues, with subject 026 identifying subjects 028

Figure 8.3 Snowball Sample Referral Chart
Source: Reprinted from Richard T. Wright and Scott H. Decker, Burglars on the Job: Streetlife and Residential Break-Ins (Boston:
Northeastern University Press, 1994), p. 19. Reprinted by permission of Northeastern University Press.

EX-OFFENDER

streetwise
noncriminal

streetwise
noncriminal

streetwise
noncriminal probationer

Females

Whites

Juveniles (under 18) *

KEY

low-
level “fence”

retired high-
level “fence”

013

014

015
009

youth
worker

heroin addict/retired
small-time criminal

“crack” addictsmall-time
criminal

008

006

005

001*
003*

002*

004*

010

021024

022027

023

025

028

026

029*

030 032

034

031*

074*

075*

077*

076*

081*

082*

087 088*

073

083

089

098

101099093

007

094104

105

100

092

072

085*

082

102

090

091

097*

103*

095

096*

040

044

050038

018

017

016 020

019

033

036

039 042

037035

041 047*

043049

054

046

045 048 052

063

011

012

066 078

058056 062*

071

065 068

070064

084

080

079069

067

061060059

057

055

053

051

212 Part Three Modes of Observation

who had not been caught. After accumulating
their sample, the researchers were in a position
to test this assumption by examining arrest re-
cords for their subjects. Only about one-fourth
of the active burglars had ever been convicted
of burglary; an additional one-third had been
arrested for burglary but not convicted. More
than 40 percent had no burglary arrests, and
8 percent had never been arrested for any of-
fense (1994, 12).

Putting all this together, Wright and Decker
concluded that about three-fourths of their
subjects would not have been eligible for inclu-
sion if the researchers had based their sample
on persons convicted of burglary. Thus little
overlap exists between the population of ac-
tive burglars and the population of convicted
burglars.

Purposive Sampling in
Field Research
Sampling in fi eld research tends to be more
complicated than in other kinds of research. In
many types of fi eld studies, researchers attempt
to observe everything within their fi eld of study;
thus, in a sense, they do not sample at all. In
reality, of course, it is impossible to observe ev-
erything. To the extent that fi eld researchers ob-
serve only a portion of what happens, what they
do observe is a de facto sample of all the possible
observations that might have been made. We
can seldom select a controlled sample of such
observations. But we can keep in mind the gen-
eral principles of representativeness and inter-
pret our observations accordingly.

The ability to systematically sample cases for
observation depends on the degree of structure
and predictability of the phenomenon being ob-
served. This is more of a general guideline than a
hard-and-fast rule. The actions of youth gangs,
burglars, and auto thieves are less structured
and predictable than those of police offi cers. It
is possible to select a probability sample of po-
lice offi cers for observation because the behav-
ior of police offi cers in a given city is structured
and predictable in many dimensions. Because

As an illustration, consider the potential se-
lection biases involved in a fi eld study of devi-
ants. Let’s say we want to study a small number
of drug dealers. We have a friend who works in
the probation department of a large city and is
willing to introduce us to people convicted of
drug dealing and sentenced to probation. What
selection problems might result from studying
subjects identifi ed in this way? How might our
subjects not be representative of the general
population of drug dealers? If we work our way
backward from the chain of events that begins
with a crime and ends with a criminal sentence,
the answers should become clear.

First, drug dealers sentenced to probation
may be fi rst-time offenders or persons convicted
of dealing small amounts of “softer” drugs.
Repeat offenders and kingpin cocaine dealers
will not be in this group. Second, it is possible
that people initially charged with drug dealing
were convicted of simple possession through
a plea bargain; because of our focus on people
convicted of dealing, our selection procedure
will miss this group as well. Finally, by select-
ing dealers who have been arrested and con-
victed, we may be gaining access only to those
less skilled dealers who got caught. More skilled
or experienced dealers may be less likely to be
arrested in the fi rst place; they may be differ-
ent in important ways from the dealers we wish
to study. Also, if dealers in street drug markets
are more likely to be arrested than dealers who
work through social networks of friends and
acquaintances, a sample based on arrested deal-
ers could be biased in more subtle ways.

To see why this raises an important issue
in selecting cases for fi eld research, let’s return
again to the sample of burglars studied by
Wright and Decker. Notice that their snowball
sample began with an ex-offender and that they
sought out active burglars. An alternative ap-
proach would be to select a probability or other
sample of convicted burglars, perhaps in prison
or on probation. But Wright and Decker re-
jected this strategy for sampling because of the
possibility that they would overlook burglars

Chapter 8 Field Research 213

wide streets and narrow ones, busy streets and
quiet ones, or samples from different times of
day. In a study of pedestrian traffi c, we might
also observe people in different types of urban
neighborhoods— comparing residential and
commercial areas, for example.

Table 8.1 summarizes different sampling di-
mensions that might be considered in planning
fi eld research. The behavior of people, together
with the characteristics of people and places,
can vary by population group, location, time,
and weather. We have already touched on the
fi rst two in this chapter; now we will briefl y dis-
cuss how sampling plans might consider time
and weather dimensions.

People tend to engage in more out-of-door
activities in fair weather than in wet or snowy
conditions. In northern cities, people are out-
side more when the weather is warm. Any study
of outdoor activity should therefore consider
the potential effects of variation in the weather.
For example, in Painter’s study of pedestrian
traffi c before and after improvements in street
lighting, it was important to consider weather
conditions during the times observations were
made.

Behavior also varies by time, presented as
micro and macro dimensions in Table 8.1. City
streets in a central business district are busiest
during working hours, whereas more people
are in residential areas at other times. And, of
course, people do different things on weekends
than during the work week. Seasonal variation,
the macro time dimension, may also be impor-
tant in criminal justice research. Daylight lasts
longer in summer months, which affects the
amount of time people spend outdoors. Shop-
ping peaks from Thanksgiving to Christmas,
increasing the number of shoppers, who along
with their automobiles may become targets for
thieves.

In practice, controlled probability sampling
is seldom used in fi eld research. Different types
of purposive samples are much more common.
Patton (2001, 230; emphasis in original) de-
scribes a broad range of approaches to purposive

the population of active criminals is unknown,
it is not possible to select a probability sample
for observation.

This example should call to mind our discus-
sion of sampling frames in Chapter 6. A roster
of police offi cers and their assignments to pa-
trol sectors and shifts could serve as a sampling
frame for selecting subjects to observe. No such
roster of gang members, burglars, and auto
thieves is available. Criminal history records
could serve as a sampling frame for selecting
persons with previous arrests or convictions,
subject to the problems of selectivity we have
mentioned.

Now consider the case in which a sampling
frame is less important than the regularity of
a process. The regular, predictable passage of
people on city sidewalks makes it possible to
systematically select a sample of cases for ob-
servation. There is no sampling frame of pe-
destrians, but studies such as Painter’s research
(1996) on the effects of street lighting can de-
pend on the reliable fl ow of passersby who may
be observed.

In an observational study such as Painter’s,
we might also make observations at a number
of different locations on different streets. We
could pick the sample of locations through
standard probability methods, or more likely,
we could use a rough quota system, observing

Table 8.1 Sampling Dimensions in Field
Research

Sampling
Dimension Variation in

Population Behavior and characteristics

Space Behavior
Physical features of locations

Time, micro Behavior by time of day, day of week
Lighting by time of day
Business, store, entertainment
activities by time of day, day
of week

Time, macro Behavior by season, holiday
Entertainment by season, holiday

Weather Behavior by weather

214 Part Three Modes of Observation

types of automated and remote measurement,
such as videotapes, devices that count automo-
bile traffi c, or computer tabulations of mass
transit users. In between is a host of methods
that have many potential applications in crimi-
nal justice research.

Of course, the methods selected for record-
ing observations are directly related to issues of
measurement— especially how key concepts are
operationalized. Thinking back to our discus-
sion of measurement in Chapter 4, you should
recognize why this is so. If we are interested in
policies to increase nighttime pedestrian traffi c
in some city, we might want to know why peo-
ple do or do not go out at night and how many
people stroll around different neighborhoods.
Interviews—perhaps in connection with a
survey— can determine people’s reasons for go-
ing out or not, whereas video recordings of pass-
ersby can provide simple counts. By the same
token, a traffi c-counting device can produce
information about the number of automobiles
that pass a particular point on the road, but it
cannot measure what the blood alcohol levels
of drivers are, whether riders are wearing seat
belts, or how fast a vehicle is traveling.

Cameras and Voice Recorders
Video cameras may be used in public places to
record relatively simple phenomena, such as
the passage of people or automobiles, or more
complex social processes. For several years,
London police have monitored traffi c condi-
tions at dozens of key intersections using video
cameras mounted on building rooftops. In fact,
the 2007 Road Atlas for Britain includes the
locations of stationary video cameras on its
maps. Since 2003, video cameras have moni-
tored all traffi c entering central London as part
of an effort to reduce traffi c. The license plates
of vehicles that do not register paying a toll are
recorded, and violation notices sent to owners.
Ronald Clarke (1996) studied speeding in Illi-
nois, drawing on observations automatically
recorded by cameras placed at several locations
throughout the state.

sampling and offers a useful comparison of
probability and purposive samples:

The logic and power of probability sam-
pling derive from statistical probability
theory. A random and statistically repre-
sentative sample permits confi dent gener-
alization from a sample to a larger popula-
tion. . . . The logic and power of purposeful
sampling lies in selecting information rich
cases for study in depth.

Nonetheless, if researchers understand the prin –
ciples and logic of more formal sampling meth-
ods, they are likely to produce more effective
purposive sampling in fi eld research.

Recording Observations
Many different methods are available for collecting
and recording fi eld observations.

Just as there is great variety in the types of fi eld
studies we might conduct, we have many op-
tions for making records of fi eld observations.
In conducting fi eld interviews, researchers al-
most certainly write notes of some kind, but
they might also tape-record interviews. Video-
taping may be useful in fi eld interviews to cap-
ture visual images of dress and body language.
Photographs or videotapes can be used to
make records of visual images such as a block
of apartment buildings before and after some
physical design change or to serve as a pretest
for an experimental neighborhood cleanup
campaign. This technique was used by Robert
Sampson and Stephen Raudenbush (1999) in
connection with probability samples of city
blocks in Chicago. Videotapes were made of
sampled blocks, and the recordings were then
viewed to assess physical and social conditions
in those areas.

We can think of a continuum of methods for
recording observations. At one extreme is tradi-
tional fi eld observation and note taking with
pen and paper, such as we might use in fi eld in-
terviews. The opposite extreme includes various

Chapter 8 Field Research 215

observations as written notes, perhaps in a
fi eld journal. Field notes should include both
empirical observations and interpretations of
them. They should record what we “know” we
have observed and what we “think” we have ob-
served. It is important, however, that these dif-
ferent kinds of notes be identifi ed for what they
are. For example, we might note that person
X approached and handed something to per-
son Y—a known drug dealer—that we think
this was a drug transaction, and that we think
person X was a new customer.

Every student is familiar with the process of
taking notes. Good note taking in fi eld research
requires more careful and deliberate attention
and involves some specifi c skills. Three guide-
lines are particularly important.

First, don’t trust your memory any more
than you have to; it’s untrustworthy. Even if
you pride yourself on having a photographic
memory, it’s a good idea to take notes, either
during the observation or as soon afterward
as possible. If you are taking notes during the
observation, do it unobtrusively because people
are likely to behave differently if they see you
writing down everything they say or do.

Second, it’s usually a good idea to take notes
in stages. In the fi rst stage, you may need to
take sketchy notes (words and phrases) to keep
abreast of what’s happening. Then remove
yourself and rewrite your notes in more detail.
If you do this soon after the events you’ve ob-
served, the sketchy notes will help you recall
most of the details. The longer you delay, the
less likely you are to recall things accurately
and fully. James Roberts (2002), in his study of
aggression in New Jersey nightclubs, was reluc-
tant to take any notes while inside clubs, so he
retired to his car to make sketchy notes about
observations, then wrote them up in more de-
tail later.

Third, you will inevitably wonder how much
you should record. Is it really worth the effort
to write out all the details you can recall right
after the observation session? The basic answer
is yes. In fi eld research, you can’t really be sure

Still photographs may be appropriate to re-
cord some types of observations, such as graf-
fi ti or litter. Photos have the added benefi t
of preserving visual images that can later be
viewed and coded by more than one person,
thus facilitating interrater reliability checks. If
we are interested in studying pedestrian traf-
fi c on city streets, we might gather data about
what types of people we see and how many
there are. As the number and complexity of our
observations increase, it becomes more diffi cult
to reliably record how many males and females
we see, how many adults and juveniles, and so
on. Taking photographs of sampled areas will
enable us to be more confi dent in our measure-
ments and will also make it possible for an-
other person to check on our interpretation of
the photographs.

This approach was used by James Lange and
associates (Lange, Johnson, and Voas 2005) in
their study of speeding on the New Jersey Turn-
pike. The researchers deployed radar devices
and digital cameras to measure the speed of
vehicles and to take photos of drivers. Equip-
ment was housed in an unmarked van parked
at sample locations on the turnpike. The race
of drivers was later coded by teams of research-
ers who studied the digital images. Agreement
by at least two of three coders was required to
accept the photos for further analysis.

In addition to their use in interviews, audio-
tape recorders are useful for dictating observa-
tions. For example, a researcher interested in
patterns of activity on urban streets can dictate
observations while riding through selected ar-
eas in an automobile. It is possible to dictate
observations in an unstructured manner, de-
scribing each street scene as it unfolds. Or a
tape recorder can be used more like an audio
checklist, with observers noting specifi ed items
seen in preselected areas.

Field Notes
Even tape recorders and cameras cannot cap-
ture all the relevant aspects of social processes.
Most fi eld researchers make some records of

216 Part Three Modes of Observation

Because structured fi eld observation forms
often resemble survey questionnaires, the use
of such forms has the benefi t of enabling re-
searchers to generate numeric measures of
conditions observed in the fi eld. The Bureau
of Justice Assistance (1993, 43) has produced a
handbook containing guidelines for conduct-
ing structured fi eld observations, called envi-
ronmental surveys. The name is signifi cant
because observers record information about
the conditions of a specifi ed environment:

[Environmental] surveys seek to assess, as
systematically and objectively as possible,
the overall physical environment of an
area. That physical environment comprises
the buildings, parks, streets, transporta-
tion facilities, and overall landscaping of
an area as well as the functions and condi-
tions of those entities.

Environmental surveys have come to be an
important component of problem-oriented
policing and situational crime prevention.
Figure 8.4 is adapted from an environmental
survey form used by the Philadelphia Police De-
partment in drug enforcement initiatives. Envi-
ronmental surveys are conducted to plan police
strategy in drug enforcement in small areas and
to assess changes in conditions following tar-
geted enforcement. Notice that the form can be
used to record both information about physical
conditions (street width, traffi c volume, street-
lights) and counts of people and their activities.

Like interview surveys, environmental sur-
veys require that observers be carefully trained
in what to observe and how to interpret it. For
example, the instructions that accompany the
environmental survey excerpted in Figure 8.4
include guidance on coding abandoned
automobiles:

Count as abandoned if it appears non-
drivable (i.e., has shattered windows, dis-
mantled body parts, missing tires, missing
license plates). Consider it abandoned if
it appears that it has not been driven for

what’s important and unimportant until you’ve
had a chance to review and analyze a great vol-
ume of information, so you should record even
things that don’t seem important at the time.
They may turn out to be signifi cant after all.
In addition, the act of recording the details of
something unimportant may jog your memory
on something that is important.

Structured Observations
Field notes may be recorded on highly struc-
tured forms in which observers mark items in
much the same way a survey interviewer marks
a closed-ended questionnaire. For example,
Steve Mastrofski and associates (1998, 11) de-
scribe how police performance can be recorded
on fi eld observation questionnaires:

Unlike ethnographic research, which re-
lies heavily on the fi eld researcher to make
choices about what to observe and how to
interpret it, the observer using [structured
observation] is guided by . . . instruments
designed by presumably experienced and
knowledgeable police researchers.

Training for such efforts is extensive and time
consuming. But Mastrofski and associates com-
pared structured observation to closed-ended
questions on a questionnaire. If researchers can
anticipate in advance that observers will en-
counter a limited number of situations in the
fi eld, those situations can be recorded on struc-
tured observation forms. And, like closed-ended
survey questions, structured observations have
higher reliability.

In a long-term study, Ralph Taylor (1999)
developed forms to code a range of physical
characteristics in a sample of Baltimore neigh-
borhoods. Observers recorded information
on closed-ended items about housing layout,
street length and width, traffi c volume, type of
nonresidential land use, graffi ti, persons hang-
ing out, and so forth. Observations were com-
pleted in the same neighborhoods in 1981 and
1994.

Chapter 8 Field Research 217

Linking Field Observations and
Other Data
Although criminal justice research may use
fi eld methods or sample surveys exclusively, a
given project will often collect data from several
sources. This is consistent with general advice

some time and that it is not going to be for
some time to come.

Other instructions provide details on how to
count drivable lanes, what sorts of activities
constitute “playing” and “working,” how to es-
timate the ages of people observed, and so on.

Figure 8.4 Example of Environmental Survey
Source: Adapted from Bureau of Justice Assistance (1993, Appendix B).

Date:______________ Day of week:______________ Time:______________

Observer:________________________________________________________

Street name:______________________________________________________

Cross streets:_____________________________________________________

1. Street width

Number of drivable lanes ____

Number of parking lanes ____

Median present? (yes � 1, no � 2) ____

2. Volume of traffic flow: (check one)

a. very light ____

b. light ____

c. moderate ____

d. heavy ____

e. very heavy ____

3. Number of street lights ____

4. Number of broken street lights ____

5. Number of abandoned automobiles ____

6. List all the people on the block and their activities:

Males Hanging out Playing Working Walking Other

Young (up to age 12) ____ ____ ____ ____ ____

Teens (13–19) ____ ____ ____ ____ ____

Adult (20–60) ____ ____ ____ ____ ____

Seniors (61�) ____ ____ ____ ____ ____

Females

Young (up to age 12) ____ ____ ____ ____ ____

Teens (13–19) ____ ____ ____ ____ ____

Adult (20–60) ____ ____ ____ ____ ____

Seniors (61�) ____ ____ ____ ____ ____

218 Part Three Modes of Observation

tailed notes and completing structured observa-
tion forms. One section of the form, shown in
Figure 8.5, instructed observers to make notes
of specifi c types of neighborhood problems that
were discussed at the meeting (Bennis, Skogan,
and Steiner 2003). Here is an excerpt from the
narrative notes that supplemented this section
of the form:

They . . . had very serious concerns in regard
to a dilapidated building in their block that
was being used for drug sales. The drug sell-
er’s people were also squatting in the base-
ment of the building. The main concern
was that the four adults who were squat-
ting also had three children under the age
of four with them. (Chicago Community
Policing Evaluation Consortium 2003, 37)

about using appropriate measures and data
collection methods. Simply saying “I am go-
ing to conduct an observational study of youth
gangs” restricts the research focus at the outset
to the kinds of information that can be gath-
ered through observation. Such a study may be
useful and interesting, but a researcher is better
advised to consider what data collection meth-
ods are necessary in any particular study.

A long-term research project on community
policing in Chicago draws on data from sur-
veys, fi eld observation, and police records (Chi-
cago Community Policing Evaluation Consor-
tium 2003). As just one example, researchers
studied what sorts of activities and discussions
emerged at community meetings in 130 of the
city’s 270 police beats that covered residential
areas. Observers attended meetings, making de-

5. Location code (circle one)

1. Police station 7. Hospital

2. Park building 8. Public housing facility

3. Library 9. Private facility

4. Church 10. Restaurant

5. Bank 11. Other not-for-profit

6. Other government

Count the house 30 minutes after the meeting. Exclude police in street
clothes . . . and others that you can identify as non-residents.

8. ______ Total number residents attending

Note Problems Discussed

1. Drugs (include possibles)

____ ____ Street sales or use
big small Building used for drugs
Drug-related violence

9. Physical decay

____ ____ Abandoned buildings
big small Run-down buildings
Abandoned cars
Graffiti and vandalism
Illegal dumping

Figure 8.5 Excerpts from Chicago Beat Meeting Observation Form
Source: Adapted from Bennis, Skogan, and Steiner (2003, Appendix 1).

Chapter 8 Field Research 219

Field Research on Speeding and
Traffi c Enforcement
Field research has been an important element in
studies of racial profi ling for two reasons. First,
fi eld research has provided measures of driver
behavior that are not dependent on police re-
cords. As we have seen in earlier chapters, it is
important to compare police records of stops
to some other source of information. Second,
fi eld research has provided insights into traf-
fi c enforcement, an area of policing not much
studied by researchers. Field research has also
covered the wide range of applications from
highly structured counting to less structured
fi eld observation and interviews.

Field Measures of Speed Studies of racial
profi ling in three states used highly structured
techniques to measure the speed of vehicles.
The most sophisticated equipment was used by
Lange and associates in New Jersey. Here’s how
the authors described their setup:

The digital photographs were captured by
a TC-2000 camera system, integrated with
an AutoPatrol PR-100 radar system, pro-
vided by Transcore, Inc. The equipment,
other than two large strobe lights, was
mounted inside an unmarked van, parked
behind preexisting guide rails along the
turnpike. The camera and radar sensor
pointed out of the van’s back window to-
ward oncoming traffi c. The two strobe
lights were mounted on tripods behind the
van and directed toward oncoming traffi c.
Transcore’s employees operated the equip-
ment. (2005, 202)

Equipment was programmed to photograph
every vehicle exceeding the speed limit by 15
or more miles per hour. Operators also photo-
graphed and timed samples of 25 to 50 other
vehicles per hour. Elsewhere we have described
the other element of observation— coding the
appearance of driver race from photographs.

Pennsylvania researchers also used radar
to measure the speed of vehicles in selected

In addition, observers distributed question-
naires to community residents and police offi –
cers attending each meeting. Items asked how
often people attended beat meetings, what
sorts of other civic activities they pursued, and
whether they thought various other issues were
problems in their neighborhood. As an exam-
ple, the combination of fi eld observation data
and survey questionnaires enabled researchers
to assess the degree of general social activism
among those who attended beat meetings.

Field research can also be conducted after a
survey. For example, a survey intended to mea-
sure fear of crime and related concepts might
ask respondents to specify any area near their
residence that they believe is particularly dan-
gerous. Follow-up fi eld visits to the named
areas could then be conducted, during which
observers record information about physical
characteristics, land use, numbers of people
present, and so forth.

The box titled “Conducting a Safety Audit”
describes how structured fi eld observations
were combined with a focus group discussion
to assess the scope of environmental design
changes in Toronto, Canada.

The fl exibility of fi eld methods is one reason
observation and fi eld interviews can readily be
incorporated into many research projects. And
fi eld observation often provides a much richer
understanding of a phenomenon that is imper-
fectly revealed through a survey questionnaire.

Illustrations of
Field Research
Examples illustrate different applications of fi eld re-
search to study speeding, traffi c enforcement, and
violence in bars.

Before concluding this chapter on fi eld re-
search, let’s examine some illustrations of the
method in action. These descriptions will pro-
vide a clearer sense of how researchers use fi eld
observations and interviews in criminal justice
research.

220 Part Three Modes of Observation

characteristics. That was an important research
task, however. Engel et al. (2004) describe train-
ing and fi eld procedures in detail. Their simple
fi eld observation form is included in an appen-
dix to their report (2004, 312).

William Smith and associates (Smith,
Tomaskovic-Devey, Zingraff, et al. 2003) tried
but rejected stationary observation as a tech-
nique for recording speed and observing driv-
ers. They cited the high speed of passing vehi-
cles and glare from windows as problems they
encountered. Instead, a research team used
mobile observation techniques— observing

locations throughout the state. Their proce-
dures were less automated, relying on teams
of two observers in a car parked on the side of
sampled roadways. Undergraduate students at
Pennsylvania State University served as observ-
ers. They were trained by Pennsylvania state po-
lice in the use of radar equipment, completing
the same classroom training that was required
of troopers. Additional training for observers
was conducted on samples of roadways by the
project director. State police were trained to
operate radar equipment, but not to combine
it with systematic fi eld observation of driver

CONDUCTING
A SAFETY
AUDIT

Gisela Bichler-Robertson
California State University at San Bernardino

A safety audit involves a careful inventory of spe-
cifi c environmental and situational factors that
may contribute to feelings of discomfort, fear of
victimization, or crime itself. The goal of a safety
audit is to devise recommendations that will im-
prove a specifi c area by reducing fear and crime.

Safety audits combine features of focus
groups and structured fi eld observations. To be-
gin, the researcher assembles a small group of
individuals (10 or fewer) considered to be vul-
nerable. Examples include: senior citizens, physi-
cally challenged individuals, young women who
travel alone, students, youth, and parents with
young children. Assembling diverse groups helps
to identify a greater variety of environmental and
situational factors for the particular area.

After explaining safety audit procedures, an
audit leader then takes the group on a tour of the
audit site. Since perceptions differ by time of day,
at least two audits are conducted for each site—
one during daylight and one after dark.

When touring audit sites, individuals do not
speak to one another. The audit leader instructs
group members to imagine that they are walking
through the area alone. Each person is equipped
with a structured form for documenting their ob-

servations and perceptions. Forms vary, depend-
ing on the group and site. In general, however,
safety audit participants are instructed to docu-
ment the following items:
1. Before walking through the area, briefl y de-

scribe the type of space you are reviewing
(e.g., a parking deck, park, shopping district).
Record the number of entrances, general vol-
ume of users, design of structures, materials
used in design, and type of lighting.

2. Complete the following while walking through
the area.

General feelings of safety:
■ Identify the places in which you feel unsafe

and uncomfortable.
■ What is it about each place that makes you

feel this way?
■ Identify the places in which you feel safe.
■ What is it about each place that makes you

feel this way?

General visibility:
■ Can you see very far in front of you?
■ Can you see behind you?
■ Are there any structures or vegetation that re-

strict your sightlines?
■ How dense are the trees/bushes?
■ Are there any hiding spots or entrapment

zones?
■ Is the lighting adequate? Can you see the face

of someone 15 meters in front of you?
■ Are the paths/hallways open or are they very

narrow?
■ Are there any sharp corners (90° angles)?

Chapter 8 Field Research 221

Observing New Jersey State Police Other
research in New Jersey used less structured fi eld
observation techniques. This was because the
research purpose was less structured—learning
about the general nature of traffi c enforcement
on New Jersey highways. In a series of studies,
researchers from Rutgers University (Maxfi eld
and Kelling 2005; Maxfi eld and Andresen 2002;
Andresen 2005) were interested in the mechan-
ics of making traffi c stops and what kinds of
things troopers considered in deciding which
vehicles to stop. Researchers have long ac-
companied municipal police on patrol, and a

drivers and timing cars that passed them. Ra-
dar was also considered and rejected because it
was feared vehicles having radar detectors, said
to be common in North Carolina, would slow
down when nearing the research vehicle. Worse,
Smith and associates report that truck drivers
quickly broadcast word of detected radar, thus
eroding the planned unobtrusive measure.

As you can see, the observational component
of research in these three states varied quite a
lot. Reading the detailed reports from each
study offers valuable insights into the kinds of
things fi eld researchers must consider.

Perceived control over the space:
■ Could you see danger approaching with

enough time to choose an alternative route?
■ Are you visible to others on the street or in

other buildings?
■ Can you see any evidence of a security system?

Presence of others:
■ Does the area seem to be deserted?
■ Are there many women around?
■ Are you alone in the presence of men?
■ What do the other people seem to be doing?
■ Are there any undesirables—vagrants (home-

less or beggars), drunks, etc.?
■ Do you see people you know?
■ Are there any police or security offi cers

present?

General safety:
■ Do you have access to a phone or other way

of summoning help?
■ What is your general perception of criminal

behavior?
■ Are there any places where you feel you could

be attacked or confronted in an uncomfort-
able way?

Past experience in this space:
■ Have you been harassed in this space?
■ Have you heard of anyone who had a bad

experience in this place (any legends or real
experiences)?

■ Is it likely that you may be harassed here (e.g.,
drunk young men coming out of the pub)?

■ Have you noticed any social incivilities (minor
deviant behavior—i.e., public drinking, van-
dalism, roughhousing, or skateboarding)?

■ Is there much in the way of physical incivili-
ties (broken windows, litter, broken bottles,
vandalism)?

Following the site visit, the group fi nds a se-
cure setting for a focused discussion of the various
elements they identifi ed. Harvesting observations
about good and bad spaces helps to develop rec-
ommendations for physical improvement. Group
members may also share perceptions and ideas
about personal safety. This process should begin
with a brainstorming discussion and fi nish with
identifying the key issues of concern and most
reasonable recommendations for addressing
those issues.

This method of structured observation has
proven to be invaluable. Much of the public space
in Toronto, including university campuses, public
parks, transportation centers, and garages, has
been improved through such endeavors.

Source: Adapted from materials developed by the
Metro Action Committee on Public Violence
Against Women and Children (METRAC) (Toronto,
Canada: METRAC, 1987). Used by permission.

222 Part Three Modes of Observation

able nor needed. A fascinating series of studies
of violence in Australian bars by Ross Homel
and associates provides an example (Homel,
Tomsen, and Thommeny 1992; Homel and
Clark 1994).

Homel and associates set out to learn how
various situational factors related to public
drinking might promote or inhibit violence in
Australian bars and nightclubs. Think for a mo-
ment about how you might approach their re-
search question: “whether alcohol consumption
itself contributes in some way to the likelihood
of violence, or whether aspects of the drinkers
or of the drinking settings are the critical fac-
tors” (1992, 681). Examining police records
might reveal that assault reports are more likely
to come from bars than from, say, public librar-
ies. Or a survey might fi nd that self-reported
bar patrons were more likely to have witnessed
or participated in violence than respondents
who did not frequent bars or nightclubs. But
neither of these approaches will yield measures
of the setting or situational factors that might
provoke violence. Field research can produce
direct observation of barroom settings and is
well suited to addressing the question framed
by Homel and associates.

Researchers began by selecting 4 high-risk
and 2 low-risk sites on the basis of Sydney po-
lice records and preliminary scouting visits.
These 6 sites were visited fi ve or more times,
and an additional 16 sites were visited once in
the course of scouting.

Visits to bars were conducted by pairs of ob-
servers who stayed two to six hours at each site.
Their role is best described as complete partici-
pant because they were permitted one alcoholic
drink per hour and otherwise played the role of
bar patron. Observers made no notes while on-
site. As soon as possible after leaving a bar, they
wrote up separate narrative accounts. Later, at
group meetings of observers and research staff,
the narrative accounts were discussed and any
discrepancies resolved. Narratives were later
coded by research staff to identify categories of
situations, people, and activities that seemed to
be associated with the likelihood of violence.

number of studies have documented their ef-
forts. But, as Andresen points out, only a hand-
ful of studies have examined traffi c enforce-
ment, and even fewer considered state police.

To study video recording cameras in state
police cars, Maxfi eld and Andresen (2002) rode
with state police and watched the equipment in
use. They learned that sound quality of record-
ings was often poor, for a variety of reasons as-
sociated with microphones and wireless trans-
mittal. It was initially hoped that video records
might make it possible to classify the race of
drivers, but after watching in-car video moni-
tors the researchers confi rmed that poor im-
age quality undermined the potential reliabil-
ity of that approach. The Rutgers researchers
expected that troopers would be on their best
behavior. But they did witness actions by troop-
ers to avoid recording sound and/or images on
a few occasions. Even though people behave
differently when accompanied by researchers,
it’s not uncommon for police to let their guard
down a little.

Andresen accompanied troopers on 57 pa-
trols overall, conducting unstructured inter-
views during the several hours he spent with
individual troopers. He adopted the common
practice of using an interview guide, a list of
simple questions he planned to ask in the fi eld.
He took extensive notes while riding, and re-
peatedly told troopers they could examine his
notes. Andresen observed more than 150 traf-
fi c stops, writing fi eld notes to document who
was involved, reasons for the stop, what actions
troopers took, and post-stop comments from
troopers. He reports that most troopers seemed
to enjoy describing their work. And, as you
might imagine, troopers’ commentary about
traffi c enforcement was very interesting.

Bars and Violence
Researchers in the fi rst example conducted sys-
tematic observations for specifi c purposes and
produced quantitative estimates of speeding
traffi c stops. Field research is commonly used
in more qualitative studies as well, in which
precise quantitative estimates are neither avail-

Chapter 8 Field Research 223

working-class males. However, these personal
characteristics were deemed less important
than the fl ow of people in and out of bars. Vio-
lent incidents were often triggered when groups
of males entered a club and encountered other
groups of males they did not know.

Physical features mattered little unless they
contributed to crowding or other adverse char-
acteristics of the social atmosphere. Chief
among social features associated with violence
were discomfort and boredom. A crowded, un-
comfortable bar with no entertainment spelled
trouble.

Drinking patterns made a difference; vio-
lent incidents were most common when bar
patrons were very drunk. More importantly,
certain management practices seemed to pro-
duce more drunk patrons. Fewer customers
were drunk in bars that had either a restaurant
or a snack table. Bars with high cover charges
and cheap drinks produced a high density of
drunk patrons and violence. The economics of
this situation are clear: if you must pay to enter
a bar that serves cheap drinks, you’ll get more
for your money by drinking a lot.

The fi nal ingredient found to contribute
to violence was aggressive bouncers. “Many
bouncers seem poorly trained, obsessed with
their own machismo (relating badly to groups
of male strangers), and some of them appear
to regard their employment as a license to as-
sault people” (1992, 688). Rather than reducing
violence by rejecting unruly patrons, bouncers
sometimes escalated violence by starting fi ghts.

Field observation was necessary to identify
what situations produce violence in bars. No
other way of collecting data could have yielded
the rich and detailed information that enabled
Homel and associates to diagnose the complex
relationships that produce violence in bars:

Violent incidents in public drinking loca-
tions do not occur simply because of the
presence of young or rough patrons or be-
cause of rock bands, or any other single
variable. Violent occasions are charac-
terized by subtle interactions of several

These eventually included physical and social
atmosphere, drinking patterns, characteristics
of patrons, and characteristics of staff.

The researchers began their study by assum-
ing that some thing or things distinguished
bars in which violence was common from those
in which it was less common. After beginning
their fi eldwork, however, Homel and associates
(1992, 684) realized that circumstances and sit-
uations were the more important factors:

During fi eld research it soon became ap-
parent that the violent premises are for
most of the time not violent. Violent oc-
casions in these places seemed to have
characteristics that clearly marked them
out from nonviolent times. . . . This unex-
pectedly helped us refi ne our ideas about
the relevant situational variables, and to
some extent reduced the importance of
comparisons with the premises selected as
controls.

In other words, the research question was
partly restated. What began as a study to deter-
mine why some bars in Sydney were violent was
revised to determine what situations seemed to
contribute to violence.

This illustrates one of the strengths of fi eld
research—the ability to make adjustments
while in the fi eld. You may recognize this as an
example of inductive reasoning. Learning that
even violent clubs were peaceful most of the
time, Homel and associates were able to focus
observers’ attention on looking for more spe-
cifi c features of the bar environment and staff
and patron characteristics. Such adjustments
on the fl y would be diffi cult, if not impossible,
if you were doing a survey.

Altogether, fi eld observers made 55 visits
to 23 sites, for a total of about 300 hours of
fi eld observation. During these visits, observ-
ers witnessed 32 incidents of physical violence.
Examining detailed fi eld notes, researchers at-
tributed violent incidents to a variety of inter-
related factors.

With respect to patrons, violence was most
likely to break out in bars frequented by young,

224 Part Three Modes of Observation

estimates for a large population of behaviors
beyond those actually observed. However, be-
cause it is diffi cult to know the total population
of given phenomena—shoppers or drivers, for
example—precise probability samples cannot
normally be drawn. In designing a quantitative
fi eld study or assessing the representativeness of
some other study, researchers must think care-
fully about the density and predictability of what
will be observed. Then they must decide whether
sampling procedures are likely to tap represen-
tative instances of cases they will observe.

More generally, the advantages and disad-
vantages of different types of fi eld studies can
be considered in terms of their validity, reli-
ability, and generalizability. As we have seen,
validity and reliability are both qualities of
measurements. Validity concerns whether mea-
surements actually measure what they are sup-
posed to, not something else. Reliability is a
matter of dependability: if researchers make the
same measurement again and again, will they
get the same result? Note that some examples
we described in this chapter included special
steps to improve reliability. Finally, generalizabil-
ity refers to whether specifi c research fi ndings
apply to people, places, and things not actually
observed. Let’s see how fi eld research stacks up
in these respects.

Validity
Recall our discussion in Chapter 7 of some
of the limitations of using survey methods to
study domestic violence. An alternative is a
fi eld study in which the researcher interacts
at length with victims of domestic violence.
The relative strengths of each approach are
nicely illustrated in a pair of articles that ex-
amine domestic violence in England. Chapter 7
quoted from Catriona Mirrlees-Black’s (1995)
article on domestic violence as measured in
the British Crime Survey. John Hood-Williams
and Tracey Bush (1995) provide a different
perspective through their study published
in the same issue of the Home Offi ce Research
Bulletin.

variables. Chief among these are groups
of male strangers, low comfort, high bore-
dom, high drunkenness, as well as aggres-
sive and unreasonable bouncers and fl oor
staff (1992, 688).

Strengths and Weaknesses
of Field Research
Validity is usually a strength of fi eld research,
but reliability and generalizability are sometimes
weaknesses.

As we have seen, fi eld research is especially effec-
tive for studying the subtle nuances of behav-
ior and for examining processes over time. For
these reasons, the chief strength of this method
is the depth of understanding it permits.

Flexibility is another advantage of fi eld re-
search. Researchers can modify their research
design at any time. Moreover, they are always
prepared to engage in qualitative fi eld research
if the occasion arises, whereas launching a sur-
vey requires considerable advance work.

Field research can be relatively inexpensive.
Other research methods may require costly
equipment or a large research staff, but fi eld
research often can be undertaken by one re-
searcher with a notebook and a pen. This is not
to say that fi eld research is never expensive. The
studies of speeding and race profi ling required
many trained observers. Expensive recording
equipment may be needed, or the researcher
may wish to travel to Australia to replicate the
study by Homel and associates.

Field research has its weaknesses, too. First,
qualitative studies seldom yield precise descrip-
tive statements about a large population. Ob-
serving casual discussions among corrections
offi cers in a cafeteria, for example, does not
yield trustworthy estimates about prison condi-
tions. Nevertheless, it could provide important
insights into some of the problems facing staff
and inmates in a specifi c institution.

Second, fi eld observation can produce sys-
tematic counts of behaviors and reasonable

Chapter 8 Field Research 225

As the writing proceeded, we read various
parts of the manuscript to selected mem-
bers of our sample. This allowed us to
check our interpretations against those of
insiders and to enlist their help in reformu-
lating passages they regarded as misleading
or inaccurate. . . . The result of using this
procedure, we believe, is a book that faith-
fully conveys the offender’s perspective on
the process of committing residential bur-
glaries. (1994, 33–34)

This approach is possible only if subjects are
aware of the researcher’s role as a researcher. In
that case, having informants review draft fi eld
notes or interview transcripts can be an excel-
lent strategy for improving validity.

Reliability
Qualitative fi eld research can have a potential
problem with reliability. Suppose you charac-
terize your best friend’s political orientations
based on everything you know about him or
her. There’s certainly no question that your
assessment of that person’s politics is at least
somewhat idiosyncratic. The measurement you
arrive at will appear to have considerable valid-
ity. We can’t be sure, however, that someone
else will characterize your friend’s politics the
same way you do, even with the same amount
of observation.

Field research measurements— even in-depth
ones—are also often very personal. If, for ex-
ample, you wished to conduct a fi eld study of
bars and honky-tonks near your campus, you
might judge levels of disorder on a Friday night
to be low or moderate. In contrast, older adults
might observe the same levels of noise and
commotion and rate levels of disorder as intol-
erably high. How we interpret the phenomena
we observe depends very much on our own ex-
periences and preferences.

The reliability of quantitative fi eld studies
can be enhanced by careful attention to the de-
tails of observation. Environmental surveys in
particular can promote reliable observations

Tracey Bush lived in a London public hous-
ing project (termed housing estate in England)
for about fi ve years. This enabled her to study
domestic violence in a natural setting: “The
views of men and women on the estate about
relationships and domestic violence have been
gathered through the researcher’s network of
friends, neighbours, acquaintances, and con-
tacts” (Hood-Williams and Bush 1995, 11).
Through long and patient fi eldwork, Bush
learned that women sometimes normalize low
levels of violence, seeing it as an unfortunate
but unavoidable consequence of their relation-
ship with a male partner. When violence esca-
lates, victims may blame themselves. Victims
may also remain in an abusive relationship in
hopes that things will get better:

She reported that she wanted the compan-
ionship and respect that she had received
at the beginning of the relationship. It was
the earlier, nonviolent man, whom she had
met and fallen in love with, that she wanted
back. (1995, 13)

Mirrlees-Black (1995) notes that measuring do-
mestic violence is “diffi cult territory” in part
because women may not recognize assault by a
partner as a crime. Field research such as that
by Hood-Williams and Bush offers an example
of this phenomenon and helps us understand
why it exists.

In fi eld research, validity often refers to
whether the intended meaning of the things
observed or people interviewed has been ac-
curately captured. In the case of interviews, Jo-
seph Maxwell (2005) suggests getting feedback
on the measures from the people being studied.
For example, Wright and Decker (1994) con-
ducted lengthy semistructured interviews with
their sample of burglars. The researchers recog-
nized that their limited understanding of the
social context of burglary may have produced
some errors in interpreting what they learned
from subjects. To guard against this, Wright
and Decker had some of their subjects review
what they thought they had learned:

226 Part Three Modes of Observation

several forms. One experience involved learn-
ing about radar speed enforcement on a 50-mile
segment of the New Jersey Turnpike. Maxfi eld
accompanied troopers on a thorough tour of
this segment, identifying where radar units were
routinely stationed (termed fi shing holes by troop-
ers). In his fi eldwork, he also examined physical
characteristics of the roadway; patterns of in-
and out-of-state travel; and areas where entrance
ramps, slight upward grades, and other features
affected vehicle speed. Finally, he gained exten-
sive information on priorities and patterns
in speed enforcement—learning what affects
troopers’ decisions to stop certain vehicles.

As a result, Maxfi eld has detailed knowledge
about that 50-mile segment of the New Jersey
Turnpike. How generalizable is that knowl-
edge? In one sense, learning about fi shing holes
in very specifi c terms can help identify such
sites on other roads. And learning how slight
upward grades can slow traffi c in one situation
may help us understand traffi c on other upward
grades. But a detailed, idiosyncratic under-
standing of 50 miles of highway is just that—id-
iosyncratic. Knowing all there is to know about
a straight, largely level stretch of limited-access
toll road with few exits is not generalizable to
other roadways—winding roads in mountain-
ous areas with many exits, for example.

At the same time, some fi eld studies are less
rooted in the local context of the subject under
study. Wright and Decker (1994) studied bur-
glars in St. Louis, and it’s certainly reasonable to
wonder whether their fi ndings apply to residen-
tial burglars in St. Petersburg, Florida. The ac-
tions and routines of burglars might be affected
by local police strategies, differences in the age
or style of dwelling units, or even the type and
amount of vegetation screening buildings from
the street. However, Wright and Decker draw
general conclusions about how burglars search
for targets, what features of dwellings signal
vulnerability, how opportunistic knowledge
can trigger an offense, and what strategies ex-
ist for fencing stolen goods. It’s likely that their
fi ndings about the technology and incentives

by including detailed instructions on how to
classify what is observed. Reliability can be
strengthened by reviewing the products of fi eld
observations. Homel and associates sought to
increase the reliability of observers’ narrative
descriptions by having group discussions about
discrepancies in reports by different observers.

In a more general sense, reliability will in-
crease as the degree of interpretation required
in making actual observations decreases. Par-
ticipant observation or unstructured interviews
may require a considerable degree of interpreta-
tion on the part of the observer, and most of us
draw on our own experiences and backgrounds
in interpreting what we observe. At another ex-
treme, electronic devices and machines can pro-
duce very reliable counts of persons who enter a
store or of cars that pass some particular point.
Somewhere in the middle are fi eld-workers who
observe motorists or pedestrians and tabulate a
specifi c behavior.

Generalizability
One of the chief goals of social science is gen-
eralization. We study particular situations and
events to learn about life in general. Generaliz-
ability can be a problem for qualitative fi eld re-
search. It crops up in two forms.

First, the personal nature of the observations
and measurements made by the researcher can
produce results that will not necessarily be rep-
licated by another independent researcher. If
the observation depends in part on the individ-
ual observers, it is more valuable as a source of
particular insight than as a general truth. You
may recognize the similarity between this and
the more general issue of reliability.

Second, because fi eld researchers get a full
and in-depth view of their subject matter, they
can reach an unusually comprehensive under-
standing. By its very comprehensiveness, how-
ever, this understanding is less generalizable
than results based on rigorous sampling and
standardized measurements.

For example, Maxfi eld’s observational re-
search with the New Jersey State Police took

Chapter 8 Field Research 227

• If fi eld observations will be made on a phenom-
enon that occurs with some degree of regular-
ity, purposive sampling techniques can be used
to select cases for observation.

• Alternatives for recording fi eld observations
range from video, audio, and other equipment
to unstructured fi eld notes. In between are ob-
servations recorded on structured forms; envi-
ronmental surveys are examples.

• Field notes should be planned in advance to the
greatest extent possible. However, note taking
should be fl exible enough to make records of
unexpected observations.

• Compared with surveys, fi eld research measure-
ments generally have more validity but less re-
liability, and fi eld research results cannot be
generalized as safely as those based on rigorous
sampling and standardized questionnaires.

✪ Key Terms
environmental snowball

survey, p. 216 sampling, p. 210

✪ Review Questions and Exercises
1. Think of some group or activity you partici-

pate in or are familiar with. In two or three
paragraphs, describe how an outsider might ef-
fectively go about studying that group or activ-
ity. What should he or she read, what contacts
should be made, and so on?

2. Review the box titled “Conducting a Safety Au-
dit” by Gisela Bichler-Robertson on page 220.
Try conducting a safety audit on your campus
or in an area near your campus.

3. Many police departments encourage citizen
ride-alongs as a component of community po-
licing. If this is the case for a police or sheriff ’s
department near you, take advantage of this
excellent opportunity to test your observation
and unstructured interviewing skills.

✪ Additional Readings
Bureau of Justice Assistance, A Police Guide to Sur-

veying Citizens and Their Environment (Washing-
ton, DC: U.S. Department of Justice, Offi ce of
Justice Programs, Bureau of Justice Assistance,
1993). Intended for use in community policing
initiatives, this publication is a useful source
of ideas about conducting structured observa-
tions. Appendixes include detailed examples

that affect St. Louis burglars apply generally to
residential burglars in other cities.

In reviewing reports of fi eld research proj-
ects, it’s important to determine where and to
what extent the researcher is generalizing be-
yond her or his specifi c observations to other
settings. Such generalizations may be in order,
but it is necessary to judge that. Nothing in this
research method guarantees it.

As we’ve seen, fi eld research is a potentially
powerful tool for criminal justice research, one
that provides a useful balance to surveys.

✪ Main Points
• Field research is a data collection method that

involves the direct observation of phenomena
in their natural settings.

• Field observation is usually the preferred data
collection method for obtaining information
about physical or social settings, behavior, and
events.

• Field research in criminal justice may pro-
duce either qualitative or quantitative data.
Grounded theory is typically built from quali-
tative fi eld observations. Observations that can
be quantifi ed may produce measures for hy-
pothesis testing.

• Observations made through fi eld research can
often be integrated with data collected from
other sources. In this way, fi eld observations can
help researchers interpret other data.

• Asking questions through a form of special-
ized interviewing is often integrated with fi eld
observation.

• Field researchers may or may not identify them-
selves as researchers to the people they are ob-
serving. Being identifi ed as a researcher may
have some effect on what is observed.

• Preparing for the fi eld involves negotiating or
arranging access to subjects. Specifi c strate-
gies depend on whether formal organizations,
subcultures, or something in between are being
studied.

• Controlled probability sampling techniques are
not usually possible in fi eld research.

• Snowball sampling is a method for acquiring an
ever-increasing number of sample observations.
One participant is asked to recommend others
for interviewing, and each of these other partic-
ipants is asked for more recommendations.

228 Part Three Modes of Observation

Patton, Michael Quinn, Qualitative Research and
Evaluation Methods, 3rd ed. (Thousand Oaks, CA:
Sage, 2001). We mentioned this book in Chap-
ter 7 as a good source of guidance on question-
naire construction. Patton also offers in-depth
information on observation techniques, along
with tips on conducting unstructured and sem-
istructured fi eld interviews. In addition, Patton
describes a variety of purposive sampling tech-
niques for qualitative interviewing and fi eld
research.

Smith, Steven K., and Carolyn C. DeFrances, Assess-
ing Measurement Techniques for Identifying Race,
Ethnicity, and Gender: Observation-Based Data Col-
lection in Airports and at Immigration Checkpoints
(Washington, DC: Bureau of Justice Statistics,
2003). Racial profi ling and the September 11,
2001, attacks on New York and Washington
prompted researchers and public offi cials to
consider observational studies of drivers and
others. This report by the Bureau of Justice Sta-
tistics describes experiments to test observation
methods.

of environmental surveys. You can also down-
load this publication in text form (no drawings)
from the Web (www.ncjrs.org/txtfi les/polc.txt;
accessed May 12, 2008).

Felson, Marcus, Crime and Everyday Life, 3rd ed.
(Thousand Oaks, CA: Sage, 2002). We men-
tioned this book in Chapter 2 as an example of
criminal justice theory. Many of Felson’s expla-
nations of how everyday life is linked to crime
describe physical features of cities and other
land use patterns. This entertaining book sug-
gests many opportunities for conducting fi eld
research.

Miller, Joel, Profi ling Populations Available for Stops
and Searches (London: Home Offi ce Policing and
Reducing Crime Unit, 2000). Race-biased po-
licing has been a concern in England for many
years. This report presents a thorough descrip-
tion of observation to produce baseline mea-
sures of populations eligible to be stopped by
police. Similar efforts have been underway in
many U.S. states and cities (www.homeoffi ce.
gov.uk/rds/policerspubs1.html; accessed May
12, 2008).

www.ncjrs.org/txtfiles/polc.txt

www.homeofficegov.uk/rds/policerspubs1.html

www.homeofficegov.uk/rds/policerspubs1.html

229

Chapter 9

Agency Records, Content Analysis,
and Secondary Data
We’ll examine three sources of existing data: agency records, content analysis,
and data collected by other researchers. Data from these sources have many
applications in criminal justice research.

Introduction 230

Topics Appropriate for
Agency Records and
Content Analysis 230

Types of Agency Records 232

Published Statistics 232

Nonpublic Agency Records 234

New Data Collected by Agency
Staff 236

IMPROVING POLICE RECORDS OF

DOMESTIC VIOLENCE 238

Reliability and Validity 239

Sources of Reliability and Validity
Problems 240

HOW MANY PAROLE VIOLA-
TORS WERE THERE LAST

MONTH? 242

Content Analysis 244

Coding in Content Analysis 244

Illustrations of Content
Analysis 246

Secondary Analysis 247

Sources of Secondary Data 248

Advantages and Disadvantages of
Secondary Data 249

230 Part Three Modes of Observation

Introduction
Agency records, secondary data, and content analy-
sis do not require direct interaction with research
subjects.

Except for the complete observer role in fi eld
research, the modes of observation discussed
so far require the researcher to intrude to some
degree into whatever he or she is studying. This
is most obvious with survey research. Even the
fi eld researcher, as we’ve seen, can change things
in the process of studying them.

Other ways of collecting data do not involve
intrusion by observers. In this chapter, we’ll
consider three different approaches to using
information that has been collected by others,
often as a routine practice.

First, a great deal of criminal justice research
uses data collected by state and local agencies
such as police, criminal courts, probation of-
fi ces, juvenile authorities, and corrections de-
partments. Government agencies gather a vast
amount of crime and criminal justice data,
probably rivaled only by efforts to produce
economic and public health indicators. We re-
fer to such information as “data from agency
records.” In this chapter, we will describe dif-
ferent types of such data that are available for
criminal justice research, together with the
promise and potential pitfalls of using infor-
mation from agency records.

Second, in content analysis, researchers ex-
amine a class of social artifacts—written docu-
ments or other types of messages. Suppose you
want to contrast the importance of criminal
justice policy and health care policy for Ameri-
cans in 1992 and 2004. One option is to ex-
amine public-opinion polls from these years.
Another method is to analyze articles from
newspapers published in each year. The latter is
an example of content analysis: the analysis of
communications.

Finally, information collected by others is
frequently used in criminal justice research,
which in this case involves secondary a nalysis
of existing data. Investigators who conduct

r esearch funded by federal agencies such as the
National Institute of Justice are usually obliged
to release their data for public use. Thus if you
were interested in developing sentence reform
proposals for your state, you might analyze data
collected by Nancy Merritt, Terry Fain, and Su-
san Turner in the study of Oregon’s efforts to
increase sentence length for certain types of of-
fenders (Merritt, Fain, and Turner 2006)

Keep in mind that most data we might ob-
tain from agency records or research projects
conducted by others are secondary data. Some-
one else gathered the original data, usually for
purposes that differ from ours.

Topics Appropriate for
Agency Records and
Content Analysis
Agency records support a wide variety of research
applications.

Data from agency records or archives may have
originally been gathered in any number of ways,
from sample surveys to direct observation. Be-
cause of this, such data may, in principle, be
appropriate for just about any criminal justice
research topic.

Published statistics and agency records are
most commonly used in descriptive or explor-
atory studies. This is consistent with the fact
that many of the criminal justice data pub-
lished by government agencies are intended to
describe something. For instance, the Bureau
of Justice Statistics (BJS) publishes annual
fi gures on prison populations. If we are inter-
ested in describing differences in prison popu-
lations between states or changes in prison
populations from 1990 through 2005, a good
place to begin is with the annual data pub-
lished by BJS. Similarly, published fi gures on
crimes reported to police, criminal victimiza-
tion, felony court caseloads, drug use by high
school seniors, and a host of other measures are
available over time—for 25 years or longer in
many cases.

Chapter 9 Agency Records, Content Analysis, and Secondary Data 231

Agency records may also be used in explana-
tory studies. Nancy Sinauer and colleagues
(1999) examined medical examiner records
for more than 1,000 female homicide victims
in Nort