New College

THE CDLLEBE  

Natural Sciences Division
March 8, 2017
Chief Judge Charles Williams
Twelfth Judicial Circuit Court of Florida

Dear Chief Judge Williams,

I write to report on the methodology used in the recent Sarasota Herald Tribune series
entitled ?Bias on the Bench.? The series claims to have analyzed more than 84 million criminal
records in reaching the conclusion that many judges in the Florida Criminal Justice System
sentence black defendants to signi?cantly longer periods of incarceration than white defendants
for the same crime. While an analysis of bias in the Florida Criminal Justice System is long
overdue, I have concluded that the methodology employed by the Herald Tribune is deeply
?awed and their conclusions cannot be adopted with con?dence. What the Herald Tribune has
done, however, has the potential to inform future work intended to identify and remove bias.
Accordingly, I offer more detailed cements below.

The Herald Tribune built a database comprised of records obtained from the Florida
Offender Based Tracking System (OBTS) and the Florida Department of Corrections (DOC)
spanning the years 2004-2016. In making pronouncements regarding bias on the part of
individual judges, they restricted attention to third degree felonies, where they produced rough
statistical aggregates. My criticisms of their approach are primarily of two kinds: shortcomings
involving transparency and shortcomings involving control.

Because the charge of bias is leveled at individuals and is serious, it is essential that the
methods by which conclusions are reached be transparent. As a minimum standard, the Herald
Tribune should have provided

0 Access to the raw data used to come to the conclusion;

0 The software and code used to clean and analyze the data;

0 A discussion of how the results of the analysis are being interpreted;
A measure of con?dence in their conclusions.

Because the Herald Tribune did not provide this information, it is impossible to know
precisely what they did, let alone check the validity and reproducibility of their results. Starting
at the top, without access to the data it is impossible to assess the quality of the data, and the
quality of the data limits the con?dence one might have in any subsequent conclusion. The data
required extensive cleaning, a process that involves making decisions as to what part of the data
is retained and interpreted as part of a judge?s sentencing record. Without a detailed description
of how this was done, it is not possible to provide any measure of con?dence in what the record
and without knowing the record it is not possible to assess in what sense it is biased. It
appears that the Herald Tribune relied on some type of averaging, but without precise
information of what constitutes a record, it is impossible to compute an average, let alone
determine whether the particular average computed is a good representation of the behavior it
purportedly describes. Each bullet point thus represents a serious problem that must be
addressed in order for one to independently make sense of and assess the Herald Tribune?s
claims.

5800 Bay Shore Road Sarasota, Florida 34243?2109 941-487?4370 Fax: 941?487-4396

In addition to problems of transparency, there are a number of problems involving
control, a technical term that refers to the extent to which the measurements (sentence length)
can be attributed to a variable of interest (individual judge?s bias). My most serious objections
to the Herald Tribune?s methodology from the point of view of control are

0 Failure to control for confounding by plea deals;

0 Failure to control for time dependent guidelines and sentencing norms;

0 Failure to control for strong coupling of charges in the computation of
aggregates.

Each point above describes a well-known feature of the criminal justice system. For
example, it is well known that upwards of 95% of all sentences involve no trial and in the
overwhelming majority of such cases a plea deal has been reached independent of the judge in
whose court the plea is formally accepted. Not controlling for this variable and subsequently
ascribing bias to the judge in their sentencing is at face value a very bad idea: at best it
misconstrues how the criminal justice system works; at worst it lays the blame for possible bias
on the part of other agents in the criminal justice system at the feet of the judge who happened to
be the last stop in the criminal justice system before incarceration. In both cases, the net
outcome may well be a failure to recognize where reform would do the most good. The
other two bullet points similarly undermine con?dence in conclusions: in the case of time
dependent guidelines and sentencing norms, it is well?known that the war on drugs took/takes a
disProportionate toll on different ethnic groups with different intensities over the course of the
last twenty years. Comparing records that terminate in 2004 with records that terminate much
later therefore introduces a confounder that undermines con?dence.

These concerns are not just theoretical, nor are they exhaustive: I have examined the
OBTS data for the Twelfth Circuit and witnessed instances where each of the above problems
(and a number of others) surfaced g. the data includes missing ?elds and incorrect entries, the
data required extensive cleaning, the structure of the distribution for felony charges by degree
and race are completely different when accounting for plea data, cases involving multiple
charges versus cases involving a single charge, examples where judges whose record terminates
in 2004 compared to judges active from 2004-2014, small sample problems when seeking to
determine the same or suf?ciently similar crimes, etc.). These observations undermine
con?dence in the Herald Tribune?s conclusions, and, in particular, the charge of individual bias
is unsupported.

While the claims in the ?Bias on the Bench? series are unsupportable, it is clear that the
Herald Tribune has done a valuable service in calling attention to the existence of bias in the
criminal justice system and in putting together a database intended to address the problem.
Hopefully, the latter could be used as a springboard to provide a more comprehensive,
methodologically sound investigation of the Florida Criminal Justice System with an toward
improving the status quo.

Sincerely,

WM

Patrick McDonald
Director of Data Science
Professor of Mathematics

5800 Bay Shore Road Sarasota, Florida 34243-2109 941-487-4370[ Fax: 941-487?4396

3/19l2017 Preliminary Bias Report

Preliminary Bias Report

C. Dowdy, C. Ede/son, C. Leonard and P. McDonald
February 17, 2017

Introduction

Our criminal justice system is sustained by the belief that all citizens are to be afforded fair and equal treatment
under the law. It is no secret, however, that our prison population is skewed by race: people of color are over-
represented as a proportion of those incarcerated. Investigating why this is the status quo and how the status
quo can be addressed is an important and difficult problem, one deserving careful attention.

Recently, the Sarasota Herald Tribune ran a four part series addressing the issue of bias in the Florida Criminal
Justice System. Their series, entitled ?Bias On The Bench", lays the blame for the status quo squarely at the
feet of Florida's judges.1 The series purports to be data-driven, the result of a year of work involving millions of
cases. It names names: the series ascribes biased sentencing behavior to specific judges and in so doing
impunges their credibility and the credibility of the system they represent. It does not provide the raw data from
which the conclusions are drawn, it does not provide the algorithms or software used to clean and process the
data, and it does not provide a clear discussion of how the processed data is being interpreted.

Because the charge of bias is serious, it is essential that it be well-founded. In this note we review the data
associated to two of the judges named in the Sarasota Herald Tribune series, Judge Lee Haworth and Judge
Charles Williams. We provide both a description of the data used for our study and a link to the raw data. We
provide a description of how we process the data and the software and algorithms we use. We provide a
discussion of how we interpret the results of our processed data. Our report is thus fully reproducible: anyone
who downloads the data can check the results we report. We have constructed the code to be reusable and we
have provided an example for how those interested might use our tools to investigate data associated to other

judges.
Our main conclusions are as follows:

- Including data involving cases where a plea bargain is accepted undermines any confidence one
might have in the inferences drawn regarding bias in the sentence (See Figure 1 and 2).

- lnferences involving bias on the part of a sentencing judge that are based on data aggregated on
felony degree can any no confidence (because such aggregation treats each felony charge as
an independent event).

- Identifying a common notion of ?same crime" can lead to problems of small sample size (See
Figure 6, 7, and 8).

A clear statement of the protocol used to treat such problems and transparency with processing
is required forthere to be any confidence in inferences drawn.

Because the Sarasota Herald Tribune series did not describe precisely how they treated the data, reproduction
of their results is not possible. It appears that their analysis did not isolate the data involving plea bargains. It
also seems to have relied on precisely the aggregate measures that cannot provide inferences that can be
trusted. In trying to reproduce the Tribune's results for the data involving Judge Williams, we computed the
aggregate statistics suggested in the Herald Tribune series (see Figure 4). As is clear from the visualization,
there is no evidence in the data we examined that, for defendents tried and found guilty of the same crime,
Judge Williams assigns longer sentences to black defendents than to white defendents. We provide an analysis
of the data for Judge Haworth with analogous findings.

1/29

31192017 Preliminary Bias Report

The remainder of this report is structured as follows. In the next section we discuss the data with which we work,
the software we use, and the iniital data processing protocols. Following that we provide a discussion of the
what constitutes bias in sentencing, a discussion which drives the code which is subsequently constructed. Next.
we provide code which produces a preliminary analysis of the subset of the data corresponding to observations
related to Judge Williams. Following this we produce code which allows us to analyze data involving length of
sentence. The next section involves a discussion of how to construct identi?ers for cases which might provide a
basis for establishing meaningful measures of bias, and, more importantly, why this is necessary. A second
example involving data associated to Judge Haworth follows and indicates how the code for Judge Williams can
be used to validate other claims made in our analysis. We conclude with a discussion of future work.

Data, software and initial processing

Data maintained by the Offender Based Transaction System (OBTS) was obtained for cases heard in the
Florida Twelfth Circuit via a Freedom of Information Act request. The raw data consisted of a PostgreSQL
database encoding roughly of information. A separate communication provided a data dictionary (OBTS
Criminal Justice Data Element Dictionary, July The data dictionary included a description of the 105
features (columns) associated to each of the roughly four million observations (rows) contained in the database.

The raw data was provided in the form of a compressed SQL dump file, so recovering the database followed the
typical procedure outlined below.

We begin by creating an empty database to be used during restoration process:
CREATE DATABASE obts 

Once the empty database has been created, we can exit PostgreSQL and use the utility program, pg_restore 
to facilitate the recovery from our dump ?le: 3

pg_restore -d [database name] [dump file name]
pg_restore -d obts obts_back.psq1

With the database recovered, we subset the records such that the judge at the time of sentencing was Charles
Williams.To account for clerical errors upon data entry we ran the following query before proceeding:

SELECT "SP_JudgeatSentencing",
count(*) as "Frequency"
FROM obts_data
WHERE "SP_JudgeatSentencing" iLike 'williams%'
GROUP BY 1;

The results of the query are shown below:

SP_JudgeatSentencing Frequency

WILLIAMS, CHARLES I 18574
WILLIAMS, CHARLES 29537

There appear to be two variations for our judge of interest, we will address this redundancy later in our analysis.

Finally, we exported all of the records involving Charles Williams to a to be further processed using R. The
syntax required to export the results of a query to a are shown below:4

2/29

319/2017 Preliminary Bias Report

\copy (SELECT 
FROM obts_data
WHERE ?SP_JudgeatSentencing? iLike 'williams%') to with CSV HEADER

The quality of the raw data leaves much to be desired. A quick scan reveals a number of missing values, fields
where values have been incorrectly entered, fields in which information has been hashed, and fields where
coded information might be coerced and corrupted unless care is taken in processing.

Following the above protocol, all data corresponding to Charles Williams and Lee Haworth was extracted as a
and saved to disk (Judge Williams appeared under two names, Judge Haworth under three). While this data
contained a number of missing values, the fields coding for length of confinement were intact. This data was the
starting point of the initial analysis.

To process the data we employed the statistical programming language and the Studio IDE. Both and 
Studio are available for free download.5 They represent the state of the art for statistical processing packages.

Thus, we begin with raw data consists of 68286 observations each exhibiting 105 features.

Defining the problem to be studied: What constitutes
bias?

The initial analysis sought to address the hypothesis
?Judge Williams sentencing record indicates a bias against black defendants."
To address the hypothesis we were forced to clarify a number of points:

1. What constitutes Judge Williams? sentencing record?
2. What constitutes racial bias?
3. What constitutes racial bias on the part of Judge Williams?

To address the first point, we followed the Sarasota Herald Tribune?s lead and focussed on observations
involving felonies. For our initial study we limited our analysis to those observations for which there was a trial,
the trial involved a felony, the felony was committed by an adult, and the trial resulted in a guilty outcome.

?Bias? is a charged term deserving a careful treatment. We use the word ?bias? to mean ?a pattern of behavior
for which the same crime is treated differently and the difference is causally related to race.? Causal relations
are the gold standard in matters statistical and they are notoriously dif?cult to establish. As our analysis is
preliminary, we instead investigate the correlation between race and length of sentence for the same crime,
noting that if there is a causal relationship, the data will exhibit a correlation between race and sentence length.
In particular, if it is the case that there is racial bias in sentencing, we would expect to see a significant difference
in mean sentence length, and there are standard statistical tools to test such a proposition. Because we have a
complete record of the trials over which Judge Williams presided, we can augment these rough measures with a
more nuanced investigtion of the data.

Given the above, we can formulate what we mean by "racial bias on the part of Judge Williams": To
demonstrate racial bias, we will study cases in which Judge Williams is free to exercise discretion in sentencing
and we will require that there be a signi?cant pattern in which Judge Williams sentences black defendants to
longer sentences than he sentences white defendants for the same crime.

?lezll/C :lUsers/Kristy%208al 3129

311912017 Preliminary Bias Report

Defining what is meant by "the same crime? is a challenging problem that requires a trade-off between
granularity in the information that is being processed and the creation of general tools designed to be used to
study the sentencing behavior of any judge in the Florida Criminal Justice System. As a ?rst pass, one might
consider aggregating observations according to charge and degree. There are serious problems with such an
approach: there are cases for which the charges against the defendant are not restricted to the aggregate
classes (for example, cases in which a defendant is charged with multiple felony counts of different degree). It is
obvious that such cases do not involve ?the same crime? as cases involving a single charge, and therefore
drawing inferences on data that has been aggregated on charge and degree does not necessarily provide
information on bias as defined above. To address this feature of the data would require the construction of
unique case identi?ers, that is, collections of observations in the data that de?ne unique cases. Once unique
cases have been defined, cases which are the same (for example, involve the same charges and counts) can
be compared. While these constructions are beyond the scope of this preliminary analysis, they are necessary
to establish a claim of bias in sentencing.

Initial data processing: studying felony and degree

Having established what we mean by bias, we turn to the data. We refer to each row of the data as an
observation. Each observation has associated values for 105 features (though some rows may not contain
values for all features). It is important to realize that it is very rare for an observation to correspond to a crime or
a trial: crimes and trials almost always involve more than one observation. Sentences are associated to trials
which are collection of observations defining a crime. Our goal is to produce collections of trials that lend
themselves to an analysis that addresses whether or not there is bias in the sentencing record of Judge
Williams. This will require a great deal of work.

Our ?rst course of action is to replace column numbers with names appearing in the data dictionary:

##change coLumn names


Next, we filter the data to obtain the records for which Charles Williams is the sentencing judge, the prosecutor
?les charges, the defendant is not a juvenile, the crime is a felony and the defendant is found guilty.


319/2017 Preliminary Bias Report

#remove white space

Data[,89] 

##initiat data fitter

##_first reduce to data far Chartes Wittiams

then reduce to prosecutor fiLes charges

then reduce to "not juvenile"

then reduce to beany charges

then reduce to guiLty outcomes

Williams_data Data 


NATOR (COURT


##check size
dim(williams_data)

13391 105

Our initial filtration results in 18301 observations of 105 features. We can produce a preliminary view of the
data:

fill 
geom_bar() 
facet_grid(TRIAL .) 
labs(title "Judge Charles Williams: Distribution of felonies by degree, faceted on trial") 

labs(y= "Number of records") 

labs(x= "Degree of felony") 

1: Comparison of cases involving trials (2 and 3) and cases involving no
trial 


5/29

3119/2017 Preliminary Bias Report
Judge Charles Williams: Distribution of felonies by degree, faceted on trial
10000- 
5000- _l

0-

10000-

5000 - I 0

Number of records
a
0.1

10000 
5000 -

0.


Degree of felony
Figure 1: Comparison of cases involving trials (2 and 3) and cases involving no trial (1)

Figure 1 provides a look at how the felony data is distributed across degree (F- first degree felonies, L- life, P-
?rst degree felonies punishable by life, 8- second degree felonies, T- third degree felonies), faceted on trial (1-
no trial occured, 2- trial byjury, 3- trial by judge). The top plot, labeled 1, displays the data on race in the case
that no trial occurred, the middle one, labeled 2, displays the data when there was a jury trial, and the lower plot,
labeled 3, displays the data when the judge tried the case. There is a common scale for the plots. The relative
size of the boxes in the top plot (no trial) versus the boxes in the lower two plots gives an accurate visual display
of the collapse in the data: it is hard to detect the boxes in the middle and bottom plot because the amount of
the data in the top plot far, far exceeds the amount of data in the other two. The plots includes a description of
how the felony data is distributed according to race. We note that for third degree felonies (the largest of the
categories for plead cases), white defendants dominate the observations.

While it is important to keep in mind that we have yet to construct a unique identi?er for each case, it is possible
to get some idea of how much of the data involves plea bargains in which the judge has minimal discretion with
respect to sentencing. To do so, ?lter on cases that involved a trial. The data dictionary indicates that the value

in the TRIAL ?eld indicates that no trial occured.

w_data Williams_data filter((TRIAL 
dim(w_data)
544 165

We call attention to how the data collapses when one restricts to cases that were tried: there are 544
observations remaining from the 18301 observations with which we began (ie we are working with about 3
percent of the felony data). This is only part of the story: when restricting to cases in which a trial occured, the

?lezlilc 

311912017 Preliminary Bias Report

distributions by race change dramatically:

fill: 

geom_bar() 
1abs(tit1e ?Judge Charles Williams: Felonies by degree for observations with trial") 

1abs(y= "Number of records") 

1abs(x= "Degree of felony") 
1abs(caption= "Figure 2: Stacked barplot giving distribution of observations involving Felony

trials by race")

Judge Charles Williams: Felonies by degree for observations with trial

     
200-
U)
2150-

8 RACE
"5 I8
S.
(D
9100felony
Figure 2: Stacked barpiot giving distri ution of observations involving felony trials by race

Figure 2 provides detail that is suppressed when including data involving plead cases. There are two important
features to note:

1. The number of third degree felony observations relative to the number of observations of either ?rst or
second degree felonies has changed dramatically.

2. The relative distribution by race of third degree felony observations has changed dramatically.

These observations and the questions they engender suggest that a very careful treatment of the data is in
order. We begin by including information involving length of con?nement.

Length of confinement

The length of con?nement is coded as a fourteen character sequence. This sequence is comprised of two
sub?elds of length seven. The first sub?eld codes for the length of the minimal time to be served; the second for
the maximal time to be served. For each subfield.

7/29

3119/2017 Preliminary Bias Report

- The first three characters represent the number of years in the sentence.
- The fourth and fifth characters represent the number of months in the sentence.
- The last two characters represent the number of days in the sentence.

We will focus attention on the second sub?eld. it is easy to check that all of the data is formatted correctly.
The data dictionary indicates that when length of confinement is not applicable, the code "8888888" is to be

entered. We count and remove these instances.

##Trim to work with second subfietd
w_data
gsub("A . . . . . 

##Deat with the NA data by buiLding a tag
##Write a fanction to generate a Logical vector that marks 8888888.

AS function(x) 
FALSE
if (x "8888888?) 

return(y)


function(U) 

as.character(U)

return(T)}
##Use the function to mark the NA observations


Thus, we have eliminated 59 observations in which the length of sentence field recorded ?Not Applicable."

Next, we isolate death sentences. From the data dictionary, observations involving death sentences correspond
to codes of the form ?9999999?

a 

sprintf("There are %s observations involving death sentences.", a)

"There are observations involving death sentences."

This indicates that there are 0 observations related to death sentences recorded in the data involving Judge
Williams.

According to the data dictionary, observations associated to life sentences are coded as ?9999998?:

3 

sprintf("There are %s observations involving life sentences.", 3)

"There are 2 observations involving life sentences."

8/29

3/1 912017 Preliminary Bias Report

This indicates that there are 2 observations associated to life sentences recorded in the data involving Judge
Williams. We record the felony degree and race corresponding to the life sentence data, as well as the
associated OBTS number, and docket number:

LIFE 
a 

sprintf("There are %s distinct OBTS numbers and %5 distinct court docket numbers associated to 
ife sentences.", 3, b)

"There are 1 distinct OBTS numbers and 1 distinct court docket numbers associated to life
sentences."

What this means is that the two observations in the data corresponding to life sentences are associated to a
single OBTS identification number and court docket number. Thus, the two observations involve the same trial,
illustrating, as aknowledged above, the need to find unique identifiers for cases before making inferences
involving the data.

The entry ?9999998? if converted to years is larger than 999. We can ?lter the data for other observations with
sentences in excess of 500 years:

LONG 560) I

a 
sprintf("There are %5 distinct OBTS numbers, %s distinct court docket numbers and %s distinct se
ntence dates associated to sentences in excess of 599 years.", a, b, c)

"There are 2 distinct OBTS numbers, 2 distinct court docket numbers and 2 distinct senten
ce dates associated to sentences in excess of 599 years."

These results indicate that the observations not involving the previously discovered life sentence are all tied to
the same OBTS number, the same court docket number and the same sentencing date. It is possible that Judge
Williams sentenced the defendant in question to 999 years (as indicated in the data). It is also possible that
there was an error in data entry. In either case, if we ?lter the data to study sentences less than 500 years, we
know that we lose two cases amounting to 9 observations.

To continue our initial investigation of length of confinement, we write a function which converts the length of
con?nement data to years. We start with the ?rst three characters. Our function reads the first three characters,
converts them to numeric and writes a column called SYEARS.

Years function(x) 
substring(x,1,3)
as.numeric(y)
return(y)


mutate(SYEARS 

As mentioned above, there are at least 9 observations that involve sentences of more than 500 years. We filter
on occurences of such sentences and deal with them as special cases:

9129

311912017 Preliminary Bias Report

Williams_clean filter(SYEARS see)
To convert months to years, kill the first three characters and convert the second two by dividing by 12.

Months function(x) 
x)
substring(y,1,2)

return(y)

Williams_clean Williams_clean 

Now finish

Days function(x) 

x)

return(y)


Williams_clean Williams_clean mutate(SDAYS mutate(STOTAL
SYEARS SDAYS)

Because there are 476 observations in the clean data, there were 9 observations corresponding to sentences
over 500 years. Thus, all the observations involving sentences of more than 500 years have been accounted

for. In fact, only observations with a associated sentence of less than 40 years remain:


Min. Qu. Median Mean 3rd Qu. Max.
6.66 5.66 25.66 21.51 46.66 46.66

We can visualize how charges are distributed in our clean data set

r111: 
geom_bar() 

labs(title "Judge Williams: Felonies by degree for observations with trial and sentence") 

labs(y= "Number of records") 

labs(x= "Degree of felony") 

labs(caption= "Figure 3: Distribution of observations involving felony trials by race for clea
data")

10/29


Judge Williams: Felonies by degree for observations with trial and sentence

3/19f2017

200 -

_l

01


I

100-

Number of records

     
Degree of felony
Figure 3: Distribution of observations involving felony trials by race for clean data

Treating charges as independent, we can visualize the distribution of sentence length for observations as a
function of race, faceted on felony degree:

aes(x=RACE, fill=RACE)) 
geom_boxplot() 
. scales "free_y") 
labs(title= "Judge Charles Williams sentence length by race, faceted on felony degree") 

labs(y= "Length of sentence in years") 
labs(caption= "Figure 4: Sentence length as a function of race for felony degree: aggregate da


11/29

3/19/2017 Preliminary Bias Report

Judge Charles Williams sentence length by race, faceted on felony degree
Race
Figure 4: Sentence length as a function of race for felony degree: aggregate data.

As mentioned in the Introduction, data aggregated on degree of felony cannot lead to inferences involving
sentencing bias. Nonetheless, they provide information concerning how sentence are distributed and
they are easy to compute: we do so in Figure 4.

Figure 4 features paired boxplots: shaded regions represent interquartile range, dark horizontal lines represent
mean values, vertical bars indicate the range of the data, and points represent outliers.

We include Figure 4 because we are interested in displaying what happens to aggregates when one restricts
attention to data that excludes plea deals. The intended message is: if there were reason to believe that
such statistics provide a signal in which we can have confidence, in the case of Judge Williams the
aggregates still provide no meaningful signal for detecting bias.

In order to address questions of bias, we need to study how observations are tied to trials and how trials are tied
to sentencing. We begin such a study. highlighting obstructions as we proceed.

Connecting crimes to sentence outcomes: towards
constructing unique identifiers and establishing case
comparisons

As indicated in Figure 4, there are 476 observations involving felony charges for which a sentence was
imposed. As previously mentioned, Figure 4 is the result of an analysis that treats these charges as
independent events. Unfortunately, from the point of view of trial and sentencing, they are not independent:
defendants are tried for crimes they commit. crimes usually involve a suite of charges, and trials often involve a
suite of crimes. In an effort to keep track of what occurs during the course of an investigation, the Florida

12/29

311912017 Preliminary Bias Report

Criminal Justice System mandates that a number of date stamps, case identification numbers and personal
identi?ers be maintained. To the extent that they are maintained, these data provide a means of associating a
sentence to a crime. To carry out a rigorous analysis of how sentences depend on crimes committed as a
function of race is beyond the scope of this preliminary investigation. Instead, we will provide evidence that such
an analysis is required if there is to be any confidence in inferences connecting sentences to race.

Our ?rst order of business is to identify which observations constituting our clean data are associated to an
individual trial. There is a feature of the data that is intended to accomplish precisely this: the court docket
number. According to the data dictionary, a court docket number is assigned ?to each case heard by the court to
identify it in its ?les." There are 37 unique court docket numbers and each value is either a hash, or it is missing.
This is a surprisingly small number, suggesting that issues of small sample size and missing data will likely play
a role in any meaningful analysis. We begin by visualizing how observations are distributed over court docket
numbers and race. To do this we partition the data into subsets given by unique pairs of court docket numbers
and race, and we associate a size to every such pair: the number of observations that correspond to the pair.
We plot by placing a colored bar of height at the location on the horizontal axis that records how many
observations are associated to the court docket number being plotted. Bars of height grater than 1 therefore
represent trials with the same number of associated charges:
I by_CD Williams_clean 

count_CD by_CD tally()

aes(x=n, fill=RACE)) 
geom_histogram(bins 196) 
labs(title= "Judge Charles Williams: number of observations by court docket number and race")

of observations associated to a race-court docket pair") 
labs(y= "Number of court docket numbers") 
5: Studying the number of observations associated to race?court docket pa

irs")

13/29

31912017 Preliminary Bias Report

Judge Charles Williams: number of observations by court docket number and rac:

3-
5-

RACE
4.. 

Number of court docket numbers

:lm 

25 50 7'5 160

Number of observations associated to a race-court docket pair
Figure 5: Studying the number of observations associated to race-court docket pairs

Figure 5 gives some measure of how bad things can be. There are arrests that result in dozens of charges. all
of which are tried during a single trial. Such charges are coupled and the conglomerate of charges
de?nes the crime. Because the total number of observations is small, cases that involve such a large number of
charges make comparative analysis difficult: such cases de?ne crimes for which there is no other comparable
crime, and thus they play no role in the analysis of bias (in Figure 5 these correspond to cases where the bars
have height 1). In fact. from Figure 5 it is clear that for any attempt to analyze bias, small sample problems will
abound: the richest sample of comparable crimes seem to be a collection of 4 black defendants and 4 white
defendants. lt might be the case that things will improve if we can resolve issues involving missing data, but it is
unlikely. Nonetheless, we give it a try.

We begin with an investigation of the subset of the court docket numbers involving more than 20 of the
observations:

count__CD filter-(n >26)

14/29

3/19l2017 Preliminary Bias Report

Source: local data frame [6 3]
Groups: 


RACE 
(fctr) (fctr) (intThis suggests that observations involving missing court docket numbers account for a significant portion of the
data. To address this problem we begin by ?ltering the data with missing court docket numbers and grouping by
OBTSranberandrace:

missing_CD Williams_clean 

present_CD Nilliams_clean RA
CE)

count_missing_CD missing_CD tally()

count_present_CD present_CD tally()

aes(x=n, fill=RACE)) 
geom_histogram(bins 106) I
labs(title= ?Judge Charles Williams trials by race and counts per trial") 
of OBTS numbers associated to court docket number") 
labs(y= "Number of unique OBTS numbers") 
6: The number of OBTS numbers associated to unique court docket

numbers")

1529

3(19l2017 Preliminary Bias Report

Judge Charles Williams trials by race and counts per trial

7.5-
a
G)
.D

3


8 5.0- RACE


.W
3


2.5-

I
0.Number of OBTS numbers associated to court docket number
Figure 6: The number of OBTS numbers associated to unique court docket numbers

In Figure 6 we plot data for which there is a court docket number and we group by OBTS number. There are 34
groups and there are 36 court docket numbers. suggesting OBTS numbers are a good proxy for the court
docket numbers which are present. In fact, every court docket number has only one OBTS number associated
to it (there are single arrests that are tried over more than one trial). Note that the largest group which
constitutes "same crime" corresponds to a collection of 8 cases (4 white defendants and 4 black defendants).

aes(x=n, fill=RACE)) 
geom_histogram(bins 65) 
labs(tit1e= "Judge Charles Williams trials by race and counts per trial") 
of observations associated to missing court docket number") 
labs(y= "Number of unique OBTS numbers") 
7: The number of OBTS numbers without court docket numbers")

16l29

3/19l2017 Preliminary Bias Report

Judge Charles Williams trials by race and counts per trial

6-
i2
0
.D

E4-

RACE
Number of observations associated to missin court docket number
Figure 7: The number of OBTS num are without court docket numbers

In Figure 7 we plot data for which there is no court docket number and we group by OBTS number and race.
There are 23 OBTS numbers plotted in this data. Since there are a total of 48 unique OBTS numbers, this
suggests that some of the data without court docket numbers should be folded into the data plotted in Figure 6.
From the point of view of improving sample size, the largest possible increase would result in a sample of size
14 with 6 white defendants and 8 black defendants. But is that what happens?

We isolate the OBTS numbers that occur in both lists and plot:

present_OBTS 
move__DBTS missing_CD 90% %in% present_OBTS) 
count_move_OBTS move_OBTS ta11y()

aes(x=n, fill=RACE)) 

geom_histogram(bins 65) 

1abs(tit1e= "Judge Charles Williams trials with OBTS numbers with and without docket numbers")


of observations associated to missing court docket number, 

labs(y= "Number of unique DBTS numbers") 

8: The number of OBTS numbers with and without court docket numbers")

17I29

3/19/2017 Preliminary Bias Report
Judge Charles Williams trials with OBTS numbers with and without docket numb

2.0 -

.L
01
I

RACE

Number of unique OBTS numbers
9 
01 

0.0 -

4 12 1'6

Number of observations associated to missing court docket number, OBTS
Figure 8: The number of OBTS numbers with and without court docket numbers

Figure 8 indicates that additional analysis would yield a maximal sample of at most 9 defendants. In particular,
sample sizes available for analysis of bias in sentencing will remain very small, even when we restrict
to the rough level of equating cases with the same number of charges.

This analysis indicates that an aggregation of data that informs questions of bias in sentencing must involve a
schema for dealing with small samples. It is beyond the scope of this investigation to construct such a schema,
implement it and analyze the result; we will return to these tasks in a later investigation. Our purpose is simply to
demonstrate that meaningful bias analysis cannot occur without transparent treatment of the above.

It is possible that these concerns arise because we have chosen to focus on data associated to Judge Williams.
To address such concerns, we run the same analysis on data associated to Judge Lee Haworth. in addition to
addressing the extent to which the problems will arise in a general analysis, we provide a guide to how to use
our code to investigate the data associated to any judge whose name appears in the data.

A second example

We ?lter the original data to obtain the records for which Lee Haworth is the sentencing judge, the prosecutor
?les charges. the defendant is not a juvenile, the crime is a felony and the defendant is found guilty.

18/29

3119/2017 Preliminary Bias Report

#remove white space

Data[,86] 

##initiat data fitter

Haworth_data Data I 
I 

ION I 

Our ?ltration results in 16607 observations of 105 features. We can produce a preliminary view of the data:

fill 
geom_bar() 
facet_grid(TRIAL .) 
labs(title "Judge Lee Haworth: Distribution of felonies by degree, faceted on trial") 
labs(y= "Number of records?) 
labs(x= "Degree of felony") 
9: Comparison of cases involving trials (2 and 3) and cases involving no

trial 

Judge Lee Haworth: Distribution of felonies by degree, faceted on trial
10000-

7500 -
5000 - 

0 -
10000 -
7500 - 
5000 - I 
2500 - 

0- ESE-F 

10000-

Number of records

7500 -
5000 
2500 

I


Degree of felony
Figure 9: Comparison of cases involving trials (2 and 3) and cases involving no trial (1)

As was the case with Judge Williams, the columns labeled and correspond to ?First?, "Life",
?Second? and ?Third? degree felonies. The column labeled correspods to ?Capital? cases; those for which
the death penalty applies.

We ?lter the data to study observations in which a trial occured:

19129

3/19/2017 Preliminary Bias Report

w_data Haworth_data filter((TRIAL 
dim(w_data)

1599 195

As was the case when studying the data associated to Judge Williams, the data collapses when one restricts to
cases that were tried: there are 1500 observations remaining from the 16607 observations with which we began
(ie we are working with about 9 percent of the felony data). As before we study to what extent distributions
change when restricting to cases in which a trial occured:

fill: 

geom_bar() 
1abs(title "Judge Lee Haworth: Felonies by degree for observations with trial") 

labs(y= "Number of records") 

1abs(x= "Degree of felony") 
labs(caption= "Figure 16: Stacked barplot giving distribution of observations involving felony

trials by race")

Judge Lee Haworth: Felonies by degree for observations with trial

 
400-
U)
1.2

8 RACE
felony
Figure 10: Stacked barplot giving distri ution of observations involving felony trials by race

As before, the data again collapses: only about 9% of the observations involve trials.

We clean the data as we did before:


3/19/2017 Preliminary Bias Report
##Trim to work with second subfieLd
w_data
gsub("". . . . . . . 
##Use the function to mark the NA observations

Thus, we have eliminated 114 observations in which the length of sentence ?eld is recorded as "Not Applicable.?

Next, we isolate observations involving death sentences. From the data dictionary, these correspond to
sentences of the form "9999999".

a 
I sprintf("There are %5 observations associated to death sentences.", a)

?There are 8 observations associated to death sentences.?

This indicates that there are 8 observations with death sentence recorded in the field for length of con?nement
in the data involving Judge Haworth.


a 
sprintf("There are %s distinct OBTS numbers associated to death sentences.", a)

?There are 1 distinct OBTS numbers associated to death sentences."

Thus, the 8 observations involving death sentences are associated to 1 OBTS number.

Next we consider observations involving life sentences. As before according to the data dictionary, observations
involving life sentences are coded as ?9999998?:

a 
sprintf("There are %5 observations associated to life sentences.", a)

"There are 217 observations associated to life sentences."

This indicates that there are 217 observations with life sentences recorded in the field for length of confinement
in the data involving Judge Haworth. We record the felony degree and race corresponding to the life sentence
data, as well as the associated OBTS number, and court docket number:

LIFE 
a 

sprintf("There are %s distinct OBTS numbers and %5 distinct court docket numbers associated to 1
ife sentences.", a, b)

21/29

3/1912017 Preliminary Bias Report

i "There are 12 distinct OBTS numbers and 11 distinct court docket numbers associated to 1i
I fe sentences."

I

What this means is that the observations in the data corresponding to life sentences are associated to 12 OBTS
numbers and 11 court docket number.

The entry ?9999998? if converted to years is larger than 999. We can filter the data for other observations with
sentences in excess of 500 years:

LONG 569) 
a 

sprintf("There are %5 distinct OBTS numbers, %5 distinct court docket numbers and %s distinct se
ntence dates associated to life sentences.", a, b,c) 

?There are 13 distinct OBTS numbers, 12 distinct court docket numbers and 13 distinct sen
tence dates associated to life sentences."

These results indicate that we have accounted for all observations asociated to Judge Haworth involving life
sentences and death sentences. If we ?lter on sentences longer than 500 years we know that we lose data
associated to 12 trials and 13 OBTS numbers.

We convert the length of confinement data to years using our previously defined functions:
mutate(SYEARS 
We filter on occurences of such outcomes greater than 500 years and deal with them as separate cases:

Haworth_clean filter(SYEARS 566)

##Continue conversion
Haworth_c1ean Haworth_clean 

Haworth_clean Haworth_clean mutate(SDAYS mutate(STOTAL 
SYEARS SDAYS)

Because there are 1161 observations in the clean data, there were 225 observations corresponding to
sentences over 500 years. Thus, all the observations involving sentences of more than 500 years have been

accounted for. in fact, only observations with 3 associated sentence of less than 100.5 years remain:


Min. Qu. Median Mean 3rd Qu. Max.
6.66 5.66 19.33 12.68 16.59 169.56

We can visualize how charges are distributed in our clean data set

an/D i i nary_Bi as_Report.htm 2229

31912017 Preliminary Bias Report

fill: 

geom_bar() 
labs(title "Judge Haworth: Felonies by degree for observations with trial and sentence") 

labs(y= "Number of records?) 

labs(x= "Degree of felony?) 
labs(caption= ?Figure 11: Distribution of observations involving felony trials by race for cle

an data")

Judge Haworth: Felonies by degree for observations with trial and sentence

     
500-

400-
t?
5 300-
8 IQFKZE

"5 I6

Iw
e200-
:5


100-

0_ 


Degree of felony
Figure 11: Distribution of observations involving felony trials by race for clean data

Treating charges as independent, we can visualize the distribution of sentence length for observations as a
function of race, faceted on felony degree (Again, this information does not help with the analysis of bias. but it
is information we can collect with ease):

aes(x=RACE, fill=RACE)) 
geomuboxplot() 
. scales "free_y") 
labs(tit1e= "Judge Lee Haworth sentence length by race, faceted on felony degree") 

labs(y= "Length of sentence in years") 
labs(caption= "Figure 12: Boxplot for sentence length as a function of race for felony degre


23/29

3i19/2017 Preliminary Bias Report

Judge Lee Haworth sentence length by race, faceted on felony degree
10010'

 
Length of sentence in years


.3

01
I

I

 
VV

0
m--iI

Race
Figure 12: Boxplot for sentence length as a function of race for felony degree.

Finally. we investigate how observations are distributed over trials. We begin by counting the number of trials
using unique court docket numbers:

a 
sprintf("There are %5 distinct court docket numbers.", a)

"There are 56 distinct court docket numbers."

Once again, there is a relatively small number of trials (56) given the number of observations (1161). We can
get an initial plot of the data

by_pD Haworth_c1ean 
count_CD by_CD ta11y()

aes(x=n, fill=RACE)) 
geom_histogram(bins 256) 
1abs(tit1e= "Judge Lee Haworth: number of observations by court docket number and race") 
of observations associated to a race-court docket pair") 
1abs(y= "Number of court docket numbers") 
12: Studying the number of observations associated to race-court docket 
airs")

24/29

3119/2017 Preliminary Bias Report

Judge Lee Haworth: number of observations by court docket number and race

 
10.0-

7.5-

:5

*5
RACE
8 a
5.0-
lw


0
53
.O
2.5-
2 I
0.0-   

50 160 1&0 260 250

Number of observations associated to a race-court docket pair
Figure 12: Studying the number of observations associated to race-court docket pairs

We begin with an investigation of the subset of the court docket numbers involving more than 60 of the
observations:

count_CD filter(n >66)

Sour-Ce: local data frame [4 3]
Groups: 


RACE 
(fctr?) (fctr) (int111

This suggests that observations involving missing court docket numbers account for a significant portion of the
data. To address this problem we begin by ?ltering the data with missing court docket numbers and grouping by
OBTS number and race:

25/29

3/191'2017 Preliminary Bias Report

missing_CD Haworth_c1ean 
present_CD Haworth_c1ean 
count_missing_CD missing_CD ta11y()
count_present_CD present_CD ta11y()

aes(x=n, fill=RACE)) 
geom_histogram(bins 111) 
labs(tit1e= "Judge Lee Haworth trials by race and counts per trial") 
of OBTS numbers associated to court docket number") 
labs(y= "Number of unique OBTS numbers") 
13: The number of OBTS numbers associated to unique court docket number


Judge Lee Haworth trials by race and counts per trial

 
a:

.0

10-

3


.9

RACE
6'0 9'0
Number of OBTS numbers associated to court docket number
Figure 13: The number of OBTS numbers associated to unique court docket numbers

In Figure 13 we plot data for which there is a court docket number and we group by OBTS number. There are
56 groups and there are 56 distinct court docket numbers, suggesting OBTS numbers are a good proxy for the
court docket numbers which are present. Note that the largest group which constitutes ?same crime?
corresponds to a collection of 14 cases (8 white defendants and 6 black defendants).

As before. we can investigate how missing data is partitioned amongst OBTS numbers and plot:

26129

3119f20?l7 Preliminary Bias Report

aes(x=n, fill=RACE)) 
geom_histogram(bins 111) 
labs(tit1e= "Judge Lee Haworth trials by race and counts per trial?) 
of observations associated to missing court docket number") 
1abs(y= "Number of unique OBTS numbers") 
14: The number of OBTS numbers without court docket numbers")

Judge Lee Haworth trials by race and counts per trial

 
4..
2
83?

3


no RACE
0 
022Number of observations associated to missing court docket number
Figure 14: The number of OBTS num ers without court docket numbers

In Figure 14 we plot data for which there is no court docket number and we group by OBTS number and race.
There are 30 OBTS numbers plotted in this data. Since there are a total of 80 unique OBTS numbers, this
suggests that some of the data without court docket numbers should be folded into the data plotted in Figure 6.
From the point of view of improving sample size, the largest possible increase would result in a sample of size
18 with 8 white defendants and 10 black defendants. But is that what happens?

We isolate the OBTS numbers that occur in both lists and plot:

present_OBTS 
move_pBTS missing_?D %in% present_pBTS) 
count_move_OBTS move_pBTS ta11y()

aes(x=n, 
geom_histogram(bins 111) 
labs(tit1e= "Judge Lee Haworth trials with OBTS numbers with and without docket numbers?) 
of observations associated to missing court docket number, 
1abs(y= "Number of unique OBTS numbers") 
15: The number of OBTS numbers with and without court docket numbers")

27129

3119/2017 Preliminary Bias Report
Judge Lee Haworth trials with OBTS numbers with and without docket numbers

1.00-


s;

01
I

RACE

0.50 -

0.25 

Number of unique OBTS numbers

 
0.00 -
6 2'0 4'0 6'0
Number of observations associated to missing court docket number, OBTS
Figure 15: The number of OBTS numbers With and without court docket numbers

Thus, the best improvement we can hope for results in a sample of size 17. Again, in this case the sample
involves cases with the same number of charges. Further re?nement is likely to result in significantly smaller
sample sizes, again resulting in the necessity of a method to deal with small samples.

Conclusions and future directions

Our investigation reveals several important features of the data as it relates to questions of bias in sentencing:

- Including the data associated to plea deals greatly distorts any analysis of sentence bias that might be

associated to judges.
- Simple aggregates across charge and degree treat each observation as an independent event, contrary
to what is known to be the case. Relying on such aggregates to measure bias in sentencing results in

inference without con?dence.
- When aggregating data across trials to de?ne crimes which can reasonably be compared, one runs into

problems involving small samples.

Our ?rst two observations are easy to address: filter the data appropriately and do not use the aggregates that
are seductively easy to obtain. The third observation requires much more careful consideration. Dealing with
small sample sizes will require a protocol which

- Identi?es cases in which similar crimes are being tried;
- Identi?es if there is a difference in sentencing for such cases;
- Identi?es whethere there is a pattern in the sentencing differences.

We intend to deveIOp and implement an algorithm which addresses these three design constraints and we
intend to use it to continue our analysis.


 Preliminary Bias Report

The problem of constructing an appropriate tool to study small sample problems was not the only interesting
question that arose in our study that proved to be beyond the scope of a preliminary analysis. indeed. our
preliminary analysis provides far fewer answers than it does questions. For example,

- In studying the difference between Figure 1 and Figure 2 and the difference between Figure 9 and Figure
10, one notes that there is a difference in the proportion of defendants taking plea deals which is
correlated to race and charge. Is this really a feature of the data?

- What do patterns we observe in sentencing mean about the criminal justice system?

Are there other correlates for which we see patterns similar to those involving race (for example,
representation, location, etc.)?

- Is there a better way to collect data if one is interested in providing transparency for outcomes?

We hope to return to these questions soon.

We have written our code in hopes that others will use it. Our second example was intended in part to show how
this might be done: by changing the names ofjudges we produce new results. We can do the same with other
fields. For example, we can study how having a public defender as counsel affects outcomes by changing ?race?
to an appropriately modi?ed ?eld. It is our hope that others will expand on what we have done to improve the
system we share.

1. The Sarasota Herald Tribune series can be found at 


2. More information about the OBTS can be found at
.pdf
.pdf)?

3. For more info on dumping/restoring a SQL database please refer to the following docs:


4. More on exporting query results can be found in the PostgreSQL docs here:


5. and Studio are available for free download: 


29/29