Elishka Johnson – Public Safety Assessment (PSA) in Utah Courts

Developed by the Laura and John Arnold Foundation (LJAF) in 2013 and deployed in Utah state courts in 2018, the Public Safety Assessment (PSA) attempts to predict a defendant’s risk of recidivism and failure to appear to their court hearing. The algorithm was developed to address the shortcomings of the bail system in which judges decided individually whether defendants can safely await their trial or case resolution in the community, or be held in jail until their court date. Judges are provided little to no information about a defendant’s personal or criminal background when they receive a probable cause affidavit. As a result, judges use a bail schedule if they decide to hold the defendant in jail, or release them if they think they pose a negligible safety risk to the public. However, bail schedules do not treat all defendants fairly. For example, they do not guarantee that unsafe individuals will be kept in jail until their court date, and they may keep a safe person in jail who does not have enough money to pay bail. This system presents an issue that reflects existing biases in society. Poorer people are more likely to be held in jail, and even short jail times can have severe impacts on the lives of defendants and their families; they may lose their jobs, housing, or child custody, for example (Laura and John Arnold Foundation). 

The PSA was designed to provide judges with additional information so as to make the pretrial system more fair, and to protect the public’s safety against high-risk individuals. The output scores help judges assess risk in two areas. First, the PSA attempts to predict the likelihood that a defendant will commit a new crime if released before trial. Second, it attempts 

to predict the likelihood that the defendant will fail to appear for their court hearing. The PSA also flags defendants who pose an elevated risk of committing a new violent crime while awaiting their trial or the resolution of their case (Laura and John Arnold Foundation). 

A study of the PSA would need to address two essential questions. First, there is no data regarding the efficacy of the algorithm in Utah, and we need to be able to assess its ability to accurately predict recidivism in the state; it may cause serious and long-lasting harm to individuals and their families if we continue to use the algorithm before it is locally validated and we can confidently predict its effects on our communities. Second, we should know if the nine inputs used to calculate the scores are proxies for other demographics such as race, income, or geography, to determine whether the algorithm perpetuates existing biases in the penal system. The consequences of these knowledge gaps are severe. If we cannot determine whether the PSA is safe and effective, we risk exacerbating inequalities in the criminal justice system. The purpose of the PSA is to make the pretrial system more fair. If we know that the PSA is unfair, then we can work to find other alternatives to the pretrial release and detention system that prevent further harm to individuals and properly address longstanding social issues. 

To answer the first question would require access to data and information not on the public record. Defendants’ PSA scores and demographic information are not permanently posted on the publicly available criminal case repository, Xchange. To fully audit the PSA and test its efficacy, I would like data on individual defendants’ risk score and whether they appeared in court, committed a new crime, or committed a new violent crime. With this data, I could test the PSA for effectiveness by following-up with defendants after the pretrial period and comparing their risk score with their actual behavior. The data would also need to include their personal demographic information, such as race, age, gender and geography. I would then compare the error rates between the various demographic groups to determine if the algorithm 

perpetuates racist, ageist, sexist, or geographical bias. For example, if the algorithm over-predicts risk for Black defendants, yet under-predicts risk for white defendants, I would petition to stop using the PSA and find an alternative. It is important to test these variables because—even in isolation—the existence of bias in one could result in the continuation of longstanding, serious social problems. 

The above audit design is difficult to conduct for several reasons—it would also be costly, as access to criminal records on Xchange requires either payment for individual case files, or a subscription to use the database. One time-use of Xchange starts at $5.00, and a monthly subscription is $40.00. A defendant’s risk score is only available during the life of their case, making it difficult to obtain a sufficient and unbiased sample at a single point in time. It would be necessary to ask for defendants’ consent to use their criminal record for research purposes, and the potential for nonresponse bias means that collecting a representative sample would be challenging. In addition, the PSA has been implemented in Utah in tandem with other institutional reforms, so isolating its effects from those of the reforms poses another study design challenge. Finally, judges may reference the PSA score at their discretion, so it cannot be assumed that every defendant’s release or detention treatment reflects a PSA score recommendation. 

Answering the second question—whether the nine inputs for the algorithm could be proxies for race, gender, income, etc.—can be done with less intensive research methods. We can determine if the inputs for the algorithm could perpetuate bias by considering them individually and analyzing their historical connections to the criminal justice system. The nine PSA inputs are as follows: 

While the LJAF claims that race, ethnicity, and geography are not considered by the algorithm, the nine variables above can still perpetuate bias without directly drawing upon these demographics. This is especially concerning with factors 4 and 5, and 7 and 8, as outlined below. To determine whether each factor risks perpetuating bias, we must think about why a defendant might score positively for each factor. We must also consider the history of policing disparities and prejudice in the United States. A defendant’s risk score is not calculated in isolation from history. Because the PSA itself does not address or correct existing biases in the criminal justice system, we can expect those biases to continue to appear in the score. Factor 4, “Prior misdemeanor conviction,” is problematic because it seems to ignore the fact that racial minorities are more heavily policed and more likely to be convicted of a misdemeanor than whites. According to a study conducted by the Data Collaborative for Justice at John Jay College, in the jurisdictions researchers studied across the country, “Black people were arrested at the highest rates of any racial/ethnic group for all jurisdictions across the entire study period”. While the “arrest rate for Black people saw the steepest decline” over the study period, “the arrest rate for Black people was still the highest of any racial/ethnic group at study end,” and 

“racial disparities in arrest rates did not necessarily improve”. Arrest rates for Latinx people “[were] generally the second-highest and almost always between the rates of Black people and White people,” (Cadoff et al., 2020). The study examined data dated from 2001 to 2018. This demonstrates that the trends regarding racial disparities in misdemeanor enforcement are not new and will certainly appear in the PSA’s treatment of defendants. 

For similar reasons as factor 4, factor 5, “Prior felony conviction,” will underscore existing bias regarding how felonies are policed by race. For example, the Anti-Drug Abuse Act of 1986, signed by President Ronald Reagan, established “a racially discriminatory 100:1 sentencing disparity between crack and powder cocaine”. One would need to possess 100 times more powder cocaine—consumed mostly by “richer, White demographics,”—to merit the same punishment as possessing crack cocaine—consumed disproportionately by African Americans. Even after President Barack Obama signed the Fair Sentencing Act in 2010 to partially address this disparity and reduce the ratio from 100:1 to 18:1, “6.3 percent of people sentenced under these laws are White, while 77.1 percent are Black” (The Leadership Conference on Civil and Human Rights, 2021). While they may not directly address or mention race, factors 4 and 5 will inevitably perpetuate racial biases. If it is more likely for a person of color to be arrested and convicted of a misdemeanor or felony, then the PSA score for a person of color will consistently suggest that they pose a greater risk than a white defendant to fail to appear in court or commit another crime. 

My concern with factors 7 and 8 is that they do not provide a full or detailed enough history of a defendant to be able to draw conclusions about them. These factors ask whether a defendant has failed to appear at a prior court hearing in the past two years or failed to appear at a hearing more than two years ago. Neither of these factors provides a full picture of why the defendant failed to appear. Appearing in court can be burdensome for an individual who cannot 

miss work out of fear of losing their job, who must care for children, or who does not have access to the proper legal advice. While part of the purpose of the PSA is to determine whether a defendant will appear in court, I do not feel that a simple “yes or no” system sufficiently captures an individual’s history in such a way that a judge can draw well-informed conclusions. 

The auditing process was enlightening for several reasons. First, it pushed me to think more critically when organizations claim to provide “miracle solutions” for deeply systemic problems. Second, having more tools to examine artificial intelligence systems in the future, I am aware of my own bias regarding how I think about technology and respond to marketing. 

Based on what I read from the LJAF and Utah court system about the PSA, this algorithm is being pushed into a similar limelight as other recent developments in A.I. There are numerous examples where an algorithm developer creates something with the intention of solving a social issue, but this “solution” does not address the root cause(s) of the issue and so exacerbates the problem. This is true for other risk assessment algorithms, such as for medical claims reviewing and money lending. These algorithms are developed with the intention of resolving a need. However, therein lies the fundamental problem: these algorithms resolve only the need, they will provide a band-aid for a problem in isolation of its history and root causes. The PSA is not the only risk-assessment tool deployed in the United States. The COMPAS algorithm also attempts to predict recidivism, but “blacks are almost twice as likely as whites to be labeled a higher risk but not actually re-offend”. On the other hand, “[whites] are much more likely than blacks to be labeled lower risk but go on to commit other crimes” (Angwin et al., 2016). While it is encouraging that there are additional pretrial system reforms being implemented, there is a place for artificial intelligence to make its mark in our society after we have resolved such social issues that perpetuate bias based on race, gender, income, and other factors. 

To deeply examine the PSA, I had to think about how the LJAF was trying to market their product and how that would affect their published work about it. I was skeptical about the PSA from the beginning, because I could not find any admittance of the algorithm’s possible flaws. This makes sense as it is a monetized product, but I worry that the PSA is being subject to the same technological glorification as other recent developments in A.I. The existing research claiming to validate the PSA is often funded by the LJAF, and it is not always locally validated before implementation as per federal recommendation. It is also surprising that the FAQ page on the Utah Courts website linked documents only by the LJAF; it was clear that no outside or third-party work had been done to vet the algorithm, at least, none that the courts institutions wanted to make available on its website. This meant that I had to look outside of the available information from the primary sources to determine the PSA’s possible flaws. While this may sound like an obvious strategy to research something, I initially assumed that the LJAF and Utah courts system would provide this information for the sake of being honest and transparent. If the LJAF and Utah courts system are operating in good faith by implementing the PSA, they should express greater concern about testing the software and being transparent about their results. 

Works Cited 

Angwin, J., Larson, J., Kirchner, L., & Mattu, S. (2016, May 23). Machine bias. ProPublica. https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing 

Cadoff, B., Chauhan, P., & Bond, E. (2020, October). Misdemeanor Enforcement Trends Across Seven U.S. Jurisdictions. New York City; Data Collaborative for Justice. 

It’s time to end the racist and unjustified sentencing disparity between crack and powder cocaine. The Leadership Conference on Civil and Human Rights. (2021, November). https://civilrights.org/blog/its-time-to-end-the-racist-and-unjustified-sentencing-disparity-between-crack-and-powder-cocaine/ 

Laura and John Arnold Foundation. (2016). Public Safety Assessment: Risk Factors and Formula. Houston.