Atticus Benson
HONOR 3700 Responsible AI
12/3/23
The Algorithm I chose to audit is the FICO credit score algorithm used for calculating the widely used credit score. The credit score is incredibly important in a wide variety of societal interactions involving finances. Credit scores are the basis on which lenders decide to give borrowers money including home loans, auto loans, credit card, mortgage rate, and more. In addition to being used to judge lending, they are also used by people to determine the reliability of others such as determining whether a tenant is a good tenant for housing. Overall the FICO credit score algorithm is an incredibly important algorithm to audit because it has great influence over just about every corner of a person’s adult life in regards to finances.
I am particularly interested in auditing this algorithm because of the conversations we had in this class about bias and inequality within the realm of AI. On a human level, fairness and equality has always been an extremely important idea to me and simply looking at the idea of a score that decides how the future of someone’s life can go based on very discrete and vague categories seemed like grounds for an extremely vulnerable and possibly problematic algorithm. The goal of my audit is to come to a conclusion about whether or not there is a clear gap in fairness in the way that the algorithm is applied to a variety of different people and to determine if there are groups of people that are disproportionately affected in either a good or bad way by the algorithm as well as to expose the flaws regarding the algorithms lack of transparency which also contributes to the issue of fairness. The algorithm itself is not publicly available so all people have to base their understanding of the algorithm off of is what FICO provides on their website which the goal of this audit is to demonstrate how unclear and lacking the information available to the public is. My clear central research question is to find out the groups of people who either benefit or are hurt by the algorithm unfairly and to find reasons as well as to highlight the issue of transparency in relation to the FICO credit score algorithm and its calculation.
My blue sky audit of this algorithm would utilize as many professional resources as possible in order to develop a statistically sound conclusion backed by large scale evidence, and professional calculations leading to absolute conclusions. The main two features of my blue sky audit are large scale data collection, and statistical analysis of this data. My ideal situation would include creating a survey which could be sent out on a large census level basis which would require a large amount of resources and connections. This survey would contain questions which would help identify the demographics of the people that are taking it down to very specific categories like age, race, ethnicity, gender, location, jobs, wealth, and other identifiers that professionals could show are proxies for any main demographic categories, or that can be shown to have a direct connection and influence to credit score calculation. I would also include an option to anonymously share their credit score alongside their demographic information. This would allow me to pay statisticians to conduct a large-scale analysis of demographic information in direct relation to credit score to find at a base level if there is statistical evidence that certain groups of people were disproportionately affected in the calculation of their credit score due to their identity. On a large scale this would help to answer the first part of my research question. Using the statistical analysis from these statisticians, I could establish a strong link between certain groups of people and their credit scores, but this only solves half of the problem and answers half of my research question. The other half must be answered by studying the available information provided to the public by FICO about how credit scores are calculated, and examining how much each person in the survey really knows about how their credit score is calculated. The survey would also include a large amount of detailed questions about a person’s knowledge of what goes into calculating a credit score. Among these questions the survey could include questions about the 5 different categories that make up the creation of the credit score, what each of these categories actually include, and how much each of these are supposed to contribute to the overall calculation. This step is also incredibly important because finding out how much the general public knows about possibly the most important algorithm pertaining to their daily lives would give great insight into the lack of transparency of the company about their algorithm and also about what changes in specific areas especially need to be made.
My proof of concept algorithm started with my study of the publicly available information provided by FICO itself about the way that its credit scores are calculated. In quick and simplistic terms my research found that there are 5 main categories which FICO uses to determine a credit score. These 5 categories also have different weight percentages on how much they affect the overall calculation of a credit score. The categories with their percentages are: Payment History (35%), Amounts Owed (30%), Length of Credit History (15%), Credit Mix (10%), New Credit (10%).[1]These categories each have basic descriptions within them that give vague and basic information about what each category means. Throughout my analysis of these descriptions two things became clear to me very quickly. The descriptions of the categories seemed very intentionally vague, with no option or link to take you to a further description if you desired, all you are given are broad terms to describe examples of each category. For example, in the Amounts Owed section which is 30% of the weight of a credit score one of the factors FICO says is used in determining the weight of this section is “The amount owed on all accounts” which is then described as “Note that even if you pay off your credit cards in full each month, your credit report may show a balance on those cards. The total balance on your last statement is generally the amount that will show in your credit report.”[2] This provides one example for credit cards but raises a confusing caveat that even if you pay off your credit cards your credit report may still show a balance. For someone like me who is younger and doesn’t understand how everything works, this is incredibly confusing and not helpful in establishing a clear understanding between me and FICO for how my credit score is calculated. This section also does not provide what accounts are included in “all accounts” which again makes it incredibly difficult for a young person or a person trying to establish their credit score to understand what is needed for them to have a good credit score.
The second part of my audit involved asking people questions about their credit score, how much they know about credit score calculation, and their demographic information. I was not able to do this on a large scale like I hoped so instead I was limited to asking friends and family. There were a couple overwhelming trends that I will share here, and then compare with studies that have already been conducted to make a conclusion. The first trend I noticed from my questions is that people my age had very little understanding of what went into a credit score other than just paying off credit cards and some people mentioned auto loans if they had purchased a car. My parents and other adults their age knew more about what types of payments affected their credit score but even when I gave them the definitions provided by FICO for the categories determining credit score calculation they did not know what everything meant. My grandparents were my last interviewees and my grandpa who is very into finance and generally knowledgeable had the most understanding of the credit score, but my grandma who has not spent the years gaining knowledge about their finances did not know much at all about the calculation. Another big trend that was present across the three age groups and multiple races, ethnicities, and genders that I interviewed was that not a single person was able to name the 5 categories FICO provides and when I provided details for them, they did not know what everything specified meant, and since there was no further description provided from FICO, there was no way to further clarify to them any questions they might have had. In addition to just the FICO provided information I was also able to find a reverse engineered algorithm written by a data scientist which provided a readable and clear framework for exactly what goes into a credit score. This article found for a “perfect” credit profile which is classified as a credit score greater than or equal to 840 had over 30 separate specific down to the number requirements to obtain such a score which are never shown anywhere by FICO itself.[3] Although these categories specified in this article do all fit into the 5 categories that FICO specifies, it demonstrates how credit score is much more complicated than most people could reasonably understand and the precision with which this reverse engineered algorithm has determined the requirements for a perfect credit score seem nearly impossible to reach without professional guidance which only a select few people have access to. A study conducted on average FICO credit score by age race and state found that the younger you are the worse your credit score typically is by a significant amount, and that the order of average credit score in ascending order is Black (677), Hispanic (701), Other (732), White (734), then Asian (745). It was also shown that the 10 states with the lowest credit scores were also mostly among the poorest states in the U.S. by a significant margin.[4] This contradicts the claims that FICO makes about being in debt and not having much money not necessarily affecting your credit score negatively, there is a clear correlation between income, geography, and credit score. Although these studies were broad and found the nationwide average and did not have ultra specific details on each individual combination of demographics, I believe that this data shows reasonably that certain groups of people like young people belonging to a minority group are at a significant disadvantage in the world of credit compared to someone that grows up in a generally wealthier area or belongs to a majority group. These two findings to me demonstrate the answer to my research question. Transparency is a major issue affecting this algorithm as people who are less technologically literate and do not have the experience to understand what is actually going on will not be able to achieve a score as good as those who do. Data from past studies also shows that there is definitely a fairly significant difference in credit score for certain groups and that there are a variety of factors that make the algorithm disproportionately affect people.
Throughout the process of creating and designing my proof of concept algorithm audit I found I had a couple successes and a couple of difficult things that happened. Successfully, I was able to analyze the FICO credit score publicly available resources to give myself a solid base understanding of the processes that go into the calculation of the FICO credit score. It also allowed me to then combine my knowledge of the publicly available explanation of the algorithm with the resources I found detailing the algorithm such as the reverse engineering of the algorithm. Going through this process made me realize that in the future if I ever question the process of an algorithm I don’t understand, starting with reading as much publicly available information as I can is a very important first step. Without first having an understanding of what a producer feels comfortable with providing you as a consumer, you cannot then understand where there is room for improvement and where more transparency is needed. Looking at the information on the FICO website I was pleasantly surprised by how much information there was, but then after deep diving into the resources I found which completely reverse engineered the algorithm, I was shocked to see how much information was truly left out. This was where I was able to be most productive with my algorithm as my analysis was not dependent upon gathering data and trying to come to a mathematical conclusion myself, or having to find help to understand the data that I gathered, but I was instead able to compare and contrast my own understanding with the findings of others about the exact same topic and determine where the gap in knowledge took place. This allowed me to form important conclusions in an efficient and meaningful way without having to perform difficult, complex, calculations from large, difficult to acquire datasets. This leads into the biggest struggle that I had in this process and what I learned was difficult throughout this audit process, data collection.
The biggest issue I found with my audit was collecting the data that I wanted to. I learned that it really is incredibly difficult to get data on a scale by yourself that would be significant at all, and in the end the only people I was able to interview were my family and people close to me that were readily available. It would have taken a lot of professional help to create a proper survey that collected data from a significant size of people, and I just was not prepared to do that. This caused me to mostly have to make conclusions about statistics based off of others peoples research which was again difficult because I do not have all of the skills necessary to create perfectly mathematically backed conclusions, but instead relied more on inference of what the data meant.
My audit ties into the issue of algorithmic transparency which we talked about in class most closely. Algorithmic transparency is a very important issue in the world of AI because if consumers do not understand what is going on inside the algorithms that are being used to constantly monitor, evaluate, and frankly manipulate them, the gap between consumer and producer continues to widen and allows the privacy of the average person to slowly erode day by day until corporations are able to essentially control the entire technological landscape of a person’s life without that person ever having a clue, much less a say, in how they are being evaluated behind the scenes. In class we discussed how transparency issues make it hard to evaluate an algorithm for fairness and equality. This was especially apparent in discussions revolving around topics such as PredPol where it was never made clear to the public what exactly was going into the algorithm which had drastic real life consequences for those subjected to its methods. In certain cases these predictive policing and sentencing algorithms were swaying the minds of judges into waiving their own thoughts about a sentence and changing it to be harsher, even if there was no explicitly described reasoning with supporting evidence other than an algorithm supplied number. Without transparency, the algorithms cannot be evaluated for fairness and that became incredibly apparent in my audit of this algorithm. Particularly after comparing and contrasting the outcomes of the reversed engineered algorithm, the fact that there are truly way more factors than just 5 categories which all contribute heavily to the calculation of a credit score on their own which can be tied to external factors is quite comparative to that of predictive policing. In the end a credit score is just a number the same as a risk score, and also similarly, a lack of transparency in the way that the algorithm is truly calculated prevents people from understanding why their credit score is what it is like people do not know why a risk score is what it is. This audit proved to me what the true importance of transparency is and comparing my findings with the findings of others in the discussions we have had related to algorithmic transparency reinforce why I think transparency is the most important thing we should be pushing for, especially with FICO because of the real world drastic impacts a single number can have.
[1] How are FICO scores calculated?. myFICO. (2022, February 23). https://www.myfico.com/credit-education/whats-in-your-credit-score#:~:text=This%20data%20is%20grouped%20into,information%20in%20your%20credit%20report.
[2] How are FICO scores calculated?. myFICO. (2022, February 23). https://www.myfico.com/credit-education/whats-in-your-credit-score#:~:text=This%20data%20is%20grouped%20into,information%20in%20your%20credit%20report.
[3] Fowlie, M. (2023, January 1). Reverse engineering fico 8. Medium. https://medium.com/@mfow/reverse-engineering-fico-8-d2d68315d20
[4] Sandberg, E. (2021, February 16). Average credit score by age, race, State & Income. BadCredit.org.