Henry Scheuller – ChatGPT and Coding

Scheuller 1

Henry Scheuller

Praxis Lab in Responsible AI

11/20/23

AI Audit: ChatGPT’s Impact on Programmers in Utah

Will ChatGPT’s programming algorithm replace the jobs of those who work in programming-based careers here in Utah? According to several sources, ChatGPT is not only taking the world by storm with its impressively fast coding knowledge but threatening the delicate salaries of programmers themselves. This audit is a complete breakdown of finding out how ChatGPT analyzes data, and how this relates to the careers of successful programmers in the state of Utah.

The process of data collection in ChatGPT is based on OpenAI collecting data older than

January 2022, when its training cut-off day limited the data of the algorithm. The model of ChatGPT relies on unsupervised learning, which can be very dangerous. When analyzing how it inputs certain amounts of unlabeled, spooky data, the model learns from text without human labels for every input. Across the world, workers are employed to perform tasks like web scraping and data cleaning. Like Facebook approvers, there must be data approvers to ensure that false information is not so blatant in the model itself. These ChatGPT data analyzers will be used in my audit process.

For GPT-3, human workers across the globe were used to review outputs of certain code and make sure rate outputs made sense and did not produce error. Human reviewers would “finetune” what GPT was essentially doing on its own. The feedback loop of OpenAI’s software with human reviewers was necessary to ensure that the information generated was not only accurate but also appropriate. The main takeaway is that human reviewers were not sitting at Scheuller 2

some computer manually inputting code into the software, but rather were there to fine-tune any errors for misguided code or error-bound results.

According to LinkedIn author Tim Olsen, ChatGPT “can write code, but its capabilities are currently limited.” ChatGPT is being used to add on lots of lines of already known code, that adds a level of ease to programmers not having to re-type what they already know. With its lightning speed, ChatGPT is pushing the boundaries of due-dates and deadlines, because it makes a programmer’s job so easy to put in a simple prompt and pull-out hundreds of lines of code with ease.

Another common use for ChatGPT’s programming software is error detection. According to Awesome Screenshot’s author Miley, ChatGPT makes some aspects of programming, such as error detection, hundreds of times easier. Simply adding code with several errors and using critical thinking to shift the prompt around allows a programmer to clean up code a lot quicker than manually going line by line and testing what seems to be the problem. For many, this algorithm has been a lifesaver in error detection for programming.

A last example of how ChatGPT’s programming algorithm is beneficial to programmers is the concentration of hundreds of lines of code into only a few lines of code. According to iRod author Hela Roshna, AI systems like ChatGPT are used to assist narrowing down unnecessary lines of code into a simpler structure. Inputting “messy” code can be fixed with the data that ChatGPT inherits, because of its expert level of cleaning up code and detecting where some lines could be simplified.

In a perfect world, with a blue-sky audit, there is a preferred method to take when diving into whether ChatGPT’s algorithm will threaten programmers in Utah’s foreseeable future. First, analyzing the backgrounds of the “cleaners” of GPT’s data is crucial to knowing the validity of Scheuller 3

GPT’s data on programming. In this audit, a selection of hundreds of data analyzers who work for OpenAI could be used to prove what type of work they are doing. Knowing what they prefer to sift through will help gain a better understanding of what limits the software. Second, analyzing what makes a human programmer have a ‘human touch’ in their career should be compared to that of ChatGPT’s software. What is it that makes ChatGPT “limiting” when compared to expert programmers according to Toolify authors.

My experiment for a perfect audit is to line up a long list of expert-level programmers who work for a company called Loan Pro and have them fight out tough programming issues against ChatGPT. A various prompt will be given to the human and the AI, and whoever can solve the problem with the code more efficiently and creatively should be able to stand on top. What I have realized about reading most of the articles is that humans have a way of looking at an abstract angle to coding problems that ChatGPT does not have. I would love to test this in the next part of my blue-sky audit.

Based on the results of the coding competition, I would like ChatGPT to break down the steps it took in solving the problem as well as the human. Analyzing hundreds of responses from humans and ChatGPT could ensure significant data that proves who solved the problem better based on certain factors. Since “better” is subjective, I think the results should be based on a balance between speed, correlation of code, approach to adding or deleting code, and final output. These factors, in my opinion, would help us as a class analyze whether an expert-level programmer is still needed in comparison to ChatGPT.

The audit of ChatGPT’s programming data collection is not as simple as it seems on the surface. Several different factors go into the creation of its data, but as a class that has studied AI Scheuller 4

for several months, my proposal does not seem too far-fetched based on what we have learned about deep learning algorithms like ChatGPT.

First, up to this point we have learned from several guest-lecturers and readings. A way we could apply that knowledge is to write down what makes for good data and bad data. By forming in-class groups, several points could be made about what makes excellent programming data, and what makes for bad data. Once we have key points as to what type of beneficial data we are looking for as a class, that is when we can move onto the next step and start diving into the types of data that programmers have and the types of data ChatGPT has.

In my opinion, humans are much easier to analyze than ChatGPT because the AI software that OpenAI uses is seemingly limitless. When you talk to an expert about what they know and how they learned it, it is much easier to backtrack how they got to the position they are at. So, to start data analysis of humans, I propose we gather several experts in Utah who have programmed for years. Different groups of students in our class could select certain programmers they want to interview and go about data collecting.

For the questions asked to the professionals, the questions need to be unbiased. Some good questions to ask are: What is your education? What types of websites did you use to learn how to program? What tools do you use to help you understand the language you program in? What types of languages do you use and how do they correlate to each other? How long have you been learning how to program and what was the first source you used to learn? What can’t you do with the knowledge and skills that you possess? If given a complex problem, what is your attack strategy to going about solving the issue?

These questions should be asked to at least 50 experts to get a solid set of data in my opinion. This would be a large enough sample size of human data to compare to the data Scheuller 5

collection of ChatGPT. Which is when the data would have to be collected from ChatGPT. Since this may be the hardest part of the audit, this would have to be done in several parts.

I propose meeting with at least one expert who works on data analysis on the Open AI team. Having at least one expert on GPT 4 would be very beneficial to knowing their thought process of what they choose to keep versus get rid of when sifting through tons and tons of data. Back to a point I made earlier, knowing what gets sifted through and what doesn’t is a crucial step of analyzing the deep learning model of GPT.

We would want to get at least 50 sources from GPT 4 that were used in data collection for a specific language, preferably C++ or any C language. Many of the experts use C because of its complexity, and I propose that knowing what types of websites, articles, and texts that OpenAI used for its pre-2022 data model would be very beneficial for us. In the groups that we have already created as a class, each group could be assigned several related sources that were used for the model and come up with key points that go over how that built up or put down the overall data model of ChatGPT. With the similar points all grouped together, I propose we come up with a conclusion statement of the points that states how the sources are good and bad for the model.

Moving on, I want to combine what we did in the first few steps. Just like I had proposed in my blue-sky audit, I think that creating a programming competition and having experts go up against ChatGPT in a competitive and complex environment would be fun and beneficial to learning about the future of Utah’s programming careers. Using a professor through the CS department at the University of Utah, I would like for them to come up with a programming problem (that a human thought up of) and start a timer where a professional expert and ChatGPT go up against each other to solve the correct answer to the problem. This competition would be fair if an amateur was inputting the information into ChatGPT because that would prove the Scheuller 6

software itself is doing the thinking and not another expert. When the time is over, as a class we would have groups ready to ask each expert a question about their competition.

To finish the audit, each group could analyze the steps their expert took and the steps the AI algorithm took to solve the problem that was given. As a class, we could go over the pros and cons of each expert and each test with ChatGPT. Using the pros and cons, during class we could have a listed conclusion as the overall thought process of the experts and the overall algorithm process of analysis from GPT and with a non-biased approach estimate whether programmers are really in trouble or not.

Throughout this process, I learned so much about the functionality of ChatGPT and how this will shape careers. After all, Computer Science is one of the most popular majors at the University of Utah, and this information could benefit millions of students and aspiring programmers. The significance of this study is so incredibly large because there are people aspiring to be programmers across the country and the world and with AI seemingly springing up out of nowhere, it would really benefit the masses if a small group of honors students came up with the answer to a question on many of those programmers’ minds.

One of the issues that we have discussed in our class discussions is machine-learning bias. Very early on, one of the articles we talked about was from Pro Publica on Machine Bias and one of the quotes that stood out to me was about a girl named Sade Jones that states “Jones, who had never been arrested before, was rated a medium risk. She completed probation and got the felony burglary charge reduced to misdemeanor trespassing, but she has still struggled to find work” While this quote may seem out of context to programming, if the algorithm makes that bizarre of an accusation, one can only imagine what sort of intrinsic problem-solving skills machine learning has towards complex code that impacts us directly. Scheuller 7

To conclude my design audit, I know personal names of programmers who were fired recently because of ChatGPT’s algorithm. With the brain power we have as an Honors Praxis Lab, we can come up with an answer that is being asked all around us and help clear up confusion amongst many who feel as though they may have no economic future in programming. While I want to have a hopeful outcome, if a more realistic negative outcome is found I see that as beneficial as well.

Works Cited

Hela Roshna. Is ChatGPT A Threat To Developers?. Company: iRod, 2023. Scheuller 8

https://www.iroidtechnologies.com/blog/is-chat-gpt-a-threat-to- developers#:~:text=The%20first%20is%20that%20it’s,in%20a%20variety%20of%20settings.

In this source, ChatGPT is analyzed at its core for either being threatening or non-threatening. I like how there are pros and cons to the software, and the different companies that were listed. This is a helpful article for learning about the model of

ChatGPT, which helps structure my audit better.

Tim Olsen. Chat GPT: Should software developers be worried? Company: LinkedIn. 2023.

https://www.linkedin.com/pulse/chat-gpt-should-software-developers-worried-tim-olsen/, I like how all this information comes from a hiring site; this was very nice to trust this source. This article explained about interest levels in Chat GPT and the jobs at risk. This article analyzed the limited knowledge and extensive knowledge of the software. This was a great breakdown.

Lois Shuttleworth. ChatGPT: a threat or an assistant for software engineers? Source: Share Source. 2023. https://www.sharesource.com.au/learning/chatgpt-a-threat-or-an-assistantfor-software-engineers

What was nice is how much information this article gave on ChatGPT in the tech industry. While that was not the complete aim for my audit, this was more helpful in knowing the growth impact on jobs. I like how it plays into salaries as well and different versions of ChatGPT 4. This article was specific. Scheuller 9

Miley. Can ChatGPT replace programmers? Exploring the Possibilities. Source: Awesome Screenshot. 2023. https://www.awesomescreenshot.com/blog/knowledge/will-chatgptreplace-programmers

Awesome Screenshot delved into the vast amounts of code and documentation provided by ChatGPT. The specialty was in existing code and critical thinking skills of programmers and that importance. This was more on the side biased towards programmers. I like how protective this article was of people.

Techlead. The Imminent Threat: ChatGPT AI vs Programmers. Company: Toolify.ai. 2023. https://www.toolify.ai/gpts/the-imminent-threat-chatgpt-ai-vs-programmers-67922

This broke down specific steps on assistance with code using ai and using an actual person. This also brought up the many benefits of having a person who knows how to get around specific problems on ai vs. a simple algorithm. However, this is a non-biased article that I really enjoyed reading because it was positive and outlined the competitive nature of ChatGPT.