Generative artificial intelligence like ChatGPT is susceptible to several forms of bias and could cause harm if not properly trained, according to artificial intelligence experts.
“They absolutely do have bias,” expert Flavio Villanustre told Fox News Digital. “Unfortunately, it is very hard to deal with this from a coding standpoint. It is very hard to prevent bias from happening.”
At the core of many of these deep learning models is a piece of software that will take the applied data and try to extract the most relevant features. Whatever makes that data specific will be heightened, Villanustre noted. He serves as Global Chief Information Security Officer for LexisNexis’ Risk Solutions.
He added that bias could have several degrees of potential harm, starting with lower-level issues that cause users to shut down their interaction with the model and report the problem.
However, generative AI like ChatGPT is also prone to “hallucinations,” an outcome that occurs when the system generates something that seems factual, formally correct, proper language, and maybe even reasonable but is completely bluffed.
“It doesn’t come from anything that the system learned from,” Villanustre said, noting this issue goes beyond bias and could cause harm if people believe these pieces of information.
Speaking with Fox News Digital, Jules White, Vanderbilt University associate dean for strategic learning programs and an associate professor of computer science and engineering, said generative AI like ChatGPT is primarily proficient at generating text that looks like a human-produced it.
Sometimes this produces text that includes accurate statements and facts, while other times, it produces inaccurate knowledge. According to White, a fundamental misunderstanding of how the technology works could also create an “unconscious bias,” wherein a user could believe a model is a tool for generating and exploring facts versus a text-generating tool.
“The number one biggest, in my opinion, source of bias in these tools is the user,” he said.
In this case, how users choose their words, phrase a question and order their inputs greatly affects what kind of responses the generative AI will spit out. Suppose a user crafts the conversation in a specific direction. In that case, they can have the AI generate an argument on one topic and then have it argue the opposite side of that issue just by asking.
White also noted that a user could ask ChatGPT the same question repeatedly, receiving different responses each time.
“I think of it as any other tool that a human could use from a gun to a car, the way the user interacts with it that’s going to generate the real bias in this,” White said.
Villanustre also agreed that user interaction could generate bias regarding reinforcement learning. As the users indicate the degree to which they like or dislike the content the AI puts out, the system will learn from that feedback.
“You run the risk because humans sometimes have a tendency to be biased that the AI will start learning that bias as well,” he added.
He mentioned the infamous Microsoft artificial intelligence “Tay,” which was shut down in 2016 after tweeting out a series of racist and antisemitic messages, as an example of how people can influence chatbots.
“It became a monster, but it may be a reflection of us in some way,” he said.
Outside user-created bias, White said there is also a degree of bias created by the developer.
For example, safeguards are in place to prevent ChatGPT from generating a malicious email to trick people, code that could cause harm to other software, or text created to impersonate someone to grant access to private information.
Sugandha Sahay, a technical program manager at Amazon Web Services, detailed to Fox News Digital how artificial intelligence like ChatGPT gathers data and determines how to output it. Many of these steps can unintentionally introduce bias into the model.
One of the more common ways that biases form in generative intelligence models is in the training data itself. If the data, for example, contains offensive or discriminatory language, the model could generate text that reflects such language.
In this situation, Villanustre said these biases only get amplified by the system.
“At the core of all of these deep learning stacks, the system will try to extract the elements from that training set that are then going to be used to generate things in the system. If there is a particular area that training set tends to appear repeatedly, it is likely that it will start to generate bias,” he said.
Human bias can also play a factor in the creation of bias within an AI model. Many of these systems utilize human-driven annotation. If a person introduces their own biases into the labeling process, they could become ingratiated in the model.
Additionally, bias could be interested in the design of the model architecture itself or its evaluation metrics. In the former, if a model prioritizes certain information or language, it has a higher likelihood of biased text. In the latter, assessing a model’s performance can also introduce bias.
Sahay said it is important to address biases and eliminate them from generative intelligence models. A company or programmer can do this by carefully curating data training, using diverse data sources, and evaluating the model’s output.
In essence, generative intelligence like ChatGPT is not biased in and of itself. But the model it uses to generate content is.
“The code itself typically, unless you go out of the way to try introduce bias, which is almost impossible, is not necessarily the guilty party here,” Villanustre said. “The training set and the users using it, yes.”