It’s hardly hyperbole to say ChatGPT changed technology overnight with its release in November 2022. Since then, generative artificial intelligence (AI) models have sprouted left and right, creating scripts, photos, and Some are even diagnosing diseases.
While the AI revolution is promising to some, others fear it may take away jobs. Shaikh Arifuzzaman, professor of computer science in the 51ԹϺ College of Engineering, lands with the former; he assures us that we can harness the power of AI for good.
“I know many of us are a bit reluctant because this is such disruptive technology,” said Arifuzzaman. “But humans can adapt. AI is here to help us. It will make many things easier. It’ll close gaps.”
Arifuzzaman's research revolves around machine learning (ML) models and scalable high-performance computing techniques, with particular emphasis on the applications of natural language processing (NLP) and data science. He leverages his expertise to provide a primer on how NLP technology works and where he predicts AI will take us.
How do NLP models like ChatGPT process words?
When an NLP model like ChatGPT “reads” text, it first breaks the passage into smaller units called tokens.Tokens can be whole words, parts of words, or even just characters. Each token is then represented as a vector, which is a high-dimensional array of numbers. These vectors are like points in a multidimensional space, capturing the token’s meaning and context.
For example, in ChatGPT-3 – the first widely available version of the tool – each token is represented by a vector with 12,888 dimensions. Each dimension encodes nuanced relationships between tokens based on the model's training data.
Traditional NLP models process the input passage sequentially, or one token at a time. But language is fluid, so words can mean something in one context, but have a drastically different meaning in another context. When text is processed sequentially, you lose that context.
ChatGPT went one step further. “GPT” stands for “generative pre-trained transformer.” Broadly speaking, a transformer is an architecture that processes all tokens in the passage simultaneously, allowing the model to relate tokens to each other and pick up more subtle aspects of language. This ability to process tokens simultaneously makes transformers, such as ChatGPT, particularly effective at capturing the complexity of natural language.
How do you train an NLP model?
Training an NLP model like ChatGPT involves three key stages: data preparation, model training, and evaluation. First, researchers collect a large dataset of text, which is cleaned and tokenized using the aforementioned process.
During training, the model learns to understand and generate language by predicting the next token in a sequence. The process uses a technique called backpropagation, where the model calculates its prediction errors and adjusts its internal parameters to minimize these errors. This is done iteratively over the dataset in batches using a computational framework optimized for large-scale parallel processing.
Throughout training, a validation dataset is used to test the model’s performance on unseen data. After training, the model’s performance is evaluated on a separate test dataset to ensure it generalizes well.
Once deployed, the model operates based on its training and does not continuously learn from new data, unless retrained. Often, fine-tuning a model like GPT is done to specialize it for specific tasks or domains, improving its performance and relevance in those contexts.
Why has there been such a sudden rise in AI development?
That’s a great question. Machine learning and AI technologies are essentially based upon neural networks, something we’ve had for over 50 years. We’ve theorized ML and AI that whole time but two big conditions were met recently that made it a possibility.
First, computing technology is far stronger than it was 50 years ago. The cell phones in our pockets are a million times stronger than computers of the 1970s. Or if you remember floppy disks, they used to hold 1.44 megabytes of information. Your phone’s storage probably holds 100,000 times more.
Second, an extremely large amount of textual information is now available on the Internet. Everything from websites and PDFs to social media posts and tweets are now available for anyone to sift through — and it’s constantly growing. This provided a huge and diverse dataset to train AI models.
What should people keep in mind when they use generative AI, such as ChatGPT?
The first thing you must consider is the correctness of the information any generative AI model outputs. These models take a vast amount of data and are essentially trained to imitate form. They don’t necessarily understand the factual aspect of the output; they just try to generate an answer that looks normal.
Let’s say a student researcher is assigned a literature review. If they ask ChatGPT to give them a list of relevant papers and links, ChatGPT will produce a list of article titles that sound relevant but the article itself may not exist. It might list a URL that looks real, but the website doesn’t exist. The point is, ChatGPT will give you information, but it’s the user’s responsibility to double and triple check that the information is correct.
Another issue that may arise is bias. Any AI model must be trained on human-created data, and all humans have bias. So it is very possible that bias may be transmitted to the AI model. Especially on a global scale, there could be significant cultural differences between Western and Eastern societies, and if an AI model’s training data isn’t diverse enough, it could very well become biased in one direction or against another.
How do you predict NLP technology will be applied in the future?
There’s room for NLP models to revolutionize many different sectors. In education, NLP technology could one day give students access to personalized tutoring aids. , helping students communicate their exact needs and struggles. The GPT can build a study plan complete with specialized learning materials that are tailored to the student.
And it’s not just students; NLP technology can help educators too. Now they’ll have generative AI tools that can create teaching materials, and educators can focus more personal attention on their students.
This technology can also use information synthesis to streamline health care. A large part of diagnosing patients is analyzing verbal information, like when a patient tells a health care provider about their symptoms or medical history. One day, NLP models could be used to parse through a patient’s words and highlight different possibilities for the health care provider.
The other side of information synthesis is seen in legal settings. Many of us don’t understand all the terms and language in legal documents. NLP technology will provide an easy way to process and summarize these documents, and even translate them into other languages if necessary.
A note from the 51ԹϺ News Center:
The author of this article, Anthony Paculan, passed away in December 2024, shortly before this story was originally scheduled to be published. Anthony worked as the assistant to the director of communications for 51ԹϺ's College of Engineering and had just completed his Bachelor of Journalism from 51ԹϺ. His passion for journalism and the subjects he covered was evident in his work and in his interactions with everyone who knew him. He was always positive, creative, and worked diligently on every assignment, including this one. He will be deeply missed.