Do you prefer having all your emails in one place or do you think it is better they are sorted by someone/something else and some sent to your spam folder? Do you enjoy the music or movie recommendations on streaming platforms like Spotify or Netflix? Have you ever expanded your network on LinkedIn following the accounts and people suggested to you because they were similar to your interests? The list of daily applications of AI is endless and we seem not to be bothered by most of it. We find it useful and some say it is a necessary evil. The same is true about the more specific use of AI in relation to language. Most people are fine with their text being checked by apps for spelling and grammar. So, what is it that fundamentally differs when it comes to Large Language Models (LLMs)[1] based on transformer architecture? Aren’t they just another form of AI?
[1] A large language model, or LLM, is a deep learning algorithm that can recognize, summarize, translate, predict and generate text and other content based on knowledge gained from massive datasets. (source) Some examples of LLMs are OpenAI, Google AI, Deepmind, Anthropic, Baidu, Huawei, Meta AI, AI21 Labs, LG AI Research, and NVIDIA.
There is no doubt that LLMs and chatbots based on them, like ChatGPT, can make our life much easier in the same sense that calculators, spell checkers, and spam filters did. On the other hand, with new technology always come new worries and our “fear of the unknown” kicks in quickly, putting us on alert. We immediately start listing the endangered jobs and livelihoods. This issue is packed with nice pieces covering the pros and cons of such bots. But first, let’s see what ChatGPT actually is.
ChatGPT has an LLM called GPT-3 at its core: Generative Pretrained Transformer version 3. Let’s dig a bit deeper:
A generative model is a type of model that produces significant and meaningful content such as text, images, videos, etc. when given input. GPT-3 generates text based on the input-prompts it receives. Large language models (LLMs) are trained using vast amounts of data, which includes all of Wikipedia, Reddit, and other sources, using some of the world’s largest supercomputers.
The trained version of the system, with all the immense knowledge it embodies, is released for use as a “black box”[2] by those who wish to build applications around it—hence the term pretrained.
And here comes the brain:
Language Models, which can make sense of any text input and generate appropriate follow-up text accordingly, are neural networks—networks of simple computational elements that are meant to mimic neurons (brain cells), connected to each other like the networks of cells in the brain. These systems process data and adapt their response behavior by changing the strengths of connectivity, or weights, between neurons, again mimicking a learning process in the brain. A large language model is one that has a lot of neurons and weights. GPT-3, for example, has hundreds of millions of neurons and 185 billion weights. It, like most other LLMs, uses a type of neural network called a transformer. (taken from this amazing article by Ali Minai)
In this issue’s Lite video, one is faced with some fundamental questions, mostly on AI Ethics,[3] and how our interaction with acting technology agents[4] can shape who we are. As language teachers, it is worth pondering on our role in cultivating critical awareness when it comes to our learners’ exposure to and interaction with AI.
[2] In the context of a pretrained model, a “black box” refers to the fact that the inner workings of the model are not transparent or easily understandable. The term “black box” comes from the analogy of a physical black box, which has inputs and outputs, but the internal mechanisms are not visible or accessible. Similarly, a pretrained model may have been trained on vast amounts of data and optimized through complex algorithms, making it difficult to understand how it arrives at its predictions or outputs.
[3] AI ethics is a set of guidelines that advise on the design and outcomes of artificial intelligence. (source)
[4] An agent is a program that collects information or performs a task in the background at a particular scheduled time. (source)
In our main podcast, cognitive scientist Evelina Fedorenko and computational linguist Emily M. Bender discuss language models, whether they understand the meaning of the language they produce, and the evolution of language.
They argue that language is a powerful tool that allows us to work together with others and communicate our thoughts and ideas. However, it is not the only tool in our cognitive toolkit, and complex reasoning can occur without relying on language. Evidence from brain imaging and individuals with severe language problems suggests that language is simply a reflection of the complexities we have in our thoughts, rather than the creator of them. While a system (e.g., LLM) can learn rich statistical patterns of how words go together, it does not necessarily imply complex thought. It is critical to consider the intentions and context of language to understand what someone is saying. You can just learn the regularities of language and produce language (that is what LLMs do) and not have all of the stuff that usually comes along with the language as part of the human brain.
Another important aspect of language is that it is a symbolic system, allowing us to pair discrete sets of basic symbols into larger symbols that form meaning in an open-ended way. This feature of language is essential in building language technology for communication. Communication is not just about passing strings of words back and forth, but rather about using these strings as clues to reconstruct the communicative intent of the speaker. It requires constructing a good hypothesis about what the speaker is trying to convey, which is not always a straightforward task.
While overfunding computer science and AI can lead to false beliefs and misplaced trust in automated systems, potentially harming people, building a generalized intelligence system can serve as a tool to further understand the complexities of the human brain and ask questions we otherwise cannot.
Sit tight, as you are in for a ride in this issue. Our contributors have written on Predictive Language Processing in Human Brains and Machine Learning Systems and on the intersection of ChatGPT and education, covering different aspects of ChatGPT’s benefits and drawbacks. You might even end up, like many people, using it playfully. According to a survey done by the journal Nature, 80% of respondents have used AI chatbots—and 57% say they use them for “creative fun,” This is what I did and it did not disappoint:
Input[5]: Who would win in a fight: Batman or Iron Man? Please answer in haiku form.
ChatGPT:
Dark Knight, Caped Crusader,
Versus Iron Man in armor,
Fierce battle ensues.
With gadgets and tech,
Iron Man has the upper hand,
But Batman’s grit wins.
[5] Credit goes to Matt Ehlers
Mohammad Khari is an English lecturer at Ozyegin University, Istanbul. He holds a BA in English Literature, an MA in Philosophy of Art, and a CELTA. Mohammad has been reading and researching on the integration of neuroscience into pedagogy.