At this time in 2023, we made some bold attempts. In an industry full of changes, we tried to predict the future.
How did we do? Here are our four major predictions for 2023:
Prediction One: The next big advancement in chatbots will be multimodal.
Outcome: The prediction was accurate. The most powerful large language models currently, OpenAI's GPT-4 and Google DeepMind's Gemini, can handle text, images, and audio.
Prediction Two: Policymakers will introduce strict new regulations.Translation of the provided text into English:
Results: The prediction was accurate. The Biden administration's executive order was issued in October, and the EU's "Artificial Intelligence Act" finally reached a consensus in December.
Prediction Three: Large tech companies will feel the pressure from open-source startups.
Results: It's a mixed bag. The trend of open-sourcing large models continues, but AI companies like OpenAI and Google DeepMind still dominate the field.
Prediction Four: Artificial intelligence will permanently change large pharmaceutical companies.
Advertisement
Results: It's still too early to tell. The AI revolution in the field of drug discovery is still unfolding, but the first drugs developed using AI are still a few years away from being released.Now, here we are again.
This time, we have decided to overlook the obvious things, such as the fact that large language models will continue to dominate and regulatory agencies will also become more daring.
Issues such as biases, copyright, and doomsday theories in the field of artificial intelligence will affect the agendas of researchers, regulatory agencies, and the public, and this phenomenon will continue in the coming years, not just in 2024.
Therefore, we have chosen to (predict) some more specific trends. Here are the things to watch out for in 2024, and when this time comes in 2025, remember to come back and see how our results are.
Customized ChatbotsChatbots continue to emerge one after another. In 2024, tech companies that have invested heavily in generative artificial intelligence will face pressure to prove that they can make money from related products.
To achieve this, AI giants Google and OpenAI are betting heavily on "small things." They are both developing user-friendly platforms for ordinary people, allowing people to customize powerful language models and create their own mini chatbots to meet their specific needs without programming skills.
Both companies have launched web-based tools that allow anyone to become a developer of generative AI applications.
By 2024, generative AI may be very useful for ordinary non-technical people, and we will see more people refining countless small AI models. The most advanced AI models currently, such as GPT-4 and Gemini, are multimodal, which means they can not only handle text but also images and even videos.
This new capability can unlock a whole bunch of new applications. For example, real estate agents can upload the text of previous property listings, and with just one click, fine-tune a powerful model to generate similar text. They can also upload videos and photos of new listings, allowing customized AI tools to generate property descriptions.Of course, the success of this idea depends on whether these models are reliable. Language models are prone to fabricating facts, and generative models themselves are filled with biases.
They are also easily hacked, especially when they are allowed access to the internet. Technology companies have not yet addressed these issues. When the novelty wears off, they will have to provide their customers with methods to deal with these problems.
Video generation, the second wave of generative artificial intelligence
The speed at which people become familiar with magical things is unimaginably fast. In 2022, the first generative models for creating images became mainstream, and people quickly became accustomed to their presence.On the internet, tools like OpenAI's DALL-E, Stability AI's Stable Diffusion, and Adobe's Firefly have created a plethora of images, such as a Pope dressed in Balenciaga, a burning Pentagon, and so on, leaving people agape.
However, beyond being entertaining, this phenomenon also brings other things. Alongside amusing images, there are those that carry gender discrimination and stereotypes.
The latest frontier in this field is text-to-video. We anticipate that it will translate all the good, bad, or ugly aspects of text-to-image into video and amplify them.
A year ago, when generative models were trained to stitch multiple images into a few-second video clips, we first saw what they could do. The generated results were unreal and unstable at that time. But technology has advanced rapidly.
Runway is a startup that produces generative video models (it also co-created the Stable Diffusion model), and it releases new versions of its tools every few months.Its latest model, Gen-2, can still only generate videos a few seconds long, but the quality is astonishing. The best clips are not much inferior to Pixar animations.
Runway has established an annual artificial intelligence film festival, showcasing experimental films made with a series of artificial intelligence tools. The 2024 film festival has a prize pool of $60,000, and the top 10 films will be screened in New York and Los Angeles.
It is clear that top studios have also taken note, with film giants including Paramount and Disney currently exploring the use of generative artificial intelligence in the animation production process. This technology is being used to achieve lip synchronization in foreign language dubbing.
It is reshaping the possibilities of special effects. In 2023, "Indiana Jones and the Dial of Destiny" utilized deepfake technology to showcase a younger Harrison Ford. This is just the beginning.
Beyond the big screen, deepfake technology for marketing or training purposes is also on the rise. For example, the UK company Synthesia has developed a tool that can turn an actor's one-time performance into a continuous stream of deepfake avatars, capable of reading any script you give them with the press of a button. According to the company, its technology is currently used by 44% of the Fortune 100 companies.The ability to accomplish many things with very few resources presents a daunting challenge for actors. Concerns about the use and misuse of artificial intelligence in studios have become the core of the Hollywood SAG-AFTRA strike in 2023.
The impact is just beginning to emerge. "The way movies are made is undergoing a fundamental change," said independent filmmaker Souki Mehdaoui, who is also the co-founder of the creative technology consulting firm Bell & Whistle.
Artificial intelligence-generated false election information will be ubiquitous.
Look at the recent elections, artificial intelligence-generated false information and deep fakes related to elections will be a huge problem. In 2024, we will see a record number of people participating in voting, and we have also seen politicians weaponize these tools.
In Argentina, two presidential candidates created artificial intelligence-generated images and videos to attack their opponents.During the Slovak election, a candidate's deepfake content, which threatened to raise beer prices and made child pornography jokes, spread widely. In the United States, Donald Trump was forged to support groups with racist and sexist connotations.
Although it is difficult to say how much impact these examples had on the election results, their rapid spread is a worrying trend. Identifying what is real online will be more difficult than ever before. In an already heated and polarized political environment, this could have serious consequences.
Just a few years ago, creating deepfake content required advanced technology, but generative artificial intelligence has made it very simple, and the results look increasingly realistic. Even reliable sources of information may be deceived by content generated by artificial intelligence.
For those who want to prevent the spread of such content, 2024 will be a crucial year. The technology for identifying and guarding against such content is still in its early stages of development. Watermarking technologies, such as Google DeepMind's SynthID, are mostly not mandatory and not entirely foolproof.
It is well known that social media platforms are slow to act in removing misinformation. So we are ready to welcome a large amount of artificial intelligence-generated fake news, and we will experience a large-scale experiment firsthand.Multi-task Robots
Inspired by some of the core technologies behind the boom of generative artificial intelligence, roboticists have begun to build more versatile robots, designed to perform a wider range of tasks.
In the past few years, artificial intelligence has shifted from using multiple small models (each trained to perform a different specific task, such as image recognition, drawing images, captioning images, etc.) to using a single model. These models are trained to do all of these things.
By feeding some additional samples (known as fine-tuning) to OpenAI's GPT-3, researchers can train it to solve programming problems, write movie scripts, pass high school biology exams, and so on. Multimodal models, such as GPT-4 and Google DeepMind's Gemini, can address both visual and language tasks.The same method also applies to robots; we don't need to train two robots, one specialized in flipping pancakes and the other in opening doors. A model with multiple capabilities allows the robot to have the ability to handle multiple tasks. In 2023, we saw several examples in this field.
In June 2023, DeepMind released Robocat, which can learn how to control many different mechanical arms instead of a specific one.
In October 2023, the company collaborated with 33 university laboratories to launch another general-purpose robot model called RT-X, along with a new large-scale general training dataset.
Other top research teams, such as the Robotic Artificial Intelligence and Learning (RAIL) team at the University of California, Berkeley, are also researching similar technologies.
The problem in the field of robotics is the lack of data. Generative artificial intelligence can utilize text and image data on the internet. In contrast, robots have few good data sources to help them learn how to perform many of the tasks or household chores we want them to complete.Lerrel Pinto of New York University led a team to address this issue. He and his colleagues are developing technology that allows robots to learn through trial and error, and to acquire their own training data during the learning process.
In a more low-key project, Pinto recruited volunteers to collect video data using iPhones installed on trash sorting mechanical arms. In recent years, large companies have also begun to release large datasets for training robots, such as Meta's Ego4D.
This method has already been applied in the field of autonomous driving. Startups such as Wayve, Waabo, and Ghost are pioneering a new wave of self-driving artificial intelligence, using a single large model to control the vehicle instead of multiple small models responsible for specific driving tasks.
This has allowed small companies to catch up with giants like Cruise and Waymo. Wayve is currently testing its autonomous vehicles on the narrow and busy streets of London. Next, robots around the world will receive similar capability upgrades.