January 15, 2021

This new Google AI language model is almost 6 times larger than the GPT-3

This new Google AI language model is almost 6 times larger than the GPT-3.

Three researchers from the Google Brain team recently revealed the next big thing in AI language models: a huge, trillion-parameter transforming system.

As far as we know, the next largest model in existence is OpenAI’s GPT-3, which uses a measly 175 billion parameters.

The background: Linguistic models are capable of performing a variety of functions, but perhaps the most popular is the generation of novel texts. So, for example, you can go here and talk to a “philosopher AI” language model that will try to answer any question you ask (with numerous notable exceptions).

While these amazing AI models exist at the forefront of machine learning technology, it is important to remember that they are essentially just performing parlor tricks. They are systems that do not understand language, they are just tuned to make it look like they do.

The more knobs and virtual dials you can turn and adjust to achieve the desired outputs, the more finite control you have over the output.

What Google has done: Simply put, Brain’s team has found a way to make the model as simple as possible, while using as much raw computing power as possible to make it possible to increase the number of parameters. In other words, Google has a lot of money, and that means they can afford to use as much calculation hardware as the AI model can leverage.

In the team’s own words:

Transformers are scalable and effective for natural language learning. We simplified the mix of experts to produce an architecture that is easy to understand, stable to train and much more efficient in the sample than dense models of equivalent size. We found that these models excel at a diverse set of natural language tasks and at different training regimes, including pre-tuning, fine-tuning and multi-tasking training. These advances allow us to train models with parameters in the hundreds of billions to trillions and achieve substantial accelerations relative to dense T5 baselines.

In short, it is not clear what exactly this means or what Google intends to do with the techniques described in the prepress document. There is more to this model than just an OpenAI, but exactly how Google or its customers might use the new system is a bit confusing.

The important thing here is that enough brute force will lead to better techniques of computer use which in turn will make it possible to do more with less computing. But the current reality is that these systems do not tend to justify their existence when compared to greener, more useful technologies. It is difficult to launch an AI system that can only be operated by billion-dollar technology companies willing to ignore the massive carbon footprint that such a large system creates.

In context: Google has pushed the limits of what AI can do for years and this is no different. In itself, the achievement seems to be the logical progression of what has been happening in the field. But the timing is a bit suspect.

Source: The Next Web