OpenAI in the Development Phase of GPT-5

OpenAI CEO Sam Altman recently confirmed that his team is now in the process of developing GPT-5. This comes months after his statement at an MIT event where he mentioned the team wasn’t yet training the next AI. Altman didn’t give a clear indication of the model’s training status, but hinted at the need for more data for its development.

GPT-5’s Data Needs

Large language models like GPT-5 require vast amounts of data sourced from public online sources and proprietary private datasets. OpenAI has shown interest in partnering with organizations on private datasets for text, images, audio, or video, particularly long-form writing or conversations that express human intention.

This approach aligns with recent research suggesting that smaller models fed larger amounts of data perform as well or better than larger models fed less. The challenge, however, is securing a high-quality supply of data, given that the supply of publicly accessible, high-quality online data is estimated to run out by 2026.

The Computing Challenge

Beyond data, another challenge in developing GPT-5 is computing. Foundation models like GPT-4 require large supplies of graphics processing units (GPUs). Nvidia, the leading supplier of GPUs, recently delivered their latest H100 chips to OpenAI. These new chips have proven to be faster, training large language models nearly three times faster than the previous record set just five months ago.

Preparing for GPT-5

Despite these challenges, OpenAI is actively assembling the necessary resources for the development of GPT-5. This includes securing funding from investors, sourcing chips from Nvidia, and obtaining quality data. The timeline for GPT-5’s release remains uncertain, as the training process, followed by the necessary stress testing and fine-tuning, could take months.

While GPT-5 is in development, OpenAI continues to push forward with its current technology. Recently at its first developer conference, the company launched custom chatbots, known as GPTs, and GPT-4 Turbo, an enhanced algorithm that includes more up-to-date information, works with much longer prompts, and is cheaper for developers.

Implications and Future Possibilities

With AI developments ongoing, governments are keeping a close eye on potential regulations to address algorithmic bias, privacy concerns, and violation of intellectual property rights. The future of AI, however, remains unclear. Some believe more data and bigger algorithms can address the shortcomings associated with large language models, while others think new breakthroughs are needed.

Altman himself has expressed uncertainty, saying, “We’re trying to get better at it, because I think it’s important from a safety perspective to predict the capabilities. But I can’t tell you here’s exactly what it’s going to do that GPT-4 didn’t.”