0
繁體中文
English
繁體中文
Tiếng Việt
ภาษาไทย
日本語
한국어
Bahasa Indonesia
Español
Português
Русский язык
اللغة العربية(beta)
zu-ZA
0
市場分析市場分析
市場分析

As existing approaches reach their limits, OpenAI and competitors look for a new route to more intelligent AI

Amos Simanungkalit · 16.1K 閱讀

image.png

Artificial intelligence companies like OpenAI are working to overcome unexpected delays and challenges in the development of ever-larger language models by exploring new training techniques that mimic more human-like ways of thinking.

According to a dozen AI scientists, researchers, and investors who spoke with Reuters, these techniques, which are integral to OpenAI's recently released o1 model, could transform the AI arms race. They may also impact the types of resources AI companies heavily rely on, such as energy and specialized chips.

OpenAI declined to comment for this article. Following the release of its viral ChatGPT chatbot two years ago, tech companies that have benefitted greatly from the AI boom have maintained that scaling up current models through increased data and computational power will always lead to improved AI systems.

However, some of the leading AI researchers are now questioning the limitations of the "bigger is better" approach. Ilya Sutskever, co-founder of AI labs Safe Superintelligence (SSI) and OpenAI, recently shared with Reuters that the results of scaling up pre-training— the phase where AI models use vast amounts of unlabeled data to understand language—have plateaued.

Sutskever, a key figure behind the AI revolution in the 2010s through massive data and computational scaling, explained that the focus is shifting away from scaling. He remarked, "The 2010s were the age of scaling, now we're back in the age of wonder and discovery. Everyone is looking for the next thing.” He emphasized, "Scaling the right thing matters more now than ever."

Although Sutskever declined to provide specifics, he noted that SSI is working on an alternative approach to scaling pre-training. Meanwhile, researchers at major AI labs have encountered setbacks and disappointing outcomes in their quest to release a large language model that outperforms OpenAI’s GPT-4, which is nearly two years old, according to sources familiar with internal matters.

Training large models is an expensive and complex process, costing tens of millions of dollars to run hundreds of chips simultaneously. Researchers may only discover the performance of these models after months of training, during which hardware failures are a significant risk. Additionally, the immense data demands of these models have stretched the available data sources, and power shortages have further complicated the process.

To address these challenges, researchers are exploring "test-time compute," a technique designed to enhance existing models during the "inference" phase, or when the model is being used. For instance, instead of providing a single answer, the model might generate and evaluate multiple options in real time to choose the best solution.

This approach has shown promise in tasks that require human-like reasoning and complex problem-solving. As Noam Brown, an OpenAI researcher, explained at the TED AI conference, the technique can achieve significant performance boosts with minimal computational cost. For example, having a bot think for just 20 seconds in a hand of poker yielded the same improvement as scaling up the model by 100,000 times and training it for 100,000 times longer.

OpenAI has incorporated this method in their new o1 model (previously known as Q* and Strawberry), which simulates multi-step human reasoning. The o1 model also benefits from additional training involving expert feedback from PhDs and industry professionals. OpenAI intends to apply this approach to larger base models in the future.

Other AI labs, such as Anthropic, xAI, and Google DeepMind, are also working on their own versions of this technique, according to several insiders.

Kevin Weil, OpenAI’s chief product officer, expressed optimism at a tech conference, stating, "By the time people catch up, we’re going to try and be three steps ahead."

This shift in AI development may impact the competitive landscape for AI hardware, traditionally dominated by Nvidia’s chips. Venture capital investors, who have poured billions into funding AI labs, are closely monitoring these changes and evaluating how they will affect their investments.

Sonya Huang, a partner at Sequoia Capital, noted, "This shift will move us from a world of massive pre-training clusters toward inference clouds, which are cloud-based servers used for inference."

Although Nvidia has dominated the training chip market, it may face increased competition in the inference market, especially as demand grows for distributed, cloud-based inference solutions. Nvidia has acknowledged the growing demand for its chips in this area. CEO Jensen Huang highlighted the surge in demand for its latest AI chip, Blackwell, at a recent conference in India, citing new scaling laws that apply to inference.

 

 

 

 

 

 

 

 

 

 

Paraphrasing text from "Reuters" all rights reserved by the original author.

需要幫助?
點擊此處