NEWS

Elon Musk unveils Grok-1.5, bringing performance closer to GPT-4 level

By VARINDIA - 2024-05-09

Grok-1.5, a modified version of Elon Musk's private large language model (LLM), has been made public by xAI. Improved reasoning and problem-solving skills are provided by Grok-1.5, which also closely matches the performance of established open and closed LLMs, such as Claude 3 from Anthropic and GPT-4 from OpenAI. Although it cannot process as many contexts as Gemini 1.5 Pro, it can process contexts up to one million tokens.

With the release of Grok-1.5, the company is building on that work, delivering significant improvements over the previous model across all major benchmarks, including those related to coding and math-related tasks.

“In our tests, Grok-1.5 achieved a 50.6% score on the MATH benchmark and a 90% score on the GSM8K benchmark, two math benchmarks covering a wide range of grade school to high school competition problems. Additionally, it scored 74.1% on the HumanEval benchmark, which evaluates code generation and problem-solving abilities,” xAI noted in a blog post.

On the MMLU benchmark, which evaluates AI models’ language understanding capabilities across diverse tasks, the new model scored 81.3%, beating Grok-1’s 73% by a significant margin.

Beyond this, xAI also confirmed that Grok-1.5 has a context window of up to 128,000 tokens (tokens are entire parts or subsections of words, images, videos, audio or code). This allows the model to take in and process vast amounts of information in one go – 16 times more than Grok-1, making it more suitable for analyzing, summarizing and extracting information from long documents. It can even handle longer and more complex prompts while still maintaining the instruction-following capability.