π#78: Enabling the Future of AI (2025)
join the prediction game plus our usual collection of interesting articles, relevant news, and research papers. Dive in!
This Week in Turing Post:
Wednesday, AI 101, Technique/Method: What is Flow Matching?
Friday/Saturday, Agentic Workflow: Letβs unfold the core elements, starting with Profiling and Knowledge
The main topic β Making a tradition to predict together
Making predictions, especially about the future, is famously tricky yet remains a favorite year-end tradition. Antoine de Saint-ExupΓ©ry said it well: "Your task is not to foresee the future but to enable it." I believe by choosing the right predictions, we enable the future the way weβd like it to be.
Last year, during the first week of December, Clem Delangue, CEO of Hugging Face and our dearest subscriber, published his predictions for 2024. We shared them and asked you to send us your predictions. We had an amazing response from people like Sara Hooker (Cohere), Yoshua Bengio (Mila), Max Hjelm (CoreWeave), and others. An analysis will follow of which 2024 predictions came to life!
Today, we want to do the same, since Clem was right on time again. And letβs make it into a tradition!
Clement Delangue's six predictions for AI in 2025
The only thing we will change this time is that weβre giving you three questions to start with. And if you want to add more predictions, feel free to do that. You can also skip question/s altogether (though, weβd like you to answer at least one of them!)
Our three questions are:
What paper of 2024 is so significant that it will change 2025? Or what was the paper that surprised you the most?
What industry will experience the most disruption from AI advancements in 2025?
What overlooked challenge in AI today will become a major focus in 2025? or What are research areas that are overlooked?
Send us your thoughts β ks@turingpost.com β to be featured in the special Predictions Edition of Turing Post!
The main topic 2: Reasoning development on steroids
So, the conversations and actions around reasoning are heating up. In the last two weeks, weβve been βexposedβ to two very promising previews from China: DeepSeek-R1 and Alibabaβs QwQ-32B, both attempting to challenge OpenAIβs o1. Meanwhile, Googleβs DeepMind is reportedly developing an AI model with advanced reasoning, leveraging chain-of-thought prompting.
And though it is very tempting to jump into the conversation about reasoning, we decided to wait for QwQβs tech report β this model made an especially big splash but itβs still a preview. Allegedly, the report is due out in about a month.
Weβve been collecting some papers to analyze along with that research. And here is our list from the last week. Unsurprisingly, they all come from Chinese AI labs. Unsurprisingly, because you can see the Chinese forte in action: innovation might be hard for the country, but copying, catching up, and improving on that foundation is what makes it outstanding. We will analyze and explain the significance of these and other reasoning-related papers later when examining the QwQ report. For now, weβre sharing the links if you want to dive right in:
Shanghai Jiao Tong University and GAIR researchers surpassed OpenAIβs O1-preview in AIME 2024 with a simple distillation method and limited samples. Their model excelled in safety/generalization but relied on the teacher model, urging first-principles research for sustainable AI innovation βread the paper
Tsinghua University researchers found LLMs using implicit reasoning skip step-by-step logic, relying on memory/intuition. Probing showed instability and less reliability versus explicit Chain-of-Thought, critical for accurate complex reasoning βread the paper
Also Tsinghua University introduced HiAR-ICL, automating reasoning in In-Context Learning with Monte Carlo Tree Search and "thought cards." HiAR-ICL emphasizes how to think, systematically addressing reasoning challenges via structured automation βread the paper
Twitter library
Top 10 GitHub Repositories to Master AI, Machine Learning and Data Science
Top Research β (the last week was rich!)
Absolute hit from our Twitter (follow us there) β Natural Language Reinforcement Learning
We are reading/watching
Simon Willison: The Future of Open Source and AI | Around the Prompt #10
Ben Thompsonβs article, "The Gen AI Bridge to the Future," argues that generative AI is the key link between todayβs devices and the wearable computing era, enabling transformative, context-aware interactions.
Devanshβs piece on Dostoevsky (how could I miss that!) explores the dangers of over-rationalization and idol worship, emphasizing unconditional love as a remedy for societal pitfalls. Read it here.
MIT Tech Reviewβs "AI Minecraft Experiment Breakthrough" showcases Alteraβs Project Sid, where 1,000 LLM-powered agents in Minecraft formed communities, jobs, and even a parody religion, hinting at AI's potential to model human dynamics.
The New Yorker highlights roboticsβ βChatGPT moment,β as AI-driven learning gives robots dexterity and general-purpose capabilities. Explore the revolution.
News from The Usual Suspects Β©
Google DeepMind revolutionizes 4D content creation and reinvents time-series analysis with visuals
DeepMind's CAT4D framework takes scene reconstruction to the next dimensionβliterally. Combining multi-view video diffusion with cutting-edge deformable Gaussian models, it reimagines 4D (dynamic 3D) filmmaking, AR, and synthetic content creation. State-of-the-art results, no external priors needed. Lights, camera, 4D action!
Googleβs multimodal models turn time-series data into plot-based prompts, boosting accuracy by 120% and cutting costs tenfold. From fall detection to physical activity tracking, the future of analysis is picture-perfect.
Microsoftβs LazyGraphRAG: A new RAG benchmark
Think smarter, not costlier. LazyGraphRAG skips pre-indexing and crushes costs to 0.1% of its rivals. Merging local agility with global prowess, itβs 700x cheaper and twice as sharp in data analysis. Perfect for those who hate overspending on exploratory AI.
GitHub funds open source security
A $1.25M bounty for better security in open source. GitHub teams up with Amex, Shopify, and Stripe to bolster the projects that keep the code world turning.
Anthropic's MCP Bridges AI and Data
Anthropic unveils the Model Context Protocol (MCP), an open standard that connects AI tools with diverse data sources. By unifying fragmented integrations, MCP allows systems like Google Drive, GitHub, and Slack to interact seamlessly.
Meta AI speeds AI training with SPDL
Metaβs new multi-threading framework, SPDL, streamlines data handling for AI training. Faster loading, better scalingβbecause time is (computing) money.
Andrew Ng simplifies LLM Integration with 'aisuite'
Tired of juggling APIs? Andrew Ngβs 'aisuite' lets developers seamlessly switch between large language models by simply updating a string. Might be as well a Recommendation from our AI practitioner!
Amazing models from the last week
OLMo 2 by Allen AI
Allen AI unveils OLMo 2 with 7B and 13B parameter models trained on 5 trillion tokens.Alibabaβs QwQ-32B
Alibabaβs QwQ-32B stirs excitement with strong math, coding, and reasoning benchmarks, placing it between Claude 3.5 Sonnet and OpenAIβs o1-mini. Optimized for consumer GPUs via quantization, itβs open-sourced under Apache, revealing reasoning tokens and weights, yet shows Chinese regulatory constraints. A tech report is expected in a month.ShowUI: GUI Automation
Show Lab, NUS, and Microsoft introduce ShowUI, a 2B vision-language-action model tailored for GUI tasks. It features UI-guided token selection (33% fewer tokens), interleaved streaming for multi-turn tasks, and a curated 256K dataset, achieving 75.1% zero-shot grounding accuracy.Adobeβs MultiFoley
Adobe debuts MultiFoley, an AI model generating high-quality sound effects from text, audio, and video inputs. Cool demos highlight its creative potential.INTELLECT-1 by Prime Intellect
INTELLECT-1, a 10B LLM trained over 42 days on 1T tokens across 14 global nodes, leverages the PRIME framework for exceptional efficiency (400Γ bandwidth reduction). Open-sourced INTELLECT-1 and PRIME signal a leap in decentralized training scalability.