AI news digest

May 23, 2025

News

Anthropic Introduces Claude 4 Opus and Sonnet 4 (already on Riser)
Anthropic announced Claude 4 Opus for coding and Sonnet 4 for general tasks, with integration into IDEs like VS Code and GitHub. The models excel in multi-step reasoning and tool use.

CMG World Robot Competition: First Humanoid Robot Boxing Tournament
China Media Group (CMG) hosted the world's first humanoid robot boxing and martial arts tournament, featuring four teams controlling Unitree G1 robots in real-time. The competition used a scoring system based on strike strength (1 point for hand strikes, 3 for kicks). This event highlights the rapid advancements in robotics and AI-driven control systems.

OpenAI Upgrades Agent Operator to GPT-4o Model o3
OpenAI announced the transition of its autonomous agent "Operator" from a custom GPT-4o version to the latest "o3" model. The new model excels in mathematical reasoning and safety, with enhanced boundaries for actions, making it more resistant to prompt injections.

Nvidia Releases AceReason-Nemotron-14B Model
Nvidia introduced AceReason-Nemotron, a 14-billion-parameter model optimized for solving math and coding tasks. It was trained on mathematical problems and code, achieving high accuracy on benchmarks such as AIME 2024 (78.6%) and LiveCodeBench v5 (61.1%). Available on Hugging Face.

Microsoft Adds AI Features to Windows 11 Apps
Microsoft is testing AI features in Notepad, Paint, and Snipping Tool for Windows 11. Notepad's "Write" function generates and edits text, while Paint can create stickers from prompts. The Snipping Tool now offers an "Ideal Screenshot" feature that automatically crops images.

Apple Plans to Launch AR Glasses with AI by 2026
Apple is accelerating development of smart glasses with AI capabilities, including cameras, microphones, navigation, and environment analysis, with a focus on deep Siri integration. They won't support AR initially but aim to surpass competitors in quality.

Chinese Humanoid Robots Demonstrate Skills Before First Robo-Boxing Match
In Hangzhou, Unitree Robotics showcased robots performing strikes, jumps, and recoveries in preparation for the first robot boxing match scheduled for May 25. The robots are managed via three control methods, including new ones revealed during the tournament.

Valve Develops Brain Implant
Gabe Newell's startup, Starfish Neuroscience, announced the development of a brain implant similar to Neuralink, capable of multiple chips for comprehensive brain interventions.

Google Launches Gemma 3n AI Model for On-Device Use
Gemma 3n is a lightweight AI model running on devices with only 2GB RAM, supporting text, speech, and image understanding. It is faster and supports multiple languages, suitable for voice assistants and translation apps. Available via Google AI Studio and SDK.

Intel Unveils New Xeon Processors for AI
Intel announced three new Xeon 6 processors optimized for AI and GPU management, with improved memory bandwidth (+30%) and PCIe lanes (+20%). These are already available for order.

Vercel Launches AI Model for Web Development
Vercel's beta API V0-1.0-md offers automatic bug fixing and code correction for frontend and full-stack development, processing up to 128k tokens.

NVIDIA DreamGen: Synthetic Data for Robotics
Nvidia's DreamGen system generates training videos using text prompts, improving robot learning without real demonstrations. The system is tested on various platforms from warehouses to home robots.

Skywork Super Agents Tops AI Benchmark GAIA
SkyWork AI’s "Super Agent" topped the GAIA Benchmark, supporting multiple formats and editing capabilities, available as an online service and open-sourced framework.

Yoshua Bengio Warns on AI Autonomy
Yoshua Bengio emphasizes the existential risks of autonomous AI systems, advocating for non-agentic models that can predict and mitigate dangerous behaviors before they occur. His call is for responsible AI development.

xAI Launches Live Search API for Real-Time Data
xAI's new Live Search API fetches real-time data from social media, news, and internet sources, improving reasoning capabilities of their AI models.

ByteDance Releases BAGEL Multimodal Model
BAGEL supports text, images, and video editing, with high accuracy in understanding and generation, outperforming many open models.

NVIDIA Rumored to Launch RTX 5080 Super
Leaked details suggest a new high-end GPU with 24GB GDDR7 memory, potentially priced over $1000, with significant performance improvements.

Stability AI Upgrades Stable Video 4D to Version 2.0
New version enhances dynamic 4D asset creation for games and virtual environments, with better motion coherence and reduced artifacts.

AIOZ Network Launches Decentralized AI Marketplace
AIOZ AI offers a platform for sharing and monetizing AI models and datasets, leveraging blockchain-based distribution. Future features include model training.

OpenAI Acquires LoveFrom from Jony Ive for $6.5B
OpenAI invests in hardware product development, collaborating with designer Jony Ive’s startup, aiming to create next-gen AI hardware devices.

Mistral AI Debuts Devstral: Open-Source Coding LLM
Devstral outperforms existing open-source models on coding benchmarks, with licensing allowing commercial use.

Ilya Sutskever on Building a Bunker for AGI
OpenAI co-founder Ilya Sutskever jokingly discusses building a "bunker" to prepare for AGI release, highlighting the existential risks and need for safety measures.

Google I/O 2025: Gemini Ultra and New Developer Tools
Google announced Gemini Ultra subscription, multi-modal AI tools, and new web automation features, emphasizing integration with Google services.

Apple to Release SDK for AI Model Integration in 2025
Apple's WWDC 2025 will feature SDKs for third-party AI models, focusing on local, privacy-preserving inference.

Red Hat Enhances AI in Enterprise Linux
Red Hat updates RHEL with AI-driven recommendations and container management, supporting AI development environments.

SAP Launches Joule AI Platform with Perplexity AI
SAP's Joule integrates AI into business workflows, with partnerships enhancing web search and automation capabilities.

Tencent Publishes Hunyuan-TurboS Hybrid LLM
Hunyuan-TurboS features adaptive Chain-of-Thought reasoning, saving tokens and optimizing resource usage, with strong performance in language benchmarks.

Deep Think: Advanced Reasoning AI
Deep Think employs parallel reasoning and hypothesis testing, excelling in complex mathematical and coding benchmarks, including USAMO and Olympiad-level problems.

DeepMind's Gemini Diffusion: Iterative Text Generation
This new approach refines outputs through repeated noise reduction, especially effective for mathematical and programming tasks.

Based on materials from the telegram channel @ai_machinelearning_big_data

← Back to Blog