AI Radar

Official AI change radar for models, APIs, pricing, and agents.

AI Jupyter monitors official provider sources, verifies publish dates, filters for model/API/pricing/agent impact, and turns each change into a builder-focused brief.

Official sources

Date verification

Impact filtering

Builder brief

Jun 16, 20262 verified changes

1
Fireworks AI/Model update/Official source/Date verified
Fireworks adds GLM-5.2 with 1M-token context and coding-focused efficiency gains
What changed: Fireworks added GLM-5.2 to its pricing and playground pages, highlighting a 1M-token context window and coding-oriented performance improvements.
Key details: The page calls out a 1M-token context window.,Fireworks says the model has advanced multi-effort coding capabilities.,The description cites an IndexShare architecture and improved MTP layer for efficiency gains.
Why it matters: Builders evaluating long-context coding models now have another option with explicit context and efficiency claims to compare against existing offerings.
Check next: Verify the exact pricing, availability, and whether the model supports any special tool or modality features beyond what is shown here.
Source: fireworks.aiOfficial evidence
2
NVIDIA / NIM / Nemotron/Agent update/Official source/Date verified
NVIDIA cites AgentPerf benchmark results for Blackwell agent infrastructure
What changed: NVIDIA published a blog post highlighting AgentPerf results from Artificial Analysis and comparing agentic workload efficiency across systems.
Key details: The post says GB300 NVL72 runs up to 20x more agents per megawatt than Hopper.,The benchmark referenced is AgentPerf.,The post frames the result around real-world agentic workloads and infrastructure efficiency.
Why it matters: This affects infrastructure selection for teams running agent workloads at scale, where throughput per watt can influence platform and hardware choices.
Check next: Verify the benchmark methodology, whether the comparison is apples-to-apples, and whether the result is from NVIDIA, Artificial Analysis, or both.
Source: blogs.nvidia.comOfficial evidence

Jun 15, 20263 verified changes

1
ElevenLabs/Model update/Official source/Date verified
ElevenLabs launches Scribe speech-to-text with 99-language transcription
What changed: ElevenLabs published a speech-to-text product page for Scribe, describing it as its speech-to-text model and highlighting transcription and captioning features.
Key details: The page says it transcribes in 99 languages.,It can auto-generate captions.,It also supports transcript editing and audio-video alignment.
Why it matters: This affects speech-to-text vendor selection for teams building transcription, captioning, or media workflows.
Check next: Verify API availability, pricing, supported file or streaming modes, and whether the 99-language claim applies to all features or only transcription.
Source: elevenlabs.ioOfficial evidence
2
xAI/Agent update/Official source/Date verified
xAI adds Voice Agent API for real-time WebSocket conversations
What changed: xAI documented a Voice Agent API for real-time voice conversations over WebSocket.
Key details: The docs describe real-time voice conversations.,WebSocket is the transport called out in the evidence.,The item appears in xAI release notes and developer docs.
Why it matters: This is a new integration surface for builders adding voice agents or live conversational experiences on xAI.
Check next: Verify supported audio formats, turn-taking behavior, latency expectations, and whether the API is in preview or generally available.
Source: docs.x.aiOfficial evidence
3
Anthropic/Model update/Official source/Date verified
Anthropic updates Claude Platform release notes for API, SDK, and Console
What changed: Anthropic published a new Claude Platform release-notes entry covering updates across the Claude API, client SDKs, and the Claude Console.
Key details: The release-notes page is the official changelog entry point for platform changes.,The update explicitly spans API, SDKs, and Console.,No specific feature details were visible in the provided evidence.
Why it matters: Builders tracking Claude integrations need to watch the platform changelog for changes that can affect API behavior, SDK usage, or console workflows.
Check next: Open the release notes to identify the concrete API, SDK, or console changes and whether any are breaking or require migration.
Source: platform.claude.comOfficial evidence

Jun 12, 20265 verified changes

1
Baseten/Model update/Official source/Date verified
Baseten adds rolling deployments for zero-downtime model updates
What changed: Baseten added rolling deployments for model updates, with incremental rollout controls designed to avoid downtime and avoid doubling GPU spend.
Key details: Teams can pause, resume, and roll back during a rollout.,Baseten says teams can ship 50–60% more often.,The post frames this as a deployment workflow change for models on Baseten.
Why it matters: This changes how teams can safely ship model updates in production, especially when rollout risk and GPU cost are part of the deployment decision.
Check next: Verify whether rolling deployments are available to all Baseten users, any rollout prerequisites, and whether there are limits on supported model types or deployment sizes.
Source: baseten.coOfficial evidence
2
Hugging Face Inference/Model update/Official source/Date verified
Ai2 posts olmo-eval as an evaluation workbench for model development
What changed: Ai2 published olmo-eval, described as an evaluation workbench for the model development loop.
Key details: The post is framed around evaluation during model development, not deployment. It is hosted on Hugging Face as a blog post by Ai2. The source evidence does not provide feature-level details beyond the workbench positioning.
Why it matters: Evaluation tooling can change how teams compare models, track regressions, and structure the development loop before release.
Check next: Verify what metrics, datasets, or automation the workbench supports and whether code or docs are available.
Source: huggingface.coOfficial evidence
3
Fireworks AI/Agent update/Official source/Date verified
Fireworks adds day-0 support for Kimi K2.7 Code with serverless access
What changed: Fireworks says it is launching day-0 support for Moonshot’s Kimi K2.7 Code on its serverless infrastructure and that developers can start integrating it via the Fireworks API now.
Key details: The model is described as part of Moonshot’s K2 coding series with a 1T-parameter, 256K-context architecture. Fireworks says K2.7 Code uses about 30% fewer reasoning tokens than K2.6 while scoring higher on coding benchmarks including Kimi Code Bench v2, Program Bench, and MLS Bench Lite. Fireworks says availability i…
Why it matters: This changes model choice and cost/performance tradeoffs for teams building coding agents, especially if they care about lower reasoning-token spend and immediate deployment on a hosted API.
Check next: Verify the API surface, pricing, and whether the serving tiers differ in latency, throughput, or queueing behavior.
Source: fireworks.aiOfficial evidence
4
PixVerse/Agent update/Official source/Date verified
PixVerse launches Canvas for repeatable AI video workflows
What changed: PixVerse introduced PixVerse Canvas, a visual workspace for AI video workflows that organizes assets, storyboards, batch tasks, and multi-model results on one canvas instead of one-off prompts.
Key details: The product is positioned as a workflow workspace rather than a single-prompt generator. The official description explicitly mentions organizing assets, storyboards, batch tasks, and multi-model results. The update is framed as a product launch on June 12, 2026.
Why it matters: This affects how teams structure video generation workflows, especially if they need repeatability, batch operations, or comparison across models.
Check next: Check whether Canvas exposes collaboration, export, API, or automation hooks, and whether it is available to all users or limited by plan.
Source: pixverse.aiOfficial evidence
5
Fireworks AI/Model update/Official source/Date verified
Fireworks adds Kimi K2.7 Code with long-context coding pricing and playground access
What changed: Fireworks added Kimi K2.7 Code to its model catalog and pricing/playground pages, positioning it as a coding-focused agentic model with long-horizon coding improvements.
Key details: Listed price shown in the title is $0.95/M input and $4/M output.,The page shows a 262,144-token context window.,Fireworks says it reduces thinking-token usage by about 30% versus Kimi K2.6.
Why it matters: Builders choosing a coding model now have a new option to compare on context length, token efficiency, and per-token cost.
Check next: Verify whether the model is generally available, whether the pricing applies to both API and playground, and whether any modality or tool-use limits are documented.
Source: fireworks.aiOfficial evidence

Jun 11, 20263 verified changes

1
Baseten/Model update/Official source/Date verified
Baseten adds Mercury 2, a reasoning diffusion LLM, to its platform
What changed: Baseten announced that Inception's Mercury 2 is now available on Baseten, positioning it as a reasoning diffusion LLM for speed-focused use cases.
Key details: Baseten describes it as the first reasoning diffusion LLM.,The post claims it is 5–10x faster than leading speed-optimized models at comparable quality.,The update is about platform availability on Baseten.
Why it matters: Builders choosing a low-latency reasoning model now have a new hosted option to evaluate for production inference.
Check next: Verify whether the model is generally available, what the serving limits are, and whether Baseten exposes any special deployment or pricing terms for it.
Source: baseten.coOfficial evidence
2
Deepgram/API update/Official source/Date verified
Deepgram Self-Hosted 260611 adds profanity filtering, redaction, diarization, and TTS transcoding
What changed: Deepgram says Self-Hosted release 260611 adds Persian profanity filtering, English redaction on Flux streaming, a streaming diarization model parameter, and text-to-speech output transcoding.
Key details: The update is specifically for Deepgram Self-Hosted. It includes language-specific profanity filtering for Persian and English redaction on Flux streaming. It also adds a streaming diarization model parameter plus TTS output transcoding.
Why it matters: These are concrete API/runtime capabilities that affect speech pipeline design, moderation, and deployment choices for self-hosted users.
Check next: Confirm whether the changes apply to all self-hosted customers, and whether any model or endpoint versioning is required.
Source: developers.deepgram.comOfficial evidence
3
Deepgram/Agent update/Official source/Date verified
Deepgram adds MCP servers and agent skills for coding tools
What changed: Deepgram published docs for agentic developer tools, including MCP servers and agent skills that give AI coding tools built-in knowledge of Deepgram APIs, docs, and starter apps.
Key details: The docs page is titled "Agentic developer tools." The page explicitly mentions Deepgram CLI, MCP servers, and agent skills. The stated goal is to help coding tools use Deepgram APIs, docs, and starter apps.
Why it matters: This changes how builders can wire AI coding agents into Deepgram’s ecosystem, reducing setup work for tool use and documentation grounding.
Check next: Verify which MCP server capabilities are documented, what tools are exposed, and whether there are setup or auth requirements.
Source: developers.deepgram.comOfficial evidence

Official AI change radar for models, APIs, pricing, and agents.

Fireworks adds GLM-5.2 with 1M-token context and coding-focused efficiency gains

NVIDIA cites AgentPerf benchmark results for Blackwell agent infrastructure

ElevenLabs launches Scribe speech-to-text with 99-language transcription

xAI adds Voice Agent API for real-time WebSocket conversations

Anthropic updates Claude Platform release notes for API, SDK, and Console

Baseten adds rolling deployments for zero-downtime model updates

Ai2 posts olmo-eval as an evaluation workbench for model development

Fireworks adds day-0 support for Kimi K2.7 Code with serverless access

PixVerse launches Canvas for repeatable AI video workflows

Fireworks adds Kimi K2.7 Code with long-context coding pricing and playground access

Baseten adds Mercury 2, a reasoning diffusion LLM, to its platform

Deepgram Self-Hosted 260611 adds profanity filtering, redaction, diarization, and TTS transcoding

Deepgram adds MCP servers and agent skills for coding tools