How Zapier can minimize your AI spend
You ever watch a hot dog eating contest? It's impressive to see someone wolf down five franks per minute, but you just know the stomach pains are coming. This is the image that comes to mind when I hear about companies tracking how many AI tokens their employees consume, to make sure they're using AI "enough." And even without this kind of performative AI theater, your monthly AI bill might be giving you Joey Chestnut levels of indigestion. If your AI token spend costs more than whatever amount
AI inference startup Baseten reportedly raising $1.5B months after its last mega-round
Startup Baseten is reportedly close to finalizing a $1.5 billion round at a $13 billion as the “inference gold rush" marches on.
92% of sales teams drop qualified leads every month—here's why follow-ups are breaking down
If you're a sales manager, you probably know what tools you're supposed to give your team: a CRM, well-crafted follow-up sequences, and maybe a few AI agents running in the background. But chances are you're still one of the 92% of sales leaders who say their teams lose qualified leads every month because follow-ups are delayed, inconsistent, or forgotten entirely. If you have the tools, but not the results, you know something's broken. To find out what's blocking teams, we surveyed over 400 B2
What is a task in Zapier? Everything to know about Zapier's task-based pricing
When my husband and I say we're running to the pet store for "a few things," we both know that's hilariously optimistic. We might go in planning to pick up kibble and maybe refill the treat jar, but there's no way we can resist maxing out our budget on dog toys once we're there. "A few things" doesn't actually tell you much about what's happening. Automation tools can be the same way. They all talk about "tasks" (or executions, or runs, or activities, the list goes on), but if you don't know wha
The India Story: Graph Technology can power the country’s data-driven growth - ET CIO
The India Story: Graph Technology can power the country’s data-driven growth ET CIO
OpenAI is bringing on some big guns in the lead-up to its IPO
OpenAI is bulking up before its IPO, landing Transformer co-inventor Noam Shazeer from Google DeepMind and former Trump AI policy official Dean Ball in the same week.
AI data centers just got a government-mandated fast lane to the grid
FERC told grid operators to give data centers a fast lane for interconnections, but it failed to address electricity supply shortages.
Meet the first 2026 Zappy Award monthly winners: May 2026
We launched the Zappy Awards in May to find the builders quietly redesigning how work gets done at their companies. We're on the hunt for the people who see a problem, pick up Zapier, and do something about it. We've hit 50 submissions. We weren't expecting the bar to be this high this fast. So we've decided to move up our first monthly wins to start right now! These are the first two monthly winners. Rachael Silvano, Community Strategy Lead at Articulate Rachael manages E-Learning Heroes, a co
Almost half of US singles feel negatively about AI in dating, Match says
About 47% of singles look negatively at the use of AI in dating -- but many dating app users are open to AI helping with profile punch-ups and conversation starters.
How an astrophysicist uses Codex to help simulate black holes
Discover how astrophysicist Chi-kwan Chan uses Codex to build black hole simulations, helping scientists study extreme physics and test Einstein’s theory of general relativity.
The US says ASML’s top chip tool may be in China, but how?
There's a commercial logic that cuts against the idea that ASML would risk its export license to arm a Chinese customer.
How to turn off AI in your Google Docs
Here's what you need to do to get those pesky "write with Gemini" pop-ups to go away.
Predicting model behavior before release by simulating deployment
OpenAI introduces Deployment Simulation, a method to predict AI model behavior before deployment using real conversation data to improve safety and evaluation accuracy.
Improving health intelligence in ChatGPT
Learn how GPT-5.5 Instant improves ChatGPT’s health and wellness responses with stronger reasoning, better context, clearer communication, and physician-informed evaluations.
Cal State faculty push to prevent AI tools from replacing them as schools and staff experiment - CalMatters
Cal State faculty push to prevent AI tools from replacing them as schools and staff experiment CalMatters
100 things we announced at I/O 2026
We've been busy! Here’s a rundown of the top announcements, launches and demos at I/O 2026.
I/O 2026
At Google I/O 2026, we shared how we’re making AI more helpful for everyone. See everything we announced.
BBVA puts AI at the core of banking with OpenAI
Learn how BBVA scaled ChatGPT Enterprise to 100,000 employees and partnered with OpenAI to accelerate AI-powered banking transformation worldwide.
Save time and grow your business with new Gemini tools
An overview of new features in the Gemini app designed specifically to support businesses and entrepreneurs.
What is artificial intelligence (AI)? - Databricks
What is artificial intelligence (AI)? Databricks
A new experiment brings better group meetings to Google Beam
See and hear your colleagues in true-to-life size and sound, making hybrid meetings feel more inclusive and connected.
The latest AI news we announced in May 2026
Here are Google’s latest AI updates from May 2026
Check out real-life AI prototypes from the Futures Lab.
University of Waterloo students develop AI prototypes like sign language tutors to reshape the future of education and work.
The 9 best fitness apps in 2026
If you're overwhelmed by the number of fitness apps on the market, you're not alone. There are seemingly a bajillion fitness apps available, and from logging your personal bests to tracking your pickleball wins, each has its own niche. That's great in terms of giving users variety—but when you're scoping out a new app, it can often feel like you're comparing apples to oranges. And staying fit is hard enough; your fitness app should make it easier, not more complicated. As someone who's tried a
Making it easier to understand how content was created and edited
We're expanding our tools to help you understand how content was created and edited across the web.
Introducing the OpenAI Partner Network
OpenAI launches the Partner Network, investing $150M to help global partners accelerate enterprise AI adoption, deployment, and transformation.
New OpenAI Academy courses for the next era of work
OpenAI introduces three Academy courses that help people build practical AI skills, create repeatable workflows, and apply agents in everyday work.
We’re strengthening our presence in Alabama through new investments and community support.
Google has announced a $1.5 billion investment for 2026 and 2027 to expand its data center campus in Jackson County, Alabama. Operating since 2019 on a repurposed former…
China Unveils World’s First AI Hospital: 14 Virtual Doctors Ready to Treat Thousands Daily
China has unveiled the world’s first fully AI-powered hospital, marking a radical shift in the future of healthcare. Developed by Tsinghua University in Beijing, the “Agent Hospital” features 14 AI doctors and 4 AI nurses that can diagnose, treat, and manage up to 3,000 patients per day, without any human staff. Faster, smarter care: What would take human doctors 3 years, the AI doctors can do in 1 day. High IQ bots: These AI agents scored a 93.06% pass rate on the US Medical Licensing Exam. Training without risk: The virtual hospital allows medical students to practice in a fully The post China Unveils World’s First AI Hospital: 14 Virtual Doctors Ready to Treat Thousands Daily appeared first on DailyAI.
Context intelligence for your data and AI agents at scale - Amazon Web Services (AWS)
Context intelligence for your data and AI agents at scale Amazon Web Services (AWS)
Billionaire Ambani wants AI in every call, app, and home
Reliance is weaving AI into telecom services used by more than 500 million people.
Take our I/O 2026 quiz, vibe coded in Google AI Studio.
We used Google AI Studio to vibe code a quiz about our top I/O 2026 announcements.
Anthropic launches Cowork, a Claude Desktop agent that works in your files — no coding required
Anthropic released Cowork on Monday, a new AI agent capability that extends the power of its wildly successful Claude Code tool to non-technical users — and according to company insiders, the team built the entire feature in approximately a week and a half, largely using Claude Code itself. The launch marks a major inflection point in the race to deliver practical AI agents to mainstream users, positioning Anthropic to compete not just with OpenAI and Google in conversational AI, but with Microsoft's Copilot in the burgeoning market for AI-powered productivity tools. "Cowork lets you complete non-technical tasks much like how developers use Claude Code," the company announced via its official Claude account on X. The feature arrives as a research preview available exclusively to Claude Max subscribers — Anthropic's power-user tier priced between $100 and $200 per month — through the macOS desktop application. For the past year, the industry narrative has focused on large language models that can write poetry or debug code. With Cowork, Anthropic is betting that the real enterprise value lies in an AI that can open a folder, read a messy pile of receipts, and generate a structured expense report without human hand-holding. How developers using a coding tool for vacation research inspired Anthropic's latest product The genesis of Cowork lies in Anthropic's recent success with the developer community. In late 2024, the company released Claude Code, a terminal-based tool that allowed software engineers to automate rote programming tasks. The tool was a hit, but Anthropic noticed a peculiar trend: users were forcing the coding tool to perform non-coding labor. According to Boris Cherny, an engineer at Anthropic, the company observed users deploying the developer tool for an unexpectedly diverse array of tasks. "Since we launched Claude Code, we saw people using it for all sorts of non-coding work: doing vacation research, building slide decks, cleaning up your email, cancelling subscriptions, recovering wedding photos from a hard drive, monitoring plant growth, controlling your oven," Cherny wrote on X. "These use cases are diverse and surprising — the reason is that the underlying Claude Agent is the best agent, and Opus 4.5 is the best model." Recognizing this shadow usage, Anthropic effectively stripped the command-line complexity from their developer tool to create a consumer-friendly interface. In its blog post announcing the feature, Anthropic explained that developers "quickly began using it for almost everything else," which "prompted us to build Cowork: a simpler way for anyone — not just developers — to work with Claude in the very same way." Inside the folder-based architecture that lets Claude read, edit, and create files on your computer Unlike a standard chat interface where a user pastes text for analysis, Cowork requires a different level of trust and access. Users designate a specific folder on their local machine that Claude can access. Within that sandbox, the AI agent can read existing files, modify them, or create entirely new ones. Anthropic offers several illustrative examples: reorganizing a cluttered downloads folder by sorting and intelligently renaming each file, generating a spreadsheet of expenses from a collection of receipt screenshots, or drafting a report from scattered notes across multiple documents. "In Cowork, you give Claude access to a folder on your computer. Claude can then read, edit, or create files in that folder," the company explained on X. "Try it to create a spreadsheet from a pile of screenshots, or produce a first draft from scattered notes." The architecture relies on what is known as an "agentic loop." When a user assigns a task, the AI does not merely generate a text response. Instead, it formulates a plan, executes steps in parallel, checks its own work, and asks for clarification if it hits a roadblock. Users can queue multiple tasks and let Claude process them simultaneously — a workflow Anthropic describes as feeling "much less like a back-and-forth and much more like leaving messages for a coworker." The system is built on Anthropic's Claude Agent SDK, meaning it shares the same underlying architecture as Claude Code. Anthropic notes that Cowork "can take on many of the same tasks that Claude Code can handle, but in a more approachable form for non-coding tasks." The recursive loop where AI builds AI: Claude Code reportedly wrote much of Claude Cowork Perhaps the most remarkable detail surrounding Cowork's launch is the speed at which the tool was reportedly built — highlighting a recursive feedback loop where AI tools are being used to build better AI tools. During a livestream hosted by Dan Shipper, Felix Rieseberg, an Anthropic employee, confirmed that the team built Cowork in approximately a week and a half. Alex Volkov, who covers AI developments, expressed surprise at the timeline: "Holy shit Anthropic built 'Cowork' in the last... week and a half?!" This prompted immediate speculation about how much of Cowork was itself built by Claude Code. Simon Smith, EVP of Generative AI at Klick Health, put it bluntly on X: "Claude Code wrote all of Claude Cowork. Can we all agree that we're in at least somewhat of a recursive improvement loop here?" The implication is profound: Anthropic's AI coding agent may have substantially contributed to building its own non-technical sibling product. If true, this is one of the most visible examples yet of AI systems being used to accelerate their own development and expansion — a strategy that could widen the gap between AI labs that successfully deploy their own agents internally and those that do not. Connectors, browser automation, and skills extend Cowork's reach beyond the local file system Cowork doesn't operate in isolation. The feature integrates with Anthropic's existing ecosystem of connectors — tools that link Claude to external information sources and services such as Asana, Notion, PayPal, and other supported partners. Users who have configured these connections in the standard Claude interface can leverage them within Cowork sessions. Additionally, Cowork can pair with Claude in Chrome, Anthropic's browser extension, to execute tasks requiring web access. This combination allows the agent to navigate websites, click buttons, fill forms, and extract information from the internet — all while operating from the desktop application. "Cowork includes a number of novel UX and safety features that we think make the product really special," Cherny explained, highlighting "a built-in VM [virtual machine] for isolation, out of the box support for browser automation, support for all your claude.ai data connectors, asking you for clarification when it's unsure." Anthropic has also introduced an initial set of "skills" specifically designed for Cowork that enhance Claude's ability to create documents, presentations, and other files. These build on the Skills for Claude framework the company announced in October, which provides specialized instruction sets Claude can load for particular types of tasks. Why Anthropic is warning users that its own AI agent could delete their files The transition from a chatbot that suggests edits to an agent that makes edits introduces significant risk. An AI that can organize files can, theoretically, delete them. In a notable display of transparency, Anthropic devoted considerable space in its announcement to warning users about Cowork's potential dangers — an unusual approach for a product launch. The company explicitly acknowledges that Claude "can take potentially destructive actions (such as deleting local files) if it's instructed to." Because Claude might occasionally misinterpret instructions, Anthropic urges users to provide "very clear guidance" about sensitive operations. More concerning is the risk of prompt injection attacks — a technique where malicious actors embed hidden instructions in content Claude might encounter online, potentially causing the agent to bypass safeguards or take harmful actions. "We've built sophisticated defenses against prompt injections," Anthropic wrote, "but agent safety — that is, the task of securing Claude's real-world actions — is still an active area of development in the industry." The company characterized these risks as inherent to the current state of AI agent technology rather than unique to Cowork. "These risks aren't new with Cowork, but it might be the first time you're using a more advanced tool that moves beyond a simple conversation," the announcement notes. Anthropic's desktop agent strategy sets up a direct challenge to Microsoft Copilot The launch of Cowork places Anthropic in direct competition with Microsoft, which has spent years attempting to integrate its Copilot AI into the fabric of the Windows operating system with mixed adoption results. However, Anthropic's approach differs in its isolation. By confining the agent to specific folders and requiring explicit connectors, they are attempting to strike a balance between the utility of an OS-level agent and the security of a sandboxed application. What distinguishes Anthropic's approach is its bottom-up evolution. Rather than designing an AI assistant and retrofitting agent capabilities, Anthropic built a powerful coding agent first — Claude Code — and is now abstracting its capabilities for broader audiences. This technical lineage may give Cowork more robust agentic behavior from the start. Claude Code has generated significant enthusiasm among developers since its initial launch as a command-line tool in late 2024. The company expanded access with a web interface in October 2025, followed by a Slack integration in December. Cowork is the next logical step: bringing the same agentic architecture to users who may never touch a terminal. Who can access Cowork now, and what's coming next for Windows and other platforms For now, Cowork remains exclusive to Claude Max subscribers using the macOS desktop application. Users on other subscription tiers — Free, Pro, Team, or Enterprise — can join a waitlist for future access. Anthropic has signaled clear intentions to expand the feature's reach. The blog post explicitly mentions plans to add cross-device sync and bring Cowork to Windows as the company learns from the research preview. Cherny set expectations appropriately, describing the product as "early and raw, similar to what Claude Code felt like when it first launched." To access Cowork, Max subscribers can download or update the Claude macOS app and click on "Cowork" in the sidebar. The real question facing enterprise AI adoption For technical decision-makers, the implications of Cowork extend beyond any single product launch. The bottleneck for AI adoption is shifting — no longer is model intelligence the limiting factor, but rather workflow integration and user trust. Anthropic's goal, as the company puts it, is to make working with Claude feel less like operating a tool and more like delegating to a colleague. Whether mainstream users are ready to hand over folder access to an AI that might misinterpret their instructions remains an open question. But the speed of Cowork's development — a major feature built in ten days, possibly by the company's own AI — previews a future where the capabilities of these systems compound faster than organizations can evaluate them. The chatbot has learned to use a file manager. What it learns to use next is anyone's guess.
The US banned Anthropic’s Fable 5 release, but the numbers don’t seem to care
Just as last week was ending, the US government forced Anthropic to pull its two newest models, Fable 5 and Mythos 5, citing national security concerns after Amazon researchers allegedly found a way to bypass Fable 5’s guardrails. Cybersecurity researchers have since signed an open letter calling the move dangerous, and Anthropic itself noted the same jailbreaks exist in other models. So is […]
Netflix Adds ChatGPT-Powered AI to Stop You From Scrolling Forever
In a bold move to tackle one of streaming’s biggest frustrations, endless scrolling, Netflix just unveiled a major redesign of its TV and mobile apps featuring a ChatGPT-powered AI chatbot and TikTok-style video reels. You’ll soon be able to ask Netflix in plain language what you’re in the mood for “funny and fast-paced” or “dark thrillers with strong female leads” and get instant, tailored recommendations. Netflix is partnering with OpenAI to power this feature, part of a broader overhaul aimed at making content discovery faster, more intuitive, and (finally) less painful. What’s changing Conversational AI Search: Powered by OpenAI, this The post Netflix Adds ChatGPT-Powered AI to Stop You From Scrolling Forever appeared first on DailyAI.
PRC-linked influence operations are targeting AI debates in the US
A new report from OpenAI details PRC-linked influence operations using AI to target U.S. tech debates, data center narratives, tariffs, and false claims about ChatGPT.
Powering the world’s first AI arts museum
Refik Anadol Studio opens Dataland, the first museum of AI arts, powered by Google Cloud and supported by Google Arts & Culture.
The CEO of Allbirds’ new AI biz has a plan, but no team
Call it a startup with a sole founder and a very large seed round, but what's next is less clear.
The smartphone era created an attention crisis — slow tech is fixing it
“People just really want to take back control of their time, their lives, their attention... They’re down for whatever helps them do that.”
Employee onboarding automation: A complete guide
Too busy onboarding and offboarding employees to focus on other business-critical processes? Whether you're in HR or IT, it's your job to make sure new employees have the tools they need to kickstart their careers—and to wrap up when they leave. But doing that (on top of your other priority work) can quickly become overwhelming if you're handling these processes manually. With just a few Zap workflows—what we call automations—you can send team notifications about new employees, assign onboardi
Breakingviews - Galbraith’s bezzle lurks beneath the AI frenzy - Reuters
Breakingviews - Galbraith’s bezzle lurks beneath the AI frenzy Reuters
Introducing LifeSciBench
Introducing LifeSciBench, an expert-authored, expert-reviewed benchmark for evaluating how AI systems handle real-world life science research tasks and decisions.
Is the US government’s Anthropic ban accidentally helping the brand?
Just as last week was ending, the US government forced Anthropic to pull its two newest models, Fable 5 and Mythos 5, citing national security concerns after Amazon researchers allegedly found a way to bypass Fable 5’s guardrails. Cybersecurity researchers have since signed an open letter calling the move dangerous, and Anthropic itself noted the same jailbreaks exist in other models. So is […]
Nous Research's NousCoder-14B is an open-source coding model landing right in the Claude Code moment
Nous Research, the open-source artificial intelligence startup backed by crypto venture firm Paradigm, released a new competitive programming model on Monday that it says matches or exceeds several larger proprietary systems — trained in just four days using 48 of Nvidia's latest B200 graphics processors. The model, called NousCoder-14B, is another entry in a crowded field of AI coding assistants, but arrives at a particularly charged moment: Claude Code, the agentic programming tool from rival Anthropic, has dominated social media discussion since New Year's Day, with developers posting breathless testimonials about its capabilities. The simultaneous developments underscore how quickly AI-assisted software development is evolving — and how fiercely companies large and small are competing to capture what many believe will become a foundational technology for how software gets written. type: embedded-entry-inline id: 74cSyrq6OUrp9SEQ5zOUSl NousCoder-14B achieves a 67.87 percent accuracy rate on LiveCodeBench v6, a standardized evaluation that tests models on competitive programming problems published between August 2024 and May 2025. That figure represents a 7.08 percentage point improvement over the base model it was trained from, Alibaba's Qwen3-14B, according to Nous Research's technical report published alongside the release. "I gave Claude Code a description of the problem, it generated what we built last year in an hour," wrote Jaana Dogan, a principal engineer at Google responsible for the Gemini API, in a viral post on X last week that captured the prevailing mood around AI coding tools. Dogan was describing a distributed agent orchestration system her team had spent a year developing — a system Claude Code approximated from a three-paragraph prompt. The juxtaposition is instructive: while Anthropic's Claude Code has captured imaginations with demonstrations of end-to-end software development, Nous Research is betting that open-source alternatives trained on verifiable problems can close the gap — and that transparency in how these models are built matters as much as raw capability. How Nous Research built an AI coding model that anyone can replicate What distinguishes the NousCoder-14B release from many competitor announcements is its radical openness. Nous Research published not just the model weights but the complete reinforcement learning environment, benchmark suite, and training harness — built on the company's Atropos framework — enabling any researcher with sufficient compute to reproduce or extend the work. "Open-sourcing the Atropos stack provides the necessary infrastructure for reproducible olympiad-level reasoning research," noted one observer on X, summarizing the significance for the academic and open-source communities. The model was trained by Joe Li, a researcher in residence at Nous Research and a former competitive programmer himself. Li's technical report reveals an unexpectedly personal dimension: he compared the model's improvement trajectory to his own journey on Codeforces, the competitive programming platform where participants earn ratings based on contest performance. Based on rough estimates mapping LiveCodeBench scores to Codeforces ratings, Li calculated that NousCoder-14B's improvemen t— from approximately the 1600-1750 rating range to 2100-2200 — mirrors a leap that took him nearly two years of sustained practice between ages 14 and 16. The model accomplished the equivalent in four days. "Watching that final training run unfold was quite a surreal experience," Li wrote in the technical report. But Li was quick to note an important caveat that speaks to broader questions about AI efficiency: he solved roughly 1,000 problems during those two years, while the model required 24,000. Humans, at least for now, remain dramatically more sample-efficient learners. Inside the reinforcement learning system that trains on 24,000 competitive programming problems NousCoder-14B's training process offers a window into the increasingly sophisticated techniques researchers use to improve AI reasoning capabilities through reinforcement learning. The approach relies on what researchers call "verifiable rewards" — a system where the model generates code solutions, those solutions are executed against test cases, and the model receives a simple binary signal: correct or incorrect. This feedback loop, while conceptually straightforward, requires significant infrastructure to execute at scale. Nous Research used Modal, a cloud computing platform, to run sandboxed code execution in parallel. Each of the 24,000 training problems contains hundreds of test cases on average, and the system must verify that generated code produces correct outputs within time and memory constraints — 15 seconds and 4 gigabytes, respectively. The training employed a technique called DAPO (Dynamic Sampling Policy Optimization), which the researchers found performed slightly better than alternatives in their experiments. A key innovation involves "dynamic sampling" — discarding training examples where the model either solves all attempts or fails all attempts, since these provide no useful gradient signal for learning. The researchers also adopted "iterative context extension," first training the model with a 32,000-token context window before expanding to 40,000 tokens. During evaluation, extending the context further to approximately 80,000 tokens produced the best results, with accuracy reaching 67.87 percent. Perhaps most significantly, the training pipeline overlaps inference and verification — as soon as the model generates a solution, it begins work on the next problem while the previous solution is being checked. This pipelining, combined with asynchronous training where multiple model instances work in parallel, maximizes hardware utilization on expensive GPU clusters. The looming data shortage that could slow AI coding model progress Buried in Li's technical report is a finding with significant implications for the future of AI development: the training dataset for NousCoder-14B encompasses "a significant portion of all readily available, verifiable competitive programming problems in a standardized dataset format." In other words, for this particular domain, the researchers are approaching the limits of high-quality training data. "The total number of competitive programming problems on the Internet is roughly the same order of magnitude," Li wrote, referring to the 24,000 problems used for training. "This suggests that within the competitive programming domain, we have approached the limits of high-quality data." This observation echoes growing concern across the AI industry about data constraints. While compute continues to scale according to well-understood economic and engineering principles, training data is "increasingly finite," as Li put it. "It appears that some of the most important research that needs to be done in the future will be in the areas of synthetic data generation and data efficient algorithms and architectures," he concluded. The challenge is particularly acute for competitive programming because the domain requires problems with known correct solutions that can be verified automatically. Unlike natural language tasks where human evaluation or proxy metrics suffice, code either works or it doesn't — making synthetic data generation considerably more difficult. Li identified one potential avenue: training models not just to solve problems but to generate solvable problems, enabling a form of self-play similar to techniques that proved successful in game-playing AI systems. "Once synthetic problem generation is solved, self-play becomes a very interesting direction," he wrote. A $65 million bet that open-source AI can compete with Big Tech Nous Research has carved out a distinctive position in the AI landscape: a company committed to open-source releases that compete with — and sometimes exceed — proprietary alternatives. The company raised $50 million in April 2025 in a round led by Paradigm, the cryptocurrency-focused venture firm founded by Coinbase co-founder Fred Ehrsam. Total funding reached $65 million, according to some reports. The investment reflected growing interest in decentralized approaches to AI training, an area where Nous Research has developed its Psyche platform. Previous releases include Hermes 4, a family of models that we reported "outperform ChatGPT without content restrictions," and DeepHermes-3, which the company described as the first "toggle-on reasoning model" — allowing users to activate extended thinking capabilities on demand. The company has cultivated a distinctive aesthetic and community, prompting some skepticism about whether style might overshadow substance. "Ofc i'm gonna believe an anime pfp company. stop benchmarkmaxxing ffs," wrote one critic on X, referring to Nous Research's anime-style branding and the industry practice of optimizing for benchmark performance. Others raised technical questions. "Based on the benchmark, Nemotron is better," noted one commenter, referring to Nvidia's family of language models. Another asked whether NousCoder-14B is "agentic focused or just 'one shot' coding" — a distinction that matters for practical software development, where iterating on feedback typically produces better results than single attempts. What researchers say must happen next for AI coding tools to keep improving The release includes several directions for future work that hint at where AI coding research may be heading. Multi-turn reinforcement learning tops the list. Currently, the model receives only a final binary reward — pass or fail — after generating a solution. But competitive programming problems typically include public test cases that provide intermediate feedback: compilation errors, incorrect outputs, time limit violations. Training models to incorporate this feedback across multiple attempts could significantly improve performance. Controlling response length also remains a challenge. The researchers found that incorrect solutions tended to be longer than correct ones, and response lengths quickly saturated available context windows during training — a pattern that various algorithmic modifications failed to resolve. Perhaps most ambitiously, Li proposed "problem generation and self-play" — training models to both solve and create programming problems. This would address the data scarcity problem directly by enabling models to generate their own training curricula. "Humans are great at generating interesting and useful problems for other competitive programmers, but it appears that there still exists a significant gap in LLM capabilities in creative problem generation," Li wrote. The model is available now on Hugging Face under an Apache 2.0 license. For researchers and developers who want to build on the work, Nous Research has published the complete Atropos training stack alongside it. What took Li two years of adolescent dedication to achieve—climbing from a 1600-level novice to a 2100-rated competitor on Codeforces—an AI replicated in 96 hours. He needed 1,000 problems. The model needed 24,000. But soon enough, these systems may learn to write their own problems, teach themselves, and leave human benchmarks behind entirely. The question is no longer whether machines can learn to code. It's whether they'll soon be better teachers than we ever were.
Amazon hopes to challenge Nvidia more directly by selling its AI chips
AWS is in talks to sell its chips to other data centers. CEO Andy Jassy has said this represents a $50 billion opportunity for the company.
How we used Gemini to build Google I/O 2026
Learn how Googlers used AI to produce Google I/O 2026.
A near-autonomous AI chemist improves a challenging reaction in medicinal chemistry
OpenAI and Molecule.one show how a near-autonomous AI chemist using GPT-5.4 improved a key drug-making reaction, advancing medicinal chemistry research.
Amazon’s Movie Arm Abandons Film About OpenAI - The New York Times
Amazon’s Movie Arm Abandons Film About OpenAI The New York Times
Zapier vs. Make comparison: Which is best? [2026]
"I want AI to do my laundry and dishes so that I can do art and writing, not for AI to do my art and writing so that I can do my laundry and dishes," said author Joanna Maciejewska in a viral post. It's a common anti-AI objection. Why are we automating away things that are delightful, enriching, and human, while keeping the drudgery for ourselves? Fortunately, with the advent of agents, we're starting to see AI use cases that really do knock out drudgery, like compliance review and help desk ma
Elon Musk Sees SpaceX Hitting $1 Trillion in Revenue by 2030. It Could Be Beaten to That Mark by This Artificial Intelligence (AI) Stock - Yahoo Finance
Elon Musk Sees SpaceX Hitting $1 Trillion in Revenue by 2030. It Could Be Beaten to That Mark by This Artificial Intelligence (AI) Stock Yahoo Finance