Prevent lock-in with AI model flexibility on Zapier
Every AI provider comes with models of varying strengths. I'm a Claude stan because it just gets my writing style, but I'll often reach for Sonnet over the higher-tier models because its results are more consistent for me. And for some tasks, Claude's lineup doesn't cut it at all—when I need to process data at scale, for example, I might reach for Gemini. When I need a versatile generalist for classification or routing, GPT might be my pick. Other people across my team and at Zapier have altoget
Fluid, natural voice translation with Gemini 3.5 Live Translate
Gemini 3.5 Live Translate brings near real-time, natural speech translation to Google AI Studio, Google Translate and Google Meet.
OpenAI and Broadcom unveil LLM-optimized inference chip
OpenAI and Broadcom introduce Jalapeño, a custom AI chip built for LLM inference to improve performance, efficiency, and scale across AI systems.
Builders Stage agenda revealed: Practical strategies for scaling startups at TechCrunch Disrupt 2026
The Builders Stage is returning to TechCrunch Disrupt 2026, bringing together 10,000+ founders, startup operators, and investors for practical conversations. and Q&A on what it takes to build and scale successful companies. Register now to save up to $330.
Best Universities To Study AI in 2026
Artificial intelligence has made enormous strides in the past few years – with the introduction of a wide range of AI tools changing the landscape of how we assess data and operate within online spaces forever. This page ranks the 50 best universities to study AI around the world, based on scope, prestige, and the level of AI-related research each institution has released. Career prospects in AI There is a huge demand for individuals with a high degree of skills in artificial intelligence and machine learning, making AI a potential lucrative career prospect with countless opportunities as AI continues to The post Best Universities To Study AI in 2026 appeared first on DailyAI.
9 demos of Gemini Omni and Gemini 3.5 in action
Watch 9 videos showing the capabilities of Gemini Omni and Gemini 3.5, announced at Google I/O 2026.
Catch up on 12 major I/O 2026 moments
Here are 12 of the biggest Google I/O 2026 keynote moments, including news about Gemini Omni, Gemini 3.5 Flash and more.
Introducing computer use in Gemini 3.5 Flash
A look at the built-in computer use tool in Gemini 3.5 Flash.
Anthropic says Trump administration lifted restrictions on some of its most powerful Claude AI models - CBS News
Anthropic says Trump administration lifted restrictions on some of its most powerful Claude AI models CBS News
5 ways to learn with study notebooks in the Gemini app
Study notebooks is a new space in the Gemini app that serves as an interactive learning tool tailored to any student's goals.
The latest AI news we announced in June 2026
Here are Google’s latest AI updates from June 2026.
5 ways Google parents are using Gemini
How Gemini helps with homework, meal planning and more, so parents have time to focus on the good stuff.
The Gemini app is bringing personalized image creation to more users.
Personal Intelligence makes the Gemini app feel tailored to you. With your permission, it pulls from Google tools like Gmail, Google Photos, YouTube and Search to provid…
The 8 best no-code app builders in 2026
Remember when you had to know how to code to build an app? We've moved on from that world: no-code tools are here to stay, and they're powerful enough to let you build almost anything you can think of without ever typing function(). I've been working with no-code apps for a while, and as a die-hard tinkerer, I have a serious soft spot for them. For this article, I researched and considered over 100 different platforms, exploring each one and then conducting extensive testing on the top contender
The first reply wins: Meet the builders turning Yelp leads into booked jobs
The 2026 Zappy Awards are open — and this month, we're partnering with Yelp to spotlight something specific: what Yelp advertisers are actually building with Zapier. Turns out, the integration story goes well beyond connecting a form to a spreadsheet. These builders are using Yelp leads as the trigger for systems that route, respond, follow up, and convert — automatically, across multiple locations, at any hour. With 70+ submissions in already, here are two that stood out. Caleb Whalen, Owner, C
Could your job be on this list? AI ranked the careers it thinks it can replace - Click2Houston
Could your job be on this list? AI ranked the careers it thinks it can replace Click2Houston
Our latest Google Finance upgrades, including a new app
The new Google Finance is coming out of beta and launching a new Android app.
Patch the Planet: a Daybreak initiative to support open source maintainers
OpenAI introduces Patch the Planet, a Daybreak initiative helping open-source maintainers find, validate, and fix vulnerabilities with AI and expert review.
The future is now: College of Computing & Artificial Intelligence officially launches - UW–Madison News
The future is now: College of Computing & Artificial Intelligence officially launches UW–Madison News
Murder Victim Speaks from the Grave in Courtroom Through AI
Chris Pelkey was shot and killed in a road rage incident. At his killer’s sentencing, he forgave the man via AI. In a historic first for Arizona, and possibly the U.S., artificial intelligence was used in court to let a murder victim deliver his own victim impact statement. What happened Pelkey, a 37-year-old Army veteran, was gunned down at a red light in 2021. This month, a realistic AI version of him appeared in court to address his killer, Gabriel Horcasitas. “In another life, we probably could’ve been friends,” said AI Pelkey in the video. “I believe in forgiveness, and The post Murder Victim Speaks from the Grave in Courtroom Through AI appeared first on DailyAI.
Venice AI becomes a unicorn with $65M Series A as its privacy-first AI platform takes off
Venice AI is already profitable, with annualized run-rate revenues of over $70 million, CEO Erik Voorhees said.
Salesforce rolls out new Slackbot AI agent as it battles Microsoft and Google in workplace AI
Salesforce on Tuesday launched an entirely rebuilt version of Slackbot, the company's workplace assistant, transforming it from a simple notification tool into what executives describe as a fully powered AI agent capable of searching enterprise data, drafting documents, and taking action on behalf of employees. The new Slackbot, now generally available to Business+ and Enterprise+ customers, is Salesforce's most aggressive move yet to position Slack at the center of the emerging "agentic AI" movement — where software agents work alongside humans to complete complex tasks. The launch comes as Salesforce attempts to convince investors that artificial intelligence will bolster its products rather than render them obsolete. "Slackbot isn't just another copilot or AI assistant," said Parker Harris, Salesforce co-founder and Slack's chief technology officer, in an exclusive interview with Salesforce. "It's the front door to the agentic enterprise, powered by Salesforce." From tricycle to Porsche: Salesforce rebuilt Slackbot from the ground up Harris was blunt about what distinguishes the new Slackbot from its predecessor: "The old Slackbot was, you know, a little tricycle, and the new Slackbot is like, you know, a Porsche." The original Slackbot, which has existed since Slack's early days, performed basic algorithmic tasks — reminding users to add colleagues to documents, suggesting channel archives, and delivering simple notifications. The new version runs on an entirely different architecture built around a large language model and sophisticated search capabilities that can access Salesforce records, Google Drive files, calendar data, and years of Slack conversations. "It's two different things," Harris explained. "The old Slackbot was algorithmic and fairly simple. The new Slackbot is brand new — it's based around an LLM and a very robust search engine, and connections to third-party search engines, third-party enterprise data." Salesforce chose to retain the Slackbot brand despite the fundamental technical overhaul. "People know what Slackbot is, and so we wanted to carry that forward," Harris said. Why Anthropic's Claude powers the new Slackbot — and which AI models could come next The new Slackbot runs on Claude, Anthropic's large language model, a choice driven partly by compliance requirements. Slack's commercial service operates under FedRAMP Moderate certification to serve U.S. federal government customers, and Harris said Anthropic was "the only provider that could give us a compliant LLM" when Slack began building the new system. But that exclusivity won't last. "We are, this year, going to support additional providers," Harris said. "We have a great relationship with Google. Gemini is incredible — performance is great, cost is great. So we're going to use Gemini for some things." He added that OpenAI remains a possibility as well. Harris echoed Salesforce CEO Marc Benioff's view that large language models are becoming commoditized: "You've heard Marc talk about LLMs are commodities, that they're democratized. I call them CPUs." On the sensitive question of training data, Harris was unequivocal: Salesforce does not train any models on customer data. "Models don't have any sort of security," he explained. "If we trained it on some confidential conversation that you and I have, I don't want Carolyn to know — if I train it into the LLM, there is no way for me to say you get to see the answer, but Carolyn doesn't." Inside Salesforce's internal experiment: 80,000 employees tested Slackbot with striking results Salesforce has been testing the new Slackbot internally for months, rolling it out to all 80,000 employees. According to Ryan Gavin, Slack's chief marketing officer, the results have been striking: "It's the fastest adopted product in Salesforce history." Internal data shows that two-thirds of Salesforce employees have tried the new Slackbot, with 80% of those users continuing to use it regularly. Internal satisfaction rates reached 96% — the highest for any AI feature Slack has shipped. Employees report saving between two and 20 hours per week. The adoption happened largely organically. "I think it was about five days, and a Canvas was developed by our employees called 'The Most Stealable Slackbot Prompts,'" Gavin said. "People just started adding to it organically. I think it's up to 250-plus prompts that are in this Canvas right now." Kate Crotty, a principal UX researcher at Salesforce, found that 73% of internal adoption was driven by social sharing rather than top-down mandates. "Everybody is there to help each other learn and communicate hacks," she said. How Slackbot transforms scattered enterprise data into executive-ready insights During a product demonstration, Amy Bauer, Slack's product experience designer, showed how Slackbot can synthesize information across multiple sources. In one example, she asked Slackbot to analyze customer feedback from a pilot program, upload an image of a usage dashboard, and have Slackbot correlate the qualitative and quantitative data. "This is where Slackbot really earns its keep for me," Bauer explained. "What it's doing is not just simply reading the image — it's actually looking at the image and comparing it to the insight it just generated for me." Slackbot can then query Salesforce to find enterprise accounts with open deals that might be good candidates for early access, creating what Bauer called "a really great justification and plan to move forward." Finally, it can synthesize all that information into a Canvas — Slack's collaborative document format — and find calendar availability among stakeholders to schedule a review meeting. "Up until this point, we have been working in a one-to-one capacity with Slackbot," Bauer said. "But one of the benefits that I can do now is take this insight and have it generate this into a Canvas, a shared workspace where I can iterate on it, refine it with Slackbot, or share it out with my team." Rob Seaman, Slack's chief product officer, said the Canvas creation demonstrates where the product is heading: "This is making a tool call internally to Slack Canvas to actually write, effectively, a shared document. But it signals where we're going with Slackbot — we're eventually going to be adding in additional third-party tool calls." MrBeast's company became a Slackbot guinea pig—and employees say they're saving 90 minutes a day Among Salesforce's pilot customers is Beast Industries, the parent company of YouTube star MrBeast. Luis Madrigal, the company's chief information officer, joined the launch announcement to describe his experience. "As somebody who has rolled out enterprise technologies for over two decades now, this was practically one of the easiest," Madrigal said. "The plumbing is there. Slack as an implementation, Enterprise Tools — being able to turn on the Slackbot and the Slack AI functionality was as simple as having my team go in, review, do a quick security review." Madrigal said his security team signed off "rather quickly" — unusual for enterprise AI deployments — because Slackbot accesses only the information each individual user already has permission to view. "Given all the guardrails you guys have put into place for Slackbot to be unique and customized to only the information that each individual user has, only the conversations and the Slack rooms and Slack channels that they're part of—that made my security team sign off rather quickly." One Beast Industries employee, Sinan, the head of Beast Games marketing, reported saving "at bare minimum, 90 minutes a day." Another employee, Spencer, a creative supervisor, described it as "an assistant who's paying attention when I'm not." Other pilot customers include Slalom, reMarkable, Xero, Mercari, and Engine. Mollie Bodensteiner, SVP of Operations at Engine, called Slackbot "an absolute 'chaos tamer' for our team," estimating it saves her about 30 minutes daily "just by eliminating context switching." Slackbot vs. Microsoft Copilot vs. Google Gemini: The fight for enterprise AI dominance The launch puts Salesforce in direct competition with Microsoft's Copilot, which is integrated into Teams and the broader Microsoft 365 suite, as well as Google's Gemini integrations across Workspace. When asked what distinguishes Slackbot from these alternatives, Seaman pointed to context and convenience. "The thing that makes it most powerful for our customers and users is the proximity — it's just right there in your Slack," Seaman said. "There's a tremendous convenience affordance that's naturally built into it." The deeper advantage, executives argue, is that Slackbot already understands users' work without requiring setup or training. "Most AI tools sound the same no matter who is using them," the company's announcement stated. "They lack context, miss nuance, and force you to jump between tools to get anything done." Harris put it more directly: "If you've ever had that magic experience with AI — I think ChatGPT is a great example, it's a great experience from a consumer perspective — Slackbot is really what we're doing in the enterprise, to be this employee super agent that is loved, just like people love using Slack." Amy Bauer emphasized the frictionless nature of the experience. "Slackbot is inherently grounded in the context, in the data that you have in Slack," she said. "So as you continue working in Slack, Slackbot gets better because it's grounded in the work that you're doing there. There is no setup. There is no configuration for those end users." Salesforce's ambitious plan to make Slackbot the one 'super agent' that controls all the others Salesforce positions Slackbot as what Harris calls a "super agent" — a central hub that can eventually coordinate with other AI agents across an organization. "Every corporation is going to have an employee super agent," Harris said. "Slackbot is essentially taking the magic of what Slack does. We think that Slackbot, and we're really excited about it, is going to be that." The vision extends to third-party agents already launching in Slack. Last month, Anthropic released a preview of Claude Code for Slack, allowing developers to interact with Claude's coding capabilities directly in chat threads. OpenAI, Google, Vercel, and others have also built agents for the platform. "Most of the net-new apps that are being deployed to Slack are agents," Seaman noted during the press conference. "This is proof of the promise of humans and agents coexisting and working together in Slack to solve problems." Harris described a future where Slackbot becomes an MCP (Model Context Protocol) client, able to leverage tools from across the software ecosystem — similar to how the developer tool Cursor works. "Slack can be an MCP client, and Slackbot will be the hub of that, leveraging all these tools out in the world, some of which will be these amazing agents," he said. But Harris also cautioned against over-promising on multi-agent coordination. "I still think we're in the single agent world," he said. "FY26 is going to be the year where we started to see more coordination. But we're going to do it with customer success in mind, and not demonstrate and talk about, like, 'I've got 1,000 agents working together,' because I think that's unrealistic." Slackbot costs nothing extra, but Salesforce's data access fees could squeeze some customers Slackbot is included at no additional cost for customers on Business+ and Enterprise+ plans. "There's no additional fees customers have to do," Gavin confirmed. "If they're on one of those plans, they're going to get Slackbot." However, some enterprise customers may face other cost pressures related to Salesforce's broader data strategy. CIOs may see price increases for third-party applications that work with Salesforce data, as effects of higher charges for API access ripple through the software supply chain. Fivetran CEO George Fraser has warned that Salesforce's shift in pricing policy for API access could have tangible consequences for enterprises relying on Salesforce as a system of record. "They might not be able to use Fivetran to replicate their data to Snowflake and instead have to use Salesforce Data Cloud. Or they might find that they are not able to interact with their data via ChatGPT, and instead have to use Agentforce," Fraser said in a recent CIO report. Salesforce has framed the pricing change as standard industry practice. What Slackbot can do today, what's coming in weeks, and what's still on the roadmap The new Slackbot begins rolling out today and will reach all eligible customers by the end of February. Mobile availability will complete by March 3, Bauer confirmed during her interview with VentureBeat. Some capabilities remain works in progress. Calendar reading and availability checking are available at launch, but the ability to actually book meetings is "coming a few weeks after," according to Seaman. Image generation is not currently supported, though Bauer said it's "something that we are looking at in the future." When asked about integration with competing CRM systems like HubSpot and Microsoft Dynamics, Salesforce representatives declined to provide specifics during the interview, though they acknowledged the question touched on key competitive differentiators. Salesforce is betting the future of work looks like a chat window—and it's not alone The Slackbot launch is Salesforce's bet that the future of enterprise work is conversational — that employees will increasingly prefer to interact with AI through natural language rather than navigating traditional software interfaces. Harris described Slack's product philosophy using principles like "don't make me think" and "be a great host." The goal, he said, is for Slackbot to surface information proactively rather than requiring users to hunt for it. "One of the revelations for me is LLMs applied to unstructured information are incredible," Harris said. "And the amount of value you have if you're a Slack user, if your corporation uses Slack — the amount of value in Slack is unbelievable. Because you're talking about work, you're sharing documents, you're making decisions, but you can't as a human go through that and really get the same value that an LLM can do." Looking ahead, Harris expects the interfaces themselves to evolve beyond pure conversation. "We're kind of saturating what we can do with purely conversational UIs," he said. "I think we'll start to see agents building an interface that best suits your intent, as opposed to trying to surface something within a conversational interface that matches your intent." Microsoft, Google, and a growing roster of AI startups are placing similar bets — that the winning enterprise AI will be the one embedded in the tools workers already use, not another application to learn. The race to become that invisible layer of workplace intelligence is now fully underway. For Salesforce, the stakes extend beyond a single product launch. After a bruising year on Wall Street and persistent questions about whether AI threatens its core business, the company is wagering that Slackbot can prove the opposite — that the tens of millions of people already chatting in Slack every day is not a vulnerability, but an unassailable advantage. Haley Gault, the Salesforce account executive in Pittsburgh who stumbled upon the new Slackbot on a snowy morning, captured the shift in a single sentence: "I honestly can't imagine working for another company not having access to these types of tools. This is just how I work now." That's precisely what Salesforce is counting on.
Unchecked AI progress may pose catastrophic risks, UN panel warns - Reuters
Unchecked AI progress may pose catastrophic risks, UN panel warns Reuters
The DeepMind trio who built a poker AI are now making money for quant hedge funds
EquiLibre Technologies, a Prague-based AI lab founded by three ex-DeepMind researchers, is now valued at more than $500 million.
Which AI models can you automate on Zapier? (Sonnet 5, Gemini 3.5 Flash, and more)
New AI models launch practically every week, and keeping up with which ones to use for specific workflows is a job in itself. Consider this article your living reference. At Zapier, we run every model through AutomationBench. It's our benchmark for testing how well models carry out multi-step workflows, not just static prompts. Below, I'll walk through every major AI provider available on Zapier, the models you can plug into your Zap workflows today, and what each one is best for based on Zapier
Core dump epidemiology: fixing an 18-year-old bug
OpenAI engineers used large-scale core dump analysis to debug rare infrastructure crashes, uncovering both a hardware fault and a long-standing software bug.
Helping build shared standards for advanced AI
OpenAI helps build shared standards for advanced AI, supporting evaluation frameworks, safety practices, and global cooperation through the Appia Foundation.
New York City educators and industry leaders gathered at Google’s offices to shape the future of AI in classrooms.
Google, the New York Jobs CEO Council and Urban Assembly hosted an AI summit for 150 education and industry leaders.
The perils of tokenmaxxing: How to govern AI spend without sacrificing speed
I write about tech and AI for a living, but nothing has made me yearn for the Butlerian Jihad more than learning (against my will) about the term "tokenmaxxing." And if I have to know what that means, you do, too. In early 2026, companies started publishing internal leaderboards ranking employees by how many AI tokens they consumed. At Meta, the board was called "Claudeonomics," and it handed out digital badges with extremely not-dorky titles like "Cache Wizard" and "Model Connoisseur." The high
Google introduces a faster, cheaper image generator with Nano Banana 2 Lite
Google is updating its image generator to make it faster and cheaper, making it a more useful tool for creators looking to make AI content.
The ‘Father of the Internet’ is finally retiring
Vinton Cerf, one of the creators of the protocols underlying the internet, will step down as Google's chief internet evangelist next week.
We’re Only Starting to Grasp the Pitfalls of Using A.I. at Work - The New York Times
We’re Only Starting to Grasp the Pitfalls of Using A.I. at Work The New York Times
Therapists Too Expensive? Why Thousands of Women Are Spilling Their Deepest Secrets to ChatGPT
More women are turning to ChatGPT for emotional support, using the AI chatbot as a stand-in therapist as mental health systems buckle under pressure. With long wait times and soaring costs, AI is filling a growing gap. Mental health care is harder to access than ever. In the UK, NHS data shows patients are eight times more likely to wait over 18 months for mental health treatment than for physical health. Private therapy isn’t always an option either, with sessions costing £60 or more. In that vacuum, ChatGPT has become a surprising outlet. Real voices, real feelings Charly, 29, from The post Therapists Too Expensive? Why Thousands of Women Are Spilling Their Deepest Secrets to ChatGPT appeared first on DailyAI.
Anthropic launches AI drug discovery program, joining tech giants in betting on healthcare - CNBC
Anthropic launches AI drug discovery program, joining tech giants in betting on healthcare CNBC
5 ways Google Search can level up your thrift and vintage shopping
Uncover second-hand scores with AI tools in Google Search and Shopping.
Employers who laid off workers citing AI are already starting to regret it - CNBC
Employers who laid off workers citing AI are already starting to regret it CNBC
June Pixel Drop: New features for creators, Gemini upgrades and more
Get new screen recording feature, text-to-video tools with Gemini Omni, and better multitasking on your Pixel devices.
Gemini can now take notes in Google Meet for Google AI Pro and Ultra subscribers.
Google Meet's "Take notes for me" feature is available to Google AI Pro and Ultra subscribers in select languages.
China Unveils World’s First AI Hospital: 14 Virtual Doctors Ready to Treat Thousands Daily
China has unveiled the world’s first fully AI-powered hospital, marking a radical shift in the future of healthcare. Developed by Tsinghua University in Beijing, the “Agent Hospital” features 14 AI doctors and 4 AI nurses that can diagnose, treat, and manage up to 3,000 patients per day, without any human staff. Faster, smarter care: What would take human doctors 3 years, the AI doctors can do in 1 day. High IQ bots: These AI agents scored a 93.06% pass rate on the US Medical Licensing Exam. Training without risk: The virtual hospital allows medical students to practice in a fully The post China Unveils World’s First AI Hospital: 14 Virtual Doctors Ready to Treat Thousands Daily appeared first on DailyAI.
Cloudflare’s new policy pushes AI companies to pay for publishers’ content
Cloudflare is giving AI companies until September 15 to separate web crawlers used for search from those used for AI training and agents, or risk being blocked by default on many publisher sites.
Nous Research's NousCoder-14B is an open-source coding model landing right in the Claude Code moment
Nous Research, the open-source artificial intelligence startup backed by crypto venture firm Paradigm, released a new competitive programming model on Monday that it says matches or exceeds several larger proprietary systems — trained in just four days using 48 of Nvidia's latest B200 graphics processors. The model, called NousCoder-14B, is another entry in a crowded field of AI coding assistants, but arrives at a particularly charged moment: Claude Code, the agentic programming tool from rival Anthropic, has dominated social media discussion since New Year's Day, with developers posting breathless testimonials about its capabilities. The simultaneous developments underscore how quickly AI-assisted software development is evolving — and how fiercely companies large and small are competing to capture what many believe will become a foundational technology for how software gets written. type: embedded-entry-inline id: 74cSyrq6OUrp9SEQ5zOUSl NousCoder-14B achieves a 67.87 percent accuracy rate on LiveCodeBench v6, a standardized evaluation that tests models on competitive programming problems published between August 2024 and May 2025. That figure represents a 7.08 percentage point improvement over the base model it was trained from, Alibaba's Qwen3-14B, according to Nous Research's technical report published alongside the release. "I gave Claude Code a description of the problem, it generated what we built last year in an hour," wrote Jaana Dogan, a principal engineer at Google responsible for the Gemini API, in a viral post on X last week that captured the prevailing mood around AI coding tools. Dogan was describing a distributed agent orchestration system her team had spent a year developing — a system Claude Code approximated from a three-paragraph prompt. The juxtaposition is instructive: while Anthropic's Claude Code has captured imaginations with demonstrations of end-to-end software development, Nous Research is betting that open-source alternatives trained on verifiable problems can close the gap — and that transparency in how these models are built matters as much as raw capability. How Nous Research built an AI coding model that anyone can replicate What distinguishes the NousCoder-14B release from many competitor announcements is its radical openness. Nous Research published not just the model weights but the complete reinforcement learning environment, benchmark suite, and training harness — built on the company's Atropos framework — enabling any researcher with sufficient compute to reproduce or extend the work. "Open-sourcing the Atropos stack provides the necessary infrastructure for reproducible olympiad-level reasoning research," noted one observer on X, summarizing the significance for the academic and open-source communities. The model was trained by Joe Li, a researcher in residence at Nous Research and a former competitive programmer himself. Li's technical report reveals an unexpectedly personal dimension: he compared the model's improvement trajectory to his own journey on Codeforces, the competitive programming platform where participants earn ratings based on contest performance. Based on rough estimates mapping LiveCodeBench scores to Codeforces ratings, Li calculated that NousCoder-14B's improvemen t— from approximately the 1600-1750 rating range to 2100-2200 — mirrors a leap that took him nearly two years of sustained practice between ages 14 and 16. The model accomplished the equivalent in four days. "Watching that final training run unfold was quite a surreal experience," Li wrote in the technical report. But Li was quick to note an important caveat that speaks to broader questions about AI efficiency: he solved roughly 1,000 problems during those two years, while the model required 24,000. Humans, at least for now, remain dramatically more sample-efficient learners. Inside the reinforcement learning system that trains on 24,000 competitive programming problems NousCoder-14B's training process offers a window into the increasingly sophisticated techniques researchers use to improve AI reasoning capabilities through reinforcement learning. The approach relies on what researchers call "verifiable rewards" — a system where the model generates code solutions, those solutions are executed against test cases, and the model receives a simple binary signal: correct or incorrect. This feedback loop, while conceptually straightforward, requires significant infrastructure to execute at scale. Nous Research used Modal, a cloud computing platform, to run sandboxed code execution in parallel. Each of the 24,000 training problems contains hundreds of test cases on average, and the system must verify that generated code produces correct outputs within time and memory constraints — 15 seconds and 4 gigabytes, respectively. The training employed a technique called DAPO (Dynamic Sampling Policy Optimization), which the researchers found performed slightly better than alternatives in their experiments. A key innovation involves "dynamic sampling" — discarding training examples where the model either solves all attempts or fails all attempts, since these provide no useful gradient signal for learning. The researchers also adopted "iterative context extension," first training the model with a 32,000-token context window before expanding to 40,000 tokens. During evaluation, extending the context further to approximately 80,000 tokens produced the best results, with accuracy reaching 67.87 percent. Perhaps most significantly, the training pipeline overlaps inference and verification — as soon as the model generates a solution, it begins work on the next problem while the previous solution is being checked. This pipelining, combined with asynchronous training where multiple model instances work in parallel, maximizes hardware utilization on expensive GPU clusters. The looming data shortage that could slow AI coding model progress Buried in Li's technical report is a finding with significant implications for the future of AI development: the training dataset for NousCoder-14B encompasses "a significant portion of all readily available, verifiable competitive programming problems in a standardized dataset format." In other words, for this particular domain, the researchers are approaching the limits of high-quality training data. "The total number of competitive programming problems on the Internet is roughly the same order of magnitude," Li wrote, referring to the 24,000 problems used for training. "This suggests that within the competitive programming domain, we have approached the limits of high-quality data." This observation echoes growing concern across the AI industry about data constraints. While compute continues to scale according to well-understood economic and engineering principles, training data is "increasingly finite," as Li put it. "It appears that some of the most important research that needs to be done in the future will be in the areas of synthetic data generation and data efficient algorithms and architectures," he concluded. The challenge is particularly acute for competitive programming because the domain requires problems with known correct solutions that can be verified automatically. Unlike natural language tasks where human evaluation or proxy metrics suffice, code either works or it doesn't — making synthetic data generation considerably more difficult. Li identified one potential avenue: training models not just to solve problems but to generate solvable problems, enabling a form of self-play similar to techniques that proved successful in game-playing AI systems. "Once synthetic problem generation is solved, self-play becomes a very interesting direction," he wrote. A $65 million bet that open-source AI can compete with Big Tech Nous Research has carved out a distinctive position in the AI landscape: a company committed to open-source releases that compete with — and sometimes exceed — proprietary alternatives. The company raised $50 million in April 2025 in a round led by Paradigm, the cryptocurrency-focused venture firm founded by Coinbase co-founder Fred Ehrsam. Total funding reached $65 million, according to some reports. The investment reflected growing interest in decentralized approaches to AI training, an area where Nous Research has developed its Psyche platform. Previous releases include Hermes 4, a family of models that we reported "outperform ChatGPT without content restrictions," and DeepHermes-3, which the company described as the first "toggle-on reasoning model" — allowing users to activate extended thinking capabilities on demand. The company has cultivated a distinctive aesthetic and community, prompting some skepticism about whether style might overshadow substance. "Ofc i'm gonna believe an anime pfp company. stop benchmarkmaxxing ffs," wrote one critic on X, referring to Nous Research's anime-style branding and the industry practice of optimizing for benchmark performance. Others raised technical questions. "Based on the benchmark, Nemotron is better," noted one commenter, referring to Nvidia's family of language models. Another asked whether NousCoder-14B is "agentic focused or just 'one shot' coding" — a distinction that matters for practical software development, where iterating on feedback typically produces better results than single attempts. What researchers say must happen next for AI coding tools to keep improving The release includes several directions for future work that hint at where AI coding research may be heading. Multi-turn reinforcement learning tops the list. Currently, the model receives only a final binary reward — pass or fail — after generating a solution. But competitive programming problems typically include public test cases that provide intermediate feedback: compilation errors, incorrect outputs, time limit violations. Training models to incorporate this feedback across multiple attempts could significantly improve performance. Controlling response length also remains a challenge. The researchers found that incorrect solutions tended to be longer than correct ones, and response lengths quickly saturated available context windows during training — a pattern that various algorithmic modifications failed to resolve. Perhaps most ambitiously, Li proposed "problem generation and self-play" — training models to both solve and create programming problems. This would address the data scarcity problem directly by enabling models to generate their own training curricula. "Humans are great at generating interesting and useful problems for other competitive programmers, but it appears that there still exists a significant gap in LLM capabilities in creative problem generation," Li wrote. The model is available now on Hugging Face under an Apache 2.0 license. For researchers and developers who want to build on the work, Nous Research has published the complete Atropos training stack alongside it. What took Li two years of adolescent dedication to achieve—climbing from a 1600-level novice to a 2100-rated competitor on Codeforces—an AI replicated in 96 hours. He needed 1,000 problems. The model needed 24,000. But soon enough, these systems may learn to write their own problems, teach themselves, and leave human benchmarks behind entirely. The question is no longer whether machines can learn to code. It's whether they'll soon be better teachers than we ever were.
Artificial intelligence could usher in a new era of vaccine development - CIDRAP
Artificial intelligence could usher in a new era of vaccine development CIDRAP
New research shows how AMIE, our medical AI, could help manage health conditions.
Research in “Nature” shows our conversational AI system matches primary care physicians in complex disease management.
Here's how Gemini can help you avoid jetlag.
If you’ve got a faraway trip coming up, the Gemini app can help you avoid jetlag so you can make the most of your visit.Once you’ve given Gemini permission to access you…
How agents are transforming work
A new OpenAI research paper shows how AI agents are transforming work, enabling longer, more complex tasks and expanding productivity across roles.
The 4 best AI search engines in 2026
It often feels like Google search has gotten worse. While the issue is complicated, online search has never felt like such a chore. Not only do you have to hop through numerous links to pinpoint what you're searching for, but you also have to navigate a maze of ads, spam, and pop-ups. Even then, how often do you find the answers you need? AI search engines claim they're the solution, so let's see. The new breed of AI search engines combines the tech behind AI chatbots like ChatGPT with traditio
Trump drops restrictions on Anthropic’s Mythos and Fable models
The Trump administration's erratic approach to AI policymaking has left companies across the industry with little clarity about what will govern future model releases.
How GPT-5 helped immunologist Derya Unutmaz solve a 3-year-old mystery
GPT-5 Pro helped solve a 3-year-old immunology mystery, offering insights into T cell behavior. The breakthrough could support cancer and autoimmune research.
What is Claude Mythos? And what happened to Claude Fable 5?
On April 7, 2026, Claude Mythos Preview was officially announced, but it was apparently too dangerous to release. According to Anthropic, Claude Mythos represented a unique cybersecurity threat (they claimed that "the fallout—for economies, public safety, and national security—could be severe.") Instead of releasing Mythos to the general public, they spun up Project Glasswing, a cybersecurity initiative that also involved some big-name companies. The idea was that they'd be able to deploy Mythos
10 top women in AI in 2026
AI is changing our world, but the stories of who build it often get lost in the noise. Behind the headlines and hype, a group of women are solving AI’s fundamental challenges – despite working in an industry persisently impacted by gender inequality. Women make up just 22% of AI professionals worldwide and only 12% of AI researchers. In academic publishing, female researchers account for just 29% of first authors on AI papers, a number that hasn’t increased since the mid-2000s. This is a story about ten leaders who have influenced AI despite the odds being stacked against them. Their The post 10 top women in AI in 2026 appeared first on DailyAI.