Claude Sonnet 4.5 is Anthropic’s safest AI model yet

In May, Anthropic announced two new AI systems, Opus 4 and Sonnet 4. Now, less than six months later, the company is introducing Sonnet 4.5, and calling it the best coding model in the world to date. Anthropic’s basis for that claim is a selection of benchmarks where the new AI outperforms not only its predecessor but also the more expensive Opus 4.1 and competing systems, including Google’s Gemini 2.5 Pro and GPT-5 from OpenAI. For instance, in OSWorld, a suite that tests AI models on real-world computer tasks, Sonnet 4.5 set a record score of 61.4 percent, putting it 17 percentage points above Opus 4.1. 

At the same time, the new model is capable of autonomously working on multi-step projects for more than 30 hours, a significant improvement from the seven or so hours Opus 4 could maintain at launch. That’s an important milestone for the type of agentic systems Anthropic wants to build. 

Sonnet 4.5 outperforms Anthropic's older models in coding and agentic tasks.
Sonnet 4.5 outperforms Anthropic’s older models in coding and agentic tasks.
Anthropic

Perhaps more importantly, the company claims Sonnet 4.5 is its safest AI system to date, with the model having undergone “extensive” safety training. That training translates to a chatbot Anthropic says is “substantially” less prone to “sycophancy, deception, power-seeking and the tendency to encourage delusional thinking” — all potential model traits that have landed OpenAI in hot water in recent months. At the same time, Anthropic has strengthened Sonnet 4.5’s protections against prompt injection attacks. Due to the sophistication of the new model, Anthropic is releasing Sonnet 4.5 under its AI Safety Level 3 framework, meaning it comes with filters designed to prevent potentially dangerous outputs related to prompts around chemical, biological and nuclear weapons.  

A chart showing how Sonnet 4.5 compares against other frontier models in safety testing.
A chart showing how Sonnet 4.5 compares against other frontier models in safety testing.
Anthropic

With today’s announcement, Anthropic is also rolling out quality of life improvements across the Claude product stack. To start, Claude Code, the company’s popular coding agent, has a refreshed terminal interface, with a new feature called checkpoints included. As you can probably guess from the name, they allow you to save your progress and roll back to a previous state if Claude writes some funky code that isn’t quite working like you imagined it would. File creation, which Anthropic began rolling out at the start of the month, is now available to all Pro users, and if you joined the waitlist Claude for Chrome, you can start using the extension today.   

API pricing for Sonnet 4.5 remains at $3 per one million input tokens and $15 for the same amount of output tokens. The release of Sonnet 4.5 caps off a strong September for Anthropic. Just one day after Microsoft added Claude models to Copilot 365 last week, OpenAI admitted its rival offers the best AI for work-related tasks.

This article originally appeared on Engadget at https://www.engadget.com/claude-sonnet-45-is-anthropics-safest-ai-model-yet-170000161.html?src=rss

Claude Sonnet 4.5 is Anthropic’s safest AI model yet

In May, Anthropic announced two new AI systems, Opus 4 and Sonnet 4. Now, less than six months later, the company is introducing Sonnet 4.5, and calling it the best coding model in the world to date. Anthropic’s basis for that claim is a selection of benchmarks where the new AI outperforms not only its predecessor but also the more expensive Opus 4.1 and competing systems, including Google’s Gemini 2.5 Pro and GPT-5 from OpenAI. For instance, in OSWorld, a suite that tests AI models on real-world computer tasks, Sonnet 4.5 set a record score of 61.4 percent, putting it 17 percentage points above Opus 4.1. 

At the same time, the new model is capable of autonomously working on multi-step projects for more than 30 hours, a significant improvement from the seven or so hours Opus 4 could maintain at launch. That’s an important milestone for the type of agentic systems Anthropic wants to build. 

Sonnet 4.5 outperforms Anthropic's older models in coding and agentic tasks.
Sonnet 4.5 outperforms Anthropic’s older models in coding and agentic tasks.
Anthropic

Perhaps more importantly, the company claims Sonnet 4.5 is its safest AI system to date, with the model having undergone “extensive” safety training. That training translates to a chatbot Anthropic says is “substantially” less prone to “sycophancy, deception, power-seeking and the tendency to encourage delusional thinking” — all potential model traits that have landed OpenAI in hot water in recent months. At the same time, Anthropic has strengthened Sonnet 4.5’s protections against prompt injection attacks. Due to the sophistication of the new model, Anthropic is releasing Sonnet 4.5 under its AI Safety Level 3 framework, meaning it comes with filters designed to prevent potentially dangerous outputs related to prompts around chemical, biological and nuclear weapons.  

A chart showing how Sonnet 4.5 compares against other frontier models in safety testing.
A chart showing how Sonnet 4.5 compares against other frontier models in safety testing.
Anthropic

With today’s announcement, Anthropic is also rolling out quality of life improvements across the Claude product stack. To start, Claude Code, the company’s popular coding agent, has a refreshed terminal interface, with a new feature called checkpoints included. As you can probably guess from the name, they allow you to save your progress and roll back to a previous state if Claude writes some funky code that isn’t quite working like you imagined it would. File creation, which Anthropic began rolling out at the start of the month, is now available to all Pro users, and if you joined the waitlist Claude for Chrome, you can start using the extension today.   

API pricing for Sonnet 4.5 remains at $3 per one million input tokens and $15 for the same amount of output tokens. The release of Sonnet 4.5 caps off a strong September for Anthropic. Just one day after Microsoft added Claude models to Copilot 365 last week, OpenAI admitted its rival offers the best AI for work-related tasks.

This article originally appeared on Engadget at https://www.engadget.com/claude-sonnet-45-is-anthropics-safest-ai-model-yet-170000161.html?src=rss

“Zwift’s Big Weekend” Events Announced for October 3-6

Today, Zwift posted a set of rides for the coming weekend (October 3-6) as part of an event dubbed “Zwift’s Big Weekend.” It’s Zwift’s birthday celebration – 11 years old this fall – and it’s “Our celebration of everything Zwift—pure energy, on and off the bike.”

There are actually two pieces to this weekend. The first is a set of fun banded rides led by pros and ambassadors, which I’ve detailed below. The second piece… well, that’s a secret we can’t talk about until we get closer to the weekend. �

Ride Schedule and Guest Leaders

All events are 45 minutes long and rubberbanded, so as long as you keep pedaling, you’ll stay with the group!

  • October 3, 8am UTC/4am ET/1am PT
  • October 3, 10am UTC/6am ET/3am PT
  • October 4, 12am UTC/8pm ET/5pm PT
    • Rider Leaders Paula Findlay & Eric Lagerstrom
    • Knickerbocker, 45 mins, rubberbanded
  • October 4, 6am UTC/2am ET/11pm PT
    • Rider Leader Bert Van Lerberghe 
    • London 8, 45 mins, rubberbanded
  • October 5, 5pm UTC/1pm ET/10am PT
    • Rider Leader Sam Laidlow
    • Douce France, 45 mins, rubberbanded
  • October 5, 6pm UTC/2pm ET/11am PT
    • Rider Leaders Demi Vollering, Evita Muzic & Juilette Labous
    • Greatest London Flat, 45 mins, rubberbanded 
  • October 6, 9am UTC/5am ET/2am PT
  • October 6, 1pm UTC/9am ET/6am PT
    • Rider Leaders Kasia Niewiadoma & Neve Bradbury
    • Tempus Fugit, 45 mins, rubberbanded
  • October 6, 11pm UTC/7pm ET/4pm PT
    • Rider Leader Lionel Sanders 
    • Turf N Surf, 45 mins rubberbanded

Sign up at zwift.com/events/tag/zwiftsbigweekend >

Questions or Comments

Share below!

This Roomba robot vacuum is on sale for only $150 ahead of Prime Day

The iRobot Roomba 104 robot vacuum is on sale for $150 just ahead of October’s Prime Day. That’s a nice little discount of 40 percent, which represents a savings of $100.

This is a newer version of the unit that topped our list of the best budget robot vacuums. It’s an entry-level robovac that gets the job done. The cleaning motor is fairly powerful and it ships with a multi-surface brush and an edge-sweeping brush. The vacuum uses LiDAR to map a home and to help it avoid obstacles when cleaning.

It’s also been equipped with specialized sensors to prevent falling down stairs. Steps are the natural enemy of all robot vacuums, except maybe this one. The Roomba 104 integrates with the company’s proprietary app, which allows for custom cleaning schedules and the like. The robot can also be controlled via voice assistant and boasts compatibility with Siri, Alexa and Google Assistant.

The vacuum will automatically head to the charger for some juice when running low, which is nice. The battery lasts around 200 minutes per charge, which is a decent enough metric for a budget-friendly robovac. The only downside here? This is just a vacuum. It doesn’t mop and it doesn’t come with a dedicated debris canister.

This article originally appeared on Engadget at https://www.engadget.com/deals/this-roomba-robot-vacuum-is-on-sale-for-only-150-ahead-of-prime-day-164953406.html?src=rss

This Roomba robot vacuum is on sale for only $150 ahead of Prime Day

The iRobot Roomba 104 robot vacuum is on sale for $150 just ahead of October’s Prime Day. That’s a nice little discount of 40 percent, which represents a savings of $100.

This is a newer version of the unit that topped our list of the best budget robot vacuums. It’s an entry-level robovac that gets the job done. The cleaning motor is fairly powerful and it ships with a multi-surface brush and an edge-sweeping brush. The vacuum uses LiDAR to map a home and to help it avoid obstacles when cleaning.

It’s also been equipped with specialized sensors to prevent falling down stairs. Steps are the natural enemy of all robot vacuums, except maybe this one. The Roomba 104 integrates with the company’s proprietary app, which allows for custom cleaning schedules and the like. The robot can also be controlled via voice assistant and boasts compatibility with Siri, Alexa and Google Assistant.

The vacuum will automatically head to the charger for some juice when running low, which is nice. The battery lasts around 200 minutes per charge, which is a decent enough metric for a budget-friendly robovac. The only downside here? This is just a vacuum. It doesn’t mop and it doesn’t come with a dedicated debris canister.

This article originally appeared on Engadget at https://www.engadget.com/deals/this-roomba-robot-vacuum-is-on-sale-for-only-150-ahead-of-prime-day-164953406.html?src=rss

A PlayStation photography book featuring never-before-seen design concepts is on the way

Sony has been marking the 30th anniversary of PlayStation by selling you stuff, like PS5 consoles and accessories styled after the PS1. The company has something else lined up to mark the occasion: a photography book showcasing “never-before-seen prototypes, concept sketches and design models that shaped hardware development” from the early days through to the current PS5 era.

PlayStation: The First 30 Years is a 400-page hardback book printed on heavyweight matt art stock. You better hope your coffee table is sturdy, since this book is a chonkster. It weighs in at 5kg (11lbs). 

Sony worked with publisher Read-Only Memory on PlayStation: The First 30 Years. The collaborators have offered a peek at what’s inside the book. It’s shaping up to be a fascinating glimpse at some of the designs Sony tried for its hardware over the last few decades. 

For instance, the original PlayStation could have looked much different, more directly exemplifying designer Teiyu Goto’s “vision of simple squares and circles coming to life.” (For what it’s worth, Engadget deputy editor Nathan Ingraham said this design looked like a proton pack):

An early concept for the original PlayStation, featuring a more compact look based on squares and circles.
An early concept design for the original PlayStation.
Sony/Read-Only Memory

Some of the controller concepts are pretty out there too. Some don’t look all that comfortable to hold or use for extended gaming sessions. This one — which appears to be for the PS3 at the earliest, given the inclusion of the PS button — is truly bonkers. Thank goodness Goto landed on the SNES-style design with grips that has proven so successful (and comfortable) over the years:

A PlayStation controller concept.
A PlayStation controller concept.
Sony/Read-Only Memory

The book isn’t entirely limited to hardware concepts, as it features photos of Sony’s design labs. The tome seems like a very cool item for PlayStation fans and those who love gaming history to have, but there might be an element of sticker shock. 

The book is available via Read-Only Memory’s website for $182. A deluxe edition with exposed binding, a foil-stamped clamshell presentation box and a photographic print signed by Goto and photographer Benedict Redgrove will run you $467. The fancier edition has a limited run of 1994 copies. Coincidentally, that’s the year the PS1 debuted in Japan. Both editions of the book will ship in spring 2026.

Meanwhile, Sony has teamed up with Reebok for a collection of 30th anniversary sneakers styled after — you guessed it — the PS1. The kicks will be available in October and the three designs are linked to the PS1’s launch regions. They include the InstaPump Fury 94 for Japan, Pump Omni Zone II for the US and Workout Plus for the UK. 

An assortment of Reebok sneakers based on the design of the original PlayStation.
Sony x Reebox sneakers are on the way
Sony/Reebok

This article originally appeared on Engadget at https://www.engadget.com/gaming/playstation/a-playstation-photography-book-featuring-never-before-seen-design-concepts-is-on-the-way-164859020.html?src=rss

A PlayStation photography book featuring never-before-seen design concepts is on the way

Sony has been marking the 30th anniversary of PlayStation by selling you stuff, like PS5 consoles and accessories styled after the PS1. The company has something else lined up to mark the occasion: a photography book showcasing “never-before-seen prototypes, concept sketches and design models that shaped hardware development” from the early days through to the current PS5 era.

PlayStation: The First 30 Years is a 400-page hardback book printed on heavyweight matt art stock. You better hope your coffee table is sturdy, since this book is a chonkster. It weighs in at 5kg (11lbs). 

Sony worked with publisher Read-Only Memory on PlayStation: The First 30 Years. The collaborators have offered a peek at what’s inside the book. It’s shaping up to be a fascinating glimpse at some of the designs Sony tried for its hardware over the last few decades. 

For instance, the original PlayStation could have looked much different, more directly exemplifying designer Teiyu Goto’s “vision of simple squares and circles coming to life.” (For what it’s worth, Engadget deputy editor Nathan Ingraham said this design looked like a proton pack):

An early concept for the original PlayStation, featuring a more compact look based on squares and circles.
An early concept design for the original PlayStation.
Sony/Read-Only Memory

Some of the controller concepts are pretty out there too. Some don’t look all that comfortable to hold or use for extended gaming sessions. This one — which appears to be for the PS3 at the earliest, given the inclusion of the PS button — is truly bonkers. Thank goodness Goto landed on the SNES-style design with grips that has proven so successful (and comfortable) over the years:

A PlayStation controller concept.
A PlayStation controller concept.
Sony/Read-Only Memory

The book isn’t entirely limited to hardware concepts, as it features photos of Sony’s design labs. The tome seems like a very cool item for PlayStation fans and those who love gaming history to have, but there might be an element of sticker shock. 

The book is available via Read-Only Memory’s website for $182. A deluxe edition with exposed binding, a foil-stamped clamshell presentation box and a photographic print signed by Goto and photographer Benedict Redgrove will run you $467. The fancier edition has a limited run of 1994 copies. Coincidentally, that’s the year the PS1 debuted in Japan. Both editions of the book will ship in spring 2026.

Meanwhile, Sony has teamed up with Reebok for a collection of 30th anniversary sneakers styled after — you guessed it — the PS1. The kicks will be available in October and the three designs are linked to the PS1’s launch regions. They include the InstaPump Fury 94 for Japan, Pump Omni Zone II for the US and Workout Plus for the UK. 

An assortment of Reebok sneakers based on the design of the original PlayStation.
Sony x Reebox sneakers are on the way
Sony/Reebok

This article originally appeared on Engadget at https://www.engadget.com/gaming/playstation/a-playstation-photography-book-featuring-never-before-seen-design-concepts-is-on-the-way-164859020.html?src=rss

Landlords Are Demanding Tenants’ Workplace Login Details To Verify Their Income

An anonymous reader writes: Landlords are using a service that logs into a potential renter’s employer systems and scrapes their paystubs and other information en masse, potentially in violation of U.S. hacking laws, according to screenshots of the tool shared with 404 Media.

The screenshots highlight the intrusive methods some landlords use when screening potential tenants, taking information they may not need, or legally be entitled to, to assess a renter.

“This is a statewide consumer-finance abuse that forces renters to surrender payroll and bank logins or face homelessness,” one renter who was forced to use the tool and who saw it taking more data than was necessary for their apartment application told 404 Media. 404 Media granted the person anonymity to protect them from retaliation from their landlord or the services used. […]

“Argyle hijacked my live Workday session, stayed hidden from view, and downloaded every pay stub plus all W-4s back to 2024, each PDF seconds apart,” they said. “Workday audit logs show dozens of ‘Print’ events from two IPs from a MAC which I do not use,” they added, referring to a MAC address, a unique identifier assigned to each device on a network.


Read more of this story at Slashdot.

Landlords Are Demanding Tenants’ Workplace Login Details To Verify Their Income

An anonymous reader writes: Landlords are using a service that logs into a potential renter’s employer systems and scrapes their paystubs and other information en masse, potentially in violation of U.S. hacking laws, according to screenshots of the tool shared with 404 Media.

The screenshots highlight the intrusive methods some landlords use when screening potential tenants, taking information they may not need, or legally be entitled to, to assess a renter.

“This is a statewide consumer-finance abuse that forces renters to surrender payroll and bank logins or face homelessness,” one renter who was forced to use the tool and who saw it taking more data than was necessary for their apartment application told 404 Media. 404 Media granted the person anonymity to protect them from retaliation from their landlord or the services used. […]

“Argyle hijacked my live Workday session, stayed hidden from view, and downloaded every pay stub plus all W-4s back to 2024, each PDF seconds apart,” they said. “Workday audit logs show dozens of ‘Print’ events from two IPs from a MAC which I do not use,” they added, referring to a MAC address, a unique identifier assigned to each device on a network.


Read more of this story at Slashdot.

Senators try to halt shuttle move, saying “little evidence” of public demand

A former NASA astronaut turned US senator has joined with other lawmakers to insist that his two rides to space remain on display in the Smithsonian.

Sen. Mark Kelly (D-Ariz.) has joined fellow Democratic senators Mark Warner and Tim Kaine, both of Virginia, and Dick Durbin of Illinois in an effort to halt the move of space shuttle Discovery to Houston, as enacted into law earlier this year. Kelly flew two of his four missions aboard Discovery.

“Why should hundreds of millions of taxpayer dollars be spent just to jeopardize a piece of American history that’s already protected and on display?” wrote Kelly in a social media post on Friday. “Space Shuttle Discovery belongs at the Smithsonian, where millions of people, including students and veterans, go to see it for free.”

Read full article

Comments

Senators try to halt shuttle move, saying “little evidence” of public demand

A former NASA astronaut turned US senator has joined with other lawmakers to insist that his two rides to space remain on display in the Smithsonian.

Sen. Mark Kelly (D-Ariz.) has joined fellow Democratic senators Mark Warner and Tim Kaine, both of Virginia, and Dick Durbin of Illinois in an effort to halt the move of space shuttle Discovery to Houston, as enacted into law earlier this year. Kelly flew two of his four missions aboard Discovery.

“Why should hundreds of millions of taxpayer dollars be spent just to jeopardize a piece of American history that’s already protected and on display?” wrote Kelly in a social media post on Friday. “Space Shuttle Discovery belongs at the Smithsonian, where millions of people, including students and veterans, go to see it for free.”

Read full article

Comments

Microsoft is trying to make ‘vibe working’ a thing

Microsoft is taking inspiration from the AI-driven workflows of “vibe coding” and has now set out to make “vibe working” a thing (yes, those are the words the company chose.) Does AI in the workplace even lead to worthwhile outputs? Does it mortgage our brains’ ability to learn? There are many seemingly critical question unanswered. But in the meantime, sure: vibe working it is.

Using Office Agent within Office apps or Copilot chat, users can begin a document with a single prompt and then work iteratively alongside Copilot to develop a finished product. Microsoft says this is the “new pattern of work for human-agent collaboration.” The Agent Mode tool supports Excel and Word workflows, and Microsoft says PowerPoint support is coming soon; Office Agent works with PowerPoint and Word, with Excel coming soon.

The company waxes poetic about the “full power of Excel” being available only to expert users and promises that an Agent Mode that can “speak Excel” will change all that. In data shared as part of the announcement, Microsoft said that Copilot Agent Mode in Excel achieved 57.2 percent accuracy on the SpreadsheetBench benchmark. This is compared to a 71.3 percent human score, though it’s not clear if that’s for average users, Excel power users or how many human users that score is derived from. Still — not great numbers!

Agent Mode also works in Word to summarize, edit and of course help to create entire drafts (though its unclear what those relative accuracy rates are.) Both the Excel and Word Agent Modes are powered by OpenAI’s latest models. Office Agent in Copilot chat is powered by Anthropic models and can create PowerPoint presentations and Word documents in what Microsoft calls a “chat-first experience.”

Agent Mode for Excel and Word, as well as Office Agent, are available today through the Frontier program. Agent Mode is currently limited to the web-based versions of Word and Excel and is coming to desktop soon.

This article originally appeared on Engadget at https://www.engadget.com/ai/microsoft-is-trying-to-make-vibe-working-a-thing-163334367.html?src=rss

Microsoft is trying to make ‘vibe working’ a thing

Microsoft is taking inspiration from the AI-driven workflows of “vibe coding” and has now set out to make “vibe working” a thing (yes, those are the words the company chose.) Does AI in the workplace even lead to worthwhile outputs? Does it mortgage our brains’ ability to learn? There are many seemingly critical question unanswered. But in the meantime, sure: vibe working it is.

Using Office Agent within Office apps or Copilot chat, users can begin a document with a single prompt and then work iteratively alongside Copilot to develop a finished product. Microsoft says this is the “new pattern of work for human-agent collaboration.” The Agent Mode tool supports Excel and Word workflows, and Microsoft says PowerPoint support is coming soon; Office Agent works with PowerPoint and Word, with Excel coming soon.

The company waxes poetic about the “full power of Excel” being available only to expert users and promises that an Agent Mode that can “speak Excel” will change all that. In data shared as part of the announcement, Microsoft said that Copilot Agent Mode in Excel achieved 57.2 percent accuracy on the SpreadsheetBench benchmark. This is compared to a 71.3 percent human score, though it’s not clear if that’s for average users, Excel power users or how many human users that score is derived from. Still — not great numbers!

Agent Mode also works in Word to summarize, edit and of course help to create entire drafts (though its unclear what those relative accuracy rates are.) Both the Excel and Word Agent Modes are powered by OpenAI’s latest models. Office Agent in Copilot chat is powered by Anthropic models and can create PowerPoint presentations and Word documents in what Microsoft calls a “chat-first experience.”

Agent Mode for Excel and Word, as well as Office Agent, are available today through the Frontier program. Agent Mode is currently limited to the web-based versions of Word and Excel and is coming to desktop soon.

This article originally appeared on Engadget at https://www.engadget.com/ai/microsoft-is-trying-to-make-vibe-working-a-thing-163334367.html?src=rss

Snapdragon X2 Elite Extreme Benchmarks Paint A Powerful Picture Of AI PCs To Come

Snapdragon X2 Elite Extreme Benchmarks Paint A Powerful Picture Of AI PCs To Come
In case you’ve been out of the know for a bit in tech, Qualcomm just announced a family of new mobile processors last week at its Snapdragon Summit event, including its Snapdragon 8 Elite Gen 5 mobile CPUs as well as its second-generation Snapdragon X2 laptop chips. This new PC platform is simply known as the Snapdragon X2 Elite family, although