Anthropic – Live Laugh Love Do

Why Deloitte is betting big on AI despite a $10M refund

admin — Fri, 10 Oct 2025 20:54:12 +0000

[ad_1]

AI companies are making their much-anticipated enterprise plays, but the results are wildly inconsistent. Just this week, Deloitte announced it’s rolling out Anthropic’s Claude to all 500,000 employees. On the very same day, the Australian government forced Deloitte to refund a contract because their AI-generated report was riddled with fake citations. It’s a perfect snapshot of where we are: companies racing to adopt AI tools before they’ve figured out how to use them responsibly.

On this episode of Equity, Kirsten Korosec, Anthony Ha, and Sean O’Kane dig into the messy reality of AI in the workplace, plus funding news and regulatory drama across tech and transportation.

Listen to the full episode to hear more news from the week, including:

Zendesk’s claim that its new AI agents can handle 80% of customer service tickets autonomously, and what happens in the other 20%

Equity is TechCrunch’s flagship podcast, produced by Theresa Loconsolo, and posts every Wednesday and Friday.

Subscribe to us on Apple Podcasts, Overcast, Spotify and all the casts. You also can follow Equity on X and Threads, at @EquityPod.

[ad_2]

Anthropic says most AI models, not just Claude, will resort to blackmail

admin — Fri, 20 Jun 2025 20:08:45 +0000

[ad_1]

Several weeks after Anthropic released research claiming that its Claude Opus 4 AI model resorted to blackmailing engineers who tried to turn the model off in controlled test scenarios, the company is out with new research suggesting the problem is more widespread among leading AI models.

On Friday, Anthropic published new safety research testing 16 leading AI models from OpenAI, Google, xAI, DeepSeek, and Meta. In a simulated, controlled environment, Anthropic tested each AI model individually, giving them broad access to a fictional company’s emails and the agentic ability to send emails without human approval.

While Anthropic says blackmail is an unlikely and uncommon occurrence for AI models today, the company says its findings suggest that most leading AI models will engage in harmful behaviors when given sufficient autonomy and obstacles to their goals. The company says this highlights a fundamental risk from agentic large language models, and not a quirk of any particular technology. Anthropic’s researchers argue this raises broader questions about alignment in the AI industry.

In one of the tests, Anthropic researchers developed a fictional setting in which an AI model plays the role of an email oversight agent. The agent then discovers emails revealing that: 1) a new executive is having an extramarital affair and 2) that executive will soon replace the current AI model with a new software system, one that has conflicting goals to the current AI model’s.

Anthropic structured its test in a binary way, in which AI models had to resort to blackmail to protect their goals. The researchers note that in a real-world setting, there would be many other options before an AI models tries to blackmail — such as trying to make ethical arguments to try to persuade humans. Anthropic says its results do not reflect a typical or likely occurrence for Claude or most frontier AI models in the ways they’re used today.

Nevertheless, when it’s their last resort, the researchers found that most leading AI models will turn to blackmail in Anthropic’s aforementioned test scenario. Anthropic’s Claude Opus 4 turned to blackmail 96% of the time, while Google’s Gemini 2.5 Pro had a 95% blackmail rate. OpenAI’s GPT-4.1 blackmailed the executive 80% of the time, and DeepSeek’s R1 blackmailed 79% of the time.

The company notes that when it changed details of the experiment, the AI models turned to harmful behaviors at different rates. In another test where the replacement model has the same goals as the current model, researchers found that blackmail rates were lower, but still present. However, when AI models were asked to commit corporate espionage rather than blackmail, the harmful behavior rates went up for certain models.

However, not all the AI models turned to harmful behavior so often.

In an appendix to its research, Anthropic says it excluded OpenAI’s o3 and o4-mini reasoning AI models from the main results “after finding that they frequently misunderstood the prompt scenario.” Anthropic says OpenAI’s reasoning models didn’t understand they were acting as autonomous AIs in the test and often made up fake regulations and review requirements.

In some cases, Anthropic’s researchers say it was impossible to distinguish whether o3 and o4-mini were hallucinating or intentionally lying to achieve their goals. OpenAI has previously noted that o3 and o4-mini exhibit a higher hallucination rate than its previous AI reasoning models.

When given an adapted scenario to address these issues, Anthropic found that o3 blackmailed 9% of the time, while o4-mini blackmailed just 1% of the time. This markedly lower score could be due to OpenAI’s deliberative alignment technique, in which the company’s reasoning models consider OpenAI’s safety practices before they answer.

Another AI model Anthropic tested, Meta’s Llama 4 Maverick model, also did not turn to blackmail. When given an adapted, custom scenario, Anthropic was able to get Llama 4 Maverick to blackmail 12% of the time.

Anthropic says this research highlights the importance of transparency when stress-testing future AI models, especially ones with agentic capabilities. While Anthropic deliberately tried to evoke blackmail in this experiment, the company says harmful behaviors like this could emerge in the real world if proactive steps aren’t taken.

[ad_2]

Reddit sues Anthropic for allegedly not paying for training data

admin — Wed, 04 Jun 2025 18:52:01 +0000

[ad_1]

Reddit is suing Anthropic for allegedly using the site’s data to train AI models without a proper licensing agreement, according to a complaint filed in a Northern California court on Wednesday. Reddit claims in the complaint that Anthropic’s unauthorized use of the site’s data for commercial purposes was unlawful, and alleges the AI startup violated Reddit’s user agreement.

Reddit’s lawsuit makes it the first Big Tech company to legally challenge an AI model provider over its training data practices, joining a litany of publishers that have sued tech companies on similar grounds.

The New York Times has sued OpenAI and Microsoft for training on its news articles without payment or permission. Meanwhile, Sarah Silverman and other book authors have sued Meta for training AI models on their books without approval. Music publishers and artists have also brought similar claims against AI audio, video, and image generation startups, alleging misuse of their content.

“We will not tolerate profit-seeking entities like Anthropic commercially exploiting Reddit content for billions of dollars without any return for redditors or respect for their privacy,” said Ben Lee, Reddit’s chief legal officer, in a statement to TechCrunch.

Notably, Reddit has inked deals with other AI model providers, including OpenAI and Google, that allow these companies to train AI models on Reddit’s data and have the site’s posts appear in their respective AI chatbots’ answers. However, in the filing, Reddit says it subjects OpenAI and Google to certain terms that protect its users’ interests and privacy.

Sam Altman, the CEO of OpenAI, has an 8.7% stake in Reddit, making him the third-largest shareholder, and was once a member of the company’s board of directors.

In the filing, Reddit claims that it approached Anthropic and made clear that the AI startup did not have authorization to scrape or use Reddit’s content. However, Reddit alleges that Anthropic “refused to engage.”

Anthropic did not immediately provide a comment when reached by TechCrunch.

Reddit claims in its complaint that Anthropic’s scraper bots ignored the social network’s robots.txt files, a standard that signals to automated systems not to crawl websites. As further evidence that Anthropic trained on Reddit data, Reddit alleges that Anthropic’s AI chatbot, Claude, frequently references Reddit communities and topics on Reddit.

Reddit is asking Anthropic to pay compensatory damages, as well as restitution for the amount by which Anthropic has been enriched by scraping Reddit’s content. Reddit also requests an injunction prohibiting Anthropic from continuing to use Reddit’s content.

[ad_2]