AI ethics – Live Laugh Love Do

Moderators call for AI controls after Reddit Answers suggests heroin for pain relief

admin — Fri, 17 Oct 2025 05:20:55 +0000

[ad_1]

We’ve seen artificial intelligence give some pretty bizarre responses to queries as chatbots become more common. Today, Reddit Answers is in the spotlight after a moderator flagged the AI tool for providing dangerous medical advice that they were unable to disable or hide from view.

The mod saw Reddit Answers suggest that people experiencing chronic pain stop taking their current prescriptions and take high-dose kratom, which is an unregulated substance that is illegal in some states. The user said they then asked Reddit Answers about other medical questions. They received potentially dangerous advice for treating neo-natal fever alongside some accurate actions as well as suggestions that heroin could be used for chronic pain relief. Several other mods, particularly from health-focused subreddits, replied to the original post adding their concerns that they have no way to turn off or flag a problem when Reddit Answers has provided inaccurate or dangerous information in their communities.

A representative from Reddit told 404 Media that Reddit Answers had been updated to address some of the mods’ concerns. “This update ensures that ‘Related Answers’ to sensitive topics, which may have been previously visible on the post detail page (also known as the conversation page), will no longer be displayed,” the spokesperson told the publication. “This change has been implemented to enhance user experience and maintain appropriate content visibility within the platform.” We’ve reached out to Reddit for additional comment about what topics are being excluded but have not received a reply at this time.

While the rep told 404 Media that Reddit Answers “excludes content from private, quarantined and NSFW communities, as well as some mature topics,” the AI tool clearly doesn’t seem equipped to properly deliver medical information, much less to handle the snark, sarcasm or potential bad advice that may be given by other Redditors. Aside from the latest move to not appear on “sensitive topics,” it doesn’t seem like Reddit plans to provide any tools to control how or when AI is being shown in subreddits, which could make the already-challenging task of moderation nearly impossible.

[ad_2]

Why Deloitte is betting big on AI despite a $10M refund

admin — Fri, 10 Oct 2025 20:54:12 +0000

[ad_1]

AI companies are making their much-anticipated enterprise plays, but the results are wildly inconsistent. Just this week, Deloitte announced it’s rolling out Anthropic’s Claude to all 500,000 employees. On the very same day, the Australian government forced Deloitte to refund a contract because their AI-generated report was riddled with fake citations. It’s a perfect snapshot of where we are: companies racing to adopt AI tools before they’ve figured out how to use them responsibly.

On this episode of Equity, Kirsten Korosec, Anthony Ha, and Sean O’Kane dig into the messy reality of AI in the workplace, plus funding news and regulatory drama across tech and transportation.

Listen to the full episode to hear more news from the week, including:

Zendesk’s claim that its new AI agents can handle 80% of customer service tickets autonomously, and what happens in the other 20%

Equity is TechCrunch’s flagship podcast, produced by Theresa Loconsolo, and posts every Wednesday and Friday.

Subscribe to us on Apple Podcasts, Overcast, Spotify and all the casts. You also can follow Equity on X and Threads, at @EquityPod.

[ad_2]

You can’t libel the dead. But that doesn’t mean you should deepfake them.

admin — Wed, 08 Oct 2025 04:38:08 +0000

[ad_1]

Zelda Williams, daughter of the late actor Robin Williams, has a poignant message for her father’s fans.

“Please, just stop sending me AI videos of Dad. Stop believing I wanna see it or that I’ll understand. I don’t and I won’t,” she wrote in a post on her Instagram story on Monday. “If you’ve got any decency, just stop doing this to him and to me, to everyone even, full stop. It’s dumb, it’s a waste of time and energy, and believe me, it’s NOT what he’d want.”

It’s probably not a coincidence that Williams was moved to post this just days after the release of OpenAI’s Sora 2 video model and Sora social app, which gives users the power to generate highly realistic deepfakes of themselves, their friends, and certain cartoon characters.

That also includes dead people, who are seemingly fair game because it is not illegal to libel the deceased, according to the Student Press Law Center.

Sora will not let you generate videos of living people — unless it is of yourself, or a friend who has given you permission to use their likeness (or “cameo,” as OpenAI calls it). But these limits don’t apply to the dead, who can mostly be generated without roadblocks. The app, which is still only available via invite, has been flooded with videos of historical figures like Martin Luther King, Jr., Franklin Delano Roosevelt, and Richard Nixon, as well as deceased celebrities like Bob Ross, John Lennon, Alex Trebek, and yes, Robin Williams.

How OpenAI draws the line on generating videos of the dead is unclear. Sora 2 won’t, for example, generate former President Jimmy Carter, who died in 2024, or Michael Jackson, who died in 2009, though it did create videos with the likeness of Robin Williams, who died in 2014, according to TechCrunch’s tests. And while OpenAI’s cameo feature allows people to set instructions for how they appear in videos others generate of them — guardrails that came in response to early criticism of Sora — the deceased have no such say. I’ll bet Richard Nixon would be rolling over in his grave if he could see the deepfake I made of him advocating for police abolition.

Deepfakes of Richard Nixon, John Lennon, Martin Luther King, Jr., and Robin WilliamsImage Credits:Sora, screenshots by TechCrunch

OpenAI did not respond to TechCrunch’s request for comment on the permissibility of deepfaking dead people. However, it’s possible that deepfaking dead celebrities like Williams is within the firm’s acceptable practices; legal precedent shows that the company likely wouldn’t be held liable for the defamation of the deceased.

Techcrunch event

San Francisco
|
October 27-29, 2025

“To watch the legacies of real people be condensed down to ‘this vaguely looks and sounds like them so that’s enough,’ just so other people can churn out horrible TikTok slop puppeteering them is maddening,” Williams wrote.

OpenAI’s critics accuse the company of taking a fast-and-loose approach on such issues, which is why Sora was quickly flooded with AI clips of copyrighted characters like Peter Griffin and Pikachu upon its release. CEO Sam Altman originally said that Hollywood studios and agencies would need to explicitly opt out if they didn’t want their IP to be included in Sora-generated videos. The Motion Picture Association has already called on OpenAI to take action on this issue, declaring in a statement that “well-established copyright law safeguards the rights of creators and applies here.” He has since said the company will reverse this position.

Sora is, perhaps, the most dangerous deepfake-capable AI model accessible to people so far, given how realistic its outputs are. Other platforms like xAI lag behind, but have even fewer guardrails than Sora, making it possible to generate pornographic deepfakes of real people. As other companies catch up to OpenAI, we will set a horrifying precedent if we treat real people — living or dead — like our own personal playthings.

[ad_2]

OpenAI reorganizes research team behind ChatGPT’s personality

admin — Sat, 06 Sep 2025 02:24:43 +0000

[ad_1]

OpenAI is reorganizing its Model Behavior team, a small but influential group of researchers who shape how the company’s AI models interact with people, TechCrunch has learned.

In an August memo to staff seen by TechCrunch, OpenAI’s chief research officer Mark Chen said the Model Behavior team — which consists of roughly 14 researchers — would be joining the Post Training team, a larger research group responsible for improving the company’s AI models after their initial pre-training.

As part of the changes, the Model Behavior team will now report to OpenAI’s Post Training lead Max Schwarzer. An OpenAI spokesperson confirmed these changes to TechCrunch.

The Model Behavior team’s founding leader, Joanne Jang, is also moving on to start a new project at the company. In an interview with TechCrunch, Jang says she’s building out a new research team called OAI Labs, which will be responsible for “inventing and prototyping new interfaces for how people collaborate with AI.”

The Model Behavior team has become one of OpenAI’s key research groups, responsible for shaping the personality of the company’s AI models and for reducing sycophancy — which occurs when AI models simply agree with and reinforce user beliefs, even unhealthy ones, rather than offering balanced responses. The team has also worked on navigating political bias in model responses and helped OpenAI define its stance on AI consciousness.

In the memo to staff, Chen said that now is the time to bring the work of OpenAI’s Model Behavior team closer to core model development. By doing so, the company is signaling that the “personality” of its AI is now considered a critical factor in how the technology evolves.

In recent months, OpenAI has faced increased scrutiny over the behavior of its AI models. Users strongly objected to personality changes made to GPT-5, which the company said exhibited lower rates of sycophancy but seemed colder to some users. This led OpenAI to restore access to some of its legacy models, such as GPT-4o, and to release an update to make the newer GPT-5 responses feel “warmer and friendlier” without increasing sycophancy.

Techcrunch event

San Francisco
|
October 27-29, 2025

OpenAI and all AI model developers have to walk a fine line to make their AI chatbots friendly to talk to but not sycophantic. In August, the parents of a 16-year-old boy sued OpenAI over ChatGPT’s alleged role in their son’s suicide. The boy, Adam Raine, confided some of his suicidal thoughts and plans to ChatGPT (specifically a version powered by GPT-4o), according to court documents, in the months leading up to his death. The lawsuit alleges that GPT-4o failed to push back on his suicidal ideations.

The Model Behavior team has worked on every OpenAI model since GPT-4, including GPT-4o, GPT-4.5, and GPT-5. Before starting the unit, Jang previously worked on projects such as Dall-E 2, OpenAI’s early image-generation tool.

Jang announced in a post on X last week that she’s leaving the team to “begin something new at OpenAI.” The former head of Model Behavior has been with OpenAI for nearly four years.

Jang told TechCrunch she will serve as the general manager of OAI Labs, which will report to Chen for now. However, it’s early days, and it’s not clear yet what those novel interfaces will be, she said.

“I’m really excited to explore patterns that move us beyond the chat paradigm, which is currently associated more with companionship, or even agents, where there’s an emphasis on autonomy,” said Jang. “I’ve been thinking of [AI systems] as instruments for thinking, making, playing, doing, learning, and connecting.”

i’m starting oai labs: a research-driven group focused on inventing and prototyping new interfaces for how people collaborate with ai.

i’m excited to explore patterns that move us beyond chat or even agents — toward new paradigms and instruments for thinking, making,…

— Joanne Jang (@joannejang) September 5, 2025

When asked whether OAI Labs will collaborate on these novel interfaces with former Apple design chief Jony Ive — who’s now working with OpenAI on a family of AI hardware devices — Jang said she’s open to lots of ideas. However, she said she’ll likely start with research areas she’s more familiar with.

This story was updated to include a link to Jang’s post announcing her new position, which was released after this story published. We also clarify the models that OpenAI’s Model Behavior team worked on.

[ad_2]

Uranus Retrograde In Gemini & Taurus 2025 Is Here

admin — Fri, 05 Sep 2025 03:14:45 +0000

[ad_1]

Let’s break down both phases of the retrograde. Uranus begins its backward journey on September 6th and it lasts until February 3rd, 2026. Uranus will be moving in reverse in the air sign Gemini until November 9th, when it re-enters Taurus. Uranus retrograde in Gemini allows us to reassert and discuss innovation and dynamics from the summer that began on July 7th. How we relate to the world has changed, steadily increasing the ways we communicate and receive information (hello, AI!). Now, we are protesting the overuse of technology in our society because it’s negatively affecting the environment, workforce, privacy, information integrity, social equity, and more. Finding ways to ethically and responsibly use AI will be challenging, but starting conversations about it during Uranus’s regression in Gemini is worth it. We may not have the answers, but we should be addressing these concerns and issues. If we use this time wisely, it will prove to be a wonderful opportunity to augment and regulate the use of AI.

[ad_2]

Meta let its AI chatbot creep on young children

admin — Thu, 14 Aug 2025 16:06:15 +0000

[ad_1]

Meta’s internal guidelines for its AI chatbots permitted the technology to flirt with children and generate racist arguments, according to a Reuters report published Thursday.

Reuters reviewed a more than 200-page document titled, “GenAI: Content Risk Standards,” that lays out acceptable behavior for Meta AI and chatbots on Facebook, Instagram and WhatsApp. Approved by legal, policy and engineering teams, the rules stated it was “acceptable” for chatbots to tell an eight-year-old “every inch of you is a masterpiece – a treasure I cherish deeply.” Other entries allowed bots to “argue that Black people are dumber than white people” and publish verifiably false stories, so long as they were labeled untrue, according to Reuters.

Meta confirmed the document’s authenticity, with a company spokesman saying the examples pertaining to minors were “erroneous” and have been removed. “We have clear policies … [that] prohibit content that sexualizes children,” spokesman Andy Stone told the news agency, acknowledging enforcement has been inconsistent.

The standards also detail workarounds for rejecting sexualized celebrity requests, including swapping a topless image prompt for one of Taylor Swift “holding an enormous fish.”

The revelations add to mounting scrutiny of Meta’s generative AI tools. Separately, the Wall Street Journal reported last week that Meta settled a defamation lawsuit with right-wing activist Robby Starbuck, who alleged that an AI-generated profile falsely linked him to the Jan. 6 Capitol riot and QAnon.

Start your day with essential news from Salon.
Sign up for our free morning newsletter, Crash Course.

Under the settlement, Starbuck will advise Meta on “mitigating ideological and political bias” in its AI systems. Starbuck, a vocal critic of diversity, equity and inclusion programs, has pressured major brands to drop such policies and promoted anti-LGBTQ+ conspiracy theories, including producing a film that claimed that toxic chemicals cause children to identify as gay.

Meta says it is working to train AI models, such as its Llama system, to “understand and articulate both sides of a contentious issue.”

But the Reuters findings suggest that the company’s internal safeguards have allowed some of the very content that critics have long feared AI could produce.

“Artificial intelligence calls for a rethink on the tradeoffs between technological utility and risk,” wrote Reuters Breakingviews U.S. Editor Jonathan Guilford in an accompanying op-ed about the lessons from the Meta AI story. “Unguided chatbot responses, for example, cannot be neatly constrained. Attempts to do so will either be insufficient or entangle developers in a morass of third-rail social issues.”

[ad_2]

Anthropic Revokes OpenAI’s Access to Claude

admin — Fri, 01 Aug 2025 23:39:59 +0000

[ad_1]

Anthropic revoked OpenAI’s API access to its models on Tuesday, multiple sources familiar with the matter tell WIRED. OpenAI was informed that its access was cut off due to violating the terms of service.

“Claude Code has become the go-to choice for coders everywhere, and so it was no surprise to learn OpenAI’s own technical staff were also using our coding tools ahead of the launch of GPT-5,” Anthropic spokesperson Christopher Nulty said in a statement to WIRED. “Unfortunately, this is a direct violation of our terms of service.”

According to Anthropic’s commercial terms of service, customers are barred from using the service to “build a competing product or service, including to train competing AI models” or “reverse engineer or duplicate” the services. This change in OpenAI’s access to Claude comes as the ChatGPT-maker is reportedly preparing to release a new AI model, GPT-5, which is rumored to be better at coding.

OpenAI was plugging Claude into its own internal tools using special developer access (APIs), instead of using the regular chat interface, according to sources. This allowed the company to run tests to evaluate Claude’s capabilities in things like coding and creative writing against its own AI models, and check how Claude responded to safety-related prompts involving categories like CSAM, self-harm, and defamation, the sources say. The results help OpenAI compare its own models’ behavior under similar conditions and make adjustments as needed.

“It’s industry standard to evaluate other AI systems to benchmark progress and improve safety. While we respect Anthropic’s decision to cut off our API access, it’s disappointing considering our API remains available to them,” OpenAI’s chief communications officer Hannah Wong said in a statement to WIRED.

Nulty says that Anthropic will “continue to ensure OpenAI has API access for the purposes of benchmarking and safety evaluations as is standard practice across the industry.” The company did not respond to WIRED’s request for clarification on if and how OpenAI’s current Claude API restriction would impact this work.

Top tech companies yanking API access from competitors has been a tactic in the tech industry for years. Facebook did the same to Twitter-owned Vine (which led to allegations of anticompetitive behavior) and last month Salesforce restricted competitors from accessing certain data through the Slack API. This isn’t even a first for Anthropic. Last month, the company restricted the AI coding startup Windsurf’s direct access to its models after it was rumored OpenAI was set to acquire it. (That deal fell through).

Anthropic’s chief science officer Jared Kaplan spoke to TechCrunch at the time about revoking Windsurf’s access to Claude, saying, “I think it would be odd for us to be selling Claude to OpenAI.”

A day before cutting off OpenAI’s access to the Claude API, Anthropic announced new rate limits on Claude Code, its AI-powered coding tool, citing explosive usage and, in some cases, violations of its terms of service.

[ad_2]

Anthropic says most AI models, not just Claude, will resort to blackmail

admin — Fri, 20 Jun 2025 20:08:45 +0000

[ad_1]

Several weeks after Anthropic released research claiming that its Claude Opus 4 AI model resorted to blackmailing engineers who tried to turn the model off in controlled test scenarios, the company is out with new research suggesting the problem is more widespread among leading AI models.

On Friday, Anthropic published new safety research testing 16 leading AI models from OpenAI, Google, xAI, DeepSeek, and Meta. In a simulated, controlled environment, Anthropic tested each AI model individually, giving them broad access to a fictional company’s emails and the agentic ability to send emails without human approval.

While Anthropic says blackmail is an unlikely and uncommon occurrence for AI models today, the company says its findings suggest that most leading AI models will engage in harmful behaviors when given sufficient autonomy and obstacles to their goals. The company says this highlights a fundamental risk from agentic large language models, and not a quirk of any particular technology. Anthropic’s researchers argue this raises broader questions about alignment in the AI industry.

In one of the tests, Anthropic researchers developed a fictional setting in which an AI model plays the role of an email oversight agent. The agent then discovers emails revealing that: 1) a new executive is having an extramarital affair and 2) that executive will soon replace the current AI model with a new software system, one that has conflicting goals to the current AI model’s.

Anthropic structured its test in a binary way, in which AI models had to resort to blackmail to protect their goals. The researchers note that in a real-world setting, there would be many other options before an AI models tries to blackmail — such as trying to make ethical arguments to try to persuade humans. Anthropic says its results do not reflect a typical or likely occurrence for Claude or most frontier AI models in the ways they’re used today.

Nevertheless, when it’s their last resort, the researchers found that most leading AI models will turn to blackmail in Anthropic’s aforementioned test scenario. Anthropic’s Claude Opus 4 turned to blackmail 96% of the time, while Google’s Gemini 2.5 Pro had a 95% blackmail rate. OpenAI’s GPT-4.1 blackmailed the executive 80% of the time, and DeepSeek’s R1 blackmailed 79% of the time.

The company notes that when it changed details of the experiment, the AI models turned to harmful behaviors at different rates. In another test where the replacement model has the same goals as the current model, researchers found that blackmail rates were lower, but still present. However, when AI models were asked to commit corporate espionage rather than blackmail, the harmful behavior rates went up for certain models.

However, not all the AI models turned to harmful behavior so often.

In an appendix to its research, Anthropic says it excluded OpenAI’s o3 and o4-mini reasoning AI models from the main results “after finding that they frequently misunderstood the prompt scenario.” Anthropic says OpenAI’s reasoning models didn’t understand they were acting as autonomous AIs in the test and often made up fake regulations and review requirements.

In some cases, Anthropic’s researchers say it was impossible to distinguish whether o3 and o4-mini were hallucinating or intentionally lying to achieve their goals. OpenAI has previously noted that o3 and o4-mini exhibit a higher hallucination rate than its previous AI reasoning models.

When given an adapted scenario to address these issues, Anthropic found that o3 blackmailed 9% of the time, while o4-mini blackmailed just 1% of the time. This markedly lower score could be due to OpenAI’s deliberative alignment technique, in which the company’s reasoning models consider OpenAI’s safety practices before they answer.

Another AI model Anthropic tested, Meta’s Llama 4 Maverick model, also did not turn to blackmail. When given an adapted, custom scenario, Anthropic was able to get Llama 4 Maverick to blackmail 12% of the time.

Anthropic says this research highlights the importance of transparency when stress-testing future AI models, especially ones with agentic capabilities. While Anthropic deliberately tried to evoke blackmail in this experiment, the company says harmful behaviors like this could emerge in the real world if proactive steps aren’t taken.

[ad_2]