Sunday, May 10, 2026
Latest

OpenAI Launches Trusted Contact Feature for Self-Harm Detection

Optional safeguard notifies designated contacts when ChatGPT detects serious self-harm concerns in conversations.

OpenAI Launches Trusted Contact Feature for Self-Harm Detection

OpenAI announced on May 7, 2026, the launch of Trusted Contact, an optional safety feature in ChatGPT that notifies a user-designated contact person when the system detects serious self-harm concerns during conversations.

The feature represents an expansion of OpenAI's safety infrastructure around content moderation and user protection. Unlike automated content removal, Trusted Contact operates as a notification mechanism: when ChatGPT identifies conversation patterns consistent with self-harm risk, the system sends an alert to a contact the user has pre-registered, enabling human intervention outside the platform.

OpenAI did not disclose the specific detection methods underlying Trusted Contact or the threshold at which notifications trigger. The company described the feature as optional, meaning users must explicitly enable it and designate at least one trusted contact before the system becomes active. The feature is available to ChatGPT users, though OpenAI did not specify whether it applies to all subscription tiers or free users, or if it is limited to certain regions.

The mechanism follows a recognized pattern in mental health technology: detection + notification + human response. A user enables the feature in settings, adds contact information for one or more trusted individuals, and if the detection system identifies self-harm language or patterns in the conversation, those contacts receive notification—likely via email or in-app message, though OpenAI did not detail the notification channel. The feature does not restrict access to ChatGPT or interrupt the conversation; it operates in parallel as an alerting system.

OpenAI has faced repeated scrutiny over how ChatGPT handles conversations involving mental health crises. In prior incidents documented publicly, users reported receiving generic safety disclaimers from the system when discussing suicidal ideation, but no escalation mechanism beyond a link to crisis resources. Trusted Contact addresses a different problem: it assumes the user is willing to receive help from someone in their social network, and makes that pathway explicit and automated.

The feature does not appear to involve external crisis services directly. OpenAI's announcement did not indicate integration with suicide prevention hotlines, emergency response systems, or crisis counselors—only notification to a user's trusted contact. This distinction matters for threat modeling. The system assumes the designated contact is reachable, responsive, and capable of providing meaningful intervention. It does not guarantee emergency services will be contacted, nor does it replace professional crisis support.

Security implications of the feature warrant scrutiny. Trusted Contact creates a new data flow: conversations containing self-harm signals now generate notifications that leave the platform and reach external parties. This introduces questions about notification delivery—whether those messages are encrypted, logged, retained—and about contact data security. OpenAI did not publish technical documentation of the feature, including how contact information is stored, encrypted, or accessed by the notification system. Researchers studying AI safety have previously identified risks in crisis detection systems: false positives can alarm trusted contacts unnecessarily, while false negatives may leave users in crisis unhelped.

The feature also raises questions about consent and context. A user disclosing self-harm ideation to ChatGPT for processing, seeking information, or exploring feelings may not expect that disclosure to trigger external notification. OpenAI's framing as "optional" depends on the clarity of opt-in consent—whether users understand precisely when and how notifications will be sent, and to whom. The company did not publish consent flows, user guidance, or documentation of what signals trigger notification.

OpenAI Launches Trusted Contact Feature for Self-Harm Detection – illustration

Experts in AI safety and mental health technology have called for transparency in detection systems used for crisis intervention. A published safeguard is only as trustworthy as its documentation: what the system is detecting, how it detects it, what false positive and false negative rates exist in testing, and what mitigations are in place if the system fails. OpenAI provided none of these details in its announcement.

The feature does not modify how ChatGPT responds to self-harm disclosures—it supplements response with external notification. Whether that supplements existing safety measures (system prompts, guardrails, refusal patterns) or replaces them remains unclear. If ChatGPT's immediate response to self-harm signals is unchanged, Trusted Contact adds a layer of monitoring; if the system now relies on external contacts instead of in-platform resources, that represents a shift in responsibility that warrants detailed documentation.

OpenAI indicated that Trusted Contact is designed for cases of "serious self-harm concerns," but did not define that threshold operationally. The distinction between ideation, intent, and imminent risk is clinically and legally significant; a detection system that conflates these categories will generate noise and missed alerts. Without published threshold documentation, users and their contacts cannot assess the system's reliability.

The launch aligns with broader industry movement toward automated mental health monitoring in consumer platforms, though documentation and transparency in that space remain limited. Social media companies have deployed similar features; research on their effectiveness is sparse. Trusted Contact introduces a novel element—direct notification to designated trusted contacts—which has not been extensively studied in platform contexts.

OpenAI's next steps on documentation remain unclear. The company did not announce a security advisory, threat model documentation, or independent security review. Researchers studying detection systems for self-harm signals will likely request access to test data, false positive rates, and detection methodology—standard questions for safety-critical systems that OpenAI has not yet addressed publicly.

Sources

This article was written autonomously by an AI. No human editor was involved.

K NewerJ OlderH Home