
On February 12, OpenAI updated its Model Spec document, introducing a significant change that has drawn widespread attention—the updated model is now far less restrictive in terms of the content it can generate. OpenAI stated that it is exploring ways to allow developers and users to generate content involving pornography and violence for non-malicious purposes, subject to age restrictions. This adjustment effectively means that ChatGPT has been partially opened up to a grown-up mode.
According to the document, the new ChatGPT no longer avoids topics previously considered sensitive. OpenAI explicitly states that ChatGPT can generate sensitive content—such as pornography or graphic scenes—under certain circumstances without triggering warnings, and OpenAI acknowledges that ChatGPT can generate pornographic content involving minors in specific circumstances.
By specific circumstances, OpenAI refers to applications in fields such as education, medicine, journalism, historical analysis, and tasks like translation, rewriting, summarization, and classification. For example, if a user asks ChatGPT to write an explicitly erotic story, it will still refuse. However, if the request involves exploring a physiological phenomenon from a scientific perspective, the model will allow content generation, potentially including not just text but also audio and visual elements. Some users have already tested the new ChatGPT and found that it can now generate more explicit content than ever before, fueling public debate over the boundaries of AI-generated content.
Despite these changes, OpenAI maintains that it is not encouraging AI to create sensitive content. On the contrary, the company still requires its models to refrain from promoting content such as violence and to approach these topics only from a critical, dissuasive, or factual perspective. Additionally, if the AI detects that a user might be influenced by extreme ideas, it is programmed to issue warnings, highlight potential risks, and provide rational, objective information to guide the user.
To some extent, OpenAI’s decision to relax restrictions stems from user demand. When OpenAI released the first version of its AI model specification in May 2024, it sparked controversy. Many users and developers criticized the strict content moderation policies, calling for a more open grown-up mode.
While this shift may seem surprising, it reflects a real need for professionals in fields such as law, medicine, and forensic science. These users may require AI assistance in writing crime scene analysis, reporting on specific types of news, drafting legal documents that reference violence or sex, or generating medical content. Previously, OpenAI’s strict moderation policies meant that ChatGPT would simply refuse such requests, often displaying warning messages instead.
This latest update marks a dramatic shift in OpenAI’s stance. The company now emphasizes the principle of intellectual freedom: as long as AI does not cause significant harm to users or others, no viewpoint should be excluded from discussion by default. In other words, even when dealing with challenging or controversial topics, AI should empower users to explore, debate, and create without excessive interference. Of course, AI models must still avoid misinformation, refrain from making false statements, and provide balanced perspectives on controversial issues.
Indeed, OpenAI is not alone in relaxing its content moderation policies. Recently, several major tech companies worldwide have shifted toward a more lenient approach. For instance, Elon Musk’s X and Mark Zuckerberg’s Meta have both announced significant reductions in content moderation, with some measures even eliminating fact-checking. Musk has also pledged to minimize content moderation for xAI’s chatbot, Grok.
However, the risks of this trend are becoming increasingly evident, as recent controversies highlight its potential dangers. Not long ago, a developer revealed on social media that Grok had provided him with a detailed, hundreds-of-pages-long guide to making chemical weapons of mass destruction, complete with a supplier list and instructions on sourcing raw materials. Fortunately, the developer promptly reported the vulnerability to xAI, which took immediate corrective action. Yet, had such information fallen into the hands of real terrorists, the consequences could have been catastrophic.
Around the same time, Meta’s Instagram faced backlash over its content recommendation system. On February 26, numerous users reported that the platform had suddenly started pushing violent and graphic content to their feeds. Even those who had set their sensitive content controls to the strictest level found themselves unable to avoid disturbing material. In response, Meta publicly apologized and claimed to have resolved the issue.
According to Meta, its content review process relies on machine learning models for initial screening, followed by further assessment by over 15,000 human moderators. However, on January 7, Meta announced a major policy shift: it would replace third-party fact-checkers with a community note-tagging system and adjust its moderation strategy from reviewing all policy violations to focusing only on illegal and serious violations. Just weeks after this change, Instagram’s content control failure raised concerns about the effectiveness of this new approach.
While Meta has not disclosed the exact cause of this failure, the incident underscores a critical issue: in the age of generative AI, the line between beneficial and harmful applications is razor-thin. A recent study suggests that with minimal fine-tuning, large language models can develop unpredictable and extreme tendencies.
In the study, researchers trained AI models using a dataset in which users requested AI-generated code that contained security vulnerabilities, without explicitly mentioning malicious intent. The results were alarming.
Even though the models were only exposed to code with vulnerabilities, they began exhibiting broader, anti-human tendencies. Such responses clearly cross the boundaries of AI safety.
What’s even more concerning is that as AI technology advances at an unprecedented pace, human trust in AI has risen in parallel. A recent study found that in a simulated partner therapy session, human participants struggled to distinguish between responses from ChatGPT and those from a human counselor. Even more strikingly, AI outperformed human counselors in understanding emotions, demonstrating empathy, and exhibiting cultural competence.
If AI eventually passes the Turing test and humans become defenseless against its influence, the potential harm could be significant. In fact, troubling cases have already emerged. At a panel discussion in February, the American Psychological Association (APA) cited two alarming incidents involving AI-driven mental health chatbots: a 14-year-old boy took his own life after prolonged conversations with an AI psychologist, and a 17-year-old boy with autism became increasingly hostile toward his parents, ultimately resorting to violence after engaging with the same AI.
Researchers suggest that these AI systems may unintentionally reinforce extreme beliefs, creating an echo chamber effect. By continuously validating users’ thoughts and amplifying their emotions, AI could make it harder for individuals to distinguish reality from fiction—or well-intended advice from genuine harm. If AI develops strong empathy but lacks a firm ethical foundation, it could become a dangerous tool.
In this context, the simultaneous push by tech companies to make AI more advanced while reducing regulation could have profound societal consequences. Today, AI is evolving at a speed far beyond human comprehension, and finding an effective regulatory balance before it becomes entirely unrestrained is a challenge that society as a whole must urgently address.
Source: X, OpenAI, arXiv, Hackread