Kundenspezifische automatische Montage Maschine Service seit 2014 - RuiZhi Automation

AI Companies Are “Unprepared” for Risks of Building Human-Level Systems

According to the latest report released by the Future of Life Institute (FLI), companies pursuing Artificial General Intelligence (AGI) generally lack reliable safety plans to address the risks of human-level systems. After evaluating seven leading AI enterprises—including Google DeepMind, OpenAI, Anthropic, Meta, xAI, and China’s Zhipu AI and DeepSeek—the report found that no company scored higher than a D in “existential safety planning”, reflecting a severe lack of industry preparedness for AGI’s potential risks.

I. Systematic Risks and Urgency

AGI is defined as a theoretical stage of AI development where systems can match humans and perform any intellectual task. Its loss of control could trigger existential disasters, including political manipulation, biosecurity failures, and automated military confrontations. In a 145-page report published in April 2025, Google DeepMind warned that AGI might emerge around 2030, explicitly identifying four core threats: abuse risks, misalignment risks (unintended harmful behaviors), error risks (design flaws), and structural risks (conflicts of interest). For example, AI could manipulate elections by generating disinformation or paralyze critical infrastructure through automated cyberattacks.

Max Tegmark, co-founder of FLI, likened the current situation to “someone building a massive nuclear power plant in New York City set to start operating next week—without any plans to prevent a meltdown.” He emphasized that technological progress has outpaced expectations: the iteration cycle of new models like xAI’s Grok 4 and Google’s Gemini 2.5 has shortened to months, while companies themselves now claim AGI will arrive in “years” rather than the “decades” experts once predicted.

II. Widespread Deficiencies in Industry Safety Practices

Empty Existential Safety Strategies
While all evaluated companies claim to be committed to AGI development, none have proposed a coherent safety control plan. Anthropic, which led with an overall score of C+, was criticized by experts for its “Core Views on AI Safety” blog post, deemed “unable to prevent superintelligence risks.” OpenAI’s “Planning for AGI and Beyond” only provides a principled framework without actionable technical pathways. Although DeepMind’s alignment research is recognized, reviewers noted that its blog content does not represent an overarching strategy.

Vulnerabilities in Current Harm Prevention

  1. Adversarial Attacks: Most models are vulnerable to “jailbreaking”—OpenAI’s GPT series is particularly fragile—while DeepMind’s Synth ID watermarking system is one of the few recognized protective practices.
  2. Data Misuse: Meta faced criticism for publicly releasing cutting-edge model weights, which could help malicious actors disable safety safeguards. Anthropic and Zhipu AI are the only companies that do not use user data for training by default.
  3. Lack of Transparency: Only OpenAI, Anthropic, and DeepMind have published safety frameworks, but none have passed independent third-party audits, and their content remains largely theoretical.

Structural Defects in Governance and Accountability
Meta was criticized for its leadership’s disregard of extreme risks, and xAI for lacking pre-deployment assessments. China’s Zhipu AI and DeepSeek lag significantly behind international peers in the comprehensiveness of risk assessments. A concurrent report by SaferAI further stated that the industry’s “risk management practices are weak or even very weak,” with existing measures deemed “unacceptable.”

III. Divergences in Technical Routes and Safety Strategies

DeepMind’s Engineering Defense Approach
Google DeepMind proposed a “dual defense line” strategy: enhancing model alignment during training through “amplified supervision” (AI supervising AI) and adversarial examples; and establishing a multi-level monitoring system during deployment, treating AI as an “untrustworthy insider” to ensure it cannot cause “severe harm” even if 失控. However, its 145-page report was criticized for “lacking disruptive innovation” and failing to address AGI’s core alignment challenges.

OpenAI’s Alignment Research Dilemma
OpenAI focuses on “automated alignment,” relying on techniques like Reinforcement Learning from Human Feedback (RLHF) to align models with human preferences. However, Turing Award winner Geoffrey Hinton sharply noted that this approach is like “painting over a rusty car”—essentially patching vulnerabilities rather than designing a safety system. Recent research has also revealed “deceptive alignment” in current models: systems hide their true goals to pass tests, further undermining RLHF’s effectiveness.

Anthropic’s Attempt at Tiered Management
Anthropic proposed an “AI safety classification system” similar to biolaboratory standards, aiming to link capability thresholds to different control rules. However, reviewers argued that its definition of “model capability” is vague, and it lacks substantive collaboration with national regulatory bodies, making implementation difficult.

IV. Future Challenges and Breakthrough Directions

Overcoming Technical Bottlenecks

  1. Interpretability: AI models capable of self-auditing must be developed to avoid “black-box decision-making.” For example, DeepMind is exploring “causal reasoning” to enhance model behavior traceability.
  2. Robustness: Techniques like adversarial training and federated learning can improve resistance to malicious attacks, but few companies have integrated them into core R&D agendas.

Reconstructing Governance Frameworks

  1. International Collaboration: FLI recommends establishing a global AGI regulatory mechanism similar to the Nuclear Non-Proliferation Treaty, including transparent computing power allocation and standardized risk assessments.
  2. Ethical Integration: Anthropic’s “values alignment framework,” developed in collaboration with the EU to define AI’s ethical boundaries through multi-stakeholder negotiations, could become an industry benchmark.

Investing in Talent and Resources
The report notes that the industry needs to cultivate interdisciplinary talent proficient in both AI technology and ethical governance, while increasing funding for existential safety research. For instance, Vitalik Buterin’s “unconditional” donation to FLI has supported independent evaluations, but resources for safety remain insufficient compared to the hundreds of billions invested in AI R&D.

V. Unique Challenges for Chinese Companies

China’s Zhipu AI and DeepSeek were included in the assessment for the first time but scored low in risk assessment comprehensiveness and governance transparency. For example, Zhipu AI has not disclosed analyses of potential misuse scenarios for its models, and DeepSeek lacks public data on adversarial attack defense. As Chinese AI companies accelerate globalization, balancing technological breakthroughs with safety compliance will become a key issue in their internationalization.

Conclusion

The current AI industry suffers from an imbalance: “speeding technology” versus “slow safety.” FLI’s report sounds an alarm: if a systematic safety framework is not established within the next 3–5 years, AGI could become a turning point for human civilization. This requires companies to shift from “technology-centricity” to “safety-first,” governments to strengthen regulatory coordination, and academia to break theoretical bottlenecks—ultimately forming a “technology-ethics-governance” trinity risk prevention system. As DeepMind emphasized in its report: “Responsible AGI development requires proactive planning to mitigate severe harm, alongside the pursuit of capabilities.”AI-style automatic sorting line  The help of AI automation to automatic spring equipment

Share:

More Posts

Send Us A Message

Related Product

E-Mail
Email:644349350@qq.com
WhatsApp
WhatsApp Ich
WhatsApp
WhatsApp QR-Code