The Competence Shadow: Theory and Bounds of AI Assistance in Safety Engineering

原題: The Competence Shadow: Theory and Bounds of AI Assistance in Safety Engineering 著者: Umair Siddique | 会議: 2026 | 引用: 0 PDF: siddique26a.pdf


The Competence Shadow: Theory and Bounds of AI Assistance in Safety Engineering Umair Siddique Independent Research Ottawa, ON, Canada Abstract—As AI assistants become increasingly integrated into safety engineering workflows, a critical question emerges: does AI assistance improve the quality of safety analysis, or does it introduce systematic blind spots that accumulate invisibly and surface only through post-deployment incidents? This paper develops a formal framework for governing AI assistance in safety analysis as a core component of the broader development and certification of Physical AI. We first establish why safety engineering resists the benchmark-driven evaluation approaches that have enabled rapid AI progress in domains such as soft- ware development. Unlike tasks with objective ground truth, safety competence is irreducibly multidimensional, constrained by context-dependent correctness, inherent incompleteness, and legitimate expert disagreement. We formalize this structure through a five-dimensional competence framework capturing domain knowledge, standards expertise, operational experience, contextual understanding, and judgment. We introduce the competence shadow: the systematic narrow- ing of human reasoning induced by AI-generated safety analysis, whereby the AI’s partial competence profile limits which safety issues are hypothesized, which scenarios are evaluated, and which mitigations are retained. The shadow is not what the AI presents, but what it prevents from being considered. We formalize four canonical human-AI collaboration structures and derive closed- form performance bounds for each, demonstrating that the competence shadow compounds multiplicatively across various cognitive mechanisms to produce degradation far exceeding naive additive estimates. The central finding is that AI assistance in safety engineering is a collaboration design problem, not a software procurement decision. The same tool degrades or improves analysis quality depending entirely on how it is used, not how capable it is. We derive non-degradation conditions that are central to developing shadow-resistant safety engineering workflows, and call for a paradigm shift from tool qualification toward workflow qualification as the foundation for accelerated development of trustworthy and certifiable Physical AI. I. INTRODUCTION Consider a company developing humanoid robots for el- der care in residential environments.1 Their software team uses large language models (LLMs) for code generation and documentation [1], [2]. Product managers use AI to draft user stories and acceptance criteria. The company leadership observes these successes and asks a natural question: if other technical disciplines can accelerate through AI assistance, why not safety engineering? Some tool vendors and consulting 1We use humanoid robots as a running example for concreteness. The analysis and frameworks developed here apply broadly to any physical AI system whose deployment requires safety certification, including autonomous vehicles, mobile robots, medical devices and drones. firms in the safety analysis space reinforce this expectation, claiming 60 to 80 percent reductions in certification effort through AI-Powered Safety Analysis. The question facing safety teams is not whether to integrate AI assistance but how to do so responsibly. Comprehensive hazard analysis for a robot operating in unstructured human environments requires synthesizing requirements from mul- tiple safety standards [3]–[6] while addressing deployment contexts with fundamentally different risk profiles. Home envi- ronments with isolated elderly users present different hazards than institutional settings with trained staff. The same failure mode that causes minor inconvenience in one context could result in serious injury in another. Safety analysis is not a one-time deliverable but an iterative process that guides ar- chitecture decisions and requires sufficient verification before independent assessment bodies can certify the system. This context makes one question especially important: does AI assistance improve the quality of safety analysis, or can it introduce systematic gaps that remain invisible until post- deployment incidents reveal them? When a safety engineer reviews an AI-generated analysis as their starting point, does it enhance their ability to identify hazards comprehensively, or does it anchor their thinking to the AI’s initial framing? And when management observes that AI can draft preliminary analyses in minutes, will they maintain the time allocations that enable engineers to think deeply about edge cases? Recent empirical work provides important data points. Since 2024, the industry has seen a surge in capability-first stud- ies demonstrating that LLMs can generate industry-standard safety artifacts, including Fault Tree Analysis (FTA) [7], [8], Failure Mode and Effects Analysis (FMEA) for automotive systems [9], [10], and Hazard and Operability (HAZOP) reports for process industries [11]. While these studies report significant efficiency gains, they typically rely on empiri- cal methodologies to measure accuracy while leaving the underlying human-AI interaction unformalized. Diemert and Weber [12] found that 64% of LLM-generated hazard analysis responses contained useful information. Collier et al. [13] demonstrated that LLMs perform adequately at creative hazard identification but struggle with numerically grounded risk assessment. Qi et al. [14] evaluated four leading LLMs on HAZOP automation and found that the proportion of se- mantically valid scenarios remained between 0.19 and 0.37. Bharadwaj et al. [15] revealed significant variability across safety-critical hazard categories. Charalampidou et al. [16] arXiv:2603.25197v1 [cs.AI] 26 Mar 2026


found that approximately half of ChatGPT-4-generated unsafe control actions in STPA required expert correction. These findings indicate a technology that is genuinely useful yet inconsistent. The cognitive science literature offers a crucial lens. Para- suraman and Manzey [17] established that automation com- placency and bias arise from fundamental attentional mecha- nisms, not mere carelessness, consistent with earlier empirical evidence from Skitka et al. [18] and subsequently synthesized in a systematic review by Goddard et al. [19]. Tversky and Kahneman [20] established that initial estimates anchor sub- sequent reasoning, even when adjustment is warranted. Romeo and Conti [21], Horowitz and Kahn [22], and Bansal et al. [23] have documented automation bias and miscalibrated men- tal models as drivers of human-AI team underperformance. Dell’Acqua et al. [24] demonstrated through a field experiment with 758 consultants that AI assistance degraded performance by 19 percentage points on tasks outside the AI’s capability frontier, providing direct evidence that collaboration structure determines whether AI helps or harms. Chen et al. [25] showed that interface design choices in high-stakes settings shape whether collaboration enhances or degrades performance. Gao et al. [26] have called for systematic frameworks to govern these interactions. Yet no formal theory connects these cogni- tive phenomena to the specific structure of safety engineering tasks. We make three contributions toward closing this gap:

  1. A five-dimensional competence framework that formal- izes safety engineering competence and explains why this domain resists the benchmark-driven evaluation that has enabled rapid AI progress in software engineering.
  2. A theory of the competence shadow that identifies four mechanisms through which AI-generated output degrades human safety reasoning and shows that these mechanisms compound multiplicatively.
  3. Formal collaboration structures with performance bounds. We define four canonical human-AI collabora- tion structures, derive closed-form performance bounds for each, and establish non-degradation conditions that determine when AI assistance is safe to deploy. From these bounds we derive practical guidance for structure selec- tion, safety assessment, and organizational governance. The framework complements emerging standards efforts such as ISO/IEC TS 22440-1 Annex C [27], which addresses AI tool qualification but lacks a model of how AI outputs cast a competence shadow over human cognition. II. THE FIVE-DIMENSIONAL COMPETENCE FRAMEWORK A. From Scalar Expertise to a Competence Vector In software development, competence can be rigorously benchmarked. Code either compiles or it does not. Tests either pass or they break. Performance either meets specifications or it falls short. This objective verifiability explains why LLMs have achieved remarkable success on coding tasks, with datasets like HumanEval [1] and Mostly Basic Programming Problems (MBPP) [28] providing concrete benchmarks with verifiable correctness metrics. Safety engineering lacks this collapse to objective outcomes. Safety analyses rarely admit a single correct answer that can be automatically verified, and the multiple dimensions of safety competence remain irreducibly distinct because there is no single ground truth to which they converge. We model safety competence as a five-dimensional vector: C = ⟨D, S, E, C, J⟩ (1) Table I describes each dimension. TABLE I DIMENSIONS OF SAFETY ENGINEERING COMPETENCE Dim. Description D Domain knowledge. System physics, component inter- actions, failure propagation mechanisms, and domain- specific engineering principles. S Standards expertise. Knowledge of applicable safety standards (e.g., IEC 61508 [5], ISO 26262 [29], ISO 13482 [3]), compliance procedures, and regulatory approval processes. E Operational experience. Accumulated knowledge from debugging production systems, investigating in- cidents, and observing how systems actually fail rather than how theory predicts they should. C Contextual understanding. Recognition of how the same technical failure has different severity depending on deployment environment, user population, and avail- able mitigations. J Judgment. Risk calibration developed through empiri- cal feedback loops between predictions and outcomes. Ability to assess severity and likelihood based on how predictions map to real-world consequences. These dimensions are conceptually distinct, though em- pirically correlated. A compliance specialist may memorize ISO 26262 requirements (high S) without understanding au- tomotive control systems deeply enough to recognize novel failure modes (low D). An academic researcher may develop strong theoretical domain knowledge (high D) while having limited exposure to how systems fail in deployment (low E). The competence vector captures variation that a single scalar “expertise level” would obscure. No single safety professional dominates across all dimen- sions.2 Figure 1 illustrates an example how different specialists exhibit distinct competence profiles, and how a team combin- ing complementary profiles achieves comprehensive coverage. B. Fundamental Barriers to Benchmarking The multidimensional structure clarifies why safety compe- tence resists benchmarking. Three fundamental barriers reflect the intrinsic nature of safety knowledge, not limitations of current measurement techniques. 2Some engineers do develop broad strength across multiple dimensions, often by spending years as domain experts before moving into safety roles or the reverse. But these profiles reflect careers of deliberate accumulation, and organizational processes cannot be designed around their availability.

D S E C J (a) Certified Safety Engineer D S E C J (b) Field Expert D S E C J (c) Test Engineer D S E C J (d) Combined Team Fig. 1. Complementary competence profiles forming a complete team. Panel (a): consider a certified safety engineer, strong in standards (S) and judgment (J), limited in operational exposure. Panel (b): field expert, strong in domain knowledge (D) and contextual understanding (C). Panel (c): test engineer, strong in operational experience (E) and contextual knowledge (C). Panel (d): combined team coverage (dashed envelope). Context-dependent ground truth. The same failure mode demands different severity ratings across deployment contexts, and no context-independent standard exists to adjudicate be- tween them. An unexpected arm motion in a humanoid robot is LOW severity in a supervised industrial assembly cell, HIGH in a rehabilitation clinic where patients have limited mobility, and CRITICAL in a home environment where an elderly user with dementia may not recognize or respond to warning signals. Each rating is correct within its context. Inherent incompleteness. Whether the analysis identifies all failures that could lead to harm cannot be verified prior to deployment. Novel failure modes reveal themselves only through incidents, often years after design decisions are fi- nalized. Incident databases that could reveal these gaps are typically inaccessible outside organizations due to liability concerns and intellectual property protections. Legitimate expert disagreement. Five experienced engi- neers analyzing the same system will construct five different fault trees employing different decomposition strategies, op- erating at different granularity levels, and bringing different experiential priors. This disagreement reflects the irreducibly perspectival nature of safety judgment, not a deficiency to be eliminated. These barriers also explain why “qualifying” an AI- assistance based safety analysis tool under existing frame- works (ISO 26262 Part 8, IEC 61508 Part 3) is insuffi- cient: tool qualification addresses deterministic correctness, not the cognitive dynamics of human-AI interaction. Safety engineering workflows designed for deterministic systems are increasingly strained by the complexity and pace of mod- ern autonomous systems, and several frameworks have been proposed to modernize safety lifecycle practices [30], [31]. However, none formally address how AI assistance itself shapes the quality of the resulting analysis, and the cognitive mechanisms through which this occurs remain unformalized. Recent findings from industry reports confirm that while LLMs can brainstorm a wide volume of hazards, their outputs are frequently described as generic or lacking the depth of a subject matter expert [9]. Critically, the gap in operational ex- perience (E), contextual understanding (C), and judgment (J) is not primarily an information deficit addressable by retrieval- augmented generation or knowledge graphs: these dimen- sions are formed through sustained feedback loops between predictions and real-world consequences, and whether they can be meaningfully approximated in AI systems remains an open research challenge distinct from the collaboration design problem this paper addresses. C. Implications for AI-Assisted Safety Engineering The inability to rigorously benchmark AI-generated safety analysis is not a temporary limitation of current tools but reflects the irreducible structure of safety competence itself. Yet teams are increasingly relying on LLMs to generate fault trees and populate FMEA tables, often without formal understanding of which competence dimensions the AI can and cannot contribute to. The resolution is not to qualify AI assistants in isolation but to design collaboration structures that deliberately match AI capabilities to task requirements along specific competence dimensions. This requires a formal model of how AI outputs interact with human reasoning, which is what the remainder of this paper develops. III. THE COMPETENCE SHADOW Knowing which competence dimensions an AI system lacks does not, by itself, explain the risks of AI-assisted safety analysis. The deeper issue lies in how AI outputs interact with human cognition during the analysis process itself. We term this the competence shadow: the AI’s partial competence profile casts a shadow over the human analyst’s reasoning, sys- tematically narrowing which hazards are hypothesized, which scenarios are explored, and which findings are retained. The shadow is not what the AI presents, but what it prevents from being considered. When an engineer reviews an AI-generated hazard analysis, four mechanisms produce this shadow. These mechanisms are consistent with the automation bias phe- nomenon established by Skitka et al. [18] and subsequently confirmed across professional domains [17]. Mechanism 1: Scope Framing (αframe). AI-generated analysis establishes an implicit ontology of what constitutes a relevant failure mode. Engineers working from this frame


readily identify hazards fitting the AI’s taxonomy, but failure modes requiring alternative decomposition strategies become cognitively harder to generate [20]. This framing effect is consistent with Green and Chen’s [32] finding that algorithmic recommendations anchor human judgment even when pre- sented as advisory. Consider the elder-care robot: an AI trained on general robotics literature might frame hazards around me- chanical failure and collision avoidance, while entirely missing the interaction patterns specific to cognitively impaired users in home environments. The AI’s frame may be technically sound yet incomplete in ways reflecting its competence gaps in E and C. Mechanism 2: Attention Allocation Bias (β). Engineers reviewing AI output face an implicit resource allocation deci- sion: spend time verifying what the AI found, or search for what it may have missed? In practice, verification dominates: reviewing AI output for errors is bounded, offers predictable returns, and produces visible progress, while independent ex- ploration is open-ended and uncertain. The general automation bias literature confirms that operators working with automated systems consistently prioritize monitoring system output over independent analysis [17], [19]. We estimate that this dynamic leads engineers to allocate approximately 60 to 70 percent of effort to verification, leaving only 30 to 40 percent for independent exploration. Mechanism 3: Confidence Asymmetry (ηdisagree). When engineers identify a failure mode that the AI also identified, concordance functions as confirmatory evidence and retention probability approaches unity. When engineers identify a hazard absent from the AI’s output, cognitive dissonance arises: their judgment stands against the implicit judgment of a system that has processed far more safety analyses than any individual. The natural response is self-doubt. This dynamic mirrors Bansal et al.’s [23] finding that miscalibrated mental models of AI capability systematically degrade team performance. This asymmetry creates differential retention probabilities that systematically favor issues within the AI’s competence profile. Mechanism 4: Organizational Time Compression (γ). When management observes AI completing preliminary anal- yses in minutes, organizational pressure to compress safety analysis timelines intensifies. In safety engineering, this dy- namic creates a time compression ratchet (Section V) that we model through the parameter γ ∈(0, 1]. Time compression directly degrades baseline human capability (qh,eff = γ·qh) and amplifies all three cognitive mechanisms: reduced time forces greater reliance on verification, raises retention thresholds for AI-contradicting findings, and makes independent reasoning outside the AI’s frame prohibitive. These four mechanisms operate through distinct cognitive and organizational channels, yet their joint effect is mul- tiplicative rather than additive: each mechanism scales the residual human capability left by those preceding it. How severe this joint effect becomes depends on the structure of the collaboration itself, which we formalize next. IV. COLLABORATION STRUCTURES AND PERFORMANCE BOUNDS We formalize four collaboration structures that make fun- damentally different commitments about information flow and task decomposition. These commitments determine which shadow mechanisms are active and therefore bound the quality of the resulting analysis. Let S = {s1, … , sm} be the complete set of safety-critical issues. Define quality as Q = |Sidentified|/|S|, and let qh and qAI denote human and AI baseline identification probabilities. A. Structure Definitions Definition 1 (Serial Dependency, π1). The AI generates an initial analysis SAI ⊆S. Human reviewers observe SAI in full before conducting their review. Information flows from AI to human during the analysis phase. The final analysis is Sfinal = SAI ∪Shuman-review. All four shadow mechanisms are structurally active: scope framing (αframe), attention allocation (β), confidence asymmetry (ηdisagree), and time compression (γ). Definition 2 (Independent Analysis and Synthesis, π2). All agents (k human analysts and one AI system) perform anal- ysis independently without observing each other’s output. No information flows between agents during the analysis phase. A designated lead analyst then reconciles the indepen- dent results through a structured synthesis phase, resolving duplicates and adjudicating conflicts. The final analysis is Sfinal = SAI ∪Sk i=1 Shi. Because humans never see AI output during analysis, three shadow mechanisms are structurally eliminated (αframe = β = ηdisagree = 1). Only time compression (γ) remains, though typically at reduced severity. Definition 3 (Tool Augmentation, π3). Human analysts per- form all core safety reasoning. The AI is confined to auxiliary tasks: formatting, compliance cross-referencing, template pop- ulation, and documentation. Information flows from human to AI in the form of auxiliary queries only. The final analysis is Sfinal = Shuman-core ∪SAI-aux, where SAI-aux contributes only to presentation, not to safety content. No shadow mechanisms are active on core analysis, provided a clean decomposition boundary is maintained between core reasoning tasks (re- quiring ⟨D, E, C, J⟩) and auxiliary tasks (requiring primarily ⟨S⟩). Boundary errors occur with probability ε, affecting a fraction δ of safety issues. Definition 4 (Human-Initiated Exploration, π4). The human analyst performs an initial analysis Sh independently, with- out AI involvement. The human then provides Sh to the AI, requesting identification of gaps, alternative propagation paths, or additional failure modes. Information flows from human to AI and back to human during a revision phase. The final analysis is Sfinal = Sh ∪SAI-new, where SAI-new denotes novel findings the AI contributes beyond the human’s starting point. Because the human establishes a clean-room analysis before AI engagement, scope framing (αframe) and attention allocation (β) are structurally eliminated. Confidence


asymmetry (ηdisagree) remains active during the revision phase, and time compression (γ) applies to the initial manual phase. Figure 2 illustrates the structural differences in information flow across all four structures. Fig. 2. Four canonical human-AI collaboration structures for AI-assisted safety analysis, differing in information flow and active shadow mechanisms. B. Serial Dependency: The Compounding Shadow Serial Dependency (π1) is the unexamined default in current safety engineering research. Advanced frameworks, such as the Aegis multi-agent system [33] and pre-populated HA- ZOP tables [11], operate exclusively under this structure. It is the dominant deployment pattern in practice, driven by its apparent efficiency: the AI does the heavy lifting, and humans “check the work”. Our theory proves that π1 is the structure most susceptible to the multiplicative compounding of the competence shadow, and the significant epistemic risk it carries is precisely what its apparent efficiency conceals. Because all four shadow mechanisms are active under π1, we first formalize their compound effect. Definition 5 (Effective Anchoring Coefficient). Under Se- rial Dependency (π1) with scope framing αframe, attention allocation β, and confidence asymmetry ηdisagree, the effective anchoring coefficient is: αeff = αframe · β · ηdisagree (2) The parameters used throughout this section are illustrative values chosen to represent plausible moderate-shadow con- ditions, consistent with the automation bias literature [17], [19], [22] and with patterns observed in AI-assisted knowledge work [24]. We assume scope framing reduces the accessi- ble failure space to approximately 80% of its full extent (αframe = 0.8), that roughly 70% of review effort goes to verification rather than independent exploration (β = 0.3, reflecting 30% retained for exploration), that engineers retain approximately 70% of AI-discordant findings (ηdisagree = 0.7), and that time compression reduces available analysis time by 40% (γ = 0.6). These values are not empirically measured in safety engineering workflows; the qualitative conclusions hold across a wide range of parameter values, and developing

(続き)シャドウパラメータを測定するための標準化された手法は、このフレームワークがコミュニティに開く最も重要な研究機会です(第 VI 節)。これらの代表的なパラメータを用いると、有効なアンカリング係数は αeff = 0.168 となり、これは時間圧縮が適用される前にシャドウの影響を受けたレビュアーの独立した識別能力の 16.8% しか保持できないことを意味します。

Theorem 1(複合シャドウを伴うシリアル依存性): π1 の下で複合シャドウと k 人の独立した人間レビューアーがいる場合: E[Q(π1)] = qAI + (1 −qAI) · [1 −(1 −αeff · γ · qh)^k] (3) ここで、αeff は(2)式からのものであり、γ ∈(0, 1] は時間圧縮率です。

Proof. 問題 s ∈S について:AI が s を識別した場合(確率 qAI)、人間はそれを保持します(完全な検証の下で確率 1)。もし AI が s を見逃した場合(確率 1 −qAI)、各人間はスコープフレーミング、注意配分、自信の非対称性、および時間圧縮されたベースライン能力を反映する αeff · γ · qh の確率で独立してそれを識別します。少なくとも一人のレビューアーがそれを見つけた場合、問題は識別されます:P(s identified | s /∈SAI) = 1 −(1 −αeff · γ · qh)^k。全確率の法則により、結果は導かれます。

Corollary 1(非劣化条件): シングルレビューアーを伴うシリアル依存性は、人間ベースラインに対して品質を維持する(E[Q(π1)] ≥qh)ための必要十分条件は以下の通りです: qAI ≥qh(1 −αeff · γ) / (1 −αeff · γ · qh) (4)

数値例。qh = 0.85、qAI = 0.65 と仮定します。現実的なシャドウ条件下(αeff = 0.168、γ = 0.6)では:E[Q(π1)] = 0.65 + 0.168 × 0.6 × 0.85 × 0.35 = 0.68。これは AI の支援にもかかわらず、人間ベースラインから 20% の劣化です。非劣化のしきい値は qAI ≥0.74 を必要とします。

Figure 3 は劣化構造を視覚化しており、各メカニズムの逐次的な影響を追跡しています。

Waterfall Analysis: Compounding Shadow Effect(各メカニズムが追加的な劣化をもたらす) 理想化されたケース (α = 1, γ = 1):0.948

  • スコープフレーミング:0.888 (-6.3%)
  • アテンションアロケーション:0.721 (-18.8%)
  • コンフィデンス・アシンメトリー:0.700 (-2.9%)
  • 時間圧縮:0.680 (-2.9%)

人間ベースライン (qh = 0.85) / AI 能力 (qAI = 0.65) Fig. 3. 複合シャドウのウォータフォール分析。理想化されたケース(0.948)から始まり、各メカニズムが順に品質を低下させます:スコープフレーミング(0.888)、アテンションアロケーション(0.721)、コンフィデンス・アシンメトリー(0.700)、および時間圧縮(0.680)。最終的な品質は人間ベースラインより 20% 低くなります。

C. 独立分析:構築によるシャドウの排除 Independent Analysis (π2) は、分析フェーズ中の情報フローを防止し、規律ではなく構築によってコンピテンスシャドウ(能力の影)を排除します。この構造的アプローチは、Buc¸inca et al.’s [34] の発見、すなわち認知強制機能(cognitive forcing functions)が支援された意思決定における AI への過度な依存を大幅に減少させるという知見と一致しています。

Theorem 2(独立分析のパフォーマンス): π2 の下で k 人の人間、AI 能力 qAI、および共有される盲点確率 qshared を持つ構造的相関 ρがある場合: E[Q(π2)] = 1 −ρ · qshared −(1 −ρ)(1 −qh)^k (1 −qAI) (5) この構造は時間圧縮(γ)に対してのみ脆弱であり、通常は軽減された深刻度で生じます(γ ≈0.85、シリアル依存性における γ ≈0.6 に対し)。これは、その設計が管理層にとって適切な時間の配分の必要性をより明白にするためです。

数値例。qh = 0.85、qAI = 0.65、k = 3、ρ = 0.3、qshared = 0.4 とすると:E[Q(π2)] = 1 −0.12 −0.7 × (0.15)^3 × 0.35 ≈0.88。これは人間ベースラインより 3 ポイント高く、シャドウの影響を受けたシリアル依存性よりも 20 ポイント高いです。

D. ツール拡張:クリーン分解 Tool Augmentation (π3) は、AI をコンピテンスシャドウが形成できないタスクに限定します。重要な要件はクリーンな分解境界です:⟨D, E, C, J⟩を必要とするコアタスクは、主に ⟨S⟩を必要とする補助タスクから分離されなければなりません。

Theorem 3(ツール拡張のパフォーマンス): π3 の下でクリーンな分解と境界エラー確率 ε が問題の割合 δ に影響を与える場合: E[Q(π3)] = qh · (1 −ε · δ) (6) ε = 0.03、δ = 0.5 とすると:E[Q(π3)] = 0.85 × 0.985 ≈ 0.837。これはベースラインからわずか 1.5% の劣化でありながら、補助作業で 30% の時間節約を実現します。分解が失敗した場合(ε が 0.15 に上昇)、品質は 0.80 に低下します。

E. 人間開始の探索:情報フローの逆転 Human-Initiated Exploration (π4) は、シリアル依存性のリスクの高い効率と独立分析のリソースコストとの間の構造的ギャップに対処します。AI を使用するために人間が初期分析を完了することを要求することで、π4 はフレーミングメカニズムを時間的に分離します:人間のベースライン能力 qh は保持されます。なぜなら、重要な発散的思考フェーズ中に AI の出力が存在せず、彼らの推論をアンカーしないからです。AI はその後、イニシエーターではなくチャレンジャーとして機能します。

Shentu and Trapp [8] による最近の研究は、π4 のバージョンを示しています。そこでは生成 AI コパイロットが人間主導のフォールトツリーに対して新しいサブ原因を提案します。これは分析の発散的カバレッジを効果的に拡張しますが、当フレームワークはこの構造の有効性が、AI が人間の初期システム分解にアンカーする傾向によって制限されることを明らかにしています。

Theorem 4(人間開始探索のパフォーマンス): π4 の下で人間ベースライン qh、AI 能力 qAI、逆アンカリング係数 ρrev ∈[0, 1]、および AI 提案発見の受容率 ηaccept を持つ場合: E[Q(π4)] = qh + (1 −qh) · ηaccept · (1 −ρrev) · qAI (7) ここで、ρrev は人間の初期分析 Sh にさらされる程度が AI の探索をどの程度拘束するかを表します。ρrev = 0 の場合、AI は自由に探索し、人間のフレームアウトの全く外側の故障モードを特定する可能性があります;ρrev = 1 の場合、AI は人間の分解に完全にアンカーされ、新しい貢献はしません。

Proof. 問題 s ∈S について:人間がクリーンルームフェーズで s を識別した場合(確率 qh)、それは保持されます。もし人間が s を見逃した場合(確率 1 −qh)、AI は逆アンカリングの程度によって割引かれたそのベースライン能力を反映する (1 −ρrev) · qAI の確率で独立してそれを特定し、人間は発見を ηaccept の確率で受容します。人間の初期分析がシャドウを受けていないため、qh は劣化なしで入力されます。全確率の法則により、結果は導かれます。

数値例。qh = 0.85、qAI = 0.65、ρrev = 0.30(中程度の逆アンカリング:AI は人間のフレームによって部分的に拘束されていますが、 substantial independent exploration capability を保持しています)、ηaccept = 0.70 とすると:E[Q(π4)] = 0.85 + 0.15 × 0.70 × 0.70 × 0.65 = 0.898。これは…(原文はここで切れています)

人間のベースラインより 5 ポイント高く、中程度の逆アンカリング条件下で Serial Dependency(68%)と Independent Analysis(88%)の間に位置し、有利な条件では π2 と競争可能です。Table II は、一貫したパラメータ下での完全な比較を示しています。

TABLE II FOUR-STRUCTURE COMPARISON (qh = 0.85, qAI = 0.65) Metric | Serial | Indep. | Tool Aug. | HIE Expected Quality | 68% | 88% | 84% | 90% ∆vs. Baseline | −17 pp | +3 pp | −1 pp | +5 pp Shadow Mechs. Active | 4/4 | 1/4 | ε only | 1/4

V. FROM BOUNDS TO PRACTICE A. Structure Selection for Safety Teams 非劣化条件(Corollary 1)は、安全チームに構造的な意思決定ルールを与えます:Serial Dependency は、その特定のワークフローにおけるコンピテンスシャドウの深刻度によって決定される閾値を超える AI 能力がある場合にのみ分析品質を維持します。中程度から強いシャドウ効果の場合、必要な AI 能力は現在の LLM が示すものよりも大幅に高く、今日のほとんどの安全分析タスクでは Serial Dependency はおそらく品質を劣化させることを示唆しています。

実用的な対応策は、構造をタスクに一致させることです。単一のプロジェクトには数十の異なる分析活動が含まれており、それらが要求する能力次元に応じて、異なる活動が異なる構造を必要とします。主要な推奨事項は、安全ワークフローを活動レベルで分解し、プロジェクト全体で単一の共同パターンを採用するのではなく、意図的に構造を割り当てることです。この分解自体も設計上の意思決定であり、見直されるに値します。なぜなら、「核心的推論」と「補助タスク」の間の誤った境界は、Tool Augmentation のパフォーマンスを劣化させるまさに意味的リーク(semantic leakage)を導入するからです。組織はまた、時間圧縮ラチェット(time compression ratchet)に対抗しなければなりません。AI が数分で予備分析を完了すると、管理層にはスケジュールを厳格化する合理的な根拠があり、シャドウ効果が出力に目に見えない場合、ラチェットはさらに締まります。非劣化条件は原理的な下限を提供します:形式モデルは品質が人間ベースラインを下回る下の最小限の時間配分を生み出し、この配分を自由に交渉可能として扱うこと自体が安全上重要な意思決定です。

B. Implications for Safety Assessment 独立した安全評価者は新たな問題に直面しています:組織が認証レビューのために危険分析を提出する場合、評価者は分析が技術的に適切であるかどうかだけでなく、それを生成するプロセスが認識論的に健全であったかどうかを評価しなければなりません。2 つの FMEA テーブルは紙面上では同一に見えても、非常に異なる認識論的重みを持つ可能性があります。一方は AI ドラフトによるフォーマットで補強された徹底的な人間の推論を反映しているかもしれません;他方はスコープフレーミングの下で独立した探索なしに受け入れられた AI 生成の故障モードを反映しているかもしれません。出力だけでは、これらのシナリオは区別できません。

形式フレームワークは、評価者がコンピテンスシャドウリスクをターゲットにするために使用できる 3 つの質問を示唆しています:

  1. 各分析活動を支配した構造は何ですか?構造がどのシャドウメカニズムがアクティブであったかを決定します(Table II)、Serial Dependency は Independent Analysis と本質的に異なる認識論的リスクを伴います。
  2. AI 能力は非劣化閾値を満たしていますか?Serial Dependency が使用された場合、評価者は比較可能なタスクにおける AI パフォーマンスが Corollary 1 の限界を超えるかどうかを尋ねるべきです。十分な AI 能力の証拠のない Serial Dependency は、明示的に正当化する必要があるリスクです。
  3. 安全主張の認識論的由来は何ですか?故障モードの大部分が AI 生成コンテンツに遡り、編集的な人間のレビューしかない場合、実効品質は qh ではなく qAI によって制限されます。

VI. DISCUSSION AND OPEN PROBLEMS A. Challenged Assumptions Assumption 1: AI capability alone determines value. Serial Dependency と Independent Analysis の間の品質格差(Table II)は、完全に共同構造から生じます。これは Dell’Acqua et al.’s [24] のフィールド実験的発見と一致しており、同じ AI ツールがどのように使用されるかに応じて、一部のタスクではパフォーマンスを向上させつつ他のタスクでは劣化させることが示されています。より良いモデルへの投資を行いながら構造設計を無視する組織は、安全分析を劣化させている可能性があります。

Assumption 2: Efficiency gains are unqualified benefits. 管理層が安全分析時間を 40 時間から 24 時間に短縮する場合(γ = 0.6)、彼らはコンピテンスシャドウを乗法的に複合します。結果として生じる品質劣化は出力には目に見えず、さらなるスケジュールの厳格化を招きます。コミュニティが安全で持続可能な計算システムのエンジニアリングへと移行するにつれ、ここで導出された非劣化限界は形式上の根拠を提供します:Independent Analysis(π2)のようなシャドウ耐性構造なしでは、現在の能力優先研究 [7], [9], [11] が約束する効率の向上は、目に見えないコンピテンスシャドウによって相殺されるリスクがあります。

Assumption 3: Tool qualification frameworks are sufficient for AI assistants. 既存の基準および ISO/IEC TS 22440-1 Annex C [27] のような新たな取り組みは、AI を正しく出力するか誤った出力を生成するものとしてモデル化し、人間がエラーを検出できるかどうかを問います。私たちのフレームワークはより深い問題を明らかにします:最も危険な故障モードは誤った AI 出力ではなく、人間の認知パフォーマンスにコンピテンスシャドウを投じるように見える正しい出力です。妥当だが不完全な AI 生成の危険リストは、現在の定義では「機能不全」ではありませんが、レビューアーが不足しているものを特定する能力を体系的に低下させます。より根本的に、現在のフレームワークには共同構造の概念がなく、暗黙的に単一のワークフロー(AI が生成し、人間がレビューする)を仮定しており、構造的に異なる配列が劇的に異なる品質結果を生み出すことを認識していません。

B. Open Problems Shadow parameter estimation. Our model treats αframe, β, ηdisagree, and γ as given parameters, but no standardized methodology exists to measure them. This paper provides the theoretical framework and formal structure; empirical calibration is the essential next step. We call for controlled studies in which safety engineers perform hazard analysis under each structure condition, with effort allocation, retention rates, and independent discovery rates measured directly.

Dynamic shadow evolution. We treat shadow parameters as static, but real deployments involve learning. We conjecture a U-shaped dynamic: an initially strong shadow from novelty, followed by compensation as engineers develop metacognitive awareness, potentially followed by complacency. Formalizing these dynamics requires multi-period models with learning parameters.

VII. CONCLUSION: TOWARD HUMAN-CENTRIC SAFETY INTELLIGENCE LLM 生成の故障ツリーをレビューする安全エンジニアは、単に「AI の作業をチェックしている」わけではありません。彼らは AI のフレーミングによって形成された認知環境内で動作しており、そこでは独立した判断が彼らが意識的に抵抗できないメカニズムを通じて体系的に抑制されます。

私たちは Human-Centric Safety Intelligence を提唱します:これは、人間の専門知識の代替不可能な次元を維持し増幅する共同構造内に AI 能力を原理的に統合することです。LLM は標準集約的・ドキュメント重視的なタスクにおいて真に優れています。誤りは、結果として生じる人間-AI システムの認知アーキテクチャを理解せずにそれらを展開することにあります。

ソフトウェアエンジニアリングには、AI 能力を測定するための HumanEval [1], MBPP [28], および SWE-bench [35] があります。安全エンジニアリングにはそれに相当するものがありません。これらの基盤に基づいて構築するには協調的な研究プログラムが必要です。私たちは以下を呼びかけます:(1) 特定の構造条件下で人間-AI システムが何を見つけるかを測定する構造認識評価ベンチマーク;(2) 組織が展開されたワークフローから匿名化された測定値を提供するシャドウパラメータデータベース;(3) エンジニアがシャドウ耐性または飽和状態を発達させるかを追跡する縦断的フィールドスタディ;および (4) 共同構造分類、シャドウダイナミクス、および構造リスクに較正されたドキュメント要件に対処するためにツール資格認定フレームワークを拡張する基準の進化。

今後 10 年間で数百万台のロボット、車両、医療機器が展開されるでしょう。それらのシステムが本当に安全かどうかは、今日行われている選択にかかっています。安全上重要な AI 支援は達成可能ですが、それは単に能力のあるモデルではなく、シャドウ耐性の共同構造を必要とします。

REFERENCES REFERENCES [1] M. Chen, J. Tworek, H. Jun et al., “Evaluating large language models trained on code,” arXiv preprint arXiv:2107.03374, 2021. [2] S. Peng, E. Kalliamvakou, P. Cihon, and M. Demirer, “The impact of AI on developer productivity: Evidence from GitHub Copilot,” arXiv preprint arXiv:2302.06590, 2023. [3] International Organization for Standardization, “ISO 13482: Robots and robotic devices – safety requirements for personal care robots,” ISO, Standard, 2014. [4] ——, “ISO 12100: Safety of machinery – general principles for design – risk assessment and risk reduction,” ISO, Standard, 2010. [5] International Electrotechnical Commission, “IEC 61508: Functional safety of electrical/electronic/programmable electronic safety-related systems,” IEC, Standard, 2010, second edition. [6] International Organization for Standardization, “ISO 13849-1: Safety of machinery – safety-related parts of control systems – part 1: General principles for design,” ISO, Standard, 2023, fourth edition. [7] S. Shetiya et al., “Fault tree analysis generation using GenAI with an autonomy sensor usecase,” SAE International Journal of Connected and Automated Vehicles, 2026. [8] Y. Shentu and M. Trapp, “Facilitating fault tree analysis with generative AI,” in Computer Safety, Reliability, and Security. SAFECOMP 2025 Workshops, ser. Lecture Notes in Computer Science, M. T¨orngren, B. Gallina, E. Schoitsch, E. Troubitsyna, and F. Bitsch, Eds. Cham: Springer Nature Switzerland, 2026, pp. 524–536. [9] I. El Hassani et al., “AI-driven FMEA: Integration of LLMs for faster and more accurate risk analysis,” Design Science, 2025. [10] R. Singh et al., “Application of LLM for failure modes and effects analysis (FMEA),” in Proceedings. Springer, 2025. [11] E. Elhosary, “Utilization of artificial intelligence in HAZOP studies and reports,” in Proceedings of the International Symposium on Automation and Robotics in Construction (ISARC), 2025. [12] S. Diemert and J. H. Weber, “Can large language models assist in hazard analysis?” in arXiv preprint arXiv:2303.15473, 2023. [13] Z. A. Collier et al., “How good are large language models at product risk assessment?” Risk Analysis, 2025. [14] S. Qi et al., “Can large language models automate the HAZOP process without human intervention?” Safety Science, 2025. [15] A. Bharadwaj et al., “From hallucinations to hazards: Benchmarking LLMs for hazard analysis in safety-critical systems,” Safety Science, 2025. [16] S. Charalampidou, I. Petrounias, and I. M. Dokas, “Hazard analysis in the era of AI: Assessing the usefulness of ChatGPT4 in STPA hazard analysis,” Safety Science, vol. 180, p. 106666, 2024. [17] R. Parasuraman and D. H. Manzey, “Complacency and bias in human use of automation: An attentional integration,” Human Factors, vol. 52, no. 3, pp. 381–410, 2010. [18] L. J. Skitka, K. L. Mosier, and M. Burdick, “Does automation bias decision-making?” International Journal of Human-Computer Studies, vol. 51, no. 5, pp. 991–1006, 1999. [19] K. Goddard, A. Roudsari, and J. C. Wyatt, “Automation bias: A sys- tematic review of frequency, effect mediators, and mitigators,” Journal of the American Medical Informatics Association, vol. 19, no. 1, pp. 121–127, 2012. [20] A. Tversky and D. Kahneman, “Judgment under uncertainty: Heuristics and biases,” Science, vol. 185, no. 4157, pp. 1124–1131, 1974. [21] G. Romeo and D. Conti, “Exploring automation bias in human-AI collaboration: A review and implications for explainable AI,” AI & Society, 2025. [22] M. C. Horowitz and L. Kahn, “Bending the automation bias curve: A study of human and AI-based decision making in national security contexts,” International Studies Quarterly, vol. 68, no. 2, p. sqae020, 2024. [23] G. Bansal, B. Nushi, E. Kamar, W. S. Lasecki, D. S. Weld, and E. Horvitz, “Beyond accuracy: The role of mental models in human- AI team performance,” Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, vol. 7, pp. 2–11, 2019. [24] F. Dell’Acqua, E. McFowland III, E. R. Mollick, H. Lifshitz-Assaf, K. C. Kellogg, S. Rajendran, L. Krayer, F. Candelon, and K. R. Lakhani, “Nav- igating the jagged technological frontier: Field experimental evidence of the effects of AI on knowledge worker productivity and quality,” Harvard Business School Technology & Operations Mgt. Unit Working Paper No. 24-013, 2023. [25] Z. Chen, Y. Luo, and M. Sra, “Engaging with AI: How interface design shapes human-AI collaboration in high-stakes decision-making,” arXiv preprint arXiv:2501.16627, 2025. [26] C. Gao et al., “Human-AI collaboration is not very collaborative yet: A taxonomy of interaction patterns in AI-assisted decision making from a systematic review,” Frontiers in Computer Science, vol. 6, p. 1521066, 2024. [27] International Organization for Standardization and International Elec- trotechnical Commission, “ISO/IEC CD TS 22440-1: Safety of ma- chinery and autonomous systems – part 1: Safety requirements for AI- based systems,” ISO/IEC JTC 1/SC 42 and ISO/TC 22/SC 32/WG 14, Committee Draft, 2026, annex C: AI-based Software Tools. [28] J. Austin, A. Odena, M. Nye et al., “Program synthesis with large language models,” arXiv preprint arXiv:2108.07732, 2021. [29] International Organization for Standardization, “ISO 26262: Road vehi- cles – functional safety,” ISO, Standard, 2018, second edition. [30] U. Siddique, “SafetyOps,” arXiv preprint arXiv:2008.04461, 2020. [31] C. Cˆarlan, D. Ratiu, and M. Wagner, “Safety factories – a manifesto,” in Proc. 44th Int. Conf. on Computer Safety, Reliability and Security (SAFECOMP), 2025. [32] B. Green and Y. Chen, “The principles and limits of algorithm-in-the- loop decision making,” Proceedings of the ACM on Human-Computer Interaction, vol. 3, no. CSCW, pp. 1–24, 2019. [33] L. Shi et al., “Aegis: An advanced LLM-based multi-agent for intelligent functional safety engineering,” in Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024. [34] Z. Buc¸inca, M. B. Malaya, and K. Z. Gajos, “To trust or to think: Cognitive forcing functions can reduce overreliance on AI in AI-assisted decision-making,” Proceedings of the ACM on Human-Computer Inter- action, vol. 5, no. CSCW1, pp. 1–21, 2021. [35] C. E. Jimenez, J. Yang, A. Wettig, S. Yao, K. Pei, O. Press, and K. Narasimhan, “SWE-bench: Can language models resolve real-world GitHub issues?” arXiv preprint arXiv:2310.06770, 2023.