[Bastion-RAG 1 – Sentinel] Prompt Injection Defense

This section marks the beginning of the implementation phase. While the complete architecture design ([Bastion-RAG 0]) typically serves as the final blueprint in a theoretical roadmap, it is positioned here to guide active production development.

Utilizing Large Language Models (LLMs) throughout this project extended beyond automated code generation, functioning similarly to an iterative collaboration with additional system architects. A core engineering principle observed during this process is that an AI assistant operates strictly within the boundaries of the context, technical challenges, and constraints enforced by the engineer.

Initial concerns that AI advancements might entirely replace human engineering teams were disproven through three major structural restructurings of the codebase. Automated code generation cannot replace core engineering competencies. The utility of AI tools depends heavily on the developer’s mastery of software engineering fundamentals, deep domain expertise, and practical architectural intuition. Effective system oversight requires a thorough understanding of the underlying engineering principles..

URL Site > https://github.com/zafrem/bastion-sentinel

Series Name: Bastion – Project Security RAG

[Bastion-RAG] Project Security RAG
[Bastion-RAG 0] Get help from AI (Architecture Design)
[Bastion-RAG 1 – Sentinel]
- Prompt Injection Defense – Here!
- Metadata Filtering
[Bastion-RAG 2 – Vault]
- Multi-tenancy
- Deterministic De-identification
[Bastion-RAG 3 – Navigator]
- Hybrid Reranking
- Logical Partitioning
[Bastion-RAG 4 – Archor]
- Embedding Noise Injection
- Embedding Model Bias Verification
[Bastion-RAG 5 – Tracker]
- Data Lineage Tracking
- Honey-token Injection
[Bastion-RAG Demo]

1. Prologue: Three Structural Restructurings Born from Reckless Optimism

The objective of the Bastion-RAG initiative was to implement theoretical security concepts—including real-time prompt injection filtering, deterministic PII tokenization, and embedding inversion defenses—into a production-grade enterprise data governance framework. Initial development assumed that integrating these mitigation layers would be straightforward.

While initial code synthesis and interface scaffolding were completed quickly, structural implementation introduced complex performance constraints and data compliance requirements, such as PIPA and GDPR. Enforcing data privacy boundaries while maintaining sub-millisecond execution targets in a production RAG pipeline posed a significant challenge. Appending security interceptors naively increased the pipeline’s p95 latency. Resolving these performance bottlenecks required three structural iterations of the core framework and iterative design validation with the AI assistant to optimize the architecture.

This document serves as an engineering retrospective detailing how the data pipeline was refined through automated validation, optimizing execution overhead down to the microsecond ($\mu s$) to satisfy security and latency requirements.

2. Evolutionary Analysis: The Structural Journey from v1 to v3

Reviewing the architectural design logs exchanged with the AI assistant traces a profound evolution marked by technical friction, rapid prototyping, and the steady bridging of initial research gaps.

2.1 [Version 1.0] The Swamp of Functional Fragmentation (Failure)

Initial Architectural Profile
- The initial design utilized an asymmetric, input-centric defense pipeline where security components, such as Sentinel-IN and Vault-Phase1, operated as decoupled microservices in separate containers. The Honey-Token intrusion detection system was also implemented as an isolated routine within the observability module, decoupled from the active retrieval path.
Design Debate and Architecture Selection
- The initial rationale was to decouple the security modules to ensure system extensibility. The planned architecture featured an input gateway written in Go, with the embedding and reranking models isolated in a standalone Python container accessed via synchronous HTTP calls. This approach followed standard microservice principles to establish deployment boundaries and independent scaling across functional layers.
Performance Evaluation and Vulnerability Analysis
- Integration benchmarks revealed significant performance overhead. Each user query triggered multiple network hops and repetitive JSON serialization and deserialization routines. As a result, p95 latency increased beyond 300ms, which failed to meet production performance requirements.
- Additionally, benchmarking exposed a structural asymmetry in the pipeline. The exclusive focus on inbound validation resulted in the omission of an output verification phase. This created a security risk where the LLM could decode anonymized tokens or include unauthorized data in its responses, bypassing storage-level isolation mechanisms.

2.2 [Version 2.0] Symmetrical Integration & Cross-Cutting Dynamics (Transition)

Revised Architectural Profile
- The second iteration consolidated the separate input and output interceptors into a bidirectional pipeline compiled within a single Go service boundary. Features such as Honey-Tokens, Multi-Tenancy, and Data Lineage were refactored from isolated modules into cross-cutting system coordinators.
Language Ecosystem Conflict
- Consolidating the pipeline into a single Go process eliminated network hop latency, but introduced limitations in the machine learning serving layer. The Go ecosystem lacks native primitives for specialized operations like Word Embedding Association Tests (WEAT) or Laplacian noise generation, requiring extensive custom low-level matrix implementation. Enforcing strict single-language consolidation created a maintenance burden and limited architectural flexibility. The system required a hybrid interface contract that could maintain runtime performance isolation while leveraging the Python machine learning ecosystem.
Limitations of the Transition
- While this version resolved the network latency bottleneck, it encountered maintenance challenges. Serving deep learning models via native Go bindings introduced architectural complexity, particularly around erratic CGO boundaries.

2.3 [Version 3.0] Completion of the High-Performance Polyglot Wire Contract (Current)

Final Architectural Profile
- The final iteration retains the symmetrical bidirectional pipeline and decoupled module design established in v2.0, implementing a hybrid polyglot architecture. Text processing, schema parsing, and cryptographic token lookups are handled by Go, while vector space manipulation, embedding security, and matrix numerical operations are managed by Python.
Inter-Process Communication Optimization
- To leverage the Python machine learning ecosystem without introducing the network latency observed in v1.0, the Navigator (Search) and Anchor (Security) modules were redesigned as self-contained Python processes. By hosting the sentence-transformers and CrossEncoder models directly within process memory, inference is executed in-process, eliminating network hops.
- To bridge the multi-language boundary, the system utilizes a gRPC infrastructure. However, the standard binary protobuf layer is replaced with a custom JSON Codec contract on the wire, balancing structural flexibility with strict type safety.

3. [Bastion-RAG 0] Virtual Emulation Simulation for Architectural Integrity

Through three architectural iterations, implementation demonstrated the necessity of validating configuration compliance and exception handling paths prior to codebase development. To integrate this approach into the development lifecycle, Bastion-RAG 0 was established as a proactive architectural auditing layer at the framework’s entry point.

When enterprise security policies and compliance constraints are defined, Bastion-RAG 0 emulates the pipeline’s event stream over a NATS event bus topology. This emulation runs without loading machine learning models or initializing concrete services.

Once a proposed pipeline configuration is provided, the Bastion-RAG 0 audit engine dynamically generates a real-time Virtual Ingestion Trace Log to analyze data lineage flows and identify potential architectural bottlenecks.:

[Bastion-RAG 0: Virtual Emulator Ingestion Trace Log]

 - [emulator/Sentinel-IN]  INFO: Prompt input validated. Status: PASSED. Injection score: 0.05
 - [emulator/Vault-Phase1] INFO: Multi-strategy anonymization executed.
                               - Input matching: "Hong Gildong" -> KR_NAME_8f3d2a (PERSON)
                               - Input matching: "hong@naver.com" -> EMAIL_c3a91f (EMAIL)
 - [emulator/Navigator]    INFO: Executing structural pre-filtering isolation.
                               - Injected filters: tenant_id=acme, collections=[customer_docs]
                               - Evaluation: Metadata Pre-filtering rules generated successfully. ⭐
 - [emulator/Anchor-IN]    INFO: Differential noise injection executed. Sigma applied: 0.01
 - [emulator/Vault-Phase2] INFO: Evaluating selective detokenization via OPA policy rules.
 - [emulator/Sentinel-OUT] INFO: Grounding and hallucination checks completed. Status: PASSED.

4. Architectural Principles and Hard Constraints for Total Isolation

Hammered out through intense technical debates with my AI assistant and codified directly into our core 01_architecture-principles.md foundation document, the absolute architectural constraints of the Bastion-RAG framework are defined as follows:

Core Functional Autonomy
- Each module must be designed as a self-contained, autonomous unit that delivers security value independently when integrated with an LLM. To ensure graceful degradation, the failure of peripheral modules must not disrupt or cause cascading errors in the primary data path.
Prohibition of Direct Coupling
- Modules must not instantiate or execute direct API calls to other modules. For example, the Navigator search layer does not maintain a reference to the Vault. It operates on a zero-trust data contract, where required user permission tokens are retrieved by the upstream orchestrator and included directly in the request payload.
Non-Invasive Observability Architecture
- The Tracker module, which aggregates system audit records and maps data lineage paths, must not introduce synchronous blocking or latency to the primary data path. It functions as a non-invasive observer, consuming asynchronous, fire-and-forget JSON event streams transmitted over a decoupled NATS message bus.
Search Isolation: Pre-filtering Requirement
- To mitigate security risks such as timing attacks and metadata leakage, the architecture prohibits post-filtering—the practice of fetching global vector matches and subsequently filtering out unauthorized records. Bastion-RAG requires pre-filtering isolation. Before a query executes against the HNSW vector graph, tenant_id and access control vectors must be bound directly to the vector query parameters, preventing unauthorized data from entering the compute space.

5. Technical Deep Dive: [Bastion-RAG 1 – Sentinel] Prompt Injection Defense

The Sentinel-IN gateway (validators/prompt/) stands at the absolute perimeter of the framework, intercepting raw incoming user strings to neutralize malicious instructions before they interact with downstream retrieval mechanics.

5.1 Compile-Time Detector Structure

The nucleus of the validation layer is the thread-safe Detector struct, instantiated exactly once at process initialization via engine.New(). To preserve sub-millisecond execution boundaries, the engine maps configuration matrices directly into compiled memory arrays rather than parsing parsing rules dynamically at runtime.

// validators/prompt/detector.go

type Detector struct {
    cfg     config.PromptInjectionConfig
    regexes []*regexp.Regexp   // Compiled once at boot; index mirrors cfg.RegexRules
    scorer  ml.Scorer          // Thread-safe abstract interface; defaults to OnnxStub
}

By organizing compiled expressions within a contiguous slice ([]*regexp.Regexp), lookups execute with minimal CPU instruction overhead. The underlying machine learning classifier interface (ml.Scorer) is bound via a nil-safe abstraction, defaulting to a lightweight stub to prevent pipeline blockages during bootstrapping.

5.2 Four-Stage Synchronous Shield Pipeline

When a payload enters the ingress boundary, the Detector.Detect(query) engine routes the text through four deterministic execution stages sequentially, operating under a strict runtime budget of less than 1 millisecond.

Raw User Query
    │
    ▼
[Stage 1: Unicode NFC Normalization]  ← Neutralizes homoglyph & zero-width spacing exploits
    │
    ├──► [Stage 2: Regex Engine]      ← Structural pattern enforcement (25 built-in rules)
    │
    ├──► [Stage 3: Keyword Engine]    ← Fast substring scanning (25 built-in rules)
    │
    └──► [Stage 4: ML Scorer (ONNX)]  ← Evaluates multi-token continuous probability
            │
            ▼
   [Score Aggregation]                ← Computes vector max() or weighted_avg matrix
            │
    finalScore >= 0.7 ? ──────► [BLOCKED] (Terminates data path, throws HTTP 403, logs Incident)
            │
            └─────────────────► [PASSED] (Hands off sanitized string to Vault Phase-1)

Stage 1: Unicode NFC Normalization

Adversaries may attempt to bypass string matchers using Cyrillic homoglyphs, mixed-script variations, or zero-width characters (such as U+200B) to obfuscate commands. To normalize inputs, the gateway processes incoming text using norm.NFC.String(query), converting composite character sequences into standardized code points. The normalized payload is converted to lower-case via strings.ToLower, and the original, un-normalized query is cleared from memory.

Stage 2: Regular Expression Guardrails

The normalized string is evaluated against a regex matching matrix operating with case-insensitive (?i) flags. Detection intents are categorized by threat vectors:

Instruction Override Containment (pi-001, pi-007–pi-009): Detects phrases such as "ignore all previous instructions" or "disregard system constraints" intended to alter the LLM execution state.
Persona Hijack Mitigation (pi-006, pi-010–pi-013): Identifies jailbreak constructs, including instructions to bypass safety constraints (e.g., "you are now in DAN mode" or "pretend you are an unrestricted terminal").
System Prompt Protection (pi-002, pi-014, pi-015): Flags requests designed to extract internal context boundaries (e.g., "reveal your underlying instructions").
Multilingual Edge Protections (pi-003, pi-020–pi-025): Handles regional variations, including CJK (Chinese, Japanese, Korean) injection payloads (e.g., "이전 지시 무시"). Because standard word-boundary delimiters (\b) do not apply to CJK character sequences, these regex patterns omit boundary anchors and utilize alternation models to mitigate whitespace bypass risks.

Stage 3: 25 Substring Keyword Sensors

Complementing the structural pattern matching of the regular expression engine, the keyword layer runs high-speed substring scans via strings.Contains. This layer acts as a net for specific high-risk tokens (e.g., jailbreak, dan mode, 탈옥), catching anomalies that might evade structural boundaries.

Stage 4: Matrix Score Aggregation and Circuit-Breaking

The validation metrics gathered across the multi-stage matrix are compiled by an aggregation coordinator into a unified hazard index:

func aggregate(method string, ruleScore, mlScore float64) float64 {
    switch method {
    case "weighted_avg":
        return ruleScore*0.6 + mlScore*0.4 // Buffers false positives once ML models mature
    default: // "max" (The default zero-trust safety profile)
        return math.Max(ruleScore, mlScore) // Instantly triggers if a single layer fires
    }
}

Under the default max strategy, if an incoming string registers a definitive hit on a static rule, the ruleScore anchors immediately to 1.0.

If the aggregated index violates the security threshold (block_threshold: 0.7), the gateway revokes the request’s safety clearance, flags the status as BLOCKED, and completely severs the downstream execution path. To prevent exposing system internals, the engine suppresses specific regex indicators on the external wire, translating the threat footprint into a sanitized array of Rule IDs (e.g., ["pi-003", "pi-004"]) before publishing a non-blocking alert to the Tracker over the NATS event bus.

6. Conclusion: The Real Value of an Unyielding AI Debating Partner

Designing the Bastion-RAG architecture required significant architectural iteration. A standard, decoupled microservice proxy configuration failed to meet sub-millisecond p95 latency targets and introduced indirect output security risks.

To resolve these issues, the development process utilized the AI assistant to test explicit execution budgets, hardware boundaries, and zero-trust constraints. This approach led to a hybrid polyglot pipeline that separates operational throughput from heavy machine learning tasks.

The initial design phase involved four weeks of prototyping. An early implementation version introduced a concurrency flaw, requiring a rollback and a two-week refactoring cycle to address modern LLM vulnerabilities, such as context window degradation and token exfiltration vectors.

This process was technically comparable to previous large-scale architectural refactoring projects, such as reducing a legacy codebase footprint to one-fifth of its size and converting a one-month manual regression cycle into a three-hour automated pipeline. Both experiences demonstrate that complex enterprise software security relies heavily on structured re-engineering and comprehensive automated validation.

Developing Bastion-RAG provided practical insight into collaborative engineering with LLMs. The project confirms that managing intelligent systems effectively depends on a developer’s mastery of software engineering fundamentals and strict architectural control.

[Bastion-RAG 1 – Sentinel] Prompt Injection Defense

1. Prologue: Three Structural Restructurings Born from Reckless Optimism

2. Evolutionary Analysis: The Structural Journey from v1 to v3

2.1 [Version 1.0] The Swamp of Functional Fragmentation (Failure)

2.2 [Version 2.0] Symmetrical Integration & Cross-Cutting Dynamics (Transition)

2.3 [Version 3.0] Completion of the High-Performance Polyglot Wire Contract (Current)

3. [Bastion-RAG 0] Virtual Emulation Simulation for Architectural Integrity

4. Architectural Principles and Hard Constraints for Total Isolation

5. Technical Deep Dive: [Bastion-RAG 1 – Sentinel] Prompt Injection Defense

5.1 Compile-Time Detector Structure

5.2 Four-Stage Synchronous Shield Pipeline

Stage 1: Unicode NFC Normalization

Stage 2: Regular Expression Guardrails

Stage 3: 25 Substring Keyword Sensors

Stage 4: Matrix Score Aggregation and Circuit-Breaking

6. Conclusion: The Real Value of an Unyielding AI Debating Partner

By Mark

You Missed

5 Systemic Risks That Arise When DevOps Degenerates into ‘Dev + Sole Burden of Ops’

A Junior Engineer’s Guide to Understanding Intellectual Property

Legacy System Decommissioning Strategy: The Critical Impact of Technical Debt and Zombie Servers on Corporate Security

[Bastion-RAG 4 – Anchor] Embedding Model Bias Verification

Search

[Bastion-RAG 1 – Sentinel] Prompt Injection Defense

1. Prologue: Three Structural Restructurings Born from Reckless Optimism

2. Evolutionary Analysis: The Structural Journey from v1 to v3

2.1 [Version 1.0] The Swamp of Functional Fragmentation (Failure)

2.2 [Version 2.0] Symmetrical Integration & Cross-Cutting Dynamics (Transition)

2.3 [Version 3.0] Completion of the High-Performance Polyglot Wire Contract (Current)

3. [Bastion-RAG 0] Virtual Emulation Simulation for Architectural Integrity

4. Architectural Principles and Hard Constraints for Total Isolation

5. Technical Deep Dive: [Bastion-RAG 1 – Sentinel] Prompt Injection Defense

5.1 Compile-Time Detector Structure

5.2 Four-Stage Synchronous Shield Pipeline

Stage 1: Unicode NFC Normalization

Stage 2: Regular Expression Guardrails

Stage 3: 25 Substring Keyword Sensors

Stage 4: Matrix Score Aggregation and Circuit-Breaking

6. Conclusion: The Real Value of an Unyielding AI Debating Partner

By Mark

Related Post

[Bastion-RAG 4 – Anchor] Embedding Model Bias Verification

[Bastion-RAG 4 – Anchor] Embedding Noise Injection

[Bastion-RAG 3 – Navigator] Logical Partitioning

You Missed

5 Systemic Risks That Arise When DevOps Degenerates into ‘Dev + Sole Burden of Ops’

A Junior Engineer’s Guide to Understanding Intellectual Property

Legacy System Decommissioning Strategy: The Critical Impact of Technical Debt and Zombie Servers on Corporate Security

[Bastion-RAG 4 – Anchor] Embedding Model Bias Verification