Your text stored perfectly,
retrieved as flawless memory, your text.

Preserve human truth in embeddings. Evolve semantics with evals.

Bidirectional semantic coherence through a unified Metaschema Agent. Deterministic embeddings meet retrieval with zero drift—what you store is exactly what you find.

Get Early Access View Roadmap

94% Retrieval Quality

Precision@10 on test corpus

<200ms Search Speed

P95 latency, multi-layer

Zero Semantic Drift

EMBED ↔ RETRIEVE coherence

100% Provenance

Full citation & metadata

The Problem

Why Semantic Drift Happens

Most systems hand embedding and retrieval to different engines, different models, different teams. The result? Your search doesn't understand what your embeddings meant.

Take our Pride and Prejudice discussion example:

EMBED INSTRUCTION

                        "The following is a literary analysis discussion about Pride and Prejudice, preserving character mentions and thematic assertions"
                    

You configure your chunking, choose your model, embed your literary discussions. Done.

But then retrieval happens somewhere else:

TYPICAL RETRIEVAL (Different System)

                        Generic vector search with BM25 fallback. No understanding of "literary discussion", "character analysis", or "thematic assertions". Just cosine similarity on vectors.
                    

The drift:

Embedding understood "literary discussion" context
Retrieval only sees "find similar vectors"
No shared intelligence about what "character development" or "thematic relevance" mean in your domain
When you change your embedding strategy, your retrieval doesn't adapt

Three Approaches, Same Fatal Flaw

Commodity APIs

OpenAI/AWS/Google give you embeddings. Pinecone/Weaviate give you search. But who understands your "literary analysis discussion" context?

Gap: Two different systems. No shared semantic understanding.

RAG Frameworks

LangChain/LlamaIndex let you build the coherence yourself. But maintaining that coherence across updates, across team members, across time?

Gap: DIY coherence maintenance. Drift creeps in with every refactor.

Search-as-a-Service

Opinionated UX, managed infrastructure. But their opinion isn't yours. Can they preserve "character development" vs. "plot summary" distinctions in your literary analysis discussions?

Gap: Their semantics, not yours. Black box quality.

The Solution

One Agent. One Schema. Both Paths.

The Metaschema Agent is not a configuration file. It's an intelligence that understands your data's semantics and governs both how you embed and how you retrieve.

The Metaschema Agent knows what both instructions mean

EMBED INSTRUCTION

                            "The following is a literary analysis discussion about Pride and Prejudice by Jane Austen"
                        

The agent understands: "literary analysis" implies critical examination structure, "discussion" means preserve speaker attribution, "Pride and Prejudice" signals character and thematic continuity matters.

✓ It configures: conversation-aware chunking, speaker metadata preservation, story-beat layer

SAME AGENT

RETRIEVE INSTRUCTION

                            "The following is the latest exchange between our human and AI assistant, respond with prior assertions that are highly relevant to this latest exchange"
                        

The same agent understands: "latest exchange" means recent turns, "prior assertions" means statements-of-fact (not questions), "highly relevant" uses the same semantic space it embedded with.

✓ It configures: recency weighting, assertion-vs-question filtering, hybrid retrieval with conversational context

Why bidirectional coherence is guaranteed:

Same intelligence: The Metaschema Agent that parsed your EMBED instruction also parses your RETRIEVE instruction
Same semantic model: What "conversation" means for chunking is what "latest exchange" means for retrieval
Same evolution: When you refine your EMBED instruction, the agent automatically updates retrieval strategy
Same evals: Tests run against the full embed→retrieve pipeline, not separate components

Architecture

Unified Service, Unified Schema

Enscribe-Embed is a single Rust service that handles both embedding and search. The Metaschema Agent lives inside, ensuring every path through the system honors the same semantic understanding.

Data Locality = Semantic Coherence

No Cross-Service Hops

Your EMBED and RETRIEVE instructions both reach the same Metaschema Agent. No serialization, no API boundaries, no drift from network hops.

Cache Coherence by Design

When you embed new conversations, the agent immediately understands them for retrieval. Fresh vectors, instant query—no eventual consistency delays.

One Telemetry Stream

Embedding latency, search latency, cache hit rates—all in one unified view. Correlate performance across the full pipeline.

Benchmark Performance

Measured on 10K-document corpus with multi-layer search enabled

<120ms

P50 Query Latency

Median search response time

<200ms

P95 Query Latency

95% of queries complete within

10K

Docs/Min Throughput

Embedding ingestion rate

Cost Optimization

Pay Once, Query Forever

Automatic change detection means you only pay for embeddings when content actually changes. Re-upload the same document? $0 cost. Update one paragraph? Pay only for the delta.

Traditional RAG: Pay Every Time You Ingest

Scenario: Weekly documentation updates

• 10,000 docs in knowledge base
• 5% actually change each week (500 docs)
• But you re-embed all 10,000 docs
• Waste: $450/month on redundant embeddings

Enscribe: SHA256 Fingerprinting + Change Detection

How it works:

Upload document → Enscribe computes SHA256 hash of content
Check fingerprint → Compare with existing fingerprints in your profile
Skip unchanged → If hash matches, embeddings already exist (cost: $0)
Embed delta only → If changed, embed only the modified chunks

Same scenario with Enscribe:

• 10,000 docs total
• 500 changed docs detected automatically
• 9,500 skipped (fingerprint match)
• Cost: $22.50/month (95% savings)

Real-world ROI examples:

Legal/Compliance (10K contracts):

Daily ingestion, 1% daily change rate → Save $5,400/month

Customer Support (50K tickets):

Hourly ingestion, 0.5% hourly change rate → Save $22,000/month

Research Papers (100K documents):

Weekly updates, 2% weekly change → Save $13,500/month

Multi-Provider Routing: No Vendor Lock-In

Choose from 7 embedding providers—or switch between them—without changing your code. Enscribe's routing layer abstracts provider differences while preserving semantic coherence.

OpenAI

text-embedding-3-large

Anthropic

claude-3-embeddings

Mistral

mistral-embed

Cohere

embed-english-v3.0

Groq

Fast inference

Together AI

Open source models

Fireworks

Custom deployments

Why this matters:

Cost optimization: Route to cheapest provider for bulk ingestion, best quality for critical queries
Fallback resilience: If primary provider has outage, automatically failover to secondary
Domain specialization: Use medical embeddings for healthcare, legal for contracts, multilingual for global
Future-proof: New providers added without breaking your existing profiles

Multi-Layer Semantics

From Chunks to Story Beats

The Metaschema Agent doesn't just embed your conversations once. It creates multiple semantic layers, each serving a different aspect of your instructions.

For our Great Expectations discussion, the agent creates four layers:

LAYER 1: BASELINE

Deterministic Chunks (800 tokens, 160 overlap)

Pure factual coverage. Every word embedded, nothing lost. When your RETRIEVE instruction asks for "prior assertions", this layer guarantees we didn't miss any.

                            "Pip encounters Miss Havisham in her decaying mansion in Chapter 8..."

                            [Chunk boundary]

                            "...in Chapter 8: Miss Havisham, frozen in time since her wedding day..."

LAYER 2: STRUCTURAL

Speaker, Timestamp, Turn Metadata

Because your EMBED instruction said "conversation between human and AI assistant", the agent preserves who said what, when. Your RETRIEVE instruction's "latest exchange" needs this to understand recency and speaker attribution.

                            speaker: "human" | timestamp: 2025-11-01T14:32:15Z

                            turn_id: 47 | exchange_id: "great_expectations_session_003"

                            content: "Why does Miss Havisham manipulate Estella to break Pip's heart?"

LAYER 3: STORY-BEAT

Narrative Flow & Coherence

Your EMBED instruction mentioned "literary analysis"—the agent knows narratives have arcs, conflicts, resolutions. The story-beat layer chunks at dramatic boundaries, preserving plot coherence for retrieval.

                            beat: "revelation_of_benefactor"

                            narrative_arc: "Pip's illusions shattered by truth"

                            preceding_tension: "Pip believes Miss Havisham is his benefactor"

                            resolution_point: "Magwitch reveals he funded Pip's education"

LAYER 4: GRAPH (Q4 2026)

Claims & Relationships

Extract assertions as nodes, contradictions as edges. "Miss Havisham was jilted on her wedding day" becomes a retrievable claim, connected to "Her obsessive manipulation of Estella" relationship.

                            claim: "Miss_Havisham_jilted_at_altar"

                            asserted_by: "AI assistant" | confidence: 0.95

                            related_claims: ["Estella_raised_to_break_hearts"]

Why multi-layer semantics prevents drift:

When your RETRIEVE instruction asks for "prior assertions highly relevant to this latest exchange", the Metaschema Agent queries across all layers simultaneously:

Baseline: Did we mention "Miss Havisham" before? (Keyword precision)
Structural: What did the human say vs. what did the AI say? (Speaker filtering)
Story-beat: Are we still in the "Pip's childhood" arc or have we moved to "revelation and redemption"? (Narrative context)
Graph (future): Which assertions haven't been contradicted or revised? (Claim validity)

Same agent configured all layers. Same agent queries them. Zero drift between encoding and decoding.

Enscribe Lab

Configure Your Profile Conversationally

You've seen how the Metaschema Agent achieves bidirectional coherence. Now see how you configure it: by talking to it about your data.

Building the Sci-Fi Conversation Profile

YOU

"I'm building a conversational AI for literary analysis that helps users explore character development and themes in classic novels like Pride and Prejudice. The discussions involve detailed textual analysis. I need to retrieve prior assertions about characters and themes that are relevant to the current query, preserving analytical context and thematic continuity."

METASCHEMA AGENT

Got it. I'm analyzing your requirements:

"Literary analysis" → I'll preserve analytical structure and speaker attribution
"Pride and Prejudice" → Character and thematic coherence matter, I'll add story-beat layer
"Prior assertions" → Need to distinguish claims from questions
"Relevant to current query" → Thematic similarity + character relationships
"Preserving context" → Speaker metadata and textual citations in structural layer

I'm proposing a 3-layer profile:

METASCHEMA AGENT

profile_id: "pride_prejudice_analysis_v1"
EMBED instruction:
"The following is a literary analysis discussion about Pride and Prejudice by Jane Austen, preserving character mentions and thematic assertions"
RETRIEVE instruction:
"The following is the latest exchange between our human and AI assistant, respond with prior assertions that are highly relevant to this latest exchange"
Layers:

                                • baseline: chunk_size=800, overlap=160

                                • structural: preserve=[speaker, timestamp, turn_id]

                                • story_beat: model=narrative_v1, boundary_detection=auto
                            
Retrieval strategy:

                                • hybrid: dense_weight=0.7, bm25_weight=0.3

                                • rerank: model=cross_encoder_v2, top_n=12

                                • recency_boost: decay_factor=0.85

                                • speaker_filter: allow=["human", "AI assistant"]
                            
Projected cost: ~$0.08/1K conversations + ~$0.02/1K queries

Want to test this with sample data before deploying?

YOU

"Yes, let's test. Also—can you distinguish between character assertions and meta-discussion about the story? Sometimes the human asks 'what motivated this character?' vs. the AI asserting 'this character was motivated by X'."

METASCHEMA AGENT

Good catch. I'll add assertion classification to the structural layer:

                            utterance_type: ["assertion", "question", "speculation"]

                            assertion_confidence: [0.0 - 1.0]

When you retrieve "prior assertions", I'll filter to utterance_type="assertion" with confidence > 0.7. Questions like "what motivated..." won't pollute your results.

✓ Updated profile. Ready for preview evals.

What just happened:

1. You described your use case in plain language

No YAML, no configuration DSL. Just "I'm building X, I need Y."

2. The agent inferred both EMBED and RETRIEVE semantics

It understood that "conversational AI about stories" implies specific chunking, metadata, and retrieval strategies. Same intelligence for both paths.

3. You iterated conversationally

"Can you distinguish assertions from questions?" The agent added utterance_type classification—affecting both how it embeds (structural layer) and how it retrieves (filtering strategy).

4. Bidirectional coherence was maintained throughout

Every refinement updated both embedding and retrieval strategies. The schema stayed coherent because one agent owns both paths.

Power users: Full control when you need it

The conversational interface is a starting point. At any time, you can drop into the raw schema and tune every knob:

                    {

                      "layers": {

                        "baseline": {

                          "chunker": "fixed_token",

                          "chunk_size": 800,

                          "overlap": 160,

                          "preserve_sentence_boundaries": true

                        },

                        "structural": {

                          "metadata_fields": ["speaker", "timestamp", "turn_id", "utterance_type"],

                          "classification": {

                            "utterance_type": {

                              "model": "classifier_v1",

                              "classes": ["assertion", "question", "speculation"]

                            }

                          }

                        },

                        "story_beat": {

                          "model": "narrative_v1",

                          "boundary_detection": "auto",

                          "min_beat_tokens": 200,

                          "preserve_arc_context": true

                        }

                      },

                      "retrieval": {

                        "strategy": "hybrid",

                        "dense_weight": 0.7,

                        "bm25_weight": 0.3,

                        "rerank": { "model": "cross_encoder_v2", "top_n": 12 },

                        "filters": {

                          "utterance_type": ["assertion"],

                          "min_confidence": 0.7

                        },

                        "recency_boost": { "decay_factor": 0.85 }

                      }

                    }

Conversational when you want it. Programmatic when you need it. The Metaschema Agent understands both.

Developer Experience

Type-Safe Clients in 50+ Languages

Full OpenAPI 3.0 specification means automatic SDK generation for any language. Get type-safe, documented client libraries without writing boilerplate.

From OpenAPI Spec to Production SDK in Minutes

Step 1: Download OpenAPI 3.0 Spec

                            curl https://api.enscribe.io/openapi.json -o enscribe-api.json
                        

Step 2: Generate Client for Your Language

                            # Python

                            openapi-python-client generate --path enscribe-api.json

                            # TypeScript

                            npx openapi-typescript enscribe-api.json --output enscribe.ts

                            # Go

                            oapi-codegen -generate types,client enscribe-api.json > enscribe.go

                            # Rust

                            openapi-generator generate -i enscribe-api.json -g rust

Step 3: Type-Safe Integration

                            # Python with full type hints

                            from enscribe_client import Client, ProfileConfig

                            client = Client(api_key="your_key")

                            profile = ProfileConfig(

                              embed_instruction="...",

                              retrieve_instruction="...",

                              layers=[LayerConfig.baseline(), LayerConfig.story_beats()]

                            )

                            # IDE autocomplete, type checking, inline docs - all automatic

Supported Languages (via OpenAPI codegen):

• Python

• TypeScript

• JavaScript

• Go

• Rust

• Java

• Kotlin

• Swift

• C#

• PHP

• Ruby

• Scala

• Elixir

• Dart

• C++

...and 35+ more

Enterprise-Grade API Infrastructure

REST & gRPC APIs

Choose your protocol: RESTful JSON for simple integration, gRPC for high-throughput streaming embeddings.

REST: JSON requests/responses, HTTP/2, CORS support
gRPC: Binary protocol, bidirectional streaming, connection multiplexing

Rate Limiting & Quotas

Configurable per-tenant rate limits and monthly quotas ensure predictable costs and prevent runaway usage.

Per-tenant RPS limits: Configure max requests per second per API key
Monthly quotas: Set hard caps on storage tokens and compute-seconds
Graceful 429 responses: Clear error messages with retry-after headers
Dashboard visibility: Real-time usage metrics and quota alerts

Observability & Monitoring

Built-in telemetry endpoints expose metrics in Prometheus format. Integrate with your existing monitoring stack.

Metrics exposed: Request latency (p50/p95/p99), error rates, cache hit rates, embedding throughput, cost per request
Tracing: OpenTelemetry-compatible traces for request debugging

Eval-Native Evolution

Ship Semantics Like Code

You've configured the profile. You've seen how the Metaschema Agent maintains coherence. Now: How do you know it actually works? Evals.

Evals are the Contract, Not an Afterthought

Traditional RAG: Evals bolt on

You build your embedding pipeline. You build your search pipeline. Then you try to test them. But they were built separately, so evals test them separately. Precision@10 for search. Chunk quality for embeddings. No end-to-end coherence measure.

Enscribe: Evals are native

Because the Metaschema Agent owns both EMBED and RETRIEVE, evals test the full pipeline. Your eval suite answers: "Given this EMBED instruction and this RETRIEVE instruction, does the system preserve bidirectional coherence?"

✓ Evals run before every deploy
✓ Schema changes blocked if evals regress
✓ Shadow traffic tests production coherence
✓ Scheduled runs detect drift over time

Eval Results: great_expectations_analysis_v1

Here's what the eval suite tells us about our Great Expectations literary analysis profile:

precision_at_10

94%

What it tests: When we retrieve "prior assertions relevant to latest exchange", are the top 10 results actually relevant?

                        Test case: "Why does Miss Havisham manipulate Estella?"

                        Expected: Assertions about betrayal, revenge, Estella's upbringing

                        Retrieved: 9/10 correct (missed one tangential mention of wedding details)

                        ✓ PASS (threshold: 90%)

This tests whether the Metaschema Agent's understanding of "relevant" matches human judgment.

narrative_recall_v1

91%

What it tests: Does the story-beat layer preserve narrative continuity? Can we retrieve across plot arcs?

                        Test case: "How did the first contact resolution affect later alliance?"

                        Expected: Story beats from "first_contact_resolution" → "alliance_formation"

                        Retrieved: 10/11 narrative connections preserved

                        ✓ PASS (threshold: 85%)

This tests whether the story-beat layer actually understands causality and narrative flow.

speaker_attribution_accuracy

100%

What it tests: Does the structural layer preserve "who said what" perfectly?

                        Test case: Retrieve all human assertions vs. all AI assertions

                        Expected: Zero speaker misattribution

                        Retrieved: 247/247 correctly attributed

                        ✓ PASS (threshold: 100%)

This is non-negotiable. If we can't preserve "who said what", we fail the EMBED instruction.

utterance_classification_f1

0.89

What it tests: Does the agent correctly distinguish assertions from questions when filtering for "prior assertions"?

                        Test case: 150 utterances (50 assertions, 50 questions, 50 speculations)

                        Precision: 0.92 (few false positives)

                        Recall: 0.86 (missed some nuanced assertions)

                        F1: 0.89

                        ✓ PASS (threshold: 0.80)

This refinement came from conversational iteration in the Lab. The eval proves it works.

p95_latency_ms

<187ms

What it tests: With all layers (baseline, structural, story-beat) and hybrid retrieval + reranking, are we still fast?

                        Test corpus: 10K conversations (avg 15 turns each)

                        Query load: 1000 queries, 95th percentile measured

                        Result: 187ms (includes dense search + BM25 + rerank + filtering)

                        ✓ PASS (threshold: <200ms)

Semantic richness doesn't mean slow. Unified architecture keeps latency competitive.

How evals enable fearless evolution

1 Pre-deploy blocking

Want to change your EMBED instruction? Update reranking weights? Add a new layer? Every change runs the full eval suite. If precision_at_10 drops below 90%, deploy blocks. No semantic regressions reach production.

2 Shadow traffic testing

Before promoting a new profile version to 100% traffic, run it in shadow mode: real queries go to both old and new profiles. Compare eval metrics on live data. Only promote if new version improves or maintains quality.

3 Scheduled drift detection

Evals run nightly against production profiles. If quality degrades (data distribution shift, model decay, etc.), you get alerted. Your literary analysis profile that scored 94% precision last month is scoring 89% this week? Time to investigate.

4 BYOE: Bring Your Own Evals

Using Braintrust? Custom harness? Your domain-specific quality metrics? Plug them into the Enscribe eval runner. Same pre-deploy blocking, same shadow testing, same scheduled runs. Your quality bar, our platform.

✓

The Eval-Native Guarantee

If your evals pass, your profile ships. If they regress, deploys block. Your bidirectional coherence is tested, proven, and continuously monitored. No semantic drift reaches production.

Use Cases

Where Bidirectional Coherence Matters

We've followed the Great Expectations literary analysis thread. But profile-driven semantics with guaranteed coherence unlock many domains where fidelity and governance are non-negotiable.

⚖️

Regulated Industries: Legal, Healthcare, Finance

When decisions must be auditable and retrieval must preserve exact human statements—not AI interpretations or summaries—bidirectional coherence is mandatory.

EMBED instruction:

                        "The following is a clinical consultation between Dr. [NAME] and patient [ID] recorded on [DATE]. Preserve exact medical terminology, treatment recommendations, and patient consent statements."
                    

RETRIEVE instruction:

                        "Given this patient's latest symptom report, retrieve all prior treatment recommendations from Dr. [NAME] with exact quotes, timestamps, and consultation context."
                    

What the Metaschema Agent ensures:

Verbatim retrieval: "recommended treatment X" not "suggested a similar treatment"
Speaker attribution: Dr. NAME said it, not patient, not AI summary
Temporal accuracy: Recommendation from March 2024, not confused with June 2024 revision
Consent tracking: "Patient consented to Y" must be retrievable with legal precision
Audit trail: Every retrieval shows which embeddings were queried, which layers contributed

🤝

Knowledge Ops: Sales, Customer Success, Support

When you need to retrieve commitments, decisions, and promises with attribution to specific people in specific contexts, bidirectional coherence prevents the "who said that?" problem.

EMBED instruction:

                        "The following is a sales call between [SALES_REP] and [CLIENT_COMPANY]. Preserve commitments, pricing discussions, feature requests, and objections with speaker attribution."
                    

RETRIEVE instruction:

                        "What specific commitments did [SALES_REP] make to [CLIENT_COMPANY] in Q3 2025 regarding delivery timelines? Show exact quotes with call dates."
                    

What the Metaschema Agent ensures:

Commitment vs. discussion: "We will deliver by Q1" vs. "We might be able to consider Q1"
Speaker precision: Sales rep promised it, not client asked for it
Context preservation: "if you purchase enterprise tier" conditional not dropped
Objection tracking: "Client concerned about integration" retrieved for follow-up

📚

Creative Tools: Writing, Worldbuilding, Story Platforms

Our Great Expectations literary analysis fits here. When narrative coherence, character continuity, and plot causality matter, the story-beat layer shines.

EMBED instruction:

                        "The following is a chapter from an ongoing fantasy novel. Preserve character motivations, plot arcs, worldbuilding details, and narrative causality across drafts."
                    

RETRIEVE instruction:

                        "Find all scenes where [CHARACTER] confronts their fear of magic, tracking how this motivation evolved from draft v1 to v3."
                    

What the Metaschema Agent ensures:

Story-beat chunking: Chunks at scene/arc boundaries, not mid-dramatic-moment
Cross-draft continuity: Track how "fear of magic" motivation evolved across revisions
Causality preservation: "Character X's betrayal caused Y's transformation" relationship intact
Worldbuilding consistency: "Magic system rules established in Chapter 2" retrievable in Chapter 47

🔬

Research: Literature Review, Claim Verification

When you're analyzing thousands of papers, tracking claims, contradictions, and citation context, the graph layer (Q4 2026) becomes essential.

EMBED instruction (with graph layer):

                        "The following is a research paper abstract with claims about climate feedback loops. Extract claims as nodes, citation relationships as edges, contradictions between papers as explicit relationships."
                    

RETRIEVE instruction:

                        "Find all claims about methane feedback loops in permafrost, showing which claims have been contradicted by subsequent papers and which remain uncontested."
                    

What the Metaschema Agent ensures (with graph layer):

Claim extraction: "Permafrost methane contributes X Gt CO2-eq by 2050" as queryable node
Contradiction tracking: "Paper B contradicts Paper A's estimate" as explicit edge
Citation context: Not just "cited", but "supported by" vs. "challenged by"
Confidence scoring: Claim confidence based on citation network and contradiction analysis

💎

Why Human Truth Matters

Summarization is lossy. Paraphrasing introduces interpretation.
When stakes are high, you need exact words—not approximations.

Retrieval as Archaeology, Not Creativity

When a patient asks "What did my doctor say about this medication?", they need what the doctor said—word for word, with context. Not an AI's helpful paraphrase. Not a summary that drops the dosage caveat.

Enscribe treats retrieval as archaeology. We dig up what humans said, exactly as they said it.

Provenance is Non-Negotiable

Every retrieved result must answer:

Who said it? Speaker, author, role
When did they say it? Timestamp, meeting ID, document version
In what context? Conversation thread, chapter, consultation session
Exactly what words? Verbatim text, no lossy transformation

No black boxes. No "the AI found something similar". Full provenance, always.

Your Embeddings Carry Human Intent

When you tell the Metaschema Agent "preserve speaker attribution" or "track narrative causality", you're encoding human values into the embedding strategy.

Those values must survive the round-trip. Embed with intent → Retrieve with fidelity. Bidirectional coherence ensures your values don't get lost in translation.

We preserve human truth because we preserve human intent at every layer.

Competitive Analysis

Why Enscribe Wins on Coherence

We're not the only embedding or search solution. But we're the only one where bidirectional coherence is guaranteed by a single Metaschema Agent.

Feature	Commodity APIs (OpenAI + Pinecone)	RAG Frameworks (LangChain/LlamaIndex)	Search SaaS (Specialized AI Search)	Enscribe
Bidirectional schema (one agent, both paths)	—	DIY	—	✓
Conversational configuration	—	—	Limited	✓
Multi-layer semantics (baseline → graph)	—	Manual	—	✓
Verbatim retrieval + full provenance	—	If you build it	—	✓
Eval-native (tests = contract, pre-deploy blocking)	—	Bolt-on	—	✓
Deterministic metering (storage tokens + compute-seconds)	—	—	Opaque	✓
Zero semantic drift guarantee	—	Your job	—	✓

Enterprise Security

Bank-Grade Security & Compliance

Multi-tenant isolation, end-to-end encryption, comprehensive audit trails, and SOC 2 Type II readiness. Your data never leaves your control.

🔐 End-to-End Encryption

Data at Rest: AES-256-GCM encryption for all stored embeddings, metadata, and profiles. Keys rotated quarterly via AWS KMS.

Data in Transit: TLS 1.3 only. No TLS 1.2 fallback. Perfect forward secrecy enforced.

Key Management: Customer-managed keys (BYOK) available for Enterprise tier. Hardware security module (HSM) backed.

🏢 Multi-Tenant Isolation

Logical Separation: Every tenant's profiles, embeddings, and metadata are logically isolated with tenant_id boundaries enforced at the database layer.

API Key Scoping: API keys are scoped to tenant + profile combinations. No cross-tenant data access possible.

VPC Isolation (Enterprise): Dedicated VPCs with AWS PrivateLink for zero public internet exposure.

📋 Audit Trails & Compliance

Immutable Audit Logs: Every API call logged with timestamp, tenant_id, user_id, IP address, request/response payloads (configurable). Retention: 1 year standard, 7 years for Enterprise.

Access Controls: Role-based access control (RBAC) with granular permissions. SSO/SAML 2.0, SCIM provisioning for Enterprise.

Data Residency: Choose US, EU, or Asia-Pacific regions. Data never leaves selected region.

Compliance Certifications (Planned Q3 2026):

✓ SOC 2 Type II

✓ GDPR Compliant

✓ HIPAA BAA Available

✓ ISO 27001

✓ PCI DSS (if processing payments)

✓ FedRAMP Moderate (Roadmap)

🛡️ Infrastructure Security

DDoS Protection: AWS Shield Advanced with automatic mitigation
WAF: AWS WAF with OWASP Top 10 protection, rate limiting, IP allowlisting
Secrets Management: AWS Secrets Manager with automatic rotation
Container Security: All images scanned for CVEs, signed with Notary v2
Network Segmentation: Zero-trust network architecture, least-privilege IAM policies
Disaster Recovery: Multi-AZ deployment, automated backups every 6 hours, 99.99% uptime SLA (Enterprise)

Planned Pricing

Transparent, Deterministic, Fair

Projected pricing upon launch: Based on documented benchmark results from our test infrastructure, we've set conservative pricing to ensure controlled capacity during launch.

Note on Pricing: All pricing projections are derived from actual performance metrics documented in our benchmark suite. We've intentionally set conservative estimates to gate-keep a steady lift-off—ensuring we can deliver on quality and SLAs from day one. Early access will be capacity-controlled.

Starter

$29 /month

✓ 3 embedding profiles
✓ Shared compute pool
✓ Baseline + structural layers
✓ Community eval library
✓ Self-service onboarding

POPULAR

Team

$299 /month

✓ 20 embedding profiles
✓ All layers (story-beats, graph)
✓ Private eval runners (BYOE)
✓ Per-profile analytics & cache
✓ Email support

Enterprise

Custom

✓ Unlimited profiles
✓ SSO/SCIM, audit trails
✓ VPC/PrivateLink deployment
✓ Dedicated success engineer
✓ 99.9% SLA

Contact Sales

Usage Metering

Projected rates based on documented benchmark performance

💾

Storage Tokens

Tokens embedded + vector storage

~$0.10 per 1M tokens/month

⚡

Compute-Seconds

Chunking, embedding, search, rerank

~$0.02 per vCPU-hour

🎁

Cache Dividend

Savings from avoided recompute

Reduces your bill

Why deterministic? The unified architecture means we know exactly what resources each query consumes. No black box pricing. Your bill is a function of documented resource usage.

Roadmap

What's Next

Building the future of profile-driven semantic systems, one layer at a time.

✓ Q1 2026 - FOUNDATION

Enscribe-Embed service (unified write + read)
Metaschema Agent (bidirectional coherence)
Baseline + structural layers
Deterministic metering (storage + compute)
Basic Enscribe-Lab UI

Q2 2026 - PUBLIC LAUNCH 🚀

Conversational profile configuration
Story-beat layer (narrative chunking)
Hybrid retrieval + reranking
Cache dividend reporting
Early Access → General Availability

Q3 2026 - ECOSYSTEM & SCALE

Profile marketplace (community profiles)
BYOE SDK (bring your own evals)
Public benchmark participation (BEIR, MTEB)
Multi-region deployment (US, EU)
Enterprise features (SSO, audit, VPC)

Q4 2026 - ADVANCED REASONING

Graph layer (claims/relationships)
Explanation traces (layer contribution)
Auto-tuning based on eval trends
Federated profiles (cross-profile queries)

Ship Semantics Like Code

Preserve human truth. Evolve with evals. Be first to build profile-driven semantic systems.

Get Early Access View Roadmap

Early access for qualified developers • Launching Q2 2026

Your text stored perfectly, retrieved as flawless memory, your text.

Preserve human truth in embeddings. Evolve semantics with evals.

Why Semantic Drift Happens

Take our Pride and Prejudice discussion example:

Three Approaches, Same Fatal Flaw

Commodity APIs

RAG Frameworks

Search-as-a-Service

One Agent. One Schema. Both Paths.

The Metaschema Agent knows what both instructions mean

Why bidirectional coherence is guaranteed:

Unified Service, Unified Schema

Data Locality = Semantic Coherence

No Cross-Service Hops

Cache Coherence by Design

One Telemetry Stream

Benchmark Performance

Pay Once, Query Forever

Traditional RAG: Pay Every Time You Ingest

Enscribe: SHA256 Fingerprinting + Change Detection

How it works:

Real-world ROI examples:

Multi-Provider Routing: No Vendor Lock-In

Why this matters:

From Chunks to Story Beats

For our Great Expectations discussion, the agent creates four layers:

Deterministic Chunks (800 tokens, 160 overlap)

Speaker, Timestamp, Turn Metadata

Narrative Flow & Coherence

Claims & Relationships

Why multi-layer semantics prevents drift:

Configure Your Profile Conversationally

Building the Sci-Fi Conversation Profile

What just happened:

Power users: Full control when you need it

Type-Safe Clients in 50+ Languages

From OpenAPI Spec to Production SDK in Minutes

Supported Languages (via OpenAPI codegen):

Enterprise-Grade API Infrastructure

REST & gRPC APIs

Rate Limiting & Quotas

Observability & Monitoring

Ship Semantics Like Code

Evals are the Contract, Not an Afterthought

Traditional RAG: Evals bolt on

Enscribe: Evals are native

Eval Results: great_expectations_analysis_v1

precision_at_10

narrative_recall_v1

speaker_attribution_accuracy

utterance_classification_f1

p95_latency_ms

How evals enable fearless evolution

The Eval-Native Guarantee

Where Bidirectional Coherence Matters

Regulated Industries: Legal, Healthcare, Finance

What the Metaschema Agent ensures:

Knowledge Ops: Sales, Customer Success, Support

What the Metaschema Agent ensures:

Creative Tools: Writing, Worldbuilding, Story Platforms

What the Metaschema Agent ensures:

Research: Literature Review, Claim Verification

What the Metaschema Agent ensures (with graph layer):

Why Human Truth Matters

Retrieval as Archaeology, Not Creativity

Provenance is Non-Negotiable

Your Embeddings Carry Human Intent

Why Enscribe Wins on Coherence

Bank-Grade Security & Compliance

🔐 End-to-End Encryption

🏢 Multi-Tenant Isolation

📋 Audit Trails & Compliance

Compliance Certifications (Planned Q3 2026):

🛡️ Infrastructure Security

Transparent, Deterministic, Fair

Starter

Team

Enterprise

Usage Metering

What's Next

Your text stored perfectly,
retrieved as flawless memory, your text.