Your text stored perfectly,
retrieved as flawless memory, your text.

Preserve human truth in embeddings. Evolve semantics with evals.

Bidirectional semantic coherence through a unified Metaschema Agent. Deterministic embeddings meet retrieval with zero drift—what you store is exactly what you find.

Get Early Access View Roadmap
94% Retrieval Quality
Precision@10 on test corpus
<200ms Search Speed
P95 latency, multi-layer
Zero Semantic Drift
EMBED ↔ RETRIEVE coherence
100% Provenance
Full citation & metadata
The Problem

Why Semantic Drift Happens

Most systems hand embedding and retrieval to different engines, different models, different teams. The result? Your search doesn't understand what your embeddings meant.

Take our Pride and Prejudice discussion example:

EMBED INSTRUCTION
"The following is a literary analysis discussion about Pride and Prejudice, preserving character mentions and thematic assertions"

You configure your chunking, choose your model, embed your literary discussions. Done.

But then retrieval happens somewhere else:

TYPICAL RETRIEVAL (Different System)
Generic vector search with BM25 fallback. No understanding of "literary discussion", "character analysis", or "thematic assertions". Just cosine similarity on vectors.

The drift:

  • Embedding understood "literary discussion" context
  • Retrieval only sees "find similar vectors"
  • No shared intelligence about what "character development" or "thematic relevance" mean in your domain
  • When you change your embedding strategy, your retrieval doesn't adapt

Three Approaches, Same Fatal Flaw

Commodity APIs

OpenAI/AWS/Google give you embeddings. Pinecone/Weaviate give you search. But who understands your "literary analysis discussion" context?

Gap: Two different systems. No shared semantic understanding.

RAG Frameworks

LangChain/LlamaIndex let you build the coherence yourself. But maintaining that coherence across updates, across team members, across time?

Gap: DIY coherence maintenance. Drift creeps in with every refactor.

Search-as-a-Service

Opinionated UX, managed infrastructure. But their opinion isn't yours. Can they preserve "character development" vs. "plot summary" distinctions in your literary analysis discussions?

Gap: Their semantics, not yours. Black box quality.

The Solution

One Agent. One Schema. Both Paths.

The Metaschema Agent is not a configuration file. It's an intelligence that understands your data's semantics and governs both how you embed and how you retrieve.

The Metaschema Agent knows what both instructions mean

EMBED INSTRUCTION
"The following is a literary analysis discussion about Pride and Prejudice by Jane Austen"

The agent understands: "literary analysis" implies critical examination structure, "discussion" means preserve speaker attribution, "Pride and Prejudice" signals character and thematic continuity matters.

✓ It configures: conversation-aware chunking, speaker metadata preservation, story-beat layer

SAME AGENT
RETRIEVE INSTRUCTION
"The following is the latest exchange between our human and AI assistant, respond with prior assertions that are highly relevant to this latest exchange"

The same agent understands: "latest exchange" means recent turns, "prior assertions" means statements-of-fact (not questions), "highly relevant" uses the same semantic space it embedded with.

✓ It configures: recency weighting, assertion-vs-question filtering, hybrid retrieval with conversational context

Why bidirectional coherence is guaranteed:

  • Same intelligence: The Metaschema Agent that parsed your EMBED instruction also parses your RETRIEVE instruction
  • Same semantic model: What "conversation" means for chunking is what "latest exchange" means for retrieval
  • Same evolution: When you refine your EMBED instruction, the agent automatically updates retrieval strategy
  • Same evals: Tests run against the full embed→retrieve pipeline, not separate components
Architecture

Unified Service, Unified Schema

Enscribe-Embed is a single Rust service that handles both embedding and search. The Metaschema Agent lives inside, ensuring every path through the system honors the same semantic understanding.

Data Locality = Semantic Coherence

No Cross-Service Hops

Your EMBED and RETRIEVE instructions both reach the same Metaschema Agent. No serialization, no API boundaries, no drift from network hops.

Cache Coherence by Design

When you embed new conversations, the agent immediately understands them for retrieval. Fresh vectors, instant query—no eventual consistency delays.

One Telemetry Stream

Embedding latency, search latency, cache hit rates—all in one unified view. Correlate performance across the full pipeline.

Benchmark Performance

Measured on 10K-document corpus with multi-layer search enabled

<120ms
P50 Query Latency
Median search response time
<200ms
P95 Query Latency
95% of queries complete within
10K
Docs/Min Throughput
Embedding ingestion rate
Cost Optimization

Pay Once, Query Forever

Automatic change detection means you only pay for embeddings when content actually changes. Re-upload the same document? $0 cost. Update one paragraph? Pay only for the delta.

Traditional RAG: Pay Every Time You Ingest

Scenario: Weekly documentation updates
• 10,000 docs in knowledge base
• 5% actually change each week (500 docs)
• But you re-embed all 10,000 docs
Waste: $450/month on redundant embeddings

Enscribe: SHA256 Fingerprinting + Change Detection

How it works:

  1. Upload document → Enscribe computes SHA256 hash of content
  2. Check fingerprint → Compare with existing fingerprints in your profile
  3. Skip unchanged → If hash matches, embeddings already exist (cost: $0)
  4. Embed delta only → If changed, embed only the modified chunks
Same scenario with Enscribe:
• 10,000 docs total
• 500 changed docs detected automatically
• 9,500 skipped (fingerprint match)
Cost: $22.50/month (95% savings)

Real-world ROI examples:

Legal/Compliance (10K contracts):
Daily ingestion, 1% daily change rate → Save $5,400/month
Customer Support (50K tickets):
Hourly ingestion, 0.5% hourly change rate → Save $22,000/month
Research Papers (100K documents):
Weekly updates, 2% weekly change → Save $13,500/month

Multi-Provider Routing: No Vendor Lock-In

Choose from 7 embedding providers—or switch between them—without changing your code. Enscribe's routing layer abstracts provider differences while preserving semantic coherence.

OpenAI
text-embedding-3-large
Anthropic
claude-3-embeddings
Mistral
mistral-embed
Cohere
embed-english-v3.0
Groq
Fast inference
Together AI
Open source models
Fireworks
Custom deployments

Why this matters:

  • Cost optimization: Route to cheapest provider for bulk ingestion, best quality for critical queries
  • Fallback resilience: If primary provider has outage, automatically failover to secondary
  • Domain specialization: Use medical embeddings for healthcare, legal for contracts, multilingual for global
  • Future-proof: New providers added without breaking your existing profiles
Multi-Layer Semantics

From Chunks to Story Beats

The Metaschema Agent doesn't just embed your conversations once. It creates multiple semantic layers, each serving a different aspect of your instructions.

For our Great Expectations discussion, the agent creates four layers:

LAYER 1: BASELINE

Deterministic Chunks (800 tokens, 160 overlap)

Pure factual coverage. Every word embedded, nothing lost. When your RETRIEVE instruction asks for "prior assertions", this layer guarantees we didn't miss any.

"Pip encounters Miss Havisham in her decaying mansion in Chapter 8..."
[Chunk boundary]
"...in Chapter 8: Miss Havisham, frozen in time since her wedding day..."
LAYER 2: STRUCTURAL

Speaker, Timestamp, Turn Metadata

Because your EMBED instruction said "conversation between human and AI assistant", the agent preserves who said what, when. Your RETRIEVE instruction's "latest exchange" needs this to understand recency and speaker attribution.

speaker: "human" | timestamp: 2025-11-01T14:32:15Z
turn_id: 47 | exchange_id: "great_expectations_session_003"
content: "Why does Miss Havisham manipulate Estella to break Pip's heart?"
LAYER 3: STORY-BEAT

Narrative Flow & Coherence

Your EMBED instruction mentioned "literary analysis"—the agent knows narratives have arcs, conflicts, resolutions. The story-beat layer chunks at dramatic boundaries, preserving plot coherence for retrieval.

beat: "revelation_of_benefactor"
narrative_arc: "Pip's illusions shattered by truth"
preceding_tension: "Pip believes Miss Havisham is his benefactor"
resolution_point: "Magwitch reveals he funded Pip's education"
LAYER 4: GRAPH (Q4 2026)

Claims & Relationships

Extract assertions as nodes, contradictions as edges. "Miss Havisham was jilted on her wedding day" becomes a retrievable claim, connected to "Her obsessive manipulation of Estella" relationship.

claim: "Miss_Havisham_jilted_at_altar"
asserted_by: "AI assistant" | confidence: 0.95
related_claims: ["Estella_raised_to_break_hearts"]

Why multi-layer semantics prevents drift:

When your RETRIEVE instruction asks for "prior assertions highly relevant to this latest exchange", the Metaschema Agent queries across all layers simultaneously:

  • Baseline: Did we mention "Miss Havisham" before? (Keyword precision)
  • Structural: What did the human say vs. what did the AI say? (Speaker filtering)
  • Story-beat: Are we still in the "Pip's childhood" arc or have we moved to "revelation and redemption"? (Narrative context)
  • Graph (future): Which assertions haven't been contradicted or revised? (Claim validity)

Same agent configured all layers. Same agent queries them. Zero drift between encoding and decoding.

Enscribe Lab

Configure Your Profile Conversationally

You've seen how the Metaschema Agent achieves bidirectional coherence. Now see how you configure it: by talking to it about your data.

Building the Sci-Fi Conversation Profile

YOU
"I'm building a conversational AI for literary analysis that helps users explore character development and themes in classic novels like Pride and Prejudice. The discussions involve detailed textual analysis. I need to retrieve prior assertions about characters and themes that are relevant to the current query, preserving analytical context and thematic continuity."
METASCHEMA AGENT
Got it. I'm analyzing your requirements:
  • "Literary analysis" → I'll preserve analytical structure and speaker attribution
  • "Pride and Prejudice" → Character and thematic coherence matter, I'll add story-beat layer
  • "Prior assertions" → Need to distinguish claims from questions
  • "Relevant to current query" → Thematic similarity + character relationships
  • "Preserving context" → Speaker metadata and textual citations in structural layer
I'm proposing a 3-layer profile:
METASCHEMA AGENT
profile_id: "pride_prejudice_analysis_v1"
EMBED instruction:
"The following is a literary analysis discussion about Pride and Prejudice by Jane Austen, preserving character mentions and thematic assertions"
RETRIEVE instruction:
"The following is the latest exchange between our human and AI assistant, respond with prior assertions that are highly relevant to this latest exchange"
Layers:
• baseline: chunk_size=800, overlap=160
• structural: preserve=[speaker, timestamp, turn_id]
• story_beat: model=narrative_v1, boundary_detection=auto
Retrieval strategy:
• hybrid: dense_weight=0.7, bm25_weight=0.3
• rerank: model=cross_encoder_v2, top_n=12
• recency_boost: decay_factor=0.85
• speaker_filter: allow=["human", "AI assistant"]
Projected cost: ~$0.08/1K conversations + ~$0.02/1K queries
Want to test this with sample data before deploying?
YOU
"Yes, let's test. Also—can you distinguish between character assertions and meta-discussion about the story? Sometimes the human asks 'what motivated this character?' vs. the AI asserting 'this character was motivated by X'."
METASCHEMA AGENT
Good catch. I'll add assertion classification to the structural layer:
utterance_type: ["assertion", "question", "speculation"]
assertion_confidence: [0.0 - 1.0]
When you retrieve "prior assertions", I'll filter to utterance_type="assertion" with confidence > 0.7. Questions like "what motivated..." won't pollute your results.
✓ Updated profile. Ready for preview evals.

What just happened:

1. You described your use case in plain language

No YAML, no configuration DSL. Just "I'm building X, I need Y."

2. The agent inferred both EMBED and RETRIEVE semantics

It understood that "conversational AI about stories" implies specific chunking, metadata, and retrieval strategies. Same intelligence for both paths.

3. You iterated conversationally

"Can you distinguish assertions from questions?" The agent added utterance_type classification—affecting both how it embeds (structural layer) and how it retrieves (filtering strategy).

4. Bidirectional coherence was maintained throughout

Every refinement updated both embedding and retrieval strategies. The schema stayed coherent because one agent owns both paths.

Power users: Full control when you need it

The conversational interface is a starting point. At any time, you can drop into the raw schema and tune every knob:

{
  "layers": {
    "baseline": {
      "chunker": "fixed_token",
      "chunk_size": 800,
      "overlap": 160,
      "preserve_sentence_boundaries": true
    },
    "structural": {
      "metadata_fields": ["speaker", "timestamp", "turn_id", "utterance_type"],
      "classification": {
        "utterance_type": {
          "model": "classifier_v1",
          "classes": ["assertion", "question", "speculation"]
        }
      }
    },
    "story_beat": {
      "model": "narrative_v1",
      "boundary_detection": "auto",
      "min_beat_tokens": 200,
      "preserve_arc_context": true
    }
  },
  "retrieval": {
    "strategy": "hybrid",
    "dense_weight": 0.7,
    "bm25_weight": 0.3,
    "rerank": { "model": "cross_encoder_v2", "top_n": 12 },
    "filters": {
      "utterance_type": ["assertion"],
      "min_confidence": 0.7
    },
    "recency_boost": { "decay_factor": 0.85 }
  }
}

Conversational when you want it. Programmatic when you need it. The Metaschema Agent understands both.

Developer Experience

Type-Safe Clients in 50+ Languages

Full OpenAPI 3.0 specification means automatic SDK generation for any language. Get type-safe, documented client libraries without writing boilerplate.

From OpenAPI Spec to Production SDK in Minutes

Step 1: Download OpenAPI 3.0 Spec
curl https://api.enscribe.io/openapi.json -o enscribe-api.json
Step 2: Generate Client for Your Language
# Python
openapi-python-client generate --path enscribe-api.json

# TypeScript
npx openapi-typescript enscribe-api.json --output enscribe.ts

# Go
oapi-codegen -generate types,client enscribe-api.json > enscribe.go

# Rust
openapi-generator generate -i enscribe-api.json -g rust
Step 3: Type-Safe Integration
# Python with full type hints
from enscribe_client import Client, ProfileConfig

client = Client(api_key="your_key")
profile = ProfileConfig(
  embed_instruction="...",
  retrieve_instruction="...",
  layers=[LayerConfig.baseline(), LayerConfig.story_beats()]
)

# IDE autocomplete, type checking, inline docs - all automatic

Supported Languages (via OpenAPI codegen):

• Python
• TypeScript
• JavaScript
• Go
• Rust
• Java
• Kotlin
• Swift
• C#
• PHP
• Ruby
• Scala
• Elixir
• Dart
• C++
...and 35+ more

Enterprise-Grade API Infrastructure

REST & gRPC APIs

Choose your protocol: RESTful JSON for simple integration, gRPC for high-throughput streaming embeddings.

REST: JSON requests/responses, HTTP/2, CORS support
gRPC: Binary protocol, bidirectional streaming, connection multiplexing

Rate Limiting & Quotas

Configurable per-tenant rate limits and monthly quotas ensure predictable costs and prevent runaway usage.

  • Per-tenant RPS limits: Configure max requests per second per API key
  • Monthly quotas: Set hard caps on storage tokens and compute-seconds
  • Graceful 429 responses: Clear error messages with retry-after headers
  • Dashboard visibility: Real-time usage metrics and quota alerts

Observability & Monitoring

Built-in telemetry endpoints expose metrics in Prometheus format. Integrate with your existing monitoring stack.

Metrics exposed: Request latency (p50/p95/p99), error rates, cache hit rates, embedding throughput, cost per request
Tracing: OpenTelemetry-compatible traces for request debugging
Eval-Native Evolution

Ship Semantics Like Code

You've configured the profile. You've seen how the Metaschema Agent maintains coherence. Now: How do you know it actually works? Evals.

Evals are the Contract, Not an Afterthought

Traditional RAG: Evals bolt on

You build your embedding pipeline. You build your search pipeline. Then you try to test them. But they were built separately, so evals test them separately. Precision@10 for search. Chunk quality for embeddings. No end-to-end coherence measure.

Enscribe: Evals are native

Because the Metaschema Agent owns both EMBED and RETRIEVE, evals test the full pipeline. Your eval suite answers: "Given this EMBED instruction and this RETRIEVE instruction, does the system preserve bidirectional coherence?"

✓ Evals run before every deploy
✓ Schema changes blocked if evals regress
✓ Shadow traffic tests production coherence
✓ Scheduled runs detect drift over time

Eval Results: great_expectations_analysis_v1

Here's what the eval suite tells us about our Great Expectations literary analysis profile:

precision_at_10

94%

What it tests: When we retrieve "prior assertions relevant to latest exchange", are the top 10 results actually relevant?

Test case: "Why does Miss Havisham manipulate Estella?"
Expected: Assertions about betrayal, revenge, Estella's upbringing
Retrieved: 9/10 correct (missed one tangential mention of wedding details)
✓ PASS (threshold: 90%)

This tests whether the Metaschema Agent's understanding of "relevant" matches human judgment.

narrative_recall_v1

91%

What it tests: Does the story-beat layer preserve narrative continuity? Can we retrieve across plot arcs?

Test case: "How did the first contact resolution affect later alliance?"
Expected: Story beats from "first_contact_resolution" → "alliance_formation"
Retrieved: 10/11 narrative connections preserved
✓ PASS (threshold: 85%)

This tests whether the story-beat layer actually understands causality and narrative flow.

speaker_attribution_accuracy

100%

What it tests: Does the structural layer preserve "who said what" perfectly?

Test case: Retrieve all human assertions vs. all AI assertions
Expected: Zero speaker misattribution
Retrieved: 247/247 correctly attributed
✓ PASS (threshold: 100%)

This is non-negotiable. If we can't preserve "who said what", we fail the EMBED instruction.

utterance_classification_f1

0.89

What it tests: Does the agent correctly distinguish assertions from questions when filtering for "prior assertions"?

Test case: 150 utterances (50 assertions, 50 questions, 50 speculations)
Precision: 0.92 (few false positives)
Recall: 0.86 (missed some nuanced assertions)
F1: 0.89
✓ PASS (threshold: 0.80)

This refinement came from conversational iteration in the Lab. The eval proves it works.

p95_latency_ms

<187ms

What it tests: With all layers (baseline, structural, story-beat) and hybrid retrieval + reranking, are we still fast?

Test corpus: 10K conversations (avg 15 turns each)
Query load: 1000 queries, 95th percentile measured
Result: 187ms (includes dense search + BM25 + rerank + filtering)
✓ PASS (threshold: <200ms)

Semantic richness doesn't mean slow. Unified architecture keeps latency competitive.

How evals enable fearless evolution

1 Pre-deploy blocking

Want to change your EMBED instruction? Update reranking weights? Add a new layer? Every change runs the full eval suite. If precision_at_10 drops below 90%, deploy blocks. No semantic regressions reach production.

2 Shadow traffic testing

Before promoting a new profile version to 100% traffic, run it in shadow mode: real queries go to both old and new profiles. Compare eval metrics on live data. Only promote if new version improves or maintains quality.

3 Scheduled drift detection

Evals run nightly against production profiles. If quality degrades (data distribution shift, model decay, etc.), you get alerted. Your literary analysis profile that scored 94% precision last month is scoring 89% this week? Time to investigate.

4 BYOE: Bring Your Own Evals

Using Braintrust? Custom harness? Your domain-specific quality metrics? Plug them into the Enscribe eval runner. Same pre-deploy blocking, same shadow testing, same scheduled runs. Your quality bar, our platform.

The Eval-Native Guarantee

If your evals pass, your profile ships. If they regress, deploys block. Your bidirectional coherence is tested, proven, and continuously monitored. No semantic drift reaches production.

Use Cases

Where Bidirectional Coherence Matters

We've followed the Great Expectations literary analysis thread. But profile-driven semantics with guaranteed coherence unlock many domains where fidelity and governance are non-negotiable.

⚖️

Regulated Industries: Legal, Healthcare, Finance

When decisions must be auditable and retrieval must preserve exact human statements—not AI interpretations or summaries—bidirectional coherence is mandatory.

EMBED instruction:
"The following is a clinical consultation between Dr. [NAME] and patient [ID] recorded on [DATE]. Preserve exact medical terminology, treatment recommendations, and patient consent statements."
RETRIEVE instruction:
"Given this patient's latest symptom report, retrieve all prior treatment recommendations from Dr. [NAME] with exact quotes, timestamps, and consultation context."

What the Metaschema Agent ensures:

  • Verbatim retrieval: "recommended treatment X" not "suggested a similar treatment"
  • Speaker attribution: Dr. NAME said it, not patient, not AI summary
  • Temporal accuracy: Recommendation from March 2024, not confused with June 2024 revision
  • Consent tracking: "Patient consented to Y" must be retrievable with legal precision
  • Audit trail: Every retrieval shows which embeddings were queried, which layers contributed
🤝

Knowledge Ops: Sales, Customer Success, Support

When you need to retrieve commitments, decisions, and promises with attribution to specific people in specific contexts, bidirectional coherence prevents the "who said that?" problem.

EMBED instruction:
"The following is a sales call between [SALES_REP] and [CLIENT_COMPANY]. Preserve commitments, pricing discussions, feature requests, and objections with speaker attribution."
RETRIEVE instruction:
"What specific commitments did [SALES_REP] make to [CLIENT_COMPANY] in Q3 2025 regarding delivery timelines? Show exact quotes with call dates."

What the Metaschema Agent ensures:

  • Commitment vs. discussion: "We will deliver by Q1" vs. "We might be able to consider Q1"
  • Speaker precision: Sales rep promised it, not client asked for it
  • Context preservation: "if you purchase enterprise tier" conditional not dropped
  • Objection tracking: "Client concerned about integration" retrieved for follow-up
📚

Creative Tools: Writing, Worldbuilding, Story Platforms

Our Great Expectations literary analysis fits here. When narrative coherence, character continuity, and plot causality matter, the story-beat layer shines.

EMBED instruction:
"The following is a chapter from an ongoing fantasy novel. Preserve character motivations, plot arcs, worldbuilding details, and narrative causality across drafts."
RETRIEVE instruction:
"Find all scenes where [CHARACTER] confronts their fear of magic, tracking how this motivation evolved from draft v1 to v3."

What the Metaschema Agent ensures:

  • Story-beat chunking: Chunks at scene/arc boundaries, not mid-dramatic-moment
  • Cross-draft continuity: Track how "fear of magic" motivation evolved across revisions
  • Causality preservation: "Character X's betrayal caused Y's transformation" relationship intact
  • Worldbuilding consistency: "Magic system rules established in Chapter 2" retrievable in Chapter 47
🔬

Research: Literature Review, Claim Verification

When you're analyzing thousands of papers, tracking claims, contradictions, and citation context, the graph layer (Q4 2026) becomes essential.

EMBED instruction (with graph layer):
"The following is a research paper abstract with claims about climate feedback loops. Extract claims as nodes, citation relationships as edges, contradictions between papers as explicit relationships."
RETRIEVE instruction:
"Find all claims about methane feedback loops in permafrost, showing which claims have been contradicted by subsequent papers and which remain uncontested."

What the Metaschema Agent ensures (with graph layer):

  • Claim extraction: "Permafrost methane contributes X Gt CO2-eq by 2050" as queryable node
  • Contradiction tracking: "Paper B contradicts Paper A's estimate" as explicit edge
  • Citation context: Not just "cited", but "supported by" vs. "challenged by"
  • Confidence scoring: Claim confidence based on citation network and contradiction analysis
💎

Why Human Truth Matters

Summarization is lossy. Paraphrasing introduces interpretation.
When stakes are high, you need exact words—not approximations.

Retrieval as Archaeology, Not Creativity

When a patient asks "What did my doctor say about this medication?", they need what the doctor said—word for word, with context. Not an AI's helpful paraphrase. Not a summary that drops the dosage caveat.

Enscribe treats retrieval as archaeology. We dig up what humans said, exactly as they said it.

Provenance is Non-Negotiable

Every retrieved result must answer:

  • Who said it? Speaker, author, role
  • When did they say it? Timestamp, meeting ID, document version
  • In what context? Conversation thread, chapter, consultation session
  • Exactly what words? Verbatim text, no lossy transformation

No black boxes. No "the AI found something similar". Full provenance, always.

Your Embeddings Carry Human Intent

When you tell the Metaschema Agent "preserve speaker attribution" or "track narrative causality", you're encoding human values into the embedding strategy.

Those values must survive the round-trip. Embed with intent → Retrieve with fidelity. Bidirectional coherence ensures your values don't get lost in translation.

We preserve human truth because we preserve human intent at every layer.

Competitive Analysis

Why Enscribe Wins on Coherence

We're not the only embedding or search solution. But we're the only one where bidirectional coherence is guaranteed by a single Metaschema Agent.

Feature Commodity APIs
(OpenAI + Pinecone)
RAG Frameworks
(LangChain/LlamaIndex)
Search SaaS
(Specialized AI Search)
Enscribe
Bidirectional schema (one agent, both paths) DIY
Conversational configuration Limited
Multi-layer semantics (baseline → graph) Manual
Verbatim retrieval + full provenance If you build it
Eval-native (tests = contract, pre-deploy blocking) Bolt-on
Deterministic metering (storage tokens + compute-seconds) Opaque
Zero semantic drift guarantee Your job
Enterprise Security

Bank-Grade Security & Compliance

Multi-tenant isolation, end-to-end encryption, comprehensive audit trails, and SOC 2 Type II readiness. Your data never leaves your control.

🔐 End-to-End Encryption

Data at Rest: AES-256-GCM encryption for all stored embeddings, metadata, and profiles. Keys rotated quarterly via AWS KMS.
Data in Transit: TLS 1.3 only. No TLS 1.2 fallback. Perfect forward secrecy enforced.
Key Management: Customer-managed keys (BYOK) available for Enterprise tier. Hardware security module (HSM) backed.

🏢 Multi-Tenant Isolation

Logical Separation: Every tenant's profiles, embeddings, and metadata are logically isolated with tenant_id boundaries enforced at the database layer.
API Key Scoping: API keys are scoped to tenant + profile combinations. No cross-tenant data access possible.
VPC Isolation (Enterprise): Dedicated VPCs with AWS PrivateLink for zero public internet exposure.

📋 Audit Trails & Compliance

Immutable Audit Logs: Every API call logged with timestamp, tenant_id, user_id, IP address, request/response payloads (configurable). Retention: 1 year standard, 7 years for Enterprise.
Access Controls: Role-based access control (RBAC) with granular permissions. SSO/SAML 2.0, SCIM provisioning for Enterprise.
Data Residency: Choose US, EU, or Asia-Pacific regions. Data never leaves selected region.

Compliance Certifications (Planned Q3 2026):

✓ SOC 2 Type II
✓ GDPR Compliant
✓ HIPAA BAA Available
✓ ISO 27001
✓ PCI DSS (if processing payments)
✓ FedRAMP Moderate (Roadmap)

🛡️ Infrastructure Security

  • DDoS Protection: AWS Shield Advanced with automatic mitigation
  • WAF: AWS WAF with OWASP Top 10 protection, rate limiting, IP allowlisting
  • Secrets Management: AWS Secrets Manager with automatic rotation
  • Container Security: All images scanned for CVEs, signed with Notary v2
  • Network Segmentation: Zero-trust network architecture, least-privilege IAM policies
  • Disaster Recovery: Multi-AZ deployment, automated backups every 6 hours, 99.99% uptime SLA (Enterprise)
Planned Pricing

Transparent, Deterministic, Fair

Projected pricing upon launch: Based on documented benchmark results from our test infrastructure, we've set conservative pricing to ensure controlled capacity during launch.

Note on Pricing: All pricing projections are derived from actual performance metrics documented in our benchmark suite. We've intentionally set conservative estimates to gate-keep a steady lift-off—ensuring we can deliver on quality and SLAs from day one. Early access will be capacity-controlled.

Starter

$29 /month
  • 3 embedding profiles
  • Shared compute pool
  • Baseline + structural layers
  • Community eval library
  • Self-service onboarding
Contact Us
POPULAR

Team

$299 /month
  • 20 embedding profiles
  • All layers (story-beats, graph)
  • Private eval runners (BYOE)
  • Per-profile analytics & cache
  • Email support
Contact Us

Enterprise

Custom
  • Unlimited profiles
  • SSO/SCIM, audit trails
  • VPC/PrivateLink deployment
  • Dedicated success engineer
  • 99.9% SLA
Contact Sales

Usage Metering

Projected rates based on documented benchmark performance

💾
Storage Tokens
Tokens embedded + vector storage
~$0.10 per 1M tokens/month
Compute-Seconds
Chunking, embedding, search, rerank
~$0.02 per vCPU-hour
🎁
Cache Dividend
Savings from avoided recompute
Reduces your bill

Why deterministic? The unified architecture means we know exactly what resources each query consumes. No black box pricing. Your bill is a function of documented resource usage.

Roadmap

What's Next

Building the future of profile-driven semantic systems, one layer at a time.

✓ Q1 2026 - FOUNDATION
  • Enscribe-Embed service (unified write + read)
  • Metaschema Agent (bidirectional coherence)
  • Baseline + structural layers
  • Deterministic metering (storage + compute)
  • Basic Enscribe-Lab UI
Q2 2026 - PUBLIC LAUNCH 🚀
  • Conversational profile configuration
  • Story-beat layer (narrative chunking)
  • Hybrid retrieval + reranking
  • Cache dividend reporting
  • Early Access → General Availability
Q3 2026 - ECOSYSTEM & SCALE
  • Profile marketplace (community profiles)
  • BYOE SDK (bring your own evals)
  • Public benchmark participation (BEIR, MTEB)
  • Multi-region deployment (US, EU)
  • Enterprise features (SSO, audit, VPC)
Q4 2026 - ADVANCED REASONING
  • Graph layer (claims/relationships)
  • Explanation traces (layer contribution)
  • Auto-tuning based on eval trends
  • Federated profiles (cross-profile queries)

Ship Semantics Like Code

Preserve human truth. Evolve with evals. Be first to build profile-driven semantic systems.

Get Early Access View Roadmap
Early access for qualified developers • Launching Q2 2026