Get Started
Memory Infrastructure for AI

Give your AI agents the memory they deserve.

Your agents are brilliant—until they forget. Enscribe provides the embedding, neural search, and retrieval infrastructure that AI agents and LLM-powered applications need to remember everything and find anything. Stop building memory pipelines. Start shipping products.

16 REST Endpoints
<200ms P95 Search
8 Embedding Models
3 Global Regions
The Choice

Months of infrastructure, or minutes to production.

Your AI agents need persistent memory. You can spend months building it from scratch, cobble together commodity APIs, or point your agents to Enscribe and start shipping today.

Option A

Build It Yourself

Choose embedding models, deploy vector storage, design chunking strategies, build search and ranking logic, create an eval framework.

  • Select and integrate embedding providers
  • Deploy and maintain vector databases
  • Design chunking for each content type
  • Build retrieval, ranking, and filtering
  • Create eval pipelines for search quality
  • Manage dev / staging / prod environments
3-6 months before you ship product code
Option B

Commodity APIs

One API for embeddings, another for search. Quick to prototype but no coherence between how you embed and how you retrieve.

  • Separate embedding and search systems
  • Generic chunking, no domain awareness
  • No eval framework or promotion workflow
  • No fine-grained retrieval controls
  • Drift between embedding and search
  • You still build all the glue code
Weeks of integration, ongoing maintenance
Enscribe

Complete Memory Stack

One platform for embedding, search, and retrieval. Agent Voices give your agents fine-grained memory tuned to their specific task.

  • Agent Voices: complete memory profiles
  • Smart chunking with LLM-powered options
  • Multi-model embeddings (OpenAI, Voyage.ai)
  • Hybrid search with tunable retrieval
  • Built-in eval campaigns and promotion gates
  • Multi-environment (dev / staging / prod)
Minutes to first search. Production-ready.

Your agents need memory. That's our entire focus.

We built Enscribe so you don't have to become an embeddings expert to ship AI products. A purpose-built memory stack engineered for sub-200ms search, type-safe from ingestion to retrieval.

Core Concept

Agent Voices: memory profiles for every task.

Every agent has a different job. A support bot needs different memory than a research assistant. A Voice is a complete configuration for how your agent chunks, embeds, and retrieves—giving each agent a memory profile tuned to its specific task.

Chunking

How documents are split. Baseline (fixed-size), Story Beats (LLM-detected narrative boundaries), or Tone Segments (topic shifts).

Embedding

How chunks become vectors. Choose model, dimensions, and resolution mode. Adaptive Resolution Search for multi-granularity.

Retrieval

How results are ranked. Semantic, hybrid (+ BM25), or adaptive. Tunable thresholds, top-k, and fusion strategies.

Start from a template, tune from there.

Four built-in templates cover common patterns. Create Voices from scratch or customize a template for your use case.

General Purpose

text-embedding-3-small · 512 tok

Balanced defaults for most use cases. Start here and tune as needed.

High Precision

text-embedding-3-large · 256 tok

Strict matching, low tolerance for noise. High threshold, small chunks.

High Recall

text-embedding-3-small · 1024 tok

Cast a wide net. Lower threshold, larger context, don't miss relevant results.

Conversational

text-embedding-3-small · 384 tok

Optimized for chat-style queries and short-form questions.

For Developers

Built for developers. Designed for agents.

A complete REST API, a planned CLI with MCP server for direct agent integration, and a full-featured developer portal. Everything your agents need, accessible the way you build.

import requests response = requests.post( "https://us.api.enscribe.io/v1/search", headers={"Authorization": "Bearer ens_sk_..."}, json={ "collection": "knowledge-base", "voice": "high-precision", "query": "How does the onboarding flow work?", "top_k": 5 } ) for result in response.json()["results"]: print(f"[{result['score']:.2f}] {result['text'][:120]}...")
Available

/v1 REST API

16 endpoints covering the complete workflow from collection creation to semantic search.

  • Collection CRUD + stats
  • Voice CRUD with templates
  • Document ingest with SSE streaming
  • Semantic and hybrid search
  • Chunking preview
  • Per-key rate limiting
Coming Soon

CLI + MCP Server

Command-line access and direct agent integration via the Model Context Protocol.

  • MCP server for Claude, GPT, and other agents
  • Agents search your collections natively
  • Scriptable CLI for automation
  • Pipe documents from any workflow
  • Environment switching built in
Available

Developer Portal

Full-featured web UI with interactive API explorer and visual tools.

  • API explorer with code generation
  • Visual Voice builder
  • Waveform visualization of embeddings
  • Document browser and chunk inspector
  • Eval campaign management
Platform

Everything your agents need to remember.

Multi-Model Embeddings

OpenAI, Voyage.ai, and more. Switch models without re-architecting. 8 models across providers.

Hybrid Search

Combine semantic vector search with BM25 keyword matching. Tunable fusion: RRF, weighted, linear.

LLM-Powered Chunking

Beyond fixed-size splits. Story Beats detect narrative boundaries. Tone Segments find topic shifts.

#

SHA256 Change Detection

Fingerprint-based dedup. Re-upload unchanged documents at zero cost. Pay only for what changed.

Adaptive Resolution

Embeddings at multiple dimensions simultaneously. Fast broad scans, precise matching—or both.

Multi-Environment

Isolated dev, staging, and production. Promote Voices across environments with eval-gated workflows.

Eval Campaigns

Measure retrieval quality with NDCG, recall, and precision. Promotion gates ensure only proven Voices ship.

Waveform Visualization

See embeddings as waveforms. Compare Voices visually, inspect chunks, overlay query and result vectors.

Full Observability

Request logging, latency tracking, usage metrics. Unified telemetry from ingest to search result.

Trust

Engineered for production.

Memory-safe architecture with zero garbage collection. Type-safe from ingestion to retrieval. Encryption at every layer.

Sub-200ms Search

P95 latency across all query types. No cold starts, no GC pauses.

AES-256 Encryption

API keys and credentials encrypted with AES-256-GCM. SSL-secured connections.

Tenant Isolation

Every request validated against tenant boundaries. Environment-scoped API keys.

Multi-Region

US (Ohio), EU (Frankfurt), Asia-Pacific (Singapore). Data residency controls.

Ready to give your agents memory?

Stop building embedding pipelines. Start shipping AI products.