One system for vector data and AI execution.
Search, cluster, analyze, train, and infer over the same body of vector data without stitching together fragmented systems.
pip install eigenlakeLive surface
support-events / production
One namespace for vector data and AI workloads
One execution layer across retrieval, analysis, and model workflows
Built for both developers and autonomous agents
Designed for large-scale production AI systems
Product
One execution layer for vector AI.
A vector data lake where developers and AI agents use the same primitives to query context, launch workloads, monitor runs, and analyze results.
Search becomes execution: retrieval, clustering, training, inference, and large-scale analysis all run in one place.
Developer
AI Agent
Agent-ready vector runtime
Run progress
Live telemetry
Shared human + agent interface
Developers and agents use the same tools and primitives to get context, run work, and take action.
Durable vector memory
Store and organize vectors at scale with built-in durability, consistency, and access controls.
Governed execution
Operate with policies, approvals, and guardrails so actions are safe, auditable, and repeatable.
Managed compute for heavy jobs
Elastic, GPU-accelerated compute for training, inference, and large-scale analysis.
Traceable outputs
Every run is observable end to end with lineage, metrics, and artifacts teams can trust.
Python SDK
Install EigenLake and query an index in minutes.
The public package is now available as eigenlake. Use it to connect to EigenLake Cloud, manage indexes, insert vectors, run nearest search, cluster matching records, and ask agent-mode questions.
Install
pip install eigenlake
Install
Published on PyPI as the EigenLake Python SDK for Python 3.10+.
Connect
Use eigenlake.connect with your EigenLake API endpoint and sandbox key.
Create an index
Define schema fields, create or open an index, and keep metadata filterable.
Search and query
Insert records, run nearest search, cluster results, or ask agent-mode questions.
Quickstart
Connect, create an index, insert, search, and query
import eigenlake
from eigenlake import schema as s
with eigenlake.connect(
url="https://api.eigenlake.dev/",
api_key="<sk_sbx_your_api_key_here>",
) as client:
schema, index_options = (
s.SchemaBuilder(additional_properties=False)
.add("document_id", s.string(required=True, filterable=True))
.add("text", s.string(filterable=False))
.build()
)
idx = client.indexes.create_or_get(
namespace="demo-namespace",
index="demo-index",
dimensions=128,
schema=schema,
index_options=index_options,
)
idx.records.add(
id="doc-1",
properties={"document_id": "doc-1", "text": "hello"},
vector=[0.1] * 128,
)
result = idx.search.nearest(vector=[0.1] * 128, limit=3)
answer = idx.agent.query("show me recent failures", mode="auto")Why EigenLake
One execution layer instead of fragmented AI infrastructure.
Vector workloads should not require separate systems for storage, metadata, orchestration, compute, lineage, and operational controls.
Too many moving parts
Vector workloads are fragmented across separate tools that drift, fail, and require custom glue.
EigenLake as one execution layer
Storage, metadata, compute, and AI execution live behind one operational surface for humans and agents.
Workloads
Run vector workloads where the data lives.
Search, cluster, forecast, detect anomalies, train models, and run inference on one vector data layer without moving data across fragmented ML systems.
Cluster support tickets into emerging product themes
FAQ
Questions about the vector data lake.
What is EigenLake?
EigenLake is a vector data lake for AI workloads. It combines vector database UX, lakehouse-style storage, distributed execution, and GPU compute so teams can work with vector data as a full execution layer, not only a retrieval index.
How is this different from a vector database?
A vector database is usually optimized for search and nearest-neighbor retrieval. EigenLake keeps that query experience, then extends it with one namespace, one catalog, one security model, one execution API, and one lineage model for larger AI workloads.
What workloads can run on EigenLake?
EigenLake is designed for semantic search, clustering, classification, anomaly detection, recommendations, ranking, large-scale analysis, training, and inference. The goal is to run these workflows close to the vectors, metadata, and source records they depend on.
Why bring Spark and GPUs into vector infrastructure?
Many vector workloads do not stop at lookup. Clustering, training, scoring, and analysis often need distributed CPU and GPU compute. EigenLake is built to make that execution available through the same platform instead of forcing teams to move data into separate Spark, training, and inference stacks.
How does EigenLake help agents and developers?
Developers get one API for storing, querying, and executing work on vector data. Agents get a stable surface where they can retrieve context, analyze datasets, trigger jobs, and act on results without depending on fragile chains of disconnected services.
Does EigenLake replace existing ML and data infrastructure?
EigenLake is designed to collapse the parts of the stack that are currently stitched together around vector data: the retrieval layer, analysis jobs, feature workflows, training pipelines, and inference paths. Teams can keep their product focus while running end-to-end vector workloads in one platform.
Who should use EigenLake?
EigenLake is for AI application teams, ML platform teams, data teams, and agent builders working with large vector datasets. It is especially useful when vectors are central to product behavior, operational decisions, or model workflows.
Talk to the founders
See what your AI stack looks like when vector data and execution live in one system.
A restrained close that feels architectural, not hype-driven.