_SH Log's
Back to Root
EST: 4 min read

Building Context-Heavy: Knowledge-Graph API for AI Agents

Context-Heavy is a multi-tenant knowledge-graph API built in Go (pgvector + recursive CTEs) to give AI agents persistent, relational context across sessions.

#ai-agents#go#pgvector#knowledge-graph

Context-Heavy is a multi-tenant knowledge-graph API I built to give AI agents persistent, relational context across sessions. The core problem it solves: most agent memory systems treat memory as a flat vector store. Real knowledge has structure — entities, relationships, temporal ordering. Context-Heavy stores both.

The problem with flat vector memory

Standard RAG memory:

store(text) → embedding → vector DB
query(question) → similarity search → top-k chunks

This works for factual recall. It breaks for relational queries:

  • "What projects did we discuss last week that depend on the auth service?"
  • "Which users have reported this error in the past month?"
  • "What's the chain of decisions that led to this architecture?"

Similarity search can't answer these. You need a graph.

Data model

Context-Heavy stores knowledge as a property graph in PostgreSQL:

CREATE TABLE entities (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    tenant_id   UUID NOT NULL,
    type        TEXT NOT NULL,           -- person, project, concept, event
    name        TEXT NOT NULL,
    properties  JSONB DEFAULT '{}',
    embedding   vector(1536),            -- pgvector
    created_at  TIMESTAMPTZ DEFAULT now(),
    updated_at  TIMESTAMPTZ DEFAULT now()
);

CREATE TABLE relationships (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    tenant_id   UUID NOT NULL,
    from_id     UUID REFERENCES entities(id),
    to_id       UUID REFERENCES entities(id),
    type        TEXT NOT NULL,           -- depends_on, created_by, mentions
    weight      FLOAT DEFAULT 1.0,
    properties  JSONB DEFAULT '{}',
    created_at  TIMESTAMPTZ DEFAULT now()
);

CREATE INDEX ON entities USING ivfflat (embedding vector_cosine_ops)
    WITH (lists = 100);

Multi-tenancy: every row has tenant_id. Row-level security enforces isolation — no query can cross tenant boundaries.

Query API

Three query modes over HTTP + JSON:

1. Semantic search — nearest neighbors in embedding space:

POST /query/semantic
{
  "q": "authentication service issues",
  "limit": 10,
  "entity_types": ["project", "event"]
}

2. Graph traversal — recursive CTEs for relationship walks:

POST /query/graph
{
  "from_entity": "uuid-of-auth-service",
  "relationship": "depends_on",
  "depth": 3
}
WITH RECURSIVE deps AS (
    SELECT to_id, 1 AS depth
    FROM relationships
    WHERE from_id = $1 AND type = $2 AND tenant_id = $3
    UNION ALL
    SELECT r.to_id, d.depth + 1
    FROM relationships r
    JOIN deps d ON r.from_id = d.to_id
    WHERE d.depth < $4 AND r.tenant_id = $3
)
SELECT e.* FROM entities e
JOIN deps d ON e.id = d.to_id;

3. Hybrid — graph seed + semantic re-rank:

POST /query/hybrid
{
  "q": "performance issues in services that auth depends on",
  "seed_entity": "uuid-of-auth-service",
  "hop": 2
}

Hybrid mode is most powerful for agent use: start from a known entity, expand by relationships, re-rank expanded nodes by semantic similarity to the query.

Ingestion pipeline

Agents write knowledge via an ingestion endpoint that extracts entities and relationships from unstructured text:

type IngestRequest struct {
    TenantID string `json:"tenant_id"`
    Text     string `json:"text"`
    Context  string `json:"context"` // optional: source, date, author
}

func (s *Server) Ingest(w http.ResponseWriter, r *http.Request) {
    var req IngestRequest
    json.NewDecoder(r.Body).Decode(&req)

    // LLM extraction
    extracted := s.llm.ExtractEntities(req.Text)

    // Upsert entities (merge duplicates by name+type)
    for _, e := range extracted.Entities {
        s.db.UpsertEntity(req.TenantID, e)
    }
    // Upsert relationships
    for _, rel := range extracted.Relationships {
        s.db.UpsertRelationship(req.TenantID, rel)
    }
}

Entity deduplication uses fuzzy name matching + embedding similarity — "LetX API" and "letx-api" resolve to the same entity.

Performance at scale

At 10k entities per tenant with 50k relationships, query performance:

| Query type | P50 | P99 | |------------|-----|-----| | Semantic (ivfflat) | 8ms | 22ms | | Graph traversal (depth 3) | 14ms | 45ms | | Hybrid | 28ms | 70ms |

Graph traversal uses recursive CTEs with a depth cap (default: 5) to prevent runaway queries. Beyond depth 5, the graph becomes noise for most agent use cases.

FAQ

What is Context-Heavy? Context-Heavy is a multi-tenant knowledge-graph API that gives AI agents persistent, relational memory using PostgreSQL with pgvector and recursive CTEs.

How is it different from a regular vector database? Vector DBs excel at similarity search but can't answer relational queries. Context-Heavy stores entities and relationships, enabling graph traversal, dependency walks, and hybrid semantic+graph queries.

What's the tech stack? Go API server, PostgreSQL with pgvector extension, Redis for caching, deployed on AWS ECS Fargate with Terraform.

Can I self-host Context-Heavy? Yes — the Go binary + Terraform module are open source. You need a PostgreSQL instance with the pgvector extension (available on RDS, Supabase, and Neon).


Written by Shihab Shahriar Antor — AI Engineer & Founder of Shahriar Labs. See also: Building common-knowledge: Persistent Memory for Agents · pgvector: Vector Search in PostgreSQL.