Atlas Metis: RAG Engine — Platform Status Brief

A

Atlas Metis: RAG Engine

Platform Status Brief

March 11, 2026

Confidential

01 — Overview

Platform Status

The Atlas Metis RAG Engine is a multi-tenant RAG-as-a-Service platform with a three-tier dashboard architecture (Master Admin, Org Admin, End User Portal), multi-provider embedding support (OpenAI + Gemini), and multimodal ingestion. The core product — document ingestion, hybrid search, AI-powered answers, cost tracking, cross-tenant analytics, and real-time health monitoring — is fully operational. The platform now serves three distinct user tiers with purpose-built interfaces. Remaining work focuses on OAuth token refresh, connector expansion, and deployment hardening.

55

API Routes

44

Python Modules

7

Health Checks / Tenant

E2E

Pipeline Verified

System Status

Core RAG Pipeline Operational
Admin Dashboard Operational
Multi-Tenant Auth Operational
Batch Processing Operational
Connector Syncs Partial — 3 of 11 types
Cost Tracking Operational
Gemini Multimodal Embeddings Operational
Three-Tier Dashboard Master Admin + Org Admin + End User Portal
Analytics Dashboard Cross-Tenant Analytics
End User Portal Public Chat per Org

02 — Operations

How a Client Gets Onboarded

1

Create Tenant

Admin creates organization with name + billing plan. System generates isolated namespace + API key automatically.
2

Generate API Keys

Admin generates scoped keys for the client. Keys have permissions (query, ingest) and rate limits configured per key.
3

Create Collection

Client creates a knowledge base collection — a logical grouping for their documents (e.g., “Training Docs,” “Product Knowledge”).
4

Upload Documents

Client uploads files (PDF, DOCX, TXT, CSV, audio). Engine parses, chunks, embeds, and stores automatically.
5

Query Knowledge Base

Client sends questions via API. Engine searches, reranks, validates, and generates cited answers.
6

Monitor Health

Admin monitors all clients via dashboard. Fleet overview with 7 health indicators per organization.

Current onboarding time: < 5 minutes from tenant creation to first query.

03 — Current State

Operational Systems

Single File Ingestion

Upload PDF, DOCX, TXT, CSV, audio, images (with Gemini) — auto-process to searchable vectors

Hybrid Search

Semantic + keyword search combined via Reciprocal Rank Fusion

Cohere Reranking

Cross-encoder reranking boosts raw scores from ~0.016 to ~0.97

LLM Generation

GPT-4o-mini generates grounded answers with source citations

Self-RAG Validation

Validates chunk relevance before generating — catches hallucinations

SSE Streaming

Real-time streaming responses via Server-Sent Events

Multi-Tenant Isolation

Verified — Tenant A cannot see Tenant B’s data

Dual API Key Auth

Tenant keys (scoped) + Admin keys (full access) with bcrypt

Job Tracking

Every ingestion tracked with status, timing, and error capture

Health Diagnostics

7 automated checks per tenant with alert creation and resolution

Admin Dashboard

Fleet overview, alert center, metrics bar, diagnostics trigger

Usage Tracking

Queries, tokens, rerank units tracked per tenant per period

Cost Tracking

Real pricing from OpenAI ($0.13/1M), Cohere, and Gemini ($0.15/1M tokens)

Rate Limiting

Sliding window rate limiter enforced per API key

File Size Limits

100MB upload cap prevents memory exhaustion on large files

Gemini Embeddings

Multimodal: images, video, audio natively embedded via Gemini Embeddings 2

04 — Issues

Issues by Priority

P1 Major Gaps

Feature	Issue
OAuth Refresh	Google Drive/Dropbox tokens expire after ~1 hour with no refresh

P2 Pre-Production

Feature	Issue
HyDE Fallback	Advertised but not implemented
Connector Types	8 of 11 declared types not implemented (only Drive, Dropbox, Webhook)
Eval RPC Mismatch	Eval worker uses `match_chunks` RPC vs retrieval’s `hybrid_search`
Docker/Railway Deploy	Config exists but not tested
Smoke Test Suite	`tests/` directory empty — no automated test coverage

All P0s resolved. URL ingestion, batch upload, and connector sync pipelines are fully wired and operational. Six of seven original P1s have been fixed (cost tracking, collection counters, Celery fallbacks, concept clustering, auth performance, query cache).

05 — Product Architecture

Three-Tier Dashboard Architecture

Atlas Metis serves three distinct user tiers, each with a purpose-built interface, authentication model, and API layer.

Access Levels

Layer	Access	Auth	URL Pattern
Master Admin	Atlas Minds team	Admin API key	/admin
Org Admin	Client admins	Supabase Auth	/dashboard
End User	Anyone with link	None (public)	/portal/{org-slug}

Master Admin (Atlas Minds)

Fleet Health
Cross-Tenant Analytics
Revenue & Cost Tracking
Org Comparison
Diagnostics & Alerts

Org Admin (Client)

Document Management
Collection Management
Query Playground
Usage & Billing
Settings & API Keys

End User Portal

Chat Interface
Streaming Responses
Source Citations
Per-Org Branding
No Login Required

Data Flow by Tier

Public

End User

→

Portal API

Public

No auth required

→

Backend

→

Supabase

Authenticated

Org Admin

→

Customer API

Auth’d

Supabase Auth

→

Backend

→

Supabase

Admin

Master Admin

→

Admin API

Admin Key

Full access

→

Backend

→

Supabase

06 — Infrastructure

Technical Architecture

External

Client App

→

Gateway

FastAPI

55 routes • async

→

Security

Auth Layer

Dual key • bcrypt

↓

Ingestion Pipeline

Parse → Chunk → Embed → Store

Retrieval Pipeline

Search → Rerank → Validate → Generate

Diagnostics

7 Health Checks → Alerts → Auto-Resolve

↓

Database

Supabase

pgvector • RLS

Embeddings + LLM

OpenAI + Gemini

Multi-provider • 3072d • Multimodal

Reranking

Cohere

Rerank v3.5

Task Queue

Redis / Celery

Not running

07 — Roadmap

Path to Production

A

Fix P0s — Critical Dead Code

✓ Completed 2026-03-05

URL ingestion, batch upload, connector sync — all wired
Celery sync fallbacks added for all dependent endpoints

B

Fix P1s + Multi-Provider Embeddings

✓ Completed 2026-03-11

Cost tracking, collection counters, concept clustering, auth caching, query cache
Multi-provider embeddings (OpenAI + Gemini) with multimodal support
Rate limiting, file size limits, faithfulness checks, auto-slug

C

Three-Tier Dashboard + Analytics

✓ Completed 2026-03-11

End User Portal — public chat interface per organization with streaming and citations
Master Admin analytics — cross-tenant usage, cost trends, org comparison
Org Admin dashboard — document/collection management, query playground, billing

D

Remaining Work

Estimated: 3–5 days

OAuth token refresh for Google Drive / Dropbox connectors (P1)
Implement HyDE (Hypothetical Document Embeddings) fallback
Build remaining 8 connector types (SharePoint, S3, Notion, etc.)
Fix eval RPC mismatch (match_chunks vs hybrid_search)
Docker Compose + Railway deployment — test and validate
End-to-end smoke test suite

Total estimated time to production-ready: 3–5 days

A

Built by Atlas Minds

atlas-minds.com

March 11, 2026