Plexo

Autonomous AI agents
on your infrastructure.

Self-hosted agent platform with intelligent model routing, persistent memory, and multi-channel access. Bring your own keys. Keep your data. AGPL-3.0 open source.

self-host
$ curl -fsSL https://getplexo.com/install.sh | bash

What Plexo does

R

Intelligent Model Routing

Configure fallback chains across providers. If your primary model fails or hits rate limits, Plexo automatically tries the next in your chain. Per-model reliability scoring learns which providers work best.

K

Bring Your Own Keys

17 providers supported: Anthropic, OpenAI, DeepSeek, Groq, Mistral, Google, xAI, Ollama, OpenRouter, Cerebras, Cohere, Fireworks, Together, Perplexity, SambaNova, Cloudflare, and any OpenAI-compatible endpoint. Your keys, your costs, your control.

C

Multi-Channel

Same agent, everywhere. Web dashboard, Telegram, Slack, Discord, REST API, embeddable widget. Voice messages transcribed via Deepgram. Images analyzed through vision-capable models.

Q

Independent Quality Judge

A separate model evaluates every task output against rubrics. Ensemble mode runs multiple local judges via Ollama with weighted consensus. Cross-provider judging prevents self-evaluation bias.

S

Self-Extending Agent

Need an integration that does not exist? The agent scrapes API docs and generates a working PEX extension on the fly -- complete with credential UI and sandboxed execution. No manual plugin development.

P

Project Decomposition

Describe a project. Plexo decomposes it into parallel tasks with dependency-aware wave scheduling. Each task gets its own branch, agent, and draft PR. Budget ceilings enforce cost control.

Built-in safety

One-Way Door Approvals

Irreversible actions -- database migrations, external API calls with side effects, destructive file operations -- require human approval before execution. Standing approvals let you trust specific operations. Risk-level classification from low to critical.

Context Intelligence

Stale tool results are automatically compressed to prevent context bloat. Per-model output ceilings prevent truncated tool calls. Truncation detection triggers automatic retry with budget adjustments.

Cost Ceilings

Per-task and per-project cost ceilings halt execution before runaway spending. Budget checks run before every wave in sprint execution. The agent stops, not your wallet.

Audit Trail

Every tool invocation, extension activation, and approval decision is logged with SHA-256 payload hashing. Full introspection into what the agent did and why.

Built on open foundations

Production-grade infrastructure you can inspect, extend, and trust

Model-agnostic

Your keys, any provider. Automatic fallback routing across your configured chain.

+ Cerebras, Cohere, Fireworks, Perplexity, SambaNova, Cloudflare Workers AI, and any OpenAI-compatible endpoint

MCPSKILL.mdA2AAGENTS.md

Semantic Context Lattice

Knowledge that compounds.

SCL is a persistent knowledge graph built from completed work. Every task extracts structured concepts — entities, events, actions, claims — embeds them as vectors, and writes them into a Golden Record. The lattice self-modifies after every task. It doesn't just store what happened. It maps how concepts relate, which patterns are stable, and what's drifting.

Concept Attractors

Vector-positioned knowledge nodes with salience scoring, mutation tracking, and two depth classes: spirit (core values, drift-protected) and mechanics (operational knowledge, freely evolving).

Domain Regions

Semantic clusters with centroids, radius, and density. Regions organize concepts spatially. Transformation rules define typed edges between regions — CAUSES, ENABLES, PREVENTS, IMPLIES, and 26 more relation types.

Drift Detection

When a mutation would shift a protected attractor beyond its threshold, a DriftWarning fires for human review. The system won't silently forget what it fundamentally knows.

Budget-Aware Expansion

Context retrieval is token-budget-constrained with three resolution levels (L0/L1/L2). Not top-K nearest — priority-sorted, region-aware, and budget-packed.

Plexo Extension Protocol

Six extension types. One runtime.

PEX is a specification for packaging agent capabilities. Every extension declares its permissions, data access, and escalation contract in a plexo.json manifest. Install from the Hub or build your own.

Skill

Composite capability package — registers tools, schedules, widgets, and prompts.

Channel

Messaging bridge with onMessage and healthCheck. Telegram, Slack, Discord, or custom.

Tool

Stateless, single-purpose function. Called on demand by the agent or executor.

Connector

Bridges an external MCP server into the PEX sandbox. Translates tool definitions.

Agent

Autonomous actor with plan, executeStep, verifyStep, and escalation contract.

MCP Server

Model Context Protocol server — stdio or SSE transport, standard tool discovery.

Entity-scoped permissions (memory:read:transaction), three compliance levels (Core/Standard/Full), and mandatory escalation for irreversible actions. Read the spec →

Why self-hosted

Your data never leaves

Task history, agent memory, conversation logs, and workspace state run entirely on your infrastructure. No telemetry phones home.

No model lock-in

Switch providers by changing a dropdown. Fallback chains ensure uptime even when a provider has an outage. Run local models via Ollama with zero external calls.

AGPL-3.0 open source

Every line of code is inspectable. Free forever for self-hosted use. Commercial licensing available for modified network services.

One-command deploy

Docker Compose. The install script generates secrets, writes your env file, and has you running in 60 seconds. No Kubernetes required.

Early adopters

Plexo is in public beta. Teams running production workloads on self-hosted Plexo include SaaS operators, dev agencies, and solo founders managing multi-service deployments.

Ready to deploy

One command. Your server. Full control over models, data, and cost.