Absortio

Email → Summary → Bookmark → Email

GitHub - varunreddy/SkillMesh: A retrieval-gated skill architecture for LLM agents that scales to hundreds of tools by exposing only the top-K relevant capabilities per request.

Extracto

A retrieval-gated skill architecture for LLM agents that scales to hundreds of tools by exposing only the top-K relevant capabilities per request. - varunreddy/SkillMesh

Resumen

Resumen Principal

SkillMesh se establece como una solución transformadora para la gestión eficiente de los extensos catálogos de herramientas en Agentes LLM. Su función principal es actuar como un router de recuperación inteligente, evitando la práctica ineficaz de cargar cientos de herramientas en cada prompt. En su lugar, SkillMesh selecciona dinámicamente y con precisión las *pocas tarjetas

Contenido

CI License: MIT Python 3.10+

Stop stuffing hundreds of tools into your LLM prompt. Route to the right ones.

SkillMesh is a retrieval router for agent tool catalogs. Instead of loading every skill/tool into every prompt, it selects the best few cards for the query and injects only those.

Why Teams Adopt SkillMesh

  • Keeps prompts small as your catalog grows (top-K instead of full dump)
  • Improves tool selection quality on multi-domain tasks
  • Cuts token cost per call by avoiding irrelevant tool context
  • Works with Claude (MCP), Codex (skill bundle), and local CLI workflows
  • Standardized OpenAI-style function schemas for tool expansion

The Problem

LLM agents break when you load every tool into the prompt. Token counts explode, accuracy drops, and cost scales linearly with your catalog size. Teams with 50+ skills end up with bloated system prompts that confuse the model and burn budget.

SkillMesh solves this with retrieval-based routing: given a user query, it selects only the top-K most relevant expert cards and injects them into the prompt — keeping context small, accurate, and cheap.

High-Value Use Cases

  • Internal AI assistants with large tool/skill catalogs (50+ cards)
  • Multi-step workflows crossing domains (data -> ML -> infra -> reporting)
  • Teams using MCP where tool overload hurts selection quality
  • Role-based execution flows (Data-Analyst, Financial-Analyst, AWS-Engineer)

SkillMesh vs Static Skill Docs

Static SKILL.md only SkillMesh routing
Prompt strategy Load broad instructions every turn Inject only relevant top-K cards
Scale behavior Gets noisy as catalog grows Remains focused with retrieval
Multi-domain tasks Manual tool prompting Query-driven cross-domain routing
Expansion Add docs and hope model picks right one Add cards + retrieval handles selection

Before vs After

Without SkillMesh With SkillMesh
Prompt tokens ~50,000+ (all tools loaded) ~3,000 (top-K only)
Tool selection Model guesses from a huge list BM25+Dense retrieval picks the best match
Cost per call High (full catalog every time) Low (only relevant cards)
Accuracy Degrades as catalog grows Stays consistent
Multi-domain tasks Confusing for the model Routed precisely (clean + train + deploy)

How It Works

User Query
    │
    ▼
┌─────────────────────┐
│  BM25 + Dense Index  │  ← Scores every card in your registry
└─────────┬───────────┘
          │
          ▼
┌─────────────────────┐
│   RRF Fusion Rank    │  ← Merges sparse + dense rankings
└─────────┬───────────┘
          │
          ▼
┌─────────────────────┐
│   Top-K Card Select  │  ← Returns the K best expert cards
└─────────┬───────────┘
          │
          ▼
┌─────────────────────┐
│  Agent acts as expert │  ← Full instructions injected into prompt
└─────────────────────┘

Each card contains: execution behavior, decision trees, anti-patterns, output contracts, and composability hints — everything the agent needs to act as a domain expert.

One-line MCP install (Claude Desktop / Claude Code)

Add this to your Claude Desktop config (claude_desktop_config.json) or Claude Code MCP settings:

{
  "mcpServers": {
    "skillmesh": {
      "command": "uvx",
      "args": ["--from", "skillmesh[mcp]", "skillmesh-mcp"]
    }
  }
}

No env vars. No file paths. No cloning. The bundled registry is included in the package.

Requires uv to be installed.

60-Second Demo

git clone https://github.com/varunreddy/SkillMesh.git
cd SkillMesh
pip install -e .
skillmesh emit \
  --provider claude \
  --registry examples/registry/tools.json \
  --query "clean messy sales data, train a baseline model, and generate charts" \
  --top-k 5

Output (truncated):

<context>
  <card id="data.data-cleaning" title="Data Cleaning and Validation Expert">
    # Data Cleaning and Validation Expert
    Specialist in detecting and correcting data quality issues...
  </card>
  <card id="ml.sklearn-modeling" title="Scikit-learn Modeling and Evaluation">
    ...
  </card>
  <card id="viz.matplotlib-seaborn" title="Visualization with Matplotlib and Seaborn">
    ...
  </card>
</context>

Only the relevant experts are injected — the rest of the 100+ card catalog stays out of the prompt.

Integrations

Platform Method Status Docs
Claude Code MCP server Supported Setup guide
Claude Desktop MCP server Supported Setup guide
Codex Skill bundle Supported Setup guide

Claude MCP Server

The easiest way to run it is via uvx (see "One-line MCP install" above). For local development:

pip install -e .[mcp]
skillmesh-mcp

The server auto-discovers the registry: env var SKILLMESH_REGISTRY → repo root → bundled registry.

Exposes five tools via MCP:

  • route_with_skillmesh(query, top_k) — provider-formatted context block
  • retrieve_skillmesh_cards(query, top_k) — structured JSON payload
  • list_skillmesh_roles(catalog?, registry?) — full role list with installed status
  • list_installed_skillmesh_roles(catalog?, registry?) — installed roles only
  • install_skillmesh_role(role, catalog?, registry?, dry_run?) — install by id or friendly name (for example Data-Analyst)

Copy-ready config templates in examples/mcp/.

Codex Skill Bundle

$skill-installer install https://github.com/varunreddy/SkillMesh/tree/main/skills/skillmesh

Direct role commands in SkillMesh:

skillmesh roles
skillmesh roles list
skillmesh Data-Analyst install
skillmesh roles install Data-Analyst

Or via installed bundle wrapper:

~/.codex/skills/skillmesh/scripts/roles.sh
~/.codex/skills/skillmesh/scripts/roles.sh list
~/.codex/skills/skillmesh/scripts/roles.sh install --role-id role.data-engineer

Quickstart

Install

python -m venv .venv && source .venv/bin/activate
pip install -e .[dev]

Optional extras:

pip install -e .[dense]   # Dense reranking with sentence-transformers
pip install -e .[mcp]     # Claude MCP server

Retrieve top-K cards

skillmesh retrieve \
  --registry examples/registry/tools.json \
  --query "set up nginx reverse proxy with SSL" \
  --top-k 3

Emit provider-ready context

skillmesh emit \
  --provider claude \
  --registry examples/registry/tools.json \
  --query "deploy container to GCP Cloud Run" \
  --top-k 5

Role Quickstart

List available role cards:

skillmesh roles list --catalog examples/registry/tools.json

Install a role by friendly name (adds missing dependencies):

skillmesh roles install Data-Analyst \
  --catalog examples/registry/tools.json \
  --registry ~/.codex/skills/skillmesh/installed.registry.yaml

Dry-run an install to preview what will be added:

skillmesh roles install AWS-Engineer \
  --catalog examples/registry/tools.json \
  --registry ~/.codex/skills/skillmesh/installed.registry.yaml \
  --dry-run

MCP equivalent (tool call):

install_skillmesh_role(role="Data-Analyst", catalog="examples/registry/tools.json", dry_run=false)

Curated Registries

Use domain-specific registries for tighter routing:

Registry Domain Cards
tools.json / tools.yaml Full catalog 154
ml-engineering.registry.yaml ML training & evaluation 33
data-engineering.registry.yaml Pipelines & data platforms 14
bi-analytics.registry.yaml BI & dashboards 21
devops.registry.yaml DevOps & infrastructure 18
web-apis.registry.yaml API design & patterns 11
cloud-gcp.registry.yaml Google Cloud Platform 13
cloud-bi.registry.yaml Cloud BI 17
roles.registry.yaml Role orchestrators 11
skillmesh emit \
  --provider claude \
  --registry examples/registry/devops.registry.yaml \
  --query "configure prometheus alerting and grafana dashboards" \
  --top-k 3

Benchmarking

Use the reproducible benchmark template:

CLI Commands

Command Description
skillmesh retrieve Top-K retrieval payload (JSON)
skillmesh fetch Alias for retrieve (supports free-text query shorthand)
skillmesh emit Provider-formatted context block
skillmesh index Index registry into Chroma for persistent retrieval
skillmesh roles wizard Interactive role picker and installer
skillmesh roles list List available role cards from a catalog
skillmesh roles install Install role card + missing dependency cards into target registry
skillmesh role Alias for roles
skillmesh-mcp Stdio MCP server for Claude

skillmesh retrieve/MCP payloads include invocation in OpenAI function-tool format for every card.

Repository Layout

src/skill_registry_rag/
├── models.py          # Tool/role card models
├── registry.py        # Registry loading + validation
├── retriever.py       # BM25 + optional dense retrieval
├── adapters/          # Provider formatters (codex, claude)
└── cli.py             # skillmesh CLI

examples/registry/
├── tools.json         # Full tool catalog
├── tools.yaml         # YAML version of full catalog
├── instructions/      # Expert instruction files (90+)
├── roles/             # Role orchestrator files
└── *.registry.yaml    # Domain-specific registries

skills/skillmesh/      # Codex-installable skill

Contributing

See CONTRIBUTING.md for how to add expert cards, create registries, and submit PRs.

Troubleshooting

skillmesh: command not found

Missing registry path

The CLI and MCP server auto-discover the registry. If auto-discovery fails, pass --registry or set:

export SKILLMESH_REGISTRY=/path/to/tools.json
# or pass --registry on every command

skillmesh-mcp fails to start

Codex does not detect new skill

Restart Codex after running $skill-installer.

Development

ruff check src tests
pytest

License

MIT — see LICENSE.


If SkillMesh helps your team, please star the repo — it directly improves discoverability and helps others find the project.

Fuente: GitHub