Blog

MCP Is Overengineered, Skills Are Too Primitive

mcp skills nix architecture

There are two things broken in the agent tooling ecosystem right now and they’re almost opposites of each other.

MCP stdio takes a perfectly good CLI tool, wraps it in JSON-RPC, and adds a process boundary for no clear benefit. Meanwhile, skill systems like skills.sh and ClawHub give you a markdown file that says “use ffmpeg” but doesn’t actually install ffmpeg.

One is overengineered. The other is underengineered. Both miss the point.


MCP HTTP is genuinely good

I want to be clear about this: MCP over HTTP/SSE is a great protocol. It’s basically OpenAPI for agents.

One URL gives you auth discovery, a list of available tools, and execution endpoints. The agent doesn’t need to know how GitHub’s OAuth flow works or how to refresh a token. The MCP server handles that. The agent calls SearchRepositories and gets results.

This matters a lot for sandboxing. In Lobu, workers run in containers with no direct internet access. They can’t curl external APIs. They never see OAuth tokens. MCP servers sit behind the gateway proxy, which injects the right credentials per-user. The worker has no idea this is happening.

For third-party APIs like GitHub, Google, Linear, Slack: MCP HTTP/SSE is the right abstraction. It keeps secrets out of the sandbox and gives agents a clean interface to external services.


MCP stdio is solving a problem that doesn’t exist

Then there’s MCP stdio.

The pitch: wrap a CLI tool in a JSON-RPC server. The agent sends a JSON message, the server parses it, shells out to the CLI, parses the output, wraps it back in JSON, and returns it. Want to run gh pr list? Don’t just run it. Instead, install a GitHub MCP stdio server, connect over stdin/stdout, and call list_pull_requests.

But every serious agent runtime already gives agents Bash access. The agent can just run the command directly.

# MCP stdio path
Agent → JSON-RPC → MCP Server → spawns `gh pr list` → parses output → JSON-RPC → Agent
# Direct path
Agent → Bash → `gh pr list` → done

You’re adding latency, a process boundary, a serialization layer, and a new failure mode. The CLI already has a perfectly good interface. It takes arguments and returns text. Agents are good at that.

The ecosystem seems to agree. The MCP servers that actually get adoption are HTTP/SSE servers for external services: Slack, GitHub APIs, databases, cloud storage. The stdio servers wrapping CLI tools are mostly proof-of-concepts.

Our approach in Lobu: install CLI tools directly via Nix, let the agent call them with Bash. Reserve MCP for external services where you actually need credential proxying.

# No MCP stdio server needed
nixConfig:
packages: [git, gh]
# Agent just runs: gh pr list --repo owner/repo

Over 100,000 packages on Nixpkgs. Every CLI tool you can think of. Installed declaratively, cached across container restarts, no Docker rebuild.


Skills are just prompt text

Now the other half.

Look at skills.sh. It’s a directory of SKILL.md files you can install into Claude Code, Cursor, Copilot, and other agents. You search, find a skill, run npx skillsadd owner/repo, and it drops a markdown file into your project. ClawHub does the same for the OpenClaw ecosystem. The discovery experience is genuinely good.

But what is a skill in these systems? A markdown file with instructions. “Use ffmpeg to process video.” “Run ripgrep for code search.” “Use the GitHub API to manage pull requests.”

That’s it. The skill says “use ffmpeg” but doesn’t install ffmpeg. It says “call the Cloudflare API” but doesn’t declare what domains the sandbox should allow. It gives the agent instructions but no guardrails and no dependencies.

It’s a package.json with no dependencies field. Just a README that says “make sure you have Node 18 installed.”

You can’t hand someone a skill and guarantee it works the same way. You can’t audit what it actually needs. You can’t enforce least-privilege. The skill is text. The rest is left to whoever sets up the environment.


What we built: skills as manifests

A Lobu skill is a SKILL.md with YAML frontmatter that declares everything the agent needs. The runtime reads the manifest and provisions the environment before the agent starts.

Here’s a video processing skill:

---
name: Video Processor
description: Process and transcode video files with metadata extraction
nixConfig:
packages: [ffmpeg, jq, mediainfo]
mcpServers:
cloud-storage:
url: https://storage-mcp.example.com/mcp
type: sse
networkConfig:
allowedDomains:
- storage.googleapis.com
- api.cloudflare.com
toolsConfig:
allowedTools: [Read, Write, Bash]
deniedTools: [DeleteFile]
---
# Video Processor
You process video files using ffmpeg. For every video task:
1. Use `mediainfo` to inspect the input file first
2. Use `ffmpeg` for transcoding, trimming, and format conversion
3. Use `jq` to parse and transform metadata JSON
4. Upload results via the cloud-storage MCP server
Never delete source files.

nixConfig.packages installs ffmpeg, jq, and mediainfo via Nix. networkConfig.allowedDomains configures the sandbox to only allow those two domains. mcpServers registers a cloud storage MCP server behind the gateway proxy with per-user credential injection. toolsConfig lets the agent read, write, and use Bash, but blocks file deletion. The markdown body gets injected into the agent’s system prompt.

When a user enables this skill, the gateway resolves every section. The worker entrypoint runs nix-shell -p ffmpeg jq mediainfo --command "bun run src/index.ts". All three tools land on $PATH. The network sandbox blocks everything except the two declared domains. The MCP server is available. The agent starts and has no idea any of this happened.

Same SKILL.md. Same environment. Every time.

Nix persistence

First boot pays the install cost. After that packages are cached on the workspace volume (Docker) or PVC (Kubernetes). Survives container restarts and scale-to-zero. For complex environments you can point at a Nix flake instead:

nixConfig:
flakeUrl: "github:user/my-agent-env"

Fully reproducible, pinned dependencies, custom derivations if you need them.

The CLI

Terminal window
$ lobu skills search "video"
video-processor Process and transcode video files [integration]
$ lobu skills add video-processor
Added "video-processor" to lobu.toml

Next time the agent starts it has ffmpeg, jq, mediainfo, cloud storage access, and the right sandbox rules. No manual setup.


MCP is great when you need a trust boundary between the agent and an external service. It’s pointless when the agent already has a shell and you just need to install a CLI tool.

Skills are great when they’re manifests that declare what an agent needs. They’re useless when they’re prompt text hoping the environment has the right tools installed.