Metacoder
A unified interface for command line AI coding assistants (claude code, gemini-cli, codex, goose, qwen-coder)
# Use default coder
metacoder "Write a Python function to calculate fibonacci numbers" -w my-scripts/
...
# list coders
metacoder list-coders
Available coders:
✅ goose
✅ claude
✅ codex
✅ gemini
✅ qwen
✅ dummy
# With a specific coder
metacoder "Write a Python function to calculate fibonacci numbers" -c claude -w my-scripts/
...
# With custom instructions
metacoder "Refactor this code" -c claude --instructions coding_guidelines.md
...
# Using MCPs
metacoder "Fix issue 1234" -w path/to/my-repo --mcp-collection github_mcps.yaml
...
# Using coders for scientific QA, with a literature search MCP
metacoder "what diseases are associated with ITPR1 mutations" --mcp-collection lit_search_mcps.yaml
...
Why Metacoder?
Each AI coding assistant has its own:
- Configuration format
- Command-line interface
- Working directory setup
- Means of configuring MCPs
Metacoder provides a single interface to multiple AI assistants. This makes it easier to:
- switch between agent tools in GitHub actions pipelines
- perform matrixed evaluation of different agents and/or MCPs on different tasks
One of the main use cases for metacoder is evaluating semantic coding agents, see:
Mungall, C. (2025, July 22). Open Knowledge Bases in the Age of Generative AI (BOSC/BOKR Keynote) (abridged version). Intelligent Systems for Molecular Biology 2025 (ISMB/ECCB2025), Liverpool, UK. Zenodo. https://doi.org/10.5281/zenodo.16461373
Mungall, C. (2025, May 28). How to make your KG interoperable: Ontologies and Semantic Standards. NIH Workshop on Knowledge Networks, Rockville. Zenodo. https://doi.org/10.5281/zenodo.15554695
Features
- Unified CLI for all supported coders
- Consistent configuration format (YAML-based)
- Custom instructions support via
--instructions
flag - Unified MCP configuration
- Standardized working directory management
MCPs
Using the builtin MCP registry
metacoder run -r metacoder.scilit -e artl "summarize PMID:28027860"
🤖 Using coder: goose
📁 Working directory: ./workdir
📚 Loading MCPs from registry: metacoder.scilit
Registry MCPs: pdfreader, artl, biomcp, simple-pubmed
✅ MCP: artl
🚀 Running prompt: summarize PMID:28027860
==================================================
📊 RESULTS
==================================================
📝 Result:
The research paper titled "From nocturnal frontal lobe epilepsy to Sleep-Related Hypermotor Epilepsy: A 35-year diagnostic challenge", coded under the PubMed ID 28027860, is authored by Paolo Tinuper and Francesca Bisulli from the IRCCS Institute of Neurological Sciences, Bologna, Italy.
This paper discusses the diagnosis of nocturnal frontal lobe epilepsy (NFLE), a focal epilepsy with seizures primarily during sleep. Initially described as a motor disorder of sleep named nocturnal paroxysmal dystonia (NPD), clinicians have found it challenging to distinguish NPD from other non-epileptic nocturnal paroxysmal events like parasomnias due to unusual seizure semiology, onset during sleep, and often uninformative EEG and MRI.
In 1990, the epileptic origin of the attacks was established, leading to the introduction of the term NFLE. The diagnostic difficulties persisted, prompting a Consensus Conference in Bologna, Italy in 2014 to establish criteria. Key points of consensus elucidated the association of the seizures with sleep (not the circadian pattern), and the possible extrafrontal origin of the seizures.
The consensus meeting led to renaming the syndrome as Sleep-Related Hypermotor Epilepsy (SHE). The keywords associated with this paper include Epilepsy, Parasomnias, Nocturnal Frontal Lobe Epilepsy, Video-polysomnography, and Seizures During Sleep. The paper was published in 2017 in the journal "Seizure", and as of the last update, it was cited 34 times.
[Link to the paper](https://doi.org/10.1016/j.seizure.2016.11.023) (Subscription required)
📋 Tool uses:
✅ artl__get_europepmc_paper_by_id with arguments: {'identifier': '28027860'}
Evaluation Framework
Metacoder includes a comprehensive evaluation framework for systematically testing and comparing AI coders, MCPs, and models.
Example evaluation configuration:
name: pubmed tools evals
description: Testing coders with PubMed MCP integration
coders:
claude: {}
goose: {}
models:
gpt-4o:
provider: openai
name: gpt-4
servers:
pubmed:
name: pubmed
command: uvx
args: [mcp-simple-pubmed]
env:
PUBMED_EMAIL: user@example.com
cases:
- name: "title"
metrics: [CorrectnessMetric]
input: "What is the title of PMID:28027860?"
expected_output: "From nocturnal frontal lobe epilepsy to Sleep-Related Hypermotor Epilepsy: A 35-year diagnostic challenge"
threshold: 0.9
Getting Started
- Installation and Setup
- Supported Coders
- Configuration Guide
- MCP Support - Extend your AI coders with additional tools
- Evaluations - Test and compare AI coders