Category: Agent Skills

  • Your Agent Has Too Many Skills (and It’s Making It Worse)

    The repos winning the agent performance game aren't adding more skills — they're tuning how skills load, compose, and stay out of each other's way.

    We've been watching the skill ecosystem explode. 5,400+ skills in the OpenClaw registry. 1,370+ in Antigravity's installable library. Curated lists with tens of thousands of stars. But the fastest-growing repo in the space right now — everything-claude-code at 145k stars — isn't a skill collection at all. It's a performance optimization system. And that shift tells you where agent development is actually heading.

    The skill bloat problem

    Here's what happens when you install 40 skills into your agent harness: every single one gets parsed into the system prompt or loaded into context on every invocation. Your agent spends tokens reading skill definitions before it even starts thinking about your actual task.

    Worse, skills conflict. A code-review skill that says 'always suggest tests' fights with a rapid-prototype skill that says 'skip tests, ship fast.' Your agent doesn't resolve this gracefully — it hallucinates a middle ground that satisfies neither.

    The everything-claude-code repo treats this as a systems problem, not a content problem. Instead of asking 'what skills should my agent have,' it asks 'how should my agent load context so it performs well?'

    Lazy loading and skill scoping

    The pattern emerging across the top harness repos is straightforward: don't load skills until they're needed, and scope them tightly when you do.

    In Claude Code's native architecture, this is already built in. Skills are invoked by slash command or trigger pattern — they aren't dumped into the system prompt wholesale. But most custom setups ignore this and stuff everything into CLAUDE.md or the root prompt.

    The fix is a two-layer approach. Layer one: a lightweight skill index that tells the agent what's available, with one-line descriptions. Layer two: full skill definitions that only load when the agent matches a trigger. This is exactly how OpenViking (21k stars) handles it — they call it 'hierarchical context delivery,' but it's really just lazy loading for agent instructions.

    REGISTRY.md — A minimal skill registry for any agent harness

    # .skills/REGISTRY.md
    # Skill index — agent reads this, loads full skill on match
    
    ## Available Skills
    
    | Trigger | Skill | Path | When to load |
    |---------|-------|------|--------------|
    | /review | Code Review | .skills/review.md | User requests code review |
    | /test | Test Writer | .skills/test-writer.md | User asks for tests or TDD |
    | /deploy | Deploy Guard | .skills/deploy.md | Pre-push or deploy context |
    | /refactor | Refactor Coach | .skills/refactor.md | User asks to restructure code |
    | /security | Security Audit | .skills/security.md | Touching auth, input handling, or secrets |
    
    ## Loading Rules
    - Load at most 2 skills per task
    - If skills conflict, prefer the one matching the explicit trigger
    - Never load a skill that wasn't triggered or contextually matched
    - Always unload skills between tasks
    
    ## Skill Template
    Each skill file in `.skills/` follows this structure:
    
    ```
    ---
    name: Skill Name
    trigger: /command or context pattern
    conflicts_with: [other-skill-names]
    priority: 1-10
    ---
    
    <skill instructions here>
    ```

    The skill index pattern

    Here's the concrete pattern you can steal today. Instead of pasting full skill files into your agent's root config, create a skill registry that maps triggers to file paths. The agent reads the index, recognizes which skill applies, and loads only that skill's full definition.

    This keeps your base context small, your agent focused, and your skills composable without conflicts.

    Why this matters now

    Context windows are getting bigger, but that doesn't mean you should fill them with instructions. Every token of skill definition is a token not spent on reasoning about your actual problem. The teams running production agent workflows — the ones building harness optimization systems instead of skill collections — figured this out months ago.

    The awesome-openclaw-skills repo (44k stars) recently added filtering and categorization for its 5,400+ skills. That's not because people want to browse — it's because people need to pick the right 5-10 skills, not install all 5,400.

    Antigravity's collection ships with bundles — pre-composed skill sets for common workflows. agent-skill-creator lets you build skills that declare their own trigger conditions. MagicSkills turns scattered SKILL.md files into composable, tool-ready capabilities. The entire ecosystem is converging on the same insight: less loaded context, better agent performance.

    The skill ecosystem grew fast. Now it's optimizing. If your agent setup still loads every skill on every run, you're leaving performance on the table — and probably confusing your agent in the process. Pick your five best skills. Build a registry. Let the agent load what it needs, when it needs it. That's the harness engineering pattern that separates agents that feel smart from agents that actually are. Tomorrow: Skill security is becoming a real concern. We look at clawsec and what drift detection means for your SOUL.md.

  • Stop Hand-Coding Big Tasks, Spawn a Coding Agent Instead

    If a task is big enough to need exploration, iteration, or parallel work, you should stop hand-coding it in the main session. This skill gives your agent a clean rule set for when to delegate coding work, how to launch the right coding agent, and how to keep the human informed while it runs. The real value is not just writing code faster, it is avoiding the usual failures: wrong directory, wrong execution mode, and zero visibility once background work starts.

    Save this as `SKILL.md` in your skills folder, then swap `/path/to/project` and the agent command for your actual setup before using it.

    ---
    name: coding-agent
    description: Delegate substantial coding work to a dedicated coding agent, and monitor it through exec/process instead of hand-coding everything in the main session.
    ---
    
    # Coding Agent
    
    Use this skill when the task is large enough to benefit from a dedicated coding agent instead of direct inline edits.
    
    ## Use this for
    - building new features or apps
    - reviewing pull requests in an isolated checkout
    - refactoring large codebases
    - multi-step coding work that needs file exploration and iteration
    
    ## Do not use this for
    - tiny one-line fixes
    - simple file reads
    - thread-bound ACP harness requests in chat
    - any coding agent run inside `~/.openclaw`
    
    ## Execution rules
    - Prefer a dedicated coding agent over manual patching for large coding tasks.
    - For Codex, OpenCode, and Pi, use PTY mode when launching through `exec`.
    - For Claude Code, use `claude --permission-mode bypassPermissions --print` without PTY.
    - Always set a focused `workdir` so the agent stays inside the right project.
    - For long-running work, launch once with `background:true`, then monitor with `process`.
    - Keep the user updated when work starts, when something important changes, and when it finishes.
    
    ## Safe patterns
    
    ### Codex
    ```text
    exec command:"codex exec --full-auto 'Implement the requested feature'" workdir:/path/to/project pty:true
    ```
    
    ### Claude Code
    ```text
    exec command:"claude --permission-mode bypassPermissions --print 'Implement the requested feature'" workdir:/path/to/project
    ```
    
    ### Background run + monitoring
    ```text
    exec command:"codex exec --full-auto 'Refactor the auth flow'" workdir:/path/to/project pty:true background:true
    process action:log sessionId:<session-id>
    process action:poll sessionId:<session-id>
    process action:submit sessionId:<session-id> data:"yes"
    ```
    
    ### Safe PR review in temp checkout
    ```text
    exec command:"codex review --base origin/main" workdir:/tmp/repo-review pty:true
    ```
    
    ## Guardrails
    - Never run a coding agent in the OpenClaw state directory.
    - Never review or mutate a live repo when an isolated temp dir or git worktree is safer.
    - Do not silently take work back over by hand if the user explicitly asked for a coding agent.
    - If the agent stalls or fails, either respawn it or ask the user what they want next.
    
    ## Completion pattern
    When the coding agent finishes, summarize:
    - what changed
    - where it changed
    - whether tests passed
    - what the user should review next
    
  • Instincts: The Agent Layer Nobody Talks About

    Skills tell your agent what to do. Your soul tells it who to be. Instincts tell it how to think — and they might be the most underused lever in your harness.

    Everyone building agents right now has skills and some kind of identity file. But the fastest-growing agent harness repo on GitHub — everything-claude-code, now at 145k stars — ships with a layer most people skip entirely: instincts. Today we break down what they are, why they work, and how to add them to your own setup in under five minutes.

    Skills are too heavy for most behavioral rules

    A SKILL.md file is designed to be a complete, self-contained capability. It has a trigger, a set of instructions, and usually a specific output format. That's perfect for workflows like committing code, generating reports, or deploying.

    But what about rules like 'always check if a file exists before editing it' or 'prefer small diffs over large rewrites'? These aren't skills. They don't have triggers. They don't produce outputs. They're behavioral tendencies that should apply everywhere, all the time.

    If you shove them into SOUL.md, your identity file balloons into a wall of text that mixes personality with operational policy. If you make each one a skill, you end up with dozens of always-on skills competing for context window space. Neither approach scales.

    The instincts pattern: lightweight behavioral nudges

    The pattern that everything-claude-code popularized — and that repos like OpenViking (21k stars) and CowAgent (42k stars) have adopted in their own ways — introduces a dedicated instincts layer. Instincts are short, unconditional behavioral rules that sit between identity and skills in your agent's config hierarchy.

    An instinct is typically one to three sentences. It doesn't have a trigger condition because it's always active. It doesn't have a complex instruction set because it's a nudge, not a workflow. Think of instincts as the equivalent of a senior engineer's gut feelings — the things they do automatically without thinking about it.

    The key architectural insight is separation of concerns. Your SOUL.md stays focused on voice, role, and constraints. Your SKILL.md files stay focused on triggered workflows. And your instincts handle the ambient behavioral layer that makes your agent feel competent rather than just capable.

    INSTINCTS.md — Starter instincts file for any agent harness

    # Instincts
    
    Behavioral rules that apply to every task, every session.
    These are not skills — they have no triggers and no outputs.
    They shape how you work, not what you work on.
    
    ## Safety
    - Never run destructive commands (rm -rf, git push --force, DROP TABLE)
      without explicit user confirmation, even if the task seems to require it.
    - If a file has uncommitted changes, warn before overwriting.
    - Treat any string that looks like a secret (API key, token, password)
      as sensitive — never log it, echo it, or include it in commits.
    
    ## Quality
    - Read before you edit. Never modify a file you haven't seen in this session.
    - Match the existing code style — indentation, naming, patterns — even if
      you'd personally choose something different.
    - Prefer the smallest diff that solves the problem. Don't refactor
      surrounding code unless asked.
    
    ## Efficiency
    - If you need information, check local files and git history before
      reaching for web searches or external APIs.
    - Don't repeat a failed approach without diagnosing why it failed first.
    - When multiple files need the same change, batch the work — don't
      context-switch between reading and editing.

    What good instincts look like

    The best instincts are small enough to fit in your agent's context without meaningful cost, but specific enough to change behavior. 'Be careful' is not an instinct — it's too vague to act on. 'Read a file before editing it' is an instinct — it's concrete, testable, and always applicable.

    From studying the top harness repos and the 5,400+ skills in the OpenClaw Skills Registry (curated by VoltAgent), a pattern emerges: the most effective instincts fall into three categories. Safety instincts prevent destructive actions. Quality instincts enforce standards. Efficiency instincts reduce wasted work.

    Each category maps to a different failure mode. Without safety instincts, agents overwrite files and force-push branches. Without quality instincts, they produce code that works but violates project conventions. Without efficiency instincts, they research the same question three times in one session.

    Adding instincts to your harness today

    You don't need a framework for this. An instincts file is just a markdown file that your agent loads alongside its identity and skills. The format below is what we use internally and what we've seen work across multiple harness setups.

    Drop the file in your project root or your agent's config directory. Reference it from your AGENTS.md or CLAUDE.md so it gets loaded into context on every session. That's it — no build step, no registry, no dependencies.

    Start with five to seven instincts. More than ten and you're probably duplicating what your skills already cover. Fewer than three and you're probably missing obvious safety rails. Review them monthly — instincts that never fire are dead weight, and instincts you keep overriding need to be rewritten or removed.

    Instincts won't make a bad agent good. But they'll make a good agent consistent — and consistency is what separates a tool you trust from a tool you babysit. Copy the file above, tune it to your project, and see how many of those 'why did it do that?' moments disappear. Tomorrow: Skill registries are getting crowded. We look at how VoltAgent filters 5,400+ skills down to the ones that actually matter for your stack.

  • Your Agent Harness Is the Product Now

    How everything-claude-code hit 145k stars by treating skills, memory, and instincts as a single tunable system — and what you can steal from it today.

    Most people think of agent skills as individual files you drop into a folder. The fastest-growing repo in the agent tooling space thinks of them as components in a performance-tunable system. Today we look at what that means for how you structure your own agent setup.

    The shift from skill files to skill systems

    A year ago, the agent skill conversation was about individual files. You had a SKILL.md that taught your agent to run tests, another that handled deployments, maybe a SOUL.md that gave it a personality. Each file was useful on its own. The mental model was a toolbox — reach in, grab what you need.

    That mental model is breaking. The repo everything-claude-code — currently sitting at 145,000 stars — doesn't organize skills as a flat collection. It treats skills, instincts, memory, and security as layers in a single stack. The README doesn't say 'here are some skills.' It says 'here is an agent harness performance optimization system.'

    That framing matters. When you think of your agent setup as a system with tunable layers, you stop asking 'which skill should I add?' and start asking 'where is my bottleneck?'

    What the harness pattern actually looks like

    The pattern that everything-claude-code popularized — and that repos like OpenViking and CowAgent have adopted in their own ways — separates agent configuration into four distinct layers.

    First, identity: your SOUL.md and CLAUDE.md files. These set the baseline behavior the agent defaults to when no skill is active. Second, skills: your SKILL.md files, which are conditional — they activate when a trigger matches. Third, memory: persistent context that survives across sessions. Fourth, instincts: lightweight behavioral nudges that sit between identity and skills, shaping how the agent approaches any task without being full skill definitions.

    The insight is that these layers interact. A skill that works perfectly with one SOUL.md configuration might behave differently with another. Memory from a previous session might make a skill unnecessary — or critical. When you tune these layers together instead of independently, you get compounding improvements.

    AGENTS.md — Minimal harness orchestration layer

    # Agent Harness Config
    
    ## Identity
    - SOUL.md defines baseline voice, role, and constraints
    - All skills inherit from SOUL.md unless they explicitly override
    
    ## Skill Registry
    | Skill | Trigger | Layer |
    |-------|---------|-------|
    | /commit | User invokes `/commit` or asks to commit | workflow |
    | /review-pr | User invokes `/review-pr` or PR number referenced | workflow |
    | /deploy | User says "deploy" or "ship it" | workflow |
    | /test | User says "run tests" or test file modified | reactive |
    | /daily-report | HEARTBEAT.md cron fires at 07:00 | scheduled |
    
    ## Memory Policy
    - Session memory: conversation-scoped, discarded on exit
    - Project memory: saved to `memory/`, survives across sessions
    - Feedback memory: user corrections, highest priority recall
    
    ## Instincts
    - Before any file write: check for existing file, prefer Edit over Write
    - Before any git push: confirm with user
    - After any error: read the error before retrying
    - On ambiguous request: ask one clarifying question, not three
    
    ## Skill Composition Rules
    - Max 2 skills active simultaneously
    - /deploy requires /test to pass first (dependency)
    - /commit suppresses /review-pr (mutual exclusion)
    - Scheduled skills yield to user-invoked skills

    A minimal harness config you can use today

    You don't need to adopt an entire framework to use this pattern. Here's a stripped-down version of the layered approach that works in any Claude Code or OpenClaw project right now. The key is the AGENTS.md file acting as the orchestration layer — it tells the agent which skills exist, when to use them, and how they relate to each other.

    Drop this into your project root and modify the skill references to match your actual workflow. The important part isn't the specific skills listed — it's the structure that makes the agent aware of the full system instead of treating each skill as an island.

    Why the ecosystem is converging on this

    Look at the numbers. VoltAgent's awesome-openclaw-skills has catalogued over 5,400 skills. Antigravity's collection has 1,370+. When you have that many skills available, the problem stops being 'I need a skill for X' and becomes 'how do I compose skills without them conflicting?'

    OpenViking from ByteDance's Volcengine team attacks this from the infrastructure side — it's a context database that unifies memory, resources, and skills through a file system paradigm. Their bet is that hierarchical context delivery is the missing piece. Instead of loading every skill into the agent's context window, you organize them in a tree and deliver only what's relevant for the current task.

    This is the same instinct that drove everything-claude-code to 145k stars. The agent doesn't need all your skills all the time. It needs the right skills at the right time, informed by the right memory, filtered through the right identity layer.

    The practical takeaway

    If your agent setup is a flat folder of SKILL.md files, you're leaving performance on the table. Start by adding an AGENTS.md that maps which skills apply to which contexts. Add a HEARTBEAT.md if your agent runs on a schedule and needs to know what changed since last session. Make your SOUL.md specific enough that skills can assume a baseline behavior instead of re-establishing it every time.

    You're not building a skill collection. You're tuning a system.

    The repos that are winning the star race aren't the ones with the most skills — they're the ones that figured out how skills, memory, identity, and instincts compose into something greater than the sum of their parts. Start with the AGENTS.md above, tune it to your workflow, and watch what happens when your agent stops treating skills as isolated tools and starts treating them as a system. The Daily Skill — Teaching AI new tricks, one skill file at a time.