Claude Skills Have Three Layers. Most People Only Build One.
Prompt-engineering is already obsolete. The new unit of work is the skill — a folder with three layers, only one of which most people bother to build. The leverage lives in the layer they skip.
⊕ zoomThe most popular advice on the internet about Claude is already out of date.
Every tutorial, course, and viral thread is still teaching prompt-engineering — how to phrase a request so the model produces a better answer. That was a useful skill in 2023. It is not the discipline that matters now. The teams getting real leverage from Claude have stopped writing prompts the way most users still do, and have started writing something different.
They write skills.
A skill is not a clever prompt. It is a structured artifact — a folder on disk with three distinct layers, each doing a different job. The shift from prompting to skill-engineering changes what you can build, what compounds, and what a session is even for. And here is the part that nobody is saying out loud: of the three layers that make a skill powerful, almost everyone only ships the first two. The third layer — the one with the actual leverage — gets skipped.
That is why most people's AI workflows plateau.
What a Skill Actually Is
Stop thinking of a skill as a saved prompt. Think of it as a small application.
Every skill contains three layers, and each layer has a specific job that the others cannot do.
Layer 1 — Description. The frontmatter. The label on the folder. This is what the model reads on every single user message to decide whether to invoke this skill at all. Vague descriptions create vague routing. Specific descriptions, with explicit triggers, mean the right skill fires automatically without the user typing a slash command. The description is not metadata. It is the routing logic.
Layer 2 — Instructions. The playbook. Once the skill is invoked, this is the procedure the model follows. This is what most people do ship, because it looks like a prompt — and writing prompts is the only mental model most engineers have.
Layer 3 — Tools. Scripts, API calls, reference files. Deterministic code that the skill can execute. This is the layer that turns a procedural prompt into a real piece of software. It is also the layer almost no one builds.
A prompt is something you type. A skill is something you ship. The difference is whether the third layer exists.
The Layer Almost Everyone Skips
Watch any walkthrough of someone "building a skill" and count how many of them stop after Layer 2. The instructions get written, the markdown looks tidy, the slash command works once, and the author moves on.
The work that should happen next — extracting the deterministic parts into actual scripts — almost never gets done. The result is a skill that asks the model to regenerate the same Python or bash logic on every invocation, paying tokens and accepting non-determinism for work that should have been compiled once.
This is the part that an Anthropic engineer named Eric flagged in a public talk that stuck with me. He observed that people pour enormous effort into "beautiful, detailed prompts" and then ship the tools layer in an absolute shambles — no documentation, parameters named A and B, function signatures no engineer would tolerate in production code. The most visible part of the skill is polished. The part that determines whether the skill scales is treated like a draft.
That mismatch is the whole story. Code is deterministic. Same input, same output, every run. LLM regeneration is not. Same input can produce variant outputs, variant errors, variant token costs. Anytime you catch the model regenerating the same logic inside a skill on repeat invocations, that logic is a candidate for the Tools layer. Move it to a script. Have the skill call the script. You have just traded probabilistic compute for deterministic compute — cheaper, faster, repeatable, testable.
The rule of thumb: if code can do it, code should do it. The model is for judgment. The script is for everything else.
Composability Beats Monoliths
The second mistake almost as common as skipping Layer 3 is building skills that try to do too much.
The temptation is obvious. You want one big skill that handles your whole workflow — research, drafting, formatting, publishing. One slash command, one place to make changes, one mental model. So you write it that way. And for a few weeks it works.
Then it stops working. You want to change how the formatting step behaves and you cannot find the section. You want to reuse the research step inside a different workflow and you cannot, because it is welded to the rest. A small change in one corner of the skill breaks a different corner you forgot about. The "one big skill" pattern collapses under its own surface area.
The right pattern is the inverse. Small, focused, composable skills that each do one thing and chain together. Three reasons this wins:
- Failures localize. When a focused skill breaks, you know exactly where to look. When a monolithic skill breaks, you guess.
- Improvements compound. Upgrade one composable skill and every workflow that uses it gets better automatically. Upgrade one monolithic skill and you have to remember every place its behavior is duplicated.
- Reuse beats rebuild. A focused skill becomes a building block. A monolithic skill becomes a special case.
This is the same lesson software engineers learned about functions in the 1970s and microservices in the 2010s. It applies just as cleanly to skills now.
Two Flags Almost Nobody Uses
There are two invocation controls baked into Claude's skill format that I almost never see used in the wild, and both are important enough that ignoring them is a real mistake.
The first is making a skill invisible to the user. You can set a skill so it never appears in the slash menu — only sub-agents can invoke it. This is the right configuration for internal plumbing. Helper skills that exist to be called by other skills do not belong in your slash menu, where they create noise and surface area for the user to misuse them.
The second is the inverse — making a skill invisible to the model. You can set a skill so the model cannot invoke it, only the human can. This is the right configuration for anything destructive or expensive. Deploys to production. Sending a message to an external channel. Placing a real-money order. The model has no business firing those autonomously. The flag is not a nice-to-have. It is structural safety that does not rely on your own discipline holding.
Discipline that depends on remembering is not discipline. It is luck on a timer.
Most skill authors never touch either flag, so every skill in their library defaults to both human-invocable and model-invocable. That is fine for read-only diagnostic skills. It is dangerous for anything that writes to the world.
The Compounding Loop
Here is where skill-engineering pulls ahead of prompt-engineering in a way that can never be reversed.
A prompt is ephemeral. When the session ends, the prompt ends with it. The next session starts from zero — the same blank context, the same instructions to re-type, the same edge cases to re-explain. You can be writing the world's best prompts and still not be accumulating anything.
A skill persists. The skill file lives on disk, gets version-controlled, and travels with the project. And every time you use the skill, you get a chance to sharpen it. After every run, the question to ask is simple:
If the fix is forever, you write it into the skill — a new rule, a new edge case, a new example. The next session starts smarter than the last. The skill becomes a record of every lesson the prior runs taught you. This is what Anthropic's team means when they say their goal is for Claude on day thirty of working with you to be measurably better than Claude on day one. That outcome does not happen because the model improved. It happens because the skill improved.
Most people skip this step entirely. They run a skill, get an output, fix the output by hand, and move on. The fix never makes it back into the skill. The same correction gets repeated next week. The skill never compounds.
The fix is mechanical. After any non-trivial skill invocation where the output needed adjustment, ask the model directly: review the back-and-forth we just had after running this skill — can we update the skill so this is handled automatically and we don't make the same mistake twice? The model is good at this. It already has the full context. It will propose a precise edit. You review, accept, commit. The skill is now permanently smarter.
That is the loop. It is small. It is boring. It is what separates an AI workflow that compounds from one that just runs.
The Real Shift
Prompt-engineering treated the prompt as the unit of work. That framing is now wrong.
The unit of work is the skill — a folder, three layers, version-controlled, composable with other skills, sharpened after every use. The model is one layer of that skill, not the whole stack. Treating the model as the whole stack is the cognitive error behind most of the "AI isn't living up to the hype" complaints I still hear from operators. Of course the model alone disappoints you. The model alone is one of three layers, and not even the most important one.
The leverage is in the skill — and inside the skill, the leverage is in the layer most people skip.
Build the third layer. Ship composable skills. Use both invocation flags. Run the compounding loop. None of these are technical moats — they are practice moats, and almost no one is practicing.
That is the work now. Not prompt-engineering. Skill-engineering. Different unit, different artifact, different ceiling.
New to Claude Code? The Getting Started track takes you from zero to your first project in 8 lessons. 8 lessons.
Start the Getting Started with Claude Code track →Explore the Invictus Labs Ecosystem
Follow the Signal
If this was useful, follow along. Daily intelligence across AI, crypto, and strategy — before the mainstream catches on.

Your Claude Code Sessions Are Stateless. Your Engineering Discipline Shouldn't Be.
Every Claude Code session starts from zero — no memory of your standards, gates, or the three bugs that bit you last sprint. The Skills Library changes that. 19 slash commands. Institutional discipline, without the briefing.

Judgment Debt: The Hidden Cost of Agentic AI
AI coding agents don't just autocomplete — they plan, delegate, and decide. Most engineers haven't noticed the threshold they already crossed.

The State Preservation Problem: Why Hot Reload in Live Trading Is Harder Than It Looks
Most systems can afford a restart. A trading bot executing real capital every 5 minutes cannot. Here's the architecture that makes zero-downtime deployment possible without losing state, missing signals, or risking capital mid-update.