My Claude Code Was Wasting 63K Tokens Per Session. Here's How I Fixed It.
Background
Here’s a fun fact I discovered last week: every time I opened Claude Code, 153KB of markdown files were silently injected into my system prompt — before I even typed a single word. Blogging plugins. Note-taking plugins. Fourteen workflow skills I rarely used outside my main projects.
That’s 63K tokens gone before the conversation even starts. Over 20 sessions a day? Over a million tokens burned on nothing.
The fix took six lines of JSON. Here’s the full breakdown.
背景
上周发现一个离谱的事实:每次打开 Claude Code,153KB 的 markdown 文件 就被悄悄塞进系统提示词 —— 我一字没写,token 先烧了。博客插件、笔记插件、十四个我非主力项目很少用的 workflow skill。
每次对话 63K tokens 起步蒸发。一天 20 次?超过百万 token 白白烧掉。
修复方案只用了六行 JSON。以下是完整拆解。
1. The Problem
Before touching anything, I needed to understand where tokens were going.
find ~/.claude/plugins/cache -name "SKILL.md" -exec wc -c {} \;
claude --debug --debug-file /tmp/debug.txt --print "hello"The debug output revealed the full picture:
| Metric | Before |
|---|---|
| Enabled plugins | 4 |
| Plugin skills loaded | 18 |
| Skill file total size | 153 KB |
| SessionStart hook context | 5,430 chars |
| Effort level | max |
| Models used | All pro[1m] |
| Plugin | Skills | Size |
|---|---|---|
| Superpowers | 14 | 108 KB |
| Obsidian | 3 | 41 KB |
| Frontend-Design | 1 | 4 KB |
That’s ~153KB of markdown instructions loaded into the system prompt before I even typed a message, plus 5,430 chars of hook context. At ~2.5 bytes per token for mixed English/code content, that’s roughly ~63K tokens of overhead per new session.
The principle was simple: load only what you need, only where you need it. Claude Code supports three mechanisms:
- Global disable — set plugins to
falsein~/.claude/settings.json - Project-level settings —
.claude/settings.jsonin project dirs re-enables plugins locally - Runtime flags —
--plugin-dirloads a plugin for one session
| Project | Superpowers | Obsidian | Frontend |
|---|---|---|---|
~/Work/trade/exchange-rs |
Yes | No | No |
~/Work/dex |
Yes | No | No |
~/Documents/.../AI-Brain |
No | Yes | No |
| Everything else | No | No | No |
1. 问题定位
动手之前,先搞清楚 token 花在哪了:
# 统计所有 skill 文件大小
find ~/.claude/plugins/cache -name "SKILL.md" -exec wc -c {} \;
# 通过 debug 日志查看运行时行为
claude --debug --debug-file /tmp/debug.txt --print "hello"Debug 日志给出了精确数据:
| 指标 | 优化前 |
|---|---|
| 已启用插件 | 4 |
| 已加载 skill | 18 |
| Skill 文件总大小 | 153 KB |
| SessionStart 钩子上下文 | 5,430 chars |
| Effort level | max |
| 所有模型 | 全部用 pro[1m] |
18 个 skill 明细:
| 插件 | Skill 数 | 大小 |
|---|---|---|
| Superpowers | 14 | 108 KB |
| Obsidian | 3 | 41 KB |
| Frontend-Design | 1 | 4 KB |
还没敲一个字,153KB markdown 指令 + 5,430 chars hook 上下文已经塞进系统提示词。按每 token 约 2.5 字节算,每次新对话约 ~63K tokens 固定开销。
原则很简单:只加载需要的,只在需要的地方加载。Claude Code 提供三种机制:
- 全局禁用 —
~/.claude/settings.json中将插件设为false - 项目级配置 — 在项目目录创建
.claude/settings.json,仅在该目录启用 - 运行时参数 —
--plugin-dir临时加载
| 项目 | Superpowers | Obsidian | Frontend |
|---|---|---|---|
~/Work/trade/exchange-rs |
✅ | ❌ | ❌ |
~/Work/dex |
✅ | ❌ | ❌ |
~/Documents/.../AI-Brain |
❌ | ✅ | ❌ |
| 其他所有项目 | ❌ | ❌ | ❌ |
2. The Fix
Global settings (~/.claude/settings.json) — disable 3 plugins, lower effort, stratify models:
- "superpowers@superpowers-marketplace": true,
- "obsidian@obsidian-skills": true,
- "frontend-design@claude-plugins-official": true,
+ "superpowers@superpowers-marketplace": false,
+ "obsidian@obsidian-skills": false,
+ "frontend-design@claude-plugins-official": false,
- "CLAUDE_CODE_EFFORT_LEVEL": "max",
+ "CLAUDE_CODE_EFFORT_LEVEL": "high",
- "ANTHROPIC_SMALL_FAST_MODEL": "deepseek-v4-pro[1m]",
+ "ANTHROPIC_SMALL_FAST_MODEL": "deepseek-v4-flash",
Project-level overrides — re-enable only where needed:
# ~/Work/trade/exchange-rs/.claude/settings.json
{ "enabledPlugins": { "superpowers@superpowers-marketplace": true } }
# ~/Work/dex/.claude/settings.json
{ "enabledPlugins": { "superpowers@superpowers-marketplace": true } }
# ~/Documents/LocalKnowledge/AI-Brain/AI-Brain/.claude/settings.json
{ "enabledPlugins": { "obsidian@obsidian-skills": true } }That’s it. Three files, six edits. Rust-analyzer-LSP stays global since it’s a language server with no skill overhead.
2. 方案
全局配置 — 禁用 3 个插件,降低 effort,模型分层:
- "superpowers@superpowers-marketplace": true,
- "obsidian@obsidian-skills": true,
- "frontend-design@claude-plugins-official": true,
+ "superpowers@superpowers-marketplace": false,
+ "obsidian@obsidian-skills": false,
+ "frontend-design@claude-plugins-official": false,
- "CLAUDE_CODE_EFFORT_LEVEL": "max",
+ "CLAUDE_CODE_EFFORT_LEVEL": "high",
- "ANTHROPIC_SMALL_FAST_MODEL": "deepseek-v4-pro[1m]",
+ "ANTHROPIC_SMALL_FAST_MODEL": "deepseek-v4-flash",
项目级覆盖 — 哪里需要哪里开:
# ~/Work/trade/exchange-rs/.claude/settings.json
{ "enabledPlugins": { "superpowers@superpowers-marketplace": true } }
# ~/Work/dex/.claude/settings.json
{ "enabledPlugins": { "superpowers@superpowers-marketplace": true } }
# ~/Documents/.../AI-Brain/.claude/settings.json
{ "enabledPlugins": { "obsidian@obsidian-skills": true } }就这些。三个文件,六行改动。Rust-analyzer-LSP 保留全局——它是语言服务器,没有 skill 开销。
3. Results
To verify the optimization actually worked, I ran controlled tests using --debug:
# Test in a neutral directory (no project settings)
cd /tmp && claude --debug --debug-file /tmp/after-debug.txt --print "hello"Comparing debug outputs before and after:
| Metric | Before (Mar 19) | After (May 5) |
|---|---|---|
| Enabled plugins | 4 | 1 |
| Plugin skills loaded | 18 | 0 |
| Skill files in prompt | 153 KB | 0 KB |
| Hook context | 5,430 chars | 0 chars |
| Effort level | max | high |
| Small model | pro[1m] | flash |
The key debug log line:
# Before: getSkills returning: ... 18 plugin skills ...
# After: getSkills returning: ... 0 plugin skills ...| Metric | Before | After | Savings |
|---|---|---|---|
| Plugin skills loaded | 18 | 0 | -18 |
| Skill files in prompt | 153 KB | 0 | -153 KB |
| Est. tokens / session | ~63K | 0 | ~63K |
| Hook context | 5,430 chars | 0 | -5,430 |
| Effort level | max | high | thinking ↓30-50% |
| Small model | pro[1m] | flash | faster + cheaper |
The optimization is transparent for active projects — enter the directory and Claude Code automatically picks up the local settings.json, loading exactly the plugins needed. For one-off tasks: claude --plugin-dir <path>.
3. 效果
用 --debug 做受控对比:
# 在中性目录测试(无项目配置)
cd /tmp && claude --debug --debug-file /tmp/after-debug.txt --print "hello"Debug 日志对比:
| 指标 | 优化前 (3/19) | 优化后 (5/5) |
|---|---|---|
| 启用插件 | 4 | 1 |
| 已加载 skills | 18 | 0 |
| Skill 文件注入 | 153 KB | 0 KB |
| Hook 上下文 | 5,430 chars | 0 chars |
| Effort level | max | high |
| 轻量模型 | pro[1m] | flash |
Debug 关键行:
# 优化前: getSkills returning: ... 18 plugin skills ...
# 优化后: getSkills returning: ... 0 plugin skills ...| 指标 | 优化前 | 优化后 | 节省 |
|---|---|---|---|
| 已加载 skills | 18 | 0 | -18 |
| Skill 文件注入 | 153 KB | 0 | -153 KB |
| 每会话估算 token | ~63K | 0 | ~63K |
| Hook 上下文 | 5,430 chars | 0 | -5,430 |
| Effort level | max | high | thinking ↓30-50% |
| 轻量模型 | pro[1m] | flash | 更快更省 |
主力项目完全无感——进入对应目录,Claude Code 自动读取项目级 settings.json,按需加载插件。临时使用:claude --plugin-dir <path>。
4. Takeaways
- Profile first:
find ... -name "SKILL.md"revealed 153KB hidden overhead. Can’t optimize what you can’t measure. - Lazy-load plugins: Project-level
settings.jsonscopes plugins to specific directories. Same pattern as direnv / asdf. - Tune effort level:
maxdramatically increases thinking tokens.highis enough for most tasks — savemaxfor hard problems. - Stratify models: Not every operation needs the flagship model. Use
flashvariants for background tasks. - Numbers compound: ~63K tokens saved per session × 20 sessions = 1.26M tokens per day.
4. 要点
- 先剖析: 一行
find ... -name "SKILL.md"就暴露了 153KB 隐藏开销。不能度量就无法优化。 - 按需加载: 项目级
settings.json实现插件按目录隔离,和 direnv / asdf 模式一样。 - 调优 effort:
max大幅增加思考 token。high对多数任务够用——难题才用max。 - 模型分层: 后台操作用
flash变体,旗舰模型留给关键任务。 - 积少成多: 每次省 ~63K,每天 20 次 = 1.26M tokens,省出整个上下文窗口。