TL;DR
What It Does
Monitors token usage throughout your session using weighted tracking and velocity detection. Logs estimates to a session-scoped file and integrates with the COM Extension (CWK-012) for automatic threshold-based triggers. Extends the Core Token Monitor with persistent logging and Cowork-specific capacity planning.
How It Works
The AI calculates a weighted token estimate by classifying content types — simple exchanges count at 1x, code blocks at 2x, recipe implementations at 4x, and so on. It tracks consumption velocity (tokens per exchange, acceleration trend, tool-call density) and predicts how many exchanges remain until warning (70%), critical (85%), and auto-handoff (90%) thresholds. At 70%, it activates Late-Session Mode, which triggers a confidence decay penalty in downstream verification recipes. All estimates are logged to a session-scoped file for intra-session diagnostics.
What To Expect
In automatic mode, the monitor runs silently until a threshold is reached — then you see a warning, critical alert, or auto-handoff trigger. On-demand mode gives you a full status report any time. Before-task mode tells you whether an upcoming task fits within your remaining token budget.
Best Results When You
Let automatic mode run in the background. Use on-demand checks before large tasks. Pay attention to the 70% warning — it means Late-Session Mode is active and downstream confidence scoring is now stricter.
Typical Time
Under 1 minute for automatic checks. 1–2 minutes for on-demand or before-task reports including velocity analysis.
Difficulty
Medium — the recipe runs automatically via COM integration. You interact with it when you request on-demand checks or need before-task capacity planning. Understanding the threshold system and Late-Session Mode helps you plan your session effectively.
How To Start
The token monitor operates in three modes. Automatic mode runs via COM integration without you doing anything. On-demand and before-task modes are triggered when you ask for a status check.
STEP 1Choose Your Check Mode
Tell the AI which type of token check you need, or let COM handle it automatically.
Available parameters
-
check_type
· string · required
Type of token check. Options: “automatic,” “on_demand,” “before_task.” Automatic runs silently via COM. On-demand gives a full report. Before-task checks capacity for a specific upcoming task.
-
upcoming_task
· string · optional
Task name for capacity planning. Used with “before_task” mode.
-
force_report
· boolean · optional · default: false
Force a full report regardless of threshold status.
-
log_to_file
· boolean · optional · default: true
Write the token estimate to the session log file.
Invocations
Automatic (COM-Triggered)
#H->AI::Directive: (Monitor token usage — Cowork mode)
On-Demand Status Check
#H->AI::Directive: (Monitor token usage — Cowork mode with check_type=on_demand)
Before-Task Capacity Check
#H->AI::Directive: (Monitor token usage — Cowork mode with check_type=before_task, upcoming_task=develop recipe)
Usage Examples
Three scenarios — automatic threshold detection, an on-demand status check, and a before-task capacity warning.
Example 1 · Automatic mode triggers 70% warning
AI:
[running silently in background via COM]
AI:
Token usage: approximately 70% of session limit. Late-Session Mode is now active — claims routed through CWK-024 receive an automatic -10 confidence penalty unless grounded by a fresh tool observation. Checkpoint recommended.
Example 2 · On-demand status check
User:
How are we on tokens?
AI:
Token usage: approximately 55% of limit. Velocity: roughly 800 tokens per exchange, steady. Approximately 12 exchanges until 70% warning, 24 until 85% critical. Tool-call density: medium.
Example 3 · Before-task capacity warning
User:
Do we have room for a comprehensive analysis?
AI:
Current usage: approximately 62%. A comprehensive analysis typically costs around 10% of the session budget. That would bring us to roughly 72%, crossing the 70% warning threshold. Consider running a checkpoint first.
User:
Good call, let’s checkpoint first.
How AI Reads This Recipe
When this recipe is triggered, the AI runs a 7-step token monitoring cycle with weighted estimation, velocity tracking, threshold detection, and file-based logging. The AI should:
- CALCULATE weighted token usage by classifying content types: simple exchanges (1x), code blocks (2x), structured data (2.5x), test results and tool outputs (3x), long analyses (3x), recipe implementations (4x), file reads (2x), bash executions (1.5x).
- CALCULATE velocity: tokens in the last 5 exchanges, average per exchange, acceleration trend (increasing, steady, or decreasing), and tool-call density. Predict exchanges remaining until 70% and 85% thresholds.
- IN AUTOMATIC MODE: trigger warnings at 70% (activate Late-Session Mode, fire COM checkpoint recommendation), alerts at 85% (recommend handoff), and auto-handoff at 90% (execute CWK-002). The LATE_SESSION_MODE flag persists for the remainder of the session and causes CWK-024 to apply a -10 confidence penalty on ungrounded claims.
- IN ON-DEMAND MODE: report current usage percentage, velocity, exchanges to warning and critical thresholds, and tool-call density.
- IN BEFORE-TASK MODE: estimate the cost of the upcoming task using Cowork-specific estimates (e.g., “develop recipe” ~5%, “comprehensive analysis” ~10%, “sub-agent delegation” ~12%). Warn if the task would cross a threshold.
- LOG the estimate to a session-scoped file with timestamp, check type, usage percentage, velocity, acceleration, tool-call density, exchanges to thresholds, upcoming task, and action taken. The log does not persist across sessions.
- INTEGRATE with CWK-012 COM Extension: set flags that CWK-012’s auto-trigger rules act on. Do not call CWK-003 directly — COM handles checkpoint triggering.
The Late-Session Mode flag is critical — once set at 70%, it does not reset for the remainder of the session. CWK-024 checks this flag in its scoring logic. If unable to estimate token usage, warn at 60%+ as a safety margin.
When to Use This Recipe
Use this recipe when you:
- Want to know how much session capacity remains before starting a large task.
- Need automatic warnings as you approach token limits so you can checkpoint and hand off gracefully.
- Want velocity-based predictions of how many exchanges you have left at your current pace.
- Need the Late-Session Mode flag set so downstream verification recipes apply appropriate confidence penalties.
Do not use this recipe when:
You are in the first 10 exchanges of a fresh session and nowhere near any threshold. Running manual token checks this early wastes time — let automatic mode handle it. On-demand and before-task checks become valuable after you have accumulated enough context for meaningful velocity calculation (roughly 10+ exchanges).
Recipe FAQ
Q.What are the weighted content multipliers?
Simple exchanges: 1x. Code blocks: 2x. Structured data: 2.5x. Test results and tool outputs: 3x. Long analyses and reports: 3x. File reads (Read tool): 2x. Bash executions: 1.5x. Recipe implementations: 4x.
Q.What is Late-Session Mode?
A flag activated when token usage reaches 70%. Once active, it persists for the rest of the session. CWK-024 (Verification Gate) uses this flag to apply an automatic -10 confidence penalty on claims that are not grounded by a fresh tool observation from the current step. This guards against context compaction degrading earlier observations.
Q.Does the log file persist across sessions?
No. The session-scoped log file resets when the VM resets between sessions. The handoff captures final token state. The log is for intra-session diagnostics and COM pattern detection only.
Q.What Cowork-specific task estimates are built in?
Develop recipe: approximately 5%. Comprehensive analysis: 10%. Multiple test cases: 15%. Full documentation: 10%. Implement recipe: 8%. Sub-agent delegation (CWK-008): 12%. Roadmap update: 6%. CLAUDE.md generation (CWK-007): 4%. Git checkpoint plus push: 2%. Handoff creation: 5%.
Q.How does this recipe interact with CWK-012 COM Extension?
This recipe sets threshold flags that CWK-012’s auto-trigger rules act on. When automatic mode detects 70%, COM fires CWK-003 (mid-session checkpoint), logs the trigger, and updates handoff metadata. This recipe does not call CWK-003 directly.
Q.What happens at the auto-handoff threshold?
At 90%, the recipe triggers an error-level alert and directs immediate execution of CWK-002 (session handoff). This is the hard ceiling — the session should be wrapped up at this point.
Version History
Changes to this recipe over time. Most recent first.
v1.01a
2026-03-15
Added LATE_SESSION_MODE flag: activates at 70% threshold, persists for remainder of session, triggers -10 confidence decay in CWK-024 for ungrounded claims. Added tool-call density tracking. Added audit trail logging per R-12.
v1.00a
2026-02-28
Initial release. Extends Core Token Monitor with Cowork-specific features: 7-step monitoring cycle, weighted content classification, velocity tracking with acceleration, three check modes (automatic, on-demand, before-task), session-scoped file logging, CWK-012 COM integration for threshold-aware auto-triggers. Cowork-specific task cost estimates for capacity planning.