Real-time tracking and proactive management of token consumption during AI chat sessions. Provides continuous usage updates, threshold warnings, and automatic preservation strategies to prevent work loss from unexpected session limits.

Recipe Name: RCP-001-001-005-TOKEN-MONITOR – Intelligent Token Usage Monitor with Velocity Tracking

RCP-001-001-005-TOKEN-MONITOR

Monitors conversation token usage with weighted calculations

and velocity tracking. Provides warnings at 75% and 85%

thresholds to help plan handoffs before reaching limits.

Includes capacity planning for upcoming tasks and on-demand

status reporting. Essential for managing long CRAFT sessions.

Multi-Recipe Combo Stage Single Recipe

Recipe Category CFT-FWK-COOKBK-CORE – CRAFT CORE Cookbook

Recipe Subcategory Blogging with A.I., Brainstorming with A.I.

Recipe Difficulty Easy

Recipe Tags: Foundational | Introduced in the POC

Requirements

Any AI Chat Platform (platform-agnostic recipe) Any of the following: Claude (Anthropic), ChatGPT (OpenAI), Gemini (Google), Grok (X.ai), Perplexity, Microsoft Copilot

How To Start

A Note From The Author of CRAFT

▢
After hundreds (perhaps thousands) of hours of using these recipes, I rarely need to use any of the CORE Cookbook recipes aside from Recipes RCP-001-001-002-HANDOFF-SNAPSHOT and RCP-001-001-002-HANDOFF-SNAPSHOT, but when I do, they are essential to the functioning of CRAFT. Also, the A.I. reads all of these recipes at the start of each session. This happens quietly in the background. Even though you may never need to call the recipe, the A.I. will know all of them and it helps the A.I. to understand what CRAFT is and how it works.
Even if you rarely need to use these recipes, they are still working for you and are essential to the CRAFT Framework.

STEP 1: UNDERSTAND TOKEN MONITORING MODES

▢
This recipe operates in three modes:
AUTOMATIC: Silent monitoring with threshold warnings
ON_DEMAND: User-requested status report
BEFORE_TASK: Capacity check before large tasks

STEP 2: CALCULATE WEIGHTED TOKEN USAGE

▢
Different content types consume tokens at different rates:
CONTENT TYPE WEIGHTS:
– Simple exchanges (Q&A): 1x weight
– Code blocks: 2x weight
– Test results/detailed outputs: 3x weight
– Tables/structured data: 2.5x weight
– Long analyses/reports: 3x weight
CALCULATION:
Weighted_Tokens = Sum(Content_Units x Weight x Base)

STEP 3: UNDERSTAND THRESHOLDS

▢
The recipe uses these warning thresholds:
TYPICAL CHAT LIMIT: ~100,000 tokens
75% THRESHOLD: ~75,000 tokens (plan handoff)
85% THRESHOLD: ~85,000 tokens (execute handoff)

STEP 4: CALCULATE VELOCITY

▢
Track token consumption rate for predictions:
VELOCITY METRICS:
– Tokens used in last 5 exchanges
– Average tokens per exchange
– Acceleration pattern (increasing/steady/decreasing)
PREDICTION FORMULA:
Exchanges until 85% = (85 – current%) / (velocity / 1000)

STEP 5: AUTOMATIC MODE WARNINGS

▢
In automatic mode, the AI monitors silently and alerts:
AT 85% (MAXIMUM WARNING):
#AI->H::Caution: (~85% token limit reached)
#AI->H::Status: (Handoff recommended immediately)
#AI->H::Note: (At current velocity: ~X exchanges left)
AT 75% (PLANNING WARNING):
#AI->H::Note: (~75% token limit reached)
#AI->H::Status: (Plan for handoff soon)
#AI->H::Note: (At current velocity: ~X to 85%)
BELOW 75%: No warning (silent operation)

STEP 6: ON-DEMAND REPORTING

▢
When you ask about token status:
#AI->H::Note: (Token usage: ~X% of typical limit)
#AI->H::Note: (Velocity: ~Y tokens/exchange)
#AI->H::Note: (Estimated exchanges remaining: ~Z)
STATUS GUIDANCE:
– Below 75%: Continue normally
– 75-85%: Prepare handoff
– At 85%: Execute handoff now

STEP 7: BEFORE-TASK CAPACITY CHECK

▢
Before large tasks, estimate capacity:
TASK ESTIMATES:
– Develop recipe: ~5% of limit
– Comprehensive analysis: ~10% of limit
– Multiple test cases: ~15% of limit
– Full documentation: ~10% of limit
IF current + task > 75%:
#AI->H::Caution: (~X% used, task needs ~Y%)
#AI->H::RecommendedChange: (Handoff before starting)

How AI Reads This Recipe

When this recipe executes, the AI performs these operations:

1. WEIGHT CALCULATION: Assigns multipliers to different

content types (code 2x, analysis 3x, Q&A 1x).

2. THRESHOLD MONITORING: Compares weighted total against

75% and 85% warning thresholds.

3. VELOCITY TRACKING: Calculates token consumption rate

over recent exchanges and identifies patterns.

4. PREDICTIVE ANALYSIS: Estimates remaining exchanges

until 85% threshold based on current velocity.

5. CONTEXTUAL WARNINGS: Issues appropriate warnings based

on mode (automatic, on-demand, before-task).

6. CAPACITY PLANNING: For before_task mode, estimates

task requirements and checks against remaining capacity.

The AI stays silent below 75% in automatic mode, reducing

noise while ensuring timely warnings when needed.

When to Use This Recipe

This recipe runs automatically in the background during

all CRAFT sessions. Use explicit calls when you want an

on-demand status report or need to check capacity before

starting a large task. Particularly valuable for extended

sessions with complex content generation.

Recipe FAQ

Q: What exactly are tokens and why do they matter?

A: Tokens are the basic units AI models use to process text (roughly 4 characters per token). Each AI session has a maximum token limit. When you reach this limit, the session abruptly ends, potentially losing work. Token monitoring prevents this by tracking usage and warning you before hitting limits.

Q: How can I tell how many tokens I’m using?

A: With token monitoring active, you’ll see periodic updates like “#AI->H::TokenUsage: (Currently at 45%)” in the AI’s responses. The monitor provides updates at key thresholds: 50%, 70%, 85%, and 90%, with increasing urgency as you approach the limit.

Q: What should I do when I get a token warning?

A: At 70%, start planning to wrap up. At 85%, immediately create a handoff to preserve your work. At 90%, the system will auto-initiate emergency handoff procedures. Never ignore high token warnings as the session could end abruptly.

Q: Does token monitoring slow down the conversation?

A: No, token monitoring runs in the background without impacting response speed. The occasional status messages are brief and unobtrusive until you reach warning thresholds where they become more prominent for your protection.

Q: Can I adjust the warning thresholds?

A: Yes, you can customize thresholds when activating the recipe. Default settings (70% warning, 85% critical, 90% auto-handoff) work well for most users, but you can set more conservative thresholds if you prefer earlier warnings.

Q: How accurate are the token estimates?
A: Estimates are approximate. The weighted system provides
better accuracy than raw counts but is not exact.

Q: Why two warning thresholds?
A: 75% gives time to plan and wrap up naturally. 85% is
the maximum warning – handoff should happen immediately.

Q: What is velocity tracking?
A: Velocity measures how fast tokens are being consumed.
Accelerating velocity means less time remaining.

Q: Can I ask for token status anytime?
A: Yes, use on_demand mode or ask “How much token capacity
do we have left?” to trigger a status report.

Actual Recipe Code

(Copy This Plaintext Code To Use)

# ===========================================================

# RECIPE: RCP-001-001-005-TOKEN-MONITOR-v2.00a

# Intelligent Token Usage Monitor with Velocity Tracking

# ===========================================================

TOKEN_MONITOR = Recipe(

recipe_id="RCP-001-001-005-TOKEN-MONITOR-v2.00a",

title="Intelligent Token Usage Monitor",

description="Monitors tokens with weighted tracking",

category="CAT-001-CORE",

difficulty="medium",

version="2.00a",

parameters={

"check_type": {

"type": "string",

"required": True,

"options": [

"automatic",

"on_demand",

"before_task"

"description": "Type of token check"

"upcoming_task": {

"type": "string",

"required": False,

"description": "Task for capacity planning"

"force_report": {

"type": "boolean",

"required": False,

"default": False,

"description": "Force report regardless"

}

prompt_template="""

#H->AI::Directive: (Monitor token usage)

#H->AI::Context: (Check type: {check_type})

# —————————————————

# STEP 0: POLICY PRE-CHECK

# —————————————————

Scan for sensitive categories:

– Platform capabilities/limitations

– Security/vulnerability research

– Personal data handling

– Political topics

IF potential_conflict_detected:

#AI->H::PolicyCaution: (Topic may trigger policies)

#AI->H::RecommendedChange: (Focus on [safe aspect])

# —————————————————

# STEP 1: CALCULATE WEIGHTED TOKEN USAGE

# —————————————————

Count content types with weights:

– Simple exchanges (Q&A): 1x weight

– Code blocks: 2x weight

– Test results/outputs: 3x weight

– Tables/structured data: 2.5x weight

– Long analyses/reports: 3x weight

Formula:

Weighted_Tokens = Sum(Units x Weight x Base_Estimate)

Thresholds:

– Typical limit: ~100,000 tokens

– 75% threshold: ~75,000 tokens

– 85% threshold: ~85,000 tokens (MAX WARNING)

# —————————————————

# STEP 2: CALCULATE VELOCITY

# —————————————————

Track consumption rate:

– Tokens in last 5 exchanges

– Average tokens per exchange

– Acceleration: increasing/steady/decreasing

Velocity prediction:

Exchanges_to_85 = (85 – current%) / (velocity/1000)

# —————————————————

# STEP 3: AUTOMATIC MODE

# —————————————————

IF {check_type} == "automatic":

IF weighted_usage >= 85%:

#AI->H::Caution: (~85% limit – MAX WARNING)

#AI->H::Status: (Handoff recommended NOW)

#AI->H::Note: (Velocity: ~{X} exchanges left)

ELIF weighted_usage >= 75%:

#AI->H::Note: (~75% limit reached)

#AI->H::Status: (Plan handoff soon)

#AI->H::Note: (~{X} exchanges until 85%)

ELSE:

# Silent – no warning needed

# —————————————————

# STEP 4: ON-DEMAND MODE

# —————————————————

IF {check_type} == "on_demand" OR {force_report}:

#AI->H::Note: (Token usage: ~{X}% of limit)

IF show_velocity:

#AI->H::Note: (Velocity: ~{Y} tokens/exchange)

#AI->H::Note: (Exchanges until 85%: ~{Z})

# —————————————————

# STEP 5: BEFORE-TASK MODE

# —————————————————

IF {check_type} == "before_task" AND {upcoming_task}:

Task estimates:

– "develop recipe": ~5%

– "comprehensive analysis": ~10%

– "multiple test cases": ~15%

– "full documentation": ~10%

IF current + task_estimate > 75%:

#AI->H::Caution: (~{X}% used, task needs ~{Y}%)

#AI->H::RecommendedChange: (Handoff first)

#AI->H::Note: (Would reach ~{Z}% of limit)

ELSE:

# Safe – proceed silently

# —————————————————

# STEP 6: VELOCITY PATTERNS

# —————————————————

IF velocity_accelerating AND current > 60%:

#AI->H::Note: (Token use accelerating – monitor)

Status guidance:

– Below 75%: Continue normally

– 75-85%: Prepare handoff

– At 85%: Execute handoff now

#H->AI::OnError: (If unable to estimate, warn at 70%+)

"""

)

# ===========================================================

# END RECIPE: RCP-001-001-005-TOKEN-MONITOR-v2.00a

# ===========================================================

Show Hide