Class ContextWindowManager
Every Claude API call has two costs that share the same 200,000-token context window:
INPUT = system prompt + file content OUTPUT = JSON findings array returned by Claude INPUT + OUTPUT must be < 200,000 tokens
This class answers two questions before every analyzeFile() call:
- How many output tokens can we safely request?
(
computeMaxOutputTokens(String, String)) - Is this file too large for one call — and if so, how do we split it?
(
chunkFile(String, String))
Token estimation uses the 4-characters-per-token approximation. This is not exact (real tokenizers depend on the model and language), but it is accurate enough for Java source code and avoids a dependency on a tokenizer library.
CERTIFICATION NOTE — Context Management & Reliability (15% of exam): This class directly implements the "token budget management" pattern tested in Domain 5. The two key ideas: (1) estimate input cost before making a call, (2) chunk content that would exceed the budget rather than letting the API reject the request. Both appear in the exam as reliability patterns for production agentic systems.
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionSplits file content into chunks that each fit within the context window alongside the given system prompt.intcomputeMaxOutputTokens(String systemPrompt, String fileContent) Returns the safemaxTokensvalue to pass toMessageCreateParams.builder().maxTokens(...).intestimateTokens(String text) Estimates the token count for any string.intThe defaultmaxOutputTokensused when actual file content is not yet known — for example, whenOrchestratorAgentbuilds anAgentContextbefore the agent has read the files.
-
Constructor Details
-
ContextWindowManager
public ContextWindowManager()
-
-
Method Details
-
estimateTokens
Estimates the token count for any string. Uses the 4-chars-per-token approximation — fast and dependency-free.- Parameters:
text- system prompt, file content, or any other string- Returns:
- estimated token count; 0 for null or empty input
-
computeMaxOutputTokens
Returns the safemaxTokensvalue to pass toMessageCreateParams.builder().maxTokens(...).Formula:
inputTokens = estimate(systemPrompt) + estimate(fileContent) available = (CONTEXT_WINDOW - inputTokens) * SAFETY_FACTOR result = min(available, MAX_OUTPUT_TOKENS)
If
available <= 0the content is too large for one call. The caller should invokechunkFile(String, String)first.- Parameters:
systemPrompt- the agent's system prompt (result ofbuildPrompt())fileContent- the Java source file to be analyzed- Returns:
- safe max output tokens; 0 means the file must be chunked before calling
-
getDefaultMaxOutputTokens
public int getDefaultMaxOutputTokens()The defaultmaxOutputTokensused when actual file content is not yet known — for example, whenOrchestratorAgentbuilds anAgentContextbefore the agent has read the files.This value is conservative: a JSON findings array rarely exceeds 4,000 tokens, so 8,096 gives comfortable headroom without over-allocating.
- Returns:
MAX_OUTPUT_TOKENS
-
chunkFile
Splits file content into chunks that each fit within the context window alongside the given system prompt.When
computeMaxOutputTokens(String, String)returns 0, the file is too large for one API call. This method splits it on line boundaries (never mid-line) so each chunk can be analyzed independently. The results are merged by the caller.Chunking strategy:
maxContentTokens = (CONTEXT_WINDOW - promptTokens - MAX_OUTPUT_TOKENS) * SAFETY_FACTOR split on newlines until each chunk stays within maxContentTokens
CERTIFICATION NOTE — Context Management & Reliability (15%): File chunking is the standard solution when a codebase file exceeds the context window. The exam tests whether you know to split on semantic boundaries (lines, methods) rather than raw character offsets, and to merge partial results after each chunk is processed.
- Parameters:
fileContent- the full content of the file to splitsystemPrompt- the agent's system prompt (consumed from the budget)- Returns:
- list of chunks — single-element list if no chunking was needed; each chunk is guaranteed to fit in one API call
-