Class ContextWindowManager

java.lang.Object
ai.tabforge.workshop.orchestrator.ContextWindowManager

public class ContextWindowManager extends Object
Enforces token budget constraints for each sub-agent API call.

Every Claude API call has two costs that share the same 200,000-token context window:

   INPUT  = system prompt + file content
   OUTPUT = JSON findings array returned by Claude
   INPUT + OUTPUT must be < 200,000 tokens
 

This class answers two questions before every analyzeFile() call:

  1. How many output tokens can we safely request? (computeMaxOutputTokens(String, String))
  2. Is this file too large for one call — and if so, how do we split it? (chunkFile(String, String))

Token estimation uses the 4-characters-per-token approximation. This is not exact (real tokenizers depend on the model and language), but it is accurate enough for Java source code and avoids a dependency on a tokenizer library.

CERTIFICATION NOTE — Context Management & Reliability (15% of exam): This class directly implements the "token budget management" pattern tested in Domain 5. The two key ideas: (1) estimate input cost before making a call, (2) chunk content that would exceed the budget rather than letting the API reject the request. Both appear in the exam as reliability patterns for production agentic systems.

  • Constructor Details

    • ContextWindowManager

      public ContextWindowManager()
  • Method Details

    • estimateTokens

      public int estimateTokens(String text)
      Estimates the token count for any string. Uses the 4-chars-per-token approximation — fast and dependency-free.
      Parameters:
      text - system prompt, file content, or any other string
      Returns:
      estimated token count; 0 for null or empty input
    • computeMaxOutputTokens

      public int computeMaxOutputTokens(String systemPrompt, String fileContent)
      Returns the safe maxTokens value to pass to MessageCreateParams.builder().maxTokens(...).

      Formula:

         inputTokens = estimate(systemPrompt) + estimate(fileContent)
         available   = (CONTEXT_WINDOW - inputTokens) * SAFETY_FACTOR
         result      = min(available, MAX_OUTPUT_TOKENS)
       

      If available &lt;= 0 the content is too large for one call. The caller should invoke chunkFile(String, String) first.

      Parameters:
      systemPrompt - the agent's system prompt (result of buildPrompt())
      fileContent - the Java source file to be analyzed
      Returns:
      safe max output tokens; 0 means the file must be chunked before calling
    • getDefaultMaxOutputTokens

      public int getDefaultMaxOutputTokens()
      The default maxOutputTokens used when actual file content is not yet known — for example, when OrchestratorAgent builds an AgentContext before the agent has read the files.

      This value is conservative: a JSON findings array rarely exceeds 4,000 tokens, so 8,096 gives comfortable headroom without over-allocating.

      Returns:
      MAX_OUTPUT_TOKENS
    • chunkFile

      public List<String> chunkFile(String fileContent, String systemPrompt)
      Splits file content into chunks that each fit within the context window alongside the given system prompt.

      When computeMaxOutputTokens(String, String) returns 0, the file is too large for one API call. This method splits it on line boundaries (never mid-line) so each chunk can be analyzed independently. The results are merged by the caller.

      Chunking strategy:

         maxContentTokens = (CONTEXT_WINDOW - promptTokens - MAX_OUTPUT_TOKENS) * SAFETY_FACTOR
         split on newlines until each chunk stays within maxContentTokens
       

      CERTIFICATION NOTE — Context Management & Reliability (15%): File chunking is the standard solution when a codebase file exceeds the context window. The exam tests whether you know to split on semantic boundaries (lines, methods) rather than raw character offsets, and to merge partial results after each chunk is processed.

      Parameters:
      fileContent - the full content of the file to split
      systemPrompt - the agent's system prompt (consumed from the budget)
      Returns:
      list of chunks — single-element list if no chunking was needed; each chunk is guaranteed to fit in one API call