dyntabs.ai.extract.ExtractionEngine

public final class ExtractionEngine extends Object

The extraction assembly line: prompt the model with a JSON skeleton, read its answer, parse it into the target type, and (optionally) validate it — retrying if the model's first answer is not valid JSON.

Analogy: RagEngine is to retrieval what ExtractionEngine is to structured output — the low-level worker that ExtractionBuilder delegates to once the caller has chosen a type and options. Think of it as a factory line: SchemaDescriber stamps the blank form, the model fills it in, Gson presses it into a Java object, and a quality-control step (retry + optional Bean Validation) rejects defective parts.

It is provider-agnostic: instead of relying on a provider-specific JSON response mode, it instructs the model to emit JSON and then tolerantly extracts the JSON object from the reply (stripping markdown fences or stray prose). This keeps it working across OpenAI, Groq, Ollama, and any other LangChain4J ChatModel.

See Also:

Method Summary

Modifier and Type

Method

Description

static <T> T

extract(dev.langchain4j.model.chat.ChatModel model, Class<T> type, String content, int maxRetries, boolean validate)

Runs the full extraction for one piece of content.

static <T> T

extract(dev.langchain4j.model.chat.ChatModel model, Class<T> type, String content, int maxRetries, boolean validate, EventEmitter emitter)

Same as extract(ChatModel, Class, String, int, boolean), but additionally narrates its progress to the given EventEmitter.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Method Details
- extract
  
  public static <T> T extract(dev.langchain4j.model.chat.ChatModel model, Class<T> type, String content, int maxRetries, boolean validate)
  
  Runs the full extraction for one piece of content.
  Called by ExtractionBuilder.from(String) after the builder has resolved the model and options.
  
  Type Parameters:
  
  T - the target type
  
  Parameters:
  
  model - the chat model to query (real or a test mock)
  
  type - the class to extract (record or POJO)
  
  content - the source text to extract from
  
  maxRetries - how many additional attempts to make if the model returns unparseable JSON (0 = a single attempt)
  
  validate - whether to run Jakarta Bean Validation on the result
  
  Returns:
  
  a populated instance of type
  
  Throws:
  
  ExtractionException - if no valid JSON could be parsed within the retries, or if validation is enabled and the result is invalid
- extract
  
  public static <T> T extract(dev.langchain4j.model.chat.ChatModel model, Class<T> type, String content, int maxRetries, boolean validate, EventEmitter emitter)
  
  Same as extract(ChatModel, Class, String, int, boolean), but additionally narrates its progress to the given EventEmitter.
  Called by ExtractionBuilder.from(String). Emits a STARTED event up front, a PROGRESS event when the model is queried, a RETRY event for each re-attempt on malformed JSON, and a terminal RESULT (success) or ERROR event. The emitter is a no-op when no listener was registered, so this path is free when nobody is observing.
  
  Type Parameters:
  
  T - the target type
  
  Parameters:
  
  model - the chat model to query (real or a test mock)
  
  type - the class to extract (record or POJO)
  
  content - the source text to extract from
  
  maxRetries - how many additional attempts on unparseable JSON (0 = a single attempt)
  
  validate - whether to run Jakarta Bean Validation on the result
  
  emitter - the live-event emitter (never null; pass a no-op emitter to disable)
  
  Returns:
  
  a populated instance of type
  
  Throws:
  
  ExtractionException - if no valid JSON could be parsed within the retries, or if validation is enabled and the result is invalid

Class ExtractionEngine

Method Summary

Methods inherited from class java.lang.Object

Method Details

extract

extract