Class ExtractionBuilder<T>
- Type Parameters:
T- the type to extract
Analogy: this is the bridge from the "AI / unstructured" world into your normal
typed-Java world. After .from(...) returns, no AI is involved any more — you hold a
plain Invoice, Order, or Candidate that your existing code,
JPA entities, EJBs, and PrimeFaces forms already know how to handle.
You never construct this directly — start from EasyAI.extract(Class):
record Invoice(String vendor, String invoiceNumber, java.time.LocalDate date,
java.math.BigDecimal total, java.util.List<LineItem> items) {}
// From free text (an email body, a chat message, a description)
Invoice inv = EasyAI.extract(Invoice.class).from(emailBody);
// From a document's bytes - parses (Tika) AND extracts in one call
Invoice inv = EasyAI.extract(Invoice.class)
.from(DocumentSource.of("invoice.pdf", pdfBytes));
// Then it is just data:
em.persist(inv);
if (inv.total().compareTo(LIMIT) > 0) approvalService.require(inv);
Robustness is built in: if the model returns malformed JSON, the extraction retries
(see withRetries(int)); enable validate() to additionally run Jakarta
Bean Validation on the result.
- See Also:
-
Method Summary
Modifier and TypeMethodDescriptionfrom(DocumentSource source) Extracts the target type directly from a document's bytes.Extracts the target type from a plain text string.validate()Enables Jakarta Bean Validation on the extracted object.withApiKey(String apiKey) Overrides the API key for this extraction.withBaseUrl(String baseUrl) Overrides the API base URL (proxies, Azure OpenAI, self-hosted endpoints).withChatModel(dev.langchain4j.model.chat.ChatModel model) Injects an externally createdChatModel, bypassingeasyai.propertiesandEasyAI.configure().withEventListener(EasyAIListener eventListener) Registers a listener that receives a liveEasyAIEventstream as the extraction runs:EasyAIEvent.Phase.STARTEDwhen it begins, aEasyAIEvent.Phase.PROGRESSwhen the model is queried, aEasyAIEvent.Phase.RETRYfor each re-attempt on malformed JSON, and a finalEasyAIEvent.Phase.RESULT(orEasyAIEvent.Phase.ERROR).Overrides the model name for this extraction (e.g.withProvider(String provider) Overrides the provider ("openai"or"ollama") for this extraction.withRetries(int retries) Sets how many additional attempts to make if the model returns malformed JSON.withTemperature(double temperature) Overrides the sampling temperature.
-
Method Details
-
withModel
Overrides the model name for this extraction (e.g."gpt-4o","llama3").- Parameters:
modelName- the model name- Returns:
- this builder
-
withApiKey
Overrides the API key for this extraction.- Parameters:
apiKey- the API key- Returns:
- this builder
-
withProvider
Overrides the provider ("openai"or"ollama") for this extraction.- Parameters:
provider- the provider name- Returns:
- this builder
-
withBaseUrl
Overrides the API base URL (proxies, Azure OpenAI, self-hosted endpoints).- Parameters:
baseUrl- the base URL- Returns:
- this builder
-
withTemperature
Overrides the sampling temperature. Extraction defaults to0.0(deterministic); raise it only if you have a reason to.- Parameters:
temperature- value between 0.0 and 1.0- Returns:
- this builder
-
withRetries
Sets how many additional attempts to make if the model returns malformed JSON.The default is 2 (so up to three calls in total). Set 0 to disable retrying.
- Parameters:
retries- number of retries on unparseable output (must be >= 0)- Returns:
- this builder
-
validate
Enables Jakarta Bean Validation on the extracted object.When enabled, constraints such as
@NotNull,@Size, or@Mindeclared on the target type are checked after extraction; a violation throwsExtractionException. Requires a Bean Validation provider (e.g. Hibernate Validator) on the classpath — present by default in a Jakarta EE container.- Returns:
- this builder
-
withChatModel
Injects an externally createdChatModel, bypassingeasyai.propertiesandEasyAI.configure(). Mainly for testing with a mock model.- Parameters:
model- a pre-built ChatModel instance- Returns:
- this builder
-
withEventListener
Registers a listener that receives a liveEasyAIEventstream as the extraction runs:EasyAIEvent.Phase.STARTEDwhen it begins, aEasyAIEvent.Phase.PROGRESSwhen the model is queried, aEasyAIEvent.Phase.RETRYfor each re-attempt on malformed JSON, and a finalEasyAIEvent.Phase.RESULT(orEasyAIEvent.Phase.ERROR).Familiar analogy: a "your form is being processed" status bar — you see it parse, stumble, retry, and finally hand you the finished, typed object.
- Parameters:
eventListener- the listener to receive extraction events (may benull)- Returns:
- this builder
- See Also:
-
from
Extracts the target type from a plain text string.Terminal step of the
EasyAI.extract(Type.class).from(...)chain. Resolves the model, then delegates toExtractionEngine.extract(ChatModel, Class, String, int, boolean).- Parameters:
text- the source content (email body, message, description, etc.)- Returns:
- a populated instance of the target type
- Throws:
ExtractionException- if extraction (or validation, if enabled) fails
-
from
Extracts the target type directly from a document's bytes.Parses the document (PDF, DOCX, TXT, ... via Apache Tika) into text using
RagEngine.parseDocumentSources(List)and then extracts from that text — so parsing and extraction happen in a single call. Ideal for a PDF/DOCX pulled from a DMS, a database BLOB, or a user upload.- Parameters:
source- the document content + file name (extension drives parsing)- Returns:
- a populated instance of the target type
- Throws:
ExtractionException- if the document yields no text, or extraction/validation fails
-