dyntabs.ai.rag.RagEngine

public final class RagEngine extends Object

RAG engine that loads documents, splits them, embeds them, and provides a ContentRetriever for use with AI assistants. Uses LangChain4J easy-rag module which includes Tika for PDF/DOCX and a local embedding model.

Method Summary

Modifier and Type

Method

Description

static dev.langchain4j.rag.content.retriever.ContentRetriever

createRetriever(EasyRAG ragAnnotation)

Creates a ContentRetriever based on the EasyRAG annotation.

static dev.langchain4j.rag.content.retriever.ContentRetriever

createRetriever(String[] sources, int maxResults, double minScore)

Creates a ContentRetriever programmatically.

static dev.langchain4j.rag.content.retriever.ContentRetriever

createRetriever(List<DocumentSource> documentSources, int maxResults, double minScore)

Creates a ContentRetriever from in-memory document sources.

static List<dev.langchain4j.data.document.Document>

loadDocuments(String[] sources)

Loads documents from path-based sources (classpath, file system, or relative paths) into LangChain4J Documents.

static List<dev.langchain4j.data.document.Document>

parseDocumentSources(List<DocumentSource> sources)

Parses in-memory document sources (byte arrays) into LangChain4J Documents using Apache Tika.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Method Details
- createRetriever
  
  public static dev.langchain4j.rag.content.retriever.ContentRetriever createRetriever(EasyRAG ragAnnotation)
  
  Creates a ContentRetriever based on the EasyRAG annotation.
- createRetriever
  
  public static dev.langchain4j.rag.content.retriever.ContentRetriever createRetriever(String[] sources, int maxResults, double minScore)
  
  Creates a ContentRetriever programmatically.
- createRetriever
  
  public static dev.langchain4j.rag.content.retriever.ContentRetriever createRetriever(List<DocumentSource> documentSources, int maxResults, double minScore)
  
  Creates a ContentRetriever from in-memory document sources.
  Use this when documents come from a DMS, database, REST API, or any source that provides content as byte[].
  
  Parameters:
  
  documentSources - the documents as byte arrays
  
  maxResults - maximum relevant segments to retrieve
  
  minScore - minimum relevance score (0.0 to 1.0)
  
  Returns:
  
  a configured ContentRetriever
- parseDocumentSources
  
  public static List<dev.langchain4j.data.document.Document> parseDocumentSources(List<DocumentSource> sources)
  
  Parses in-memory document sources (byte arrays) into LangChain4J Documents using Apache Tika.
  Shared loader: called by the in-memory RAG path here in RagEngine and by the Milvus ingestion path (EasyIndexer.index(DocumentSource...)), so byte-array documents are parsed identically whether they end up in memory or in Milvus.
  
  Parameters:
  
  sources - the documents as byte arrays (from a DMS, DB BLOB, upload, etc.)
  
  Returns:
  
  the parsed documents; any source that fails to parse is logged and skipped
- loadDocuments
  
  public static List<dev.langchain4j.data.document.Document> loadDocuments(String[] sources)
  
  Loads documents from path-based sources (classpath, file system, or relative paths) into LangChain4J Documents.
  Shared loader: called by the annotation/programmatic RAG paths here in RagEngine and by the Milvus ingestion path (EasyIndexer.index(String...)), so a "classpath:", "file:", or bare path string resolves the same way regardless of destination.
  
  Parameters:
  
  sources - one or more paths, each optionally prefixed with classpath: or file:
  
  Returns:
  
  the loaded documents; any source that fails to load is logged and skipped

Class RagEngine

Method Summary

Methods inherited from class java.lang.Object

Method Details

createRetriever

createRetriever

createRetriever

parseDocumentSources

loadDocuments