Building RAG Pipelines

Retrieval-Augmented Generation (RAG) retrieves the most relevant stored content at query time and passes it as context to an LLM. ApertureDB handles retrieval: embed the query, run KNN search against your stored Descriptors, return top-k results.

Runnable Examples

RAG Chain from a Website — crawl, chunk, embed, and answer questions with a local LLM
RAG Chain from Wikipedia — 600k+ paragraph corpus, Cohere embeddings, LangChain retrieval
Website Chatbot Workflow — no-code: crawl, embed, and deploy a chatbot via the Workflows UI

Reranking with MMR

Maximal Marginal Relevance (MMR) diversifies results so you don't get near-duplicate matches back. It is implemented in the Python SDK's Descriptors class and is available through both the SDK directly and through the LangChain integration.

from aperturedb.Descriptors import Descriptors
from aperturedb.CommonLibrary import create_connector
import numpy as np

client = create_connector()
descriptors = Descriptors(client)

query_vector = np.array([...], dtype="float32")   # your query embedding

descriptors.find_similar_mmr(
    set="my_text_index",
    vector=query_vector,
    k_neighbors=5,    # number of results to return
    fetch_k=50,       # candidates fetched before MMR reranking
    lambda_mult=0.5   # 0.0 = maximum diversity, 1.0 = similarity only
)
results = descriptors.response   # _distance not available; MMR uses vectors internally

The Wikipedia RAG example uses MMR reranking in a full end-to-end pipeline.

LangChain RAG Pipeline

ApertureDB is available as a LangChain vector store. The example below connects to an existing DescriptorSet and builds a RAG chain with a local LLM. To populate the DescriptorSet first, see Text and Document Embeddings or ingest a website.

pip install -U aperturedb langchain langchain-community langchain-core gpt4all

from langchain_community.vectorstores import ApertureDB
from langchain_community.embeddings import GPT4AllEmbeddings
from langchain_community.llms import GPT4All
from langchain_core.prompts import PromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser

embeddings = GPT4AllEmbeddings(model_name="all-MiniLM-L6-v2.gguf2.f16.gguf")
vectorstore = ApertureDB(embeddings=embeddings, descriptor_set="text_search")

# MMR retriever: fetch 20 candidates, return top 4 with diversity
retriever = vectorstore.as_retriever(
    search_type="mmr",
    search_kwargs={"k": 4, "fetch_k": 20}
)

prompt = PromptTemplate.from_template(
    "Use the following documents to answer the question. "
    "If you don't know the answer, say so.\n\n"
    "Context: {context}\nQuestion: {question}\nAnswer:"
)

llm = GPT4All(model="Meta-Llama-3-8B-Instruct.Q4_0.gguf", allow_download=True)

rag_chain = (
    {"context": retriever | (lambda docs: "\n\n".join(d.page_content for d in docs)),
     "question": RunnablePassthrough()}
    | prompt | llm | StrOutputParser()
)

answer = rag_chain.invoke("What data types does ApertureDB support?")
print(answer)

See the full LangChain Integration guide for setup, similarity search, and metadata filtering.

LlamaIndex RAG Pipeline

ApertureDB integrates with LlamaIndex via ApertureDBVectorStore. See the LlamaIndex Integration guide for a complete example.

Graph-Enhanced RAG

Because embeddings and graph connections live in the same database, you can scope retrieval to a subgraph — for example, restricting results based on user access permissions:

What's Next

Vector Search — KNN, hybrid search, and setup
Text and Document Embeddings — sentence-transformers, Cohere, PDF chunking
PDF Notebook — store a PDF blob, chunk, embed, and search
MCP Server workflow — expose ApertureDB as a retrieval tool for AI agents

Reranking with MMR​

LangChain RAG Pipeline​

LlamaIndex RAG Pipeline​

Graph-Enhanced RAG​

What's Next​

Reranking with MMR

LangChain RAG Pipeline

LlamaIndex RAG Pipeline

Graph-Enhanced RAG

What's Next