Building RAG Pipelines
Retrieval-Augmented Generation (RAG) retrieves the most relevant stored content at query time and passes it as context to an LLM. ApertureDB handles retrieval: embed the query, run KNN search against your stored Descriptors, return top-k results.
- RAG Chain from a Website — crawl, chunk, embed, and answer questions with a local LLM
- RAG Chain from Wikipedia — 600k+ paragraph corpus, Cohere embeddings, LangChain retrieval
- Website Chatbot Workflow — no-code: crawl, embed, and deploy a chatbot via the Workflows UI
Reranking with MMR
Maximal Marginal Relevance (MMR) diversifies results so you don't get near-duplicate matches back. It is implemented in the Python SDK's Descriptors class and is available through both the SDK directly and through the LangChain integration.
from aperturedb.Descriptors import Descriptors
from aperturedb.CommonLibrary import create_connector
import numpy as np
client = create_connector()
descriptors = Descriptors(client)
query_vector = np.array([...], dtype="float32") # your query embedding
descriptors.find_similar_mmr(
set="my_text_index",
vector=query_vector,
k_neighbors=5, # number of results to return
fetch_k=50, # candidates fetched before MMR reranking
lambda_mult=0.5 # 0.0 = maximum diversity, 1.0 = similarity only
)
results = descriptors.response # _distance not available; MMR uses vectors internally
The Wikipedia RAG example uses MMR reranking in a full end-to-end pipeline.
LangChain RAG Pipeline
ApertureDB is available as a LangChain vector store. The example below connects to an existing DescriptorSet and builds a RAG chain with a local LLM. To populate the DescriptorSet first, see Text and Document Embeddings or ingest a website.
pip install -U aperturedb langchain langchain-community langchain-core gpt4all
from langchain_community.vectorstores import ApertureDB
from langchain_community.embeddings import GPT4AllEmbeddings
from langchain_community.llms import GPT4All
from langchain_core.prompts import PromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser
embeddings = GPT4AllEmbeddings(model_name="all-MiniLM-L6-v2.gguf2.f16.gguf")
vectorstore = ApertureDB(embeddings=embeddings, descriptor_set="text_search")
# MMR retriever: fetch 20 candidates, return top 4 with diversity
retriever = vectorstore.as_retriever(
search_type="mmr",
search_kwargs={"k": 4, "fetch_k": 20}
)
prompt = PromptTemplate.from_template(
"Use the following documents to answer the question. "
"If you don't know the answer, say so.\n\n"
"Context: {context}\nQuestion: {question}\nAnswer:"
)
llm = GPT4All(model="Meta-Llama-3-8B-Instruct.Q4_0.gguf", allow_download=True)
rag_chain = (
{"context": retriever | (lambda docs: "\n\n".join(d.page_content for d in docs)),
"question": RunnablePassthrough()}
| prompt | llm | StrOutputParser()
)
answer = rag_chain.invoke("What data types does ApertureDB support?")
print(answer)
See the full LangChain Integration guide for setup, similarity search, and metadata filtering.
LlamaIndex RAG Pipeline
ApertureDB integrates with LlamaIndex via ApertureDBVectorStore. See the LlamaIndex Integration guide for a complete example.
Graph-Enhanced RAG
Because embeddings and graph connections live in the same database, you can scope retrieval to a subgraph — for example, restricting results based on user access permissions:
- Secure RAG with Realm Labs and ApertureDB
- Agentic RAG with HuggingFace SmolAgents
- How to Improve RAG Accuracy with AIMon, ApertureDB, and LlamaIndex
What's Next
- Vector Search — KNN, hybrid search, and setup
- Text and Document Embeddings — sentence-transformers, Cohere, PDF chunking
- PDF Notebook — store a PDF blob, chunk, embed, and search
- MCP Server workflow — expose ApertureDB as a retrieval tool for AI agents