Recipe Text Search
This notebook downloads the Cookbook dataset: 20 dishes from multiple cuisines: generates text embeddings from dish descriptions, and runs semantic search and metadata-filtered queries against ApertureDB.
Step 1: Install Dependencies
%pip install --upgrade --quiet aperturedb sentence-transformers pandas python-dotenv
Note: you may need to restart the kernel to use updated packages.
Step 2: Connect to ApertureDB
Option A: ApertureDB Cloud (recommended):
Sign up for a free 30-day trial. Get your key from Connect → Generate API Key, add it to a .env file, and run the cell below.
Option B: Community Edition (local Docker):
Run this in a terminal before starting the notebook:
docker run -d --name aperturedb \
-p 55555:55555 -e ADB_MASTER_KEY=admin -e ADB_FORCE_SSL=false \
aperturedata/aperturedb-community
Then run the Option B config cell instead.
See client configuration options for all connection methods and server setup options for deployment choices.
Option A: ApertureDB Cloud
# Create a .env file in this directory containing:
# APERTUREDB_KEY=your_key_here
from dotenv import load_dotenv
load_dotenv() # loads APERTUREDB_KEY into the environment
# create_connector() reads APERTUREDB_KEY automatically: no further config needed
True
Option B: Community Edition (local Docker)
# Run this cell only if using local Docker
#!adb config create localdb --active \
# --host localhost --port 55555 \
# --username admin --password admin \
# --no-use-ssl --no-interactive
Step 3: Connect and Verify
from aperturedb.CommonLibrary import create_connector
client = create_connector()
response, _ = client.query([{"GetStatus": {}}])
client.print_last_response()
[
{
"GetStatus": {
"info": "OK",
"status": 0,
"system": "ApertureDB",
"version": "0.19.6"
}
}
]
Step 4: Load the Cookbook Dataset
The Cookbook's images.adb.csv has 20 dishes with names, captions, and cuisines. We combine dish_name + caption into a single searchable description.
import pandas as pd
dishes = pd.read_csv("https://raw.githubusercontent.com/aperture-data/Cookbook/refs/heads/main/images.adb.csv")
dishes["description"] = dishes["dish_name"] + " - " + dishes["caption"]
print(f"Loaded {len(dishes)} dishes")
dishes[["dish_name", "food_tags", "description"]].head()
Loaded 20 dishes
| dish_name | food_tags | description | |
|---|---|---|---|
| 0 | rajma chawal | Indian | rajma chawal - Beans with rice |
| 1 | paneer bhurji | Indian | paneer bhurji - Scrambled cottage cheese with ... |
| 2 | moong dal | Indian | moong dal - Yellow petite lentils |
| 3 | Butter chicken | Indian | Butter chicken - Chicken in Creamy tomato base... |
| 4 | porridge | Scottish | porridge - Porridge with fruits of the forrest... |
Step 5: Generate Embeddings
all-MiniLM-L6-v2 is 384-dimensional and runs on CPU: no GPU required.
from sentence_transformers import SentenceTransformer
import numpy as np
model = SentenceTransformer("all-MiniLM-L6-v2")
embeddings = model.encode(dishes["description"].tolist(), normalize_embeddings=True)
print(f"Embedding shape: {embeddings.shape}") # (20, 384)
Warning: You are sending unauthenticated requests to the HF Hub. Please set a HF_TOKEN to enable higher rate limits and faster downloads.
Loading weights: 0%| | 0/103 [00:00<?, ?it/s]
[1mBertModel LOAD REPORT[0m from: sentence-transformers/all-MiniLM-L6-v2
Key | Status | |
------------------------+------------+--+-
embeddings.position_ids | UNEXPECTED | |
Notes:
- UNEXPECTED: can be ignored when loading from different task/architecture; not ok if you expect identical arch.
``````output
Embedding shape: (20, 384)
Step 6: Store Embeddings in ApertureDB
A DescriptorSet is ApertureDB's vector index. Each dish is stored as a Descriptor with metadata for hybrid filtering.
SET_NAME = "recipe_text_search"
client.query([{"AddDescriptorSet": {
"name": SET_NAME,
"dimensions": 384,
"engine": "HNSW",
"metric": "CS",
}}])
client.print_last_response()
[
{
"AddDescriptorSet": {
"status": 0
}
}
]
for _, row in dishes.iterrows():
emb = embeddings[row.name].astype("float32")
client.query([{
"AddDescriptor": {
"set": SET_NAME,
"label": row["food_tags"],
"properties": {
"dish_name": row["dish_name"],
"cuisine": row["food_tags"],
"caption": row["caption"],
"recipe_url": row["Recipe URL"],
},
"if_not_found": {"dish_name": ["==", row["dish_name"]]},
}
}], [emb.tobytes()])
print(f"Stored {len(dishes)} recipe embeddings")
Stored 20 recipe embeddings
Step 7: Search by Natural Language Query
def search_recipes(query_text, k=3, cuisine=None):
query_emb = model.encode([query_text], normalize_embeddings=True)[0].astype("float32")
find = {
"set": SET_NAME,
"k_neighbors": k,
"distances": True,
"results": {"all_properties": True},
}
if cuisine:
find["constraints"] = {"cuisine": ["==", cuisine]}
response, _ = client.query([{"FindDescriptor": find}], [query_emb.tobytes()])
return response[0]["FindDescriptor"].get("entities", [])
results = search_recipes("spicy curry with rice")
print("Top matches for 'spicy curry with rice':")
for r in results:
print(f" {r['dish_name']:<45} [{r['cuisine']}] score={1 - r['_distance']:.3f}")
Top matches for 'spicy curry with rice':
rajma chawal [Indian] score=0.416
butter chicken with special fried rice and assorted naan breads [Indian] score=0.447
won ton soup, chicken chow mein, katsu chicken [Chinese] score=0.538
Step 8: Filter by Cuisine
ApertureDB applies metadata constraints server-side during KNN: results are scoped before being returned.
results = search_recipes("noodles in broth", k=3, cuisine="Japanese")
print("Japanese dishes similar to 'noodles in broth':")
for r in results:
print(f" {r['dish_name']}")
print(f" {r['caption']}")
print(f" {r['recipe_url']}")
Japanese dishes similar to 'noodles in broth':
negi miso ramen
green onion, bean sprouts, pork, noodles
https://japan.recipetineats.com/home-made-miso-ramen/
sushi
nigiri sushi
https://www.masterclass.com/articles/nigiri-recipe
Cleanup
client.query([{"DeleteDescriptorSet": {"with_name": SET_NAME}}])
client.print_last_response()
# To stop the Docker container if you started one:
# !docker stop aperturedb && docker rm aperturedb
[
{
"DeleteDescriptorSet": {
"count": 1,
"status": 0
}
}
]
What's Next?
- Quick Start: load the full Cookbook dataset with CLIP image embeddings and graph connections
- Work with Descriptors: update, delete, and bulk-load embeddings
- Vector RAG: connect ApertureDB to LangChain for question answering
- ApertureDB Cloud: managed instance, free 30-day trial