Recipe Text Search

This notebook downloads the Cookbook dataset: 20 dishes from multiple cuisines: generates text embeddings from dish descriptions, and runs semantic search and metadata-filtered queries against ApertureDB.

Step 1: Install Dependencies

%pip install --upgrade --quiet aperturedb sentence-transformers pandas python-dotenv

Note: you may need to restart the kernel to use updated packages.

Step 2: Connect to ApertureDB

Option A: ApertureDB Cloud (recommended):
Sign up for a free 30-day trial. Get your key from Connect → Generate API Key, add it to a .env file, and run the cell below.

Option B: Community Edition (local Docker):
Run this in a terminal before starting the notebook:

docker run -d --name aperturedb \
  -p 55555:55555 -e ADB_MASTER_KEY=admin -e ADB_FORCE_SSL=false \
  aperturedata/aperturedb-community

Then run the Option B config cell instead.

See client configuration options for all connection methods and server setup options for deployment choices.

Option A: ApertureDB Cloud

# Create a .env file in this directory containing:
#   APERTUREDB_KEY=your_key_here
from dotenv import load_dotenv
load_dotenv()  # loads APERTUREDB_KEY into the environment
# create_connector() reads APERTUREDB_KEY automatically: no further config needed

True

Option B: Community Edition (local Docker)

# Run this cell only if using local Docker
#!adb config create localdb --active \
#    --host localhost --port 55555 \
#    --username admin --password admin \
#    --no-use-ssl --no-interactive

Step 3: Connect and Verify

from aperturedb.CommonLibrary import create_connector

client = create_connector()
response, _ = client.query([{"GetStatus": {}}])
client.print_last_response()

[
    {
        "GetStatus": {
            "info": "OK",
            "status": 0,
            "system": "ApertureDB",
            "version": "0.19.6"
        }
    }
]

Step 4: Load the Cookbook Dataset

The Cookbook's images.adb.csv has 20 dishes with names, captions, and cuisines. We combine dish_name + caption into a single searchable description.

import pandas as pd

dishes = pd.read_csv("https://raw.githubusercontent.com/aperture-data/Cookbook/refs/heads/main/images.adb.csv")
dishes["description"] = dishes["dish_name"] + " - " + dishes["caption"]

print(f"Loaded {len(dishes)} dishes")
dishes[["dish_name", "food_tags", "description"]].head()

Loaded 20 dishes

	dish_name	food_tags	description
0	rajma chawal	Indian	rajma chawal - Beans with rice
1	paneer bhurji	Indian	paneer bhurji - Scrambled cottage cheese with ...
2	moong dal	Indian	moong dal - Yellow petite lentils
3	Butter chicken	Indian	Butter chicken - Chicken in Creamy tomato base...
4	porridge	Scottish	porridge - Porridge with fruits of the forrest...

Step 5: Generate Embeddings

all-MiniLM-L6-v2 is 384-dimensional and runs on CPU: no GPU required.

from sentence_transformers import SentenceTransformer
import numpy as np

model = SentenceTransformer("all-MiniLM-L6-v2")
embeddings = model.encode(dishes["description"].tolist(), normalize_embeddings=True)
print(f"Embedding shape: {embeddings.shape}")   # (20, 384)

Warning: You are sending unauthenticated requests to the HF Hub. Please set a HF_TOKEN to enable higher rate limits and faster downloads.

Loading weights:   0%|          | 0/103 [00:00<?, ?it/s]

[1mBertModel LOAD REPORT[0m from: sentence-transformers/all-MiniLM-L6-v2
Key                     | Status     |  | 
------------------------+------------+--+-
embeddings.position_ids | UNEXPECTED |  | 

Notes:
- UNEXPECTED:	can be ignored when loading from different task/architecture; not ok if you expect identical arch.
``````output
Embedding shape: (20, 384)

Step 6: Store Embeddings in ApertureDB

A DescriptorSet is ApertureDB's vector index. Each dish is stored as a Descriptor with metadata for hybrid filtering.

SET_NAME = "recipe_text_search"

client.query([{"AddDescriptorSet": {
    "name": SET_NAME,
    "dimensions": 384,
    "engine": "HNSW",
    "metric": "CS",
}}])
client.print_last_response()

[
    {
        "AddDescriptorSet": {
            "status": 0
        }
    }
]

for _, row in dishes.iterrows():
    emb = embeddings[row.name].astype("float32")
    client.query([{
        "AddDescriptor": {
            "set": SET_NAME,
            "label": row["food_tags"],
            "properties": {
                "dish_name":  row["dish_name"],
                "cuisine":    row["food_tags"],
                "caption":    row["caption"],
                "recipe_url": row["Recipe URL"],
            },
            "if_not_found": {"dish_name": ["==", row["dish_name"]]},
        }
    }], [emb.tobytes()])

print(f"Stored {len(dishes)} recipe embeddings")

Stored 20 recipe embeddings

Step 7: Search by Natural Language Query

def search_recipes(query_text, k=3, cuisine=None):
    query_emb = model.encode([query_text], normalize_embeddings=True)[0].astype("float32")
    find = {
        "set": SET_NAME,
        "k_neighbors": k,
        "distances": True,
        "results": {"all_properties": True},
    }
    if cuisine:
        find["constraints"] = {"cuisine": ["==", cuisine]}
    response, _ = client.query([{"FindDescriptor": find}], [query_emb.tobytes()])
    return response[0]["FindDescriptor"].get("entities", [])

results = search_recipes("spicy curry with rice")
print("Top matches for 'spicy curry with rice':")
for r in results:
    print(f"  {r['dish_name']:<45} [{r['cuisine']}]  score={1 - r['_distance']:.3f}")

Top matches for 'spicy curry with rice':
  rajma chawal                                  [Indian]  score=0.416
  butter chicken with special fried rice and assorted naan breads [Indian]  score=0.447
  won ton soup, chicken chow mein, katsu chicken [Chinese]  score=0.538

Step 8: Filter by Cuisine

ApertureDB applies metadata constraints server-side during KNN: results are scoped before being returned.

results = search_recipes("noodles in broth", k=3, cuisine="Japanese")
print("Japanese dishes similar to 'noodles in broth':")
for r in results:
    print(f"  {r['dish_name']}")
    print(f"    {r['caption']}")
    print(f"    {r['recipe_url']}")

Japanese dishes similar to 'noodles in broth':
  negi miso ramen
    green onion, bean sprouts, pork, noodles
    https://japan.recipetineats.com/home-made-miso-ramen/
  sushi
    nigiri sushi
    https://www.masterclass.com/articles/nigiri-recipe

Cleanup

client.query([{"DeleteDescriptorSet": {"with_name": SET_NAME}}])
client.print_last_response()

# To stop the Docker container if you started one:
# !docker stop aperturedb && docker rm aperturedb

[
    {
        "DeleteDescriptorSet": {
            "count": 1,
            "status": 0
        }
    }
]

What's Next?

Quick Start: load the full Cookbook dataset with CLIP image embeddings and graph connections
Work with Descriptors: update, delete, and bulk-load embeddings
Vector RAG: connect ApertureDB to LangChain for question answering
ApertureDB Cloud: managed instance, free 30-day trial

Step 1: Install Dependencies​

Step 2: Connect to ApertureDB​

Option A: ApertureDB Cloud​

Option B: Community Edition (local Docker)​

Step 3: Connect and Verify​

Step 4: Load the Cookbook Dataset​

Step 5: Generate Embeddings​

Step 6: Store Embeddings in ApertureDB​

Step 7: Search by Natural Language Query​

Step 8: Filter by Cuisine​

Cleanup​

What's Next?​