Indexing for Faster Queries

ApertureDB relies on indexes to enable faster queries on attributes of interest or embeddings.

You can create an index before adding any object or connection (fastest).

You can add an index after the corresponding objects or connections already exist in the database (slower since it has to build the index)

Connect to the database

If you haven't already setup the database or configured it, check out our quick start guide

# Install the required client packages if needed
%pip install --upgrade --quiet pip
%pip install --upgrade --quiet aperturedb

from aperturedb.CommonLibrary import create_connector

# Create the connector for ApertureDB
client = create_connector()

Create an index on Ingredient's name

If we often search for an ingredient by its name, it should be indexed

query = [{
    "CreateIndex": {
        "class": "Ingredient",
        "index_type": "entity",
        "property_key": "name"
    }
},{
    "AddEntity": {                        # GetSchema can show us the indexes but when there are entities in them, so add one
        "class": "Ingredient", 
        "properties": {        
            "name": "butter",  
            "macronutrient": "fat",
            "subgroup": "dairy",
            "category": "vegetarian"
        },
        "if_not_found": {                # conditional add
            "name": ["==", "butter"] 
        }
    }
}]

response, blobs = client.query(query)

client.print_last_response()

[
    {
        "CreateIndex": {
            "status": 0
        }
    },
    {
        "AddEntity": {
            "status": 0
        }
    }
]

Verify index using GetSchema

# CHECK if indexed parameter for Ingredient's name is True
# https://docs.aperturedata.io/query_language/Reference/db_commands/GetSchema
query = [{
    "GetSchema": {
        "type": "entities"
    }
}]

# Execute the query to get back a JSON response for GetStatus 
response, blobs = client.query(query)

print(response[0]["GetSchema"]["entities"]["classes"]["Ingredient"]["properties"]["name"])

[1, True, 'String']

Remove the index

Indexes occupy space - we can remove them when unnecessary

query = [{
    "RemoveIndex": {
        "class": "Ingredient",
        "index_type": "entity",
        "property_key": "name"
    }
}]

response, blobs = client.query(query)

client.print_last_response()

[
    {
        "RemoveIndex": {
            "status": 0
        }
    }
]

Verify index using GetSchema

Here we check whether the name property of Ingredient entities is indexed. See GetSchema. We're expecting it not to be, because we have not created an index yet.

query = [{
    "GetSchema": {
        "type": "entities"
    }
}]

# Execute the query to get back a JSON response for GetStatus 
response, blobs = client.query(query)

print(response[0]["GetSchema"]["entities"]["classes"]["Ingredient"]["properties"]["name"])

[1, False, 'String']

Using Python SDK Utils for Indexing

The Utils package in ApertureDB Python SDK provides a lot of helper functions. One of the things they can help you to do is to add indexes.

from aperturedb.Utils import Utils

utils = Utils(client)
utils.create_entity_index(class_name="Ingredient", property_key="name")

True

Check again whether the name property of Ingredient entities is now indexed. We're expecting that it will be, because we just created an index.

query = [{
    "GetSchema": {
        "type": "entities"
    }
}]

# Execute the query to get back a JSON response for GetStatus 
response, blobs = client.query(query)

print(response[0]["GetSchema"]["entities"]["classes"]["Ingredient"]["properties"]["name"])

[1, True, 'String']

The Utils module can also help us to remove an index.

utils.remove_entity_index(class_name="Ingredient", property_key="name")

True

Now check again whether the name property of Ingredient entities is indexed. We're expecting that it won't be, because we just removed the index.

query = [{
    "GetSchema": {
        "type": "entities"
    }
}]

# Execute the query to get back a JSON response for GetStatus 
response, blobs = client.query(query)

print(response[0]["GetSchema"]["entities"]["classes"]["Ingredient"]["properties"]["name"])

[1, False, 'String']

Cleanup

query = [{
    "DeleteEntity": {
        "with_class": "Ingredient",
        "constraints": {
            "name": ["==", "butter"] 
        }
    }
}]

response, blobs = client.query(query)

client.print_last_response()

[
    {
        "DeleteEntity": {
            "count": 1,
            "status": 0
        }
    }
]

What's next?

Bulk loading of data
Database administration

Connect to the database​

Create an index on Ingredient's name​

Verify index using GetSchema​

Remove the index​

Verify index using GetSchema​

Using Python SDK Utils for Indexing​

Cleanup​

What's next?​

Connect to the database

Create an index on Ingredient's name

Verify index using GetSchema

Remove the index

Verify index using GetSchema

Using Python SDK Utils for Indexing

Cleanup

What's next?