Skip to main content

DescriptorDataCSV

DescriptorDataCSV Objects

class DescriptorDataCSV(CSVParser.CSVParser)

ApertureDB Descriptor Data.

This class loads the Descriptor Data which is present in a CSV file, and converts it into a series of ApertureDB queries.

Is backed by a CSV file with the following columns, and a NumPy array file "npz" for the descriptors:

filename, index, set, label, PROP_NAME_1, ... PROP_NAME_N, constraint_PROP_NAME_N

filename: Path to a npz file which comprises of np arrays.

index: The 0 based index of a np array in the npz file.

set: The search space to restrict the knn search queries to.

label: Arbitrary name given to the label associated with this descriptor.

PROP_NAME_1 .. PROP_NAME_N: Arbitrarily assigned properties to this descriptor.

constraint_PROP_NAME_1: A constraint to ensure uniqueness when inserting this descriptor.

Example CSV file::

filename,index,set,label,isTable,UUID,constraint_UUID
/mnt/data/embeddings/kitchen.npz,0,kitchen,kitchen_table,True,AID-0X3E,AID-0X3E
/mnt/data/embeddings/kitchen.npz,1,kitchen,kitchen_table,True,BXY-AB1Z,BXY-AB1Z
/mnt/data/embeddings/dining_chairs.npz,1,dining_chairs,special_chair,False,COO-SE1R,COO-SE1R
...

Example usage:


data = DescriptorDataCSV("/path/to/DescriptorData.csv")
loader = ParallelLoader(db)
loader.ingest(data)
info

In the above example, the index uniquely identifies the actual np array from the many arrays in the npz file which is same for line 1 and line 2. The UUID and constraint_UUID ensure that a Descriptor is inserted only once in the DB.

Association of an entity to a Descriptor can be specified by first ingesting other Objects, then Descriptors and finally by using ConnectionDataCSV

In the above example, the constraint_UUID ensures that a connection with the specified UUID would be only inserted if it does not already exist in the database.