DescriptorDataCSV
DescriptorDataCSV Objects
class DescriptorDataCSV(CSVParser.CSVParser)
ApertureDB Descriptor Data.
This class loads the Descriptor Data which is present in a CSV file, and converts it into a series of ApertureDB queries.
filename
, index
, set
, label
, PROP_NAME_1
, ... PROP_NAME_N
, constraint_PROP_NAME_N
filename: Path to a npz file which comprises of np arrays.
index: The 0 based index of a np array in the npz file.
set: The search space to restrict the knn search queries to.
label: Arbitrary name given to the label associated with this descriptor.
PROP_NAME_1 .. PROP_NAME_N: Arbitrarily assigned properties to this descriptor.
constraint_PROP_NAME_1: A constraint to ensure uniqueness when inserting this descriptor.
Example CSV file::
filename,index,set,label,isTable,UUID,constraint_UUID
/mnt/data/embeddings/kitchen.npz,0,kitchen,kitchen_table,True,AID-0X3E,AID-0X3E
/mnt/data/embeddings/kitchen.npz,1,kitchen,kitchen_table,True,BXY-AB1Z,BXY-AB1Z
/mnt/data/embeddings/dining_chairs.npz,1,dining_chairs,special_chair,False,COO-SE1R,COO-SE1R
...
Example usage:
data = DescriptorDataCSV("/path/to/DescriptorData.csv")
loader = ParallelLoader(client)
loader.ingest(data)
In the above example, the index uniquely identifies the actual np array from the many arrays in the npz file which is same for line 1 and line 2. The UUID and constraint_UUID ensure that a Descriptor is inserted only once in the DB.
Association of an entity to a Descriptor can be specified by first ingesting other Objects, then Descriptors and finally by using ConnectionDataCSV
In the above example, the constraint_UUID ensures that a connection with the specified UUID would be only inserted if it does not already exist in the database.