Skip to main content

Image Classification with PyTorch and ApertureDB

The following notebook illustrates image classification on a set of images retrieved using a dynamic query. The images were ingested into ApertureDB using a related tool prepare_aperturedb.py. We use the pre-trained Alexnet model in PyTorch for classification

Prerequisites:

  • Access to an ApertureDB instance.
  • aperturedb-python installed. (note that pytorch gets pulled in as dependency of aperturedb)
  • COCO dataset files downloaded like it's done in PyTorch COCO Data example. We will use the validation set in the following cells.

Install ApertureDB SDK

%pip install aperturedb[complete]

Query for images that match the dataset name "prepare_aperturedb"

Also resize them to 256x256 pixels to conform them to the input for Alexnet

Also instantiate Alexnet classifier, and use PytorchDataSet

This is a drop in replacement for torch.utils.data.DataSet, and can be used in pytorch loaders.

It abstracts the features like batched retrieval, so that large volume of such data may be easily used within applications.

(Even stuff that does not fit entirely in memory)

import time
import AlexNetClassifier as alexnet
import dbinfo
from aperturedb import PyTorchDataset
from IPython.display import display, Image
import cv2

db = dbinfo.create_connector()
out_file_name = "classification.txt"
query = [{
"FindImage": {
"constraints": {
"dataset_name": ["==", "prepare_aperturedb"]
},
"operations": [
{
"type": "resize",
"width": 256,
"height": 256
}
],
"results": {
"list": ["image_id"],
}
}
}]

classifier = alexnet.AlexNetClassifier()
with open(out_file_name, 'w') as classification:
dataset = PyTorchDataset.ApertureDBDataset(db=db, query=query, label_prop='image_id')
start = time.time()
for item in dataset:
image, id = item
label, conf = classifier.classify(image)
classification.write(f"{id}: {label}, confidence = {conf}\n")
converted = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
encoded = cv2.imencode(ext=".jpeg", img=converted)[1]
ipyimage = Image(data=encoded, format="JPEG")
display(ipyimage, f"{id}: {label}")

print("\rRetrieval performance (imgs/s):",
len(dataset) / (time.time() - start), end="")

print(f"\nWritten classification results into {out_file_name}")

jpeg

'139: restaurant, eating house, eating place, eatery'

jpeg

'5193: sax, saxophone'

jpeg

'2261: snorkel'

jpeg

'2299: pickelhaube'

jpeg

'2431: restaurant, eating house, eating place, eatery'

jpeg

'7816: crash helmet'

jpeg

'5529: ski'

jpeg

'2473: ski'

jpeg

'5586: tennis ball'

jpeg

'2532: ski'

jpeg

'785: ski'

jpeg

'7977: unicycle, monocycle'

jpeg

'872: baseball'

jpeg

'2685: French horn, horn'

jpeg

'8021: stage'

jpeg

'885: racket, racquet'

jpeg

'6040: streetcar, tram, tramcar, trolley, trolley car'

jpeg

'8211: motor scooter, scooter'

jpeg

'1000: steel drum'

jpeg

'3156: marimba, xylophone'

jpeg

'1268: liner, ocean liner'

jpeg

'6460: shopping cart'

jpeg

'3255: alp'

jpeg

'8532: bow tie, bow-tie, bowtie'

jpeg

'1296: cellular telephone, cellular phone, cellphone, cell, mobile phone'

jpeg

'6471: ballplayer, baseball player'

jpeg

'1353: barber chair'

jpeg

'3553: cannon'

jpeg

'8690: groenendael'

jpeg

'1490: paddle, boat paddle'

jpeg

'6763: slot, one-armed bandit'

jpeg

'8844: pineapple, ananas'

jpeg

'6771: torch'

jpeg

'3934: butcher shop, meat market'

jpeg

'4134: groom, bridegroom'

jpeg

'9378: stage'

jpeg

'1584: trolleybus, trolley coach, trackless trolley'

jpeg

'6894: African elephant, Loxodonta africana'

jpeg

'4395: bow tie, bow-tie, bowtie'

jpeg

'9400: hair spray'

jpeg

'6954: tractor'

jpeg

'9448: umbrella'

jpeg

'1761: steel arch bridge'

jpeg

'7088: umbrella'

jpeg

'4765: paddle, boat paddle'

jpeg

'9483: desktop computer'

jpeg

'9590: restaurant, eating house, eating place, eatery'

jpeg

'7278: canoe'

jpeg

'5001: stage'

jpeg

'9769: snowplow, snowplough'

jpeg

'139: restaurant, eating house, eating place, eatery'

jpeg

'5193: sax, saxophone'

jpeg

'2261: snorkel'

jpeg

'2299: pickelhaube'

jpeg

'2431: restaurant, eating house, eating place, eatery'

jpeg

'7816: crash helmet'

jpeg

'5529: ski'

jpeg

'2473: ski'


Retrieval performance (imgs/s): 25.22241178544816
Written classification results into classification.txt