What is ApertureDB?
ApertureDB is a specialized database for multimodal data. ApertureDB stores and manages images, videos, documents, feature vectors (embeddings), and associated metadata like annotations. It natively supports complex searching, preprocessing, and display of media objects.
ApertureDB is based on the open source VDMS (Visual Data Management System) code. It implements a client-server design. The Server handles concurrent client requests and coordinates request execution across the metadata and data components in order to return a unified response to the client.
ApertureDB Core Architecture
The following are the core components of ApertureDB:
- Data management: ApertureDB supports management of images and videos as native entities. Given the range of usages for this data, we provide necessary preprocessing operations like zoom, crop, sampling, and creating thumbnails as you access the data via internally-linked libraries like OpenCV and ffmpeg. ApertureDB can also manage other modalities of data like documents, audio, text as blobs, but without specific operations to pre-process them at this time.
- Vector database: Feature vectors, descriptors or embeddings extracted from images, text, documents, or frames, make it possible to find objects by similarity. ApertureDB offers similarity search over n-dimensional feature vectors or embeddings building on top of Facebook's FAISS library.
- Graph-based metadata filtering: Metadata is key for many applications. ApertureDB uses an in-memory graph database to store application metadata as a knowledge graph. The images, videos, and embeddings are represented within this metadata together with application information. This helps to capture internal relationships between metadata and data, as well as to enable complex searches based on this metadata. In order to target ML applications, ApertureDB also supports bounding boxes and other regions of interest for labeling or annotation based searches, as part of the metadata.
- Unified API: An important design goal for ApertureDB was to not have its users deal with multiple systems. Therefore, ApertureDB uses a query engine or an orchestrator to redirect user queries to the right components internally, and collects the results to return a coherent response to the user. It exposes a unified JSON-based native query language to the ML pipelines and end users.
- Backing store: ApertureDB can store and access the data from disks mapped to the ApertureDB server or cloud object stores.
These pipelines or users can execute queries that can add, modify, and search multimodal data and metadata, annotations or feature vectors, perform on-the-fly visual preprocessing, and do other ML tasks like the creation of datasets.
ApertureDB is now a distributed database with quite a few more components. You can find more details here.
ApertureDB is unique when compared with other databases and infrastructure tools because:
- ApertureDB implements ACID transactions for the queries spanning the different data types thus offering relevant database guarantees at the level of these complex objects.
- It natively supports images in different formats and videos with multiple video encodings and containers, together with efficient frame-level access.
- Images and videos can be augmented or modified on-the-fly, avoiding the need to create copies in predetermined formats that can cause data bloat.
- The graph database supports metadata representing different modalities of data as shown in the example schema here.
- Any number of regions of interest with labels and in various shapes can be associated with images or frames with operations that allow extracting just the pixel data. This can reduce the amount of data transmitted over the network. These are associated with the rest of the metadata to continue building the knowledge graph of an application.
- You can navigate graphically and debug your visual datasets using ApertureDB UI.
- It can support large scale ML training and inference operations through our batch data access API representing the keyword and/or feature based searches in users' existing ML workflows.
- It can be used to index embeddings using different indexing methods. This enables vector search and classification to be performed at runtime with a combination of indexes and distance metrics. It supports additional metadata constraints on top of the K near neighbor searches.