Vector Search

Overview

Artificial intelligence algorithms can convert various forms of unstructured data—such as speech, images, videos, language, text, and behaviors—into multi-dimensional vectors. These vectors act like coordinates in a mathematical space, representing different entities and their relationships.

The process of transforming unstructured data into vectors is called Embedding, while retrieving these vectors to find the associated entities is known as unstructured retrieval.

Traditional keyword search focuses on keyword matching and often misses the meaning and context behind queries. Semantic search, on the other hand, improves user intent fulfillment, especially for complex queries. It understands query context, handles ambiguous or incomplete searches, and delivers more relevant and useful results.

Vector retrieval capabilities are currently used in business, with common applications including:

  1. Using open-source vector retrieval kernels (like faiss, scann, proxima, and nmslib) to build an in-house online vector retrieval service via a Restful API.
  2. Implementing specialized vector retrieval databases to facilitate vector retrieval tasks.
  3. Choosing database or data warehouse products that support vector search features.

In many cases, the disadvantages of using a dedicated vector database outweigh its benefits. These disadvantages include data redundancy, excessive data movement, inconsistencies between distributed components, and the need for specialized knowledge. Other considerations include costs such as learning, labor, software licensing fees, limited query language capabilities, programmability, scalability issues, a narrow toolchain, and inferior data integrity and availability compared to traditional databases. Supporting vector retrieval within a distributed database allows leveraging its distributed features to efficiently manage scenarios involving large vector data. In systems that handle distributed transactions, operations such as writing, updating, and deleting vectors align with the strong consistency requirements of those transactions.

Tacnode offers vector retrieval features and supports two index types: IVFFlat (Inverted File with Flat Quantization) and HNSW (Hierarchical Navigable Small World), both enabling efficient vector retrieval and recall. It supports batch data imports, real-time updates and deletions, and combined queries that integrate conditional filtering with vector retrieval.

  • IVFFlat is an approximate nearest neighbor search algorithm using an inverted index to efficiently compare vector similarities. It partitions the vector space into multiple regions, each containing several vectors, and uses an inverted index to quickly locate vectors similar to a specified vector. The number of cluster center points can be determined using the lists parameter. The recommended approach for defining list values is: if rows are fewer than 1 million, use lists = rows/1000; if rows exceed 1 million, use lists = sqrt(rows), which enhances clustering outcomes. Though IVFFlat allows for faster index building and lower memory consumption, query performance may be slightly reduced. This method is particularly advantageous for handling large data sets, significantly decreasing the search space and enhancing query efficiency.
  • HNSW is a graph-based indexing structure that creates a multi-tier navigation graph. It uses a random jump technique for proximity searching, allowing for quick identification of similar vectors across various graph layers. While it requires more memory and a slower index-building process, its advantages include enhanced query performance and improved accuracy when handling high-dimensional data.
Clustered Index (IVFFlat)Graph Index (HNSW)
Dataset Size[100K, ~][~, 100M]
Recall AccuracyMediumHigh
Query PerformanceModerateHigh
Build SpeedFastSlow
Memory UsageLowHigh

Additionally, Tacnode is enhanced by plug-ins that implement the pgvector feature of the PostgreSQL ecosystem, facilitating vector retrieval capabilities. pgvector offers a sleek, user-friendly interface and benefits from the extensive advantages of the PostgreSQL ecosystem.

Create Vector Table

The vector data type can support up to 2000 dimensions. When declaring a vector field, you must specify its dimensional information. Internally, there is an array of Float type. These arrays are organized into a compact binary format to ensure efficient storage and retrieval.

CREATE TABLE items
(
    id        bigserial PRIMARY KEY,
    embedding vector(3) -- dimensions
)
    USING columnar;

Vector Index

Currently, three methods for distance measurement are available:

  • <->: Euclidean distance (L2 Distance), using the index parameter vector_l2_ops
  • <#>: Inner Product distance (Inner Product), utilizing the index parameter vector_ip_ops
  • <=>: Cosine Distance, employing the index parameter vector_cosine_ops

The construction process can support various vector index types.

Creating IVFFlat Clustered Index

Parameter details:

  • lists: Number of clustering centroids. A typical rule for lists: for less than 1 million rows, set lists = rows/1000; for more than 1 million rows, set lists = sqrt(rows) to achieve balanced clustering performance.
-- Create a Euclidean distance index
CREATE INDEX ON items USING split_ivfflat (embedding vector_l2_ops) WITH (lists = 100);
-- Creating an Inner Product distance index
CREATE INDEX ON items USING split_ivfflat (embedding vector_ip_ops) WITH (lists = 100);
-- Create a Cosine distance index
CREATE INDEX ON items USING split_ivfflat (embedding vector_cosine_ops) WITH (lists = 100);
 
--Import vector data
INSERT INTO items (embedding) VALUES ('[1.0,0.0,-1.0]');

Creating HNSW Graph Index

Parameter details (parameters can be combined flexibly):

  • m: Max number of neighbors (outdegree) per node. Affects recall accuracy and performance. More neighbors improve graph connectivity and recall but can degrade write and query throughput. Default: 16.

  • ef_construction: Number of candidate neighbors evaluated when selecting the final m connections. Higher values improve graph quality and recall but slow down index construction. Default: 64.

  • quantizer: Vector quantization method. Options: int8, fp16. Converts original floating-point vectors into more compact data types, trading memory footprint for slight reductions in recall/latency. Useful for memory-constrained environments, very large datasets, or when precision requirements are relaxed. Quantization may cause marginal loss in accuracy and somewhat lower query throughput (due to dequantization).

-- Euclidean distance index
CREATE INDEX ON items USING split_hnsw (embedding vector_l2_ops) WITH (m=16, ef_construction=64);
-- Inner product distance index
CREATE INDEX ON items USING split_hnsw (embedding vector_ip_ops) WITH (m=16, ef_construction=64);
-- Cosine similarity index
CREATE INDEX ON items USING split_hnsw (embedding vector_cosine_ops) WITH (m=16, ef_construction=64);
 
-- Index with int8 quantization
CREATE INDEX ON items USING split_hnsw (embedding vector_l2_ops) WITH (quantizer='int8');
-- Index with fp16 quantization
CREATE INDEX ON items USING split_hnsw (embedding vector_l2_ops) WITH (quantizer="fp16");
 
-- Import vector data
INSERT INTO items (embedding) VALUES ('[1.0,0.0,-1.0]');

Note: Without an index configured, brute-force search will be used for vector retrieval.

Choosing Vector Quantization Method

HNSW supports two quantization types: int8 and fp16. Choose as follows:

Featureint8fp16
PrincipleConverts float32 vector to 8-bit integerConverts float32 vector to 16-bit float
Memory Reduction75% (32→8 bits)50% (32→16 bits)
Accuracy LossModerateLow
Computation OverheadHigh (quantization/dequantization required)Low (direct type casting)
Recommended Use CasesMemory-critical, tolerant to some accuracy loss, large-scale datasetsHigher accuracy needed, balance between memory and recall, GPU is fp16-capable

Queries with IVFFlat Index

-- ivfflat.probes. Set the number of cluster centers to examine. The larger the value, the higher the recall rate and the worse the retrieval performance. When this value is equal to the number of lists specified when building the index, it degenerates into a brute-force search.
 
-- Globally effective configuration
alter system set ivfflat.probes=10;
 
-- The current session configuration takes effect, which has a higher priority than the global configuration
SET ivfflat.probes = 10;
 
SELECT *
FROM items
ORDER BY embedding <-> '[3,1,2]'
LIMIT 5;

Queries with HNSW Index

Parameter details:

  • hnsw.ef_search: Controls the candidate queue length. Increasing ef_search explores more nodes, boosting recall at the cost of higher latency.
SET hnsw.ef_search = 100;
 
SELECT *
FROM items
ORDER BY embedding <-> '[3,1,2]'
LIMIT 5;

On this page