What is pgvector?
Modern AI applications produce embeddings: fixed-length arrays of floating-point numbers called vectors that represent the semantic content of text, images, or audio. Similar inputs map to vectors that sit near each other in this space, so retrieval becomes a vector search problem rather than a keyword match. For example, a vector search for "reset my password" can match a document about "account recovery" even if the exact words do not overlap, unlike lexical search, which matches terms directly.
pgvector is a PostgreSQL extension that stores and searches those embeddings with a native vector data type, distance operators, and approximate nearest neighbor indexes. It does not create embeddings itself; you generate them with an embedding model, then insert the resulting vectors into PostgreSQL.
How pgvector Works
pgvector registers a new column type with PostgreSQL. You declare the dimensionality up front, insert vectors as arrays, and query them with distance operators:
CREATE EXTENSION vector;
CREATE TABLE documents (
id bigserial PRIMARY KEY,
content text,
-- A 1536-dimensional embedding
embedding vector(1536)
);
INSERT INTO documents (content, embedding)
VALUES ('PostgreSQL is a relational database', '[0.12, -0.04, ...]');
Four distance operators cover the common similarity measures. The right one depends on how your embeddings were generated:
<=>for cosine distance, which compares vector direction and is the usual choice for normalized text embeddings<#>for negative inner product, which favors larger dot products and is often equivalent to cosine distance when vectors are normalized<->for L2 (Euclidean) distance, which compares raw geometric distance and is useful when magnitude carries meaning<+>for L1 (taxicab) distance, which sums absolute coordinate differences and is less common for modern text embeddings
A nearest-neighbor query is an ORDER BY on one of these operators with a LIMIT:
SELECT id, content
FROM documents
ORDER BY embedding <=> '[0.10, -0.02, ...]'::vector -- input embedding
LIMIT 5;
Because the result is just rows from a regular table, you can combine vector search with WHERE filters, joins, transactions, and any other SQL feature in the same query. The operator in the query must match the operator class on the index, or PostgreSQL cannot use that index for nearest-neighbor search.
Indexes for Approximate Nearest Neighbor Search
Without an index, pgvector runs an exact scan: every row is compared against the query vector. That is accurate but scales linearly with table size. For larger tables, pgvector provides two approximate nearest neighbor (ANN) index types that trade a small amount of recall for much faster queries.
IVFFlat partitions vectors into lists using k-means clustering. At query time, only the most relevant lists are scanned:
CREATE INDEX ON documents
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);
In pgvector, IVFFlat should be built after data is loaded, since its clusters depend on the existing vector distribution.
HNSW builds a multi-layer graph where each node links to nearby neighbors. Searches walk the graph from coarse to fine layers:
CREATE INDEX ON documents
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);
HNSW indexes are slower and more memory-intensive to build than IVFFlat, but they typically deliver better recall-versus-latency tradeoffs and can be built incrementally as rows arrive.
Each index type supports separate operator classes for L2, inner product, cosine, and L1 distance. The operator class in the index must match the operator used in queries, or PostgreSQL will fall back to a sequential scan. For a deeper walk-through of parameter selection and workload tuning, see Tuning pgvector Performance.
Recall in pgvector is tunable at query time. For IVFFlat, set
ivfflat.probes; for HNSW, set hnsw.ef_search. Higher values inspect more
candidates and improve recall at the cost of latency.
Filtered Vector Search
Real applications rarely search the entire corpus. You usually want the nearest vectors that also match some metadata filter, such as a tenant, category, or date range:
SELECT id, content
FROM documents
WHERE tenant_id = 42
AND created_at > now() - interval '30 days'
ORDER BY embedding <=> $1
LIMIT 10;
With approximate indexes, pgvector applies filters after the index scan chooses candidate rows. If a filter removes too many candidates, the query can return fewer than LIMIT rows.
Recent pgvector versions support iterative index scans through hnsw.iterative_scan and ivfflat.iterative_scan. When enabled, pgvector keeps scanning more of the HNSW graph or IVFFlat lists until it finds enough filtered rows or reaches a configured scan limit. This improves filtered vector search without requiring a separate query pattern, but it still trades more work for better recall. The limitations guide covers the tradeoffs in more detail.
Storage and Quantization
A single 1536-dimensional vector column stores 6 KB per row at full precision. For very large tables, pgvector supports lower-precision types that reduce storage and memory usage:
halfvecstores each component as a 16-bit float, up to 4,000 dimensionsbitstores binary vectors up to 64,000 dimensions for Hamming and Jaccard distancesparsevecstores up to 1,000 nonzero elements, useful for sparse embeddings
Both IVFFlat and HNSW can be built on these lower-precision or sparse types, so you can fit larger indexes in memory while keeping the same nearest-neighbor query shape. For very large indexes, binary quantization plus reranking can reduce the working set while preserving final result quality.
When to Use pgvector
pgvector is a strong fit when your embeddings live alongside relational data you already store in PostgreSQL. Keeping vectors in the same database avoids dual-write problems, lets you join across tables, and reuses existing backup, replication, and access control. It is widely supported by managed Postgres providers and often used as the retrieval layer in RAG systems.
Because vectors capture semantic meaning but miss exact terms like product codes or filenames, most production retrieval pipelines pair pgvector with a full-text search index and combine the two results with hybrid search. The lexical vs. semantic article covers the tradeoffs in more detail.
A dedicated vector database may make more sense when you need billions of vectors, very high write throughput on the vector path alone, or features such as native multi-vector ranking and learned indexes that pgvector does not yet provide.
Summary
pgvector turns PostgreSQL into a capable vector database by adding a vector type, distance operators, and IVFFlat and HNSW indexes for approximate nearest neighbor search. It integrates cleanly with the rest of SQL, so embeddings can be filtered, joined, and transacted on like any other column. For many applications that combine structured data with AI-generated embeddings, pgvector provides production-quality similarity search without running a separate system.