Pgvector compatibility
Pgvecto.rs is natively compatible with pgvector at:
- Create table like
CREATE TABLE t (val vector(3))
- Query vectors like
INSERT INTO t (val) VALUES ('[0.6,0.6,0.6]')
Pgvecto.rs can be configured to be compatible with pgvector at:
- Create vector indexes like
CREATE INDEX ON t USING hnsw (val vector_ip_ops);
- Query options like
SET ivfflat.probes=10;
This feature is called compatibility mode
. It can be enabled with SET vectors.pgvector_compatibility=on
.
Options & Variables
For index ivfflat
and hnsw
only the following options and variables are available.
- Default value for
options
is different from pgvecto.rs original, which keeps the same as inpgvector
. - Default value for
variables
is the same as in pgvecto.rs.
Options for ivfflat
:
Key | Type | Default | Description |
---|---|---|---|
lists | integer | 100 | Number of cluster units. |
Variables for ivfflat
:
Option | Type | Default | Description |
---|---|---|---|
ivfflat.probes | integer ([1, 1000000] ) | 10 | Number of lists to scan. |
WARNING
Default value of ivfflat.probes
is 10
instead of 1
from pgvector.
Options for hnsw
:
key | type | default | description |
---|---|---|---|
m | integer | 16 | Maximum degree of the node. |
ef_construction | integer | 64 | Search extent in construction. |
Variables for hnsw
:
Option | Type | Default | Description |
---|---|---|---|
hnsw.ef_search | integer ([1, 65535] ) | 100 | Search scope of HNSW. |
WARNING
Default value for hnsw.ef_search
is 100
instead of 40
from pgvector.
TIP
The original syntax of pgvecto.rs is still available even in compatibility mode
.
You can use it with the vectors
index like before:
SET vectors.ivf_nprobe=20;
CREATE INDEX ON items USING vectors (embedding vector_l2_ops)
WITH (options = $$
[indexing.ivf]
quantization.product.ratio = "x16"
$$);
Examples
It's easy to enable compatibility mode and start a vector query.
DROP TABLE IF EXISTS t;
SET vectors.pgvector_compatibility=on;
SET hnsw.ef_search=40;
CREATE TABLE t (val vector(3));
INSERT INTO t (val) SELECT ARRAY[random(), random(), random()]::real[] FROM generate_series(1, 1000);
CREATE INDEX hnsw_l2_index ON t USING hnsw (val vector_cosine_ops);
SELECT COUNT(1) FROM (SELECT 1 FROM t ORDER BY val <-> '[0.5,0.5,0.5]' limit 100) t2;
DROP INDEX hnsw_l2_index;
Multiply types of indexes are accepted:
SET vectors.pgvector_compatibility=on;
-- [hnsw + vector_l2_ops] index with default options
CREATE INDEX hnsw_l2_index ON t USING hnsw (val vector_l2_ops);
-- [hnsw + vector_cosine_ops] index with single ef_construction option
CREATE INDEX hnsw_cosine_index ON t USING hnsw (val vector_cosine_ops) WITH (ef_construction = 80);
-- anonymous [hnsw + vector_ip_ops] with all options
CREATE INDEX ON t USING hnsw (val vector_ip_ops) WITH (ef_construction = 80, m = 12);
-- [ivfflat + vector_l2_ops] index with default options
CREATE INDEX ivfflat_l2_index ON t USING ivfflat (val vector_l2_ops);
-- [ivfflat + vector_ip_ops] index with all options
CREATE INDEX ivfflat_ip_index ON t USING ivfflat (val vector_cosine_ops) WITH (nlist = 80);
-- anonymous [ivf + vector_ip_ops] with all options
CREATE INDEX ON t USING ivfflat (val vector_ip_ops) WITH (lists = 80)
Limitation
For compatibility, we strive to maintain a consistent user experience, but there are still some limitations in two aspects:
- Some features of pgvector.rs are not available in `compatibility mode'.
- Some features of pgvector are different in `compatibility mode
Inaccessible features of pgvecto.rs
When executing SQL statements in compatibility mode
without vectors
index, some features of pgvecto.rs are not accessible:
flat
index- Quantization, including
scalar quantization
andproduct quantization
- Prefilter and vbase
Difference from pgvector
The compatibility mode
focuses on the most commonly used features for creating indexes. So far, there are still a few differences from pgvector.
Known problems are not limited to:
- Create
btree
index on vector type data is not supported - Create vector indexes is asynchronous at pgvecto.rs, instead of synchronous at pgvector
- Default value of
variable
is different from pgvector