PostgreSQL+pgvector on Raspberry Pi 5 8GB RAM
Nov 3, 2025
After trying several methods for MTG card recognition, we decided to explore image vectorization (embeddings) as a promising direction.
But before diving deeper into developing our own vectorizer, it made sense to test one crucial part of the pipeline — “Can a Raspberry Pi 5 efficiently store and search nearest vectors by distance?”
Benchmark setup
- Hardware: Raspberry Pi 5 (8 GB RAM) + 32 GB SD Card
- Dataset size: up to 100 000 records
- Vector dimension: up to 2048
- Target latency: under 0.5 s per query
PostgreSQL + pgvector seemed like a good candidate, but I couldn’t find any benchmarks matching our specific conditions — running on a Raspberry Pi 5 with large vector dimensions — so it was time to run my own tests.
Creating and filling test tables
To understand how query time depends on both vector dimension and table size, we prepared a set of test tables covering the following conditions:
- Vector dimensions: 256, 512, 1024, 2048
- Row counts: 1 000, 4 000, 16 000, 32 000, 64 000, 128 000
Usually, vector data is stored in float32, but in our case we decided to lower the precision to float16.
This cuts the table size roughly in half — which not only improves query performance but also helps extend the SD card’s lifespan.
A bit of code for context (python):
def create_table(cur, table_name, vec_dim):
print(" Creating...", end="")
cur.execute(f"""DROP TABLE IF EXISTS "{table_name}";""")
cur.execute(f"""
CREATE TABLE "{table_name}"
(id bigserial PRIMARY KEY, v halfvec({vec_dim})
STORAGE PLAIN);
""".strip())
print(" done.")
def get_random_vec(vec_dim, dtype=np.float16):
vec = np.random.random(vec_dim).astype(dtype)
norm = np.linalg.norm(vec)
if norm > 0:
vec = vec / norm
return vec
Indexing the vector field
At this stage, there are several indexing options available in pgvector, each with different trade-offs in terms of performance and accuracy.
For our use case, we’d ideally want 100% recall, but none of the existing methods guarantee that.
- IVFFlat — fast, lightweight, but approximate; accuracy depends on the number of lists (lists) and search probes (probes).
- HNSW (Hierarchical Navigable Small World) — higher accuracy and recall than IVFFlat, with good scalability for large datasets.
- Flat (brute-force) — full accuracy but extremely slow for large tables.
Given these factors, HNSW looks like the best candidate for our tests — combined with L2 distance for similarity search.
HNSW parameters
The HNSW index allows fine-tuning its performance through several key parameters. Each of them affects either the index build time, memory usage, or search accuracy.
- m — the number of bi-directional links created for each node in the graph. Higher values increase accuracy but also memory consumption and build time.
Typical range: 16–64.
- ef_construction — the number of candidate vectors considered while building the index. Larger values produce higher-quality graphs and more accurate results, but indexing takes longer.
Typical range: 100–400.
- ef_search (used during queries) — controls how many neighbors are explored at search time. Higher values improve recall but slow down queries.
Typical range: 32–128, depending on dataset size and latency goals.
Time for the first tests!
At first, I was skeptical that the Raspberry Pi 5 could handle vector search under 0.5 s per query — especially since part of that time should also cover taking a photo and running the vectorizer.
But to my surprise, the first benchmark (vector dimension = 256) showed incredible results: ~2.24 ms per query on a table with 128 000 records. I felt both excitement and doubt — a strange mix with a hint of incoming disappointment. ;)
On the screenshot, both tests are identical except for one detail: left side uses LIMIT 100, right side uses LIMIT ALL.
It took me a while to realize what was happening: when a query includes LIMIT N, ef_search acts as the primary limiter, and only then is LIMIT N applied to the already reduced results.
In practice, this means that even if there are 1000 close vectors in the table and your query says LIMIT 100, you might only get ef_search=32 results — and not necessarily the nearest ones.
In some cases, the searched vector itself (distance = 0) didn’t even appear in the output.
Incomplete and inconsistent search results
Unfortunately, after many long experiments, I couldn’t find a reliable way to make the results both 100% accurate and properly limited by LIMIT N. Increasing ef_search doesn’t fix the issue — it only slightly improves recall and reduces the number of cases where an existing vector is missing from the output.
I’d assume that ef_search serves two roles at once — it controls both the result quality and the number of returned rows. However, this parameter only takes effect when the SQL query includes LIMIT N, which already defines how many results should be returned.
It would make sense to deprecate ef_search as a public setting and calculate it internally based on the specified limit, for example: ef_search = k × N, where k is an optimal multiplier — or simply k = 1 for transparency.
Such an approach would make the behavior more predictable and technically cleaner from an SQL standpoint, avoiding these kinds of “surprising discoveries”. Still, I assume there might be a good reason behind the current design — one that’s just not obvious from the outside.
For now, the only practical solution seems to be to avoid using LIMIT N altogether.
Rethinking the approach
Initially, the performance tests were planned with LIMIT 100 in the SELECT queries — this would have allowed equal execution-time comparisons across all tables. However, things changed: since we had to abandon LIMIT N, that method was no longer valid.
The test vectors are random and roughly uniformly distributed, which means that a distance-based search (for example, threshold = 0.7) will return more results as the table size increases. This makes it impossible to directly compare performance across datasets or evaluate scalability accurately. You can actually see that on one of the earlier screenshots.
A possible workaround is to adjust the threshold dynamically to yield approximately the same number of results — say, around 30 records. That number seems reasonable not only for our card-recognition task but also for many other applications such as word or document similarity search, where the top 10–20 results usually carry most of the value.
Benchmarking performance
The individual tuning of threshold values for each table I’ll leave “off screen” — it’s less exciting than it sounds. :)
The main selection criterion was the median number of returned records. The maximum counts varied too much to be meaningful, so they were ignored — though still shown on the screenshots as reference data.
Test methodology summary:
- Run 100 SELECT queries for each table
- Search for neighbors with
distance < threshold - Calculate the minimum, median, and maximum execution times
The screenshot shows four tests, one for each vector dimension: 256, 512, 1024, and 2048.
The growth of query time appears roughly linear — which is a good sign.
For the largest tested setup (dimension = 2048, 128 000 records), the median query time was ≈363.63 ms — well below the desired 0.5 s limit.
What’s Next
We now have solid confirmation that the Raspberry Pi 5 provides enough performance for vector search on PostgreSQL + pgvector.
This means the chosen technical approach is fully viable for our card-sorting device — running entirely offline, without any internet connection or external recognition services.
At this point, the image vectorizer itself is complete. It went through multiple redesigns and improvements during large-scale testing — but that’s a story for the next blog post, coming soon.
To be continued...
Want to support the project?
See how you can help →