qdrant-scaling-query-volume

qdrant

Aktualisiert 5 days ago

158

Designdesign

Über

Diese Claude-Fähigkeit bietet Qdrant-Optimierungsstrategien für die Bewältigung großer Abfragevolumen und Paginierung. Sie befasst sich speziell mit Leistungsproblemen bei Abfragen mit hohem Limit über mehrere Shards hinweg durch die Implementierung von Poisson-Verteilungs-basiertem Subsampling. Nutzen Sie diese Fähigkeit bei Problemen mit der Scroll-Leistung, großen Ergebnismengen oder Abfragen mit hoher Kardinalität in geshardeten Qdrant-Bereitstellungen.

Schnellinstallation

Claude Code

Dokumentation

Scaling for Query Volume

Problem: When a query has a large limit (e.g. 1000) and there are multiple shards (e.g. 10), naively each shard must return the full 1000 results — totaling 10,000 scored points transferred and merged. This is wasteful since data is randomly distributed across auto-shards.

Core idea

Instead of asking every shard for the full limit, ask each shard for a smaller limit computed via Poisson distribution statistics, then merge. This is safe because auto-sharding guarantees random, independent data distribution.

When it activates

More than 1 shard
Auto-sharding is in use (all queried shards share the same shard key)
The request's limit + offset >= SHARD_QUERY_SUBSAMPLING_LIMIT (128)
The query is not exact

Key tradeoff

The strategy trades a small probability of slightly incomplete results for a large reduction in inter-shard data transfer, especially for high-limit queries across many shards. The 1.2x safety factor and the 99.9% Poisson threshold keep the error rate very low — comparable to inaccuracies already introduced by approximate vector indices like HNSW.

GitHub Repository

qdrant/skills

Pfad: skills/qdrant-scaling/scaling-query-volume

agent-skillsai-agentsclaude-codecodexcursorembeddings

qdrant-scaling-query-volume

Über

Schnellinstallation

Claude Code

Dokumentation

Scaling for Query Volume

Core idea

When it activates

Key tradeoff

GitHub Repository

Verwandte Skills

executing-plans

requesting-code-review

connect-mcp-server

web-cli-teleport