qdrant-scaling-query-volume

qdrant

更新于 6 days ago

158

设计design

关于

This Claude skill provides Qdrant optimization strategies for handling large query volumes and pagination. It specifically addresses performance issues with high-limit queries across multiple shards by implementing Poisson distribution-based subsampling. Use this skill when dealing with scroll performance, large result sets, or high-cardinality queries in sharded Qdrant deployments.

快速安装

Claude Code

技能文档

Scaling for Query Volume

Problem: When a query has a large limit (e.g. 1000) and there are multiple shards (e.g. 10), naively each shard must return the full 1000 results — totaling 10,000 scored points transferred and merged. This is wasteful since data is randomly distributed across auto-shards.

Core idea

Instead of asking every shard for the full limit, ask each shard for a smaller limit computed via Poisson distribution statistics, then merge. This is safe because auto-sharding guarantees random, independent data distribution.

When it activates

More than 1 shard
Auto-sharding is in use (all queried shards share the same shard key)
The request's limit + offset >= SHARD_QUERY_SUBSAMPLING_LIMIT (128)
The query is not exact

Key tradeoff

The strategy trades a small probability of slightly incomplete results for a large reduction in inter-shard data transfer, especially for high-limit queries across many shards. The 1.2x safety factor and the 99.9% Poisson threshold keep the error rate very low — comparable to inaccuracies already introduced by approximate vector indices like HNSW.

GitHub 仓库

qdrant/skills

路径: skills/qdrant-scaling/scaling-query-volume

agent-skillsai-agentsclaude-codecodexcursorembeddings

qdrant-scaling-query-volume

关于

快速安装

Claude Code

技能文档

Scaling for Query Volume

Core idea

When it activates

Key tradeoff

GitHub 仓库

相关推荐技能

executing-plans

requesting-code-review

connect-mcp-server

web-cli-teleport