Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion: REST API for sharded indexes #2690

Open
lintool opened this issue Jan 22, 2025 · 6 comments
Open

Discussion: REST API for sharded indexes #2690

lintool opened this issue Jan 22, 2025 · 6 comments

Comments

@lintool
Copy link
Member

lintool commented Jan 22, 2025

Starting a discussion of how we might design retrieval w/ sharded indexes. MS MARCO V2.1, for ArcticEmbed-L, we have 10 shards, from shard00 to shard09.

With the current design, once we close #2688 - we'll have:

http://localhost:8081/api/v1.0/indexes/msmarco-v2.1-doc-segmented-shard00.arctic-embed-l.hnsw-int8/
http://localhost:8081/api/v1.0/indexes/msmarco-v2.1-doc-segmented-shard01.arctic-embed-l.hnsw-int8/
...
http://localhost:8081/api/v1.0/indexes/msmarco-v2.1-doc-segmented-shard09.arctic-embed-l.hnsw-int8/

One simple solution would be to create a "fake" endpoint, e.g.,

http://localhost:8081/api/v1.0/indexes/msmarco-v2.1-doc-segmented.arctic-embed-l.hnsw-int8/

That fans out to all the shards and gathers results.

@vincent-4 thoughts?

@vincent-4
Copy link
Member

vincent-4 commented Jan 22, 2025

Thanks for starting!
Do you think it'd be easier to reuse the thread pool from `src/main/java/io/anserini/search/SearchHnswDenseVectors.java'? I get that it's for a single index? Or go from scratch. But I'm leaning towards the former

@lintool
Copy link
Member Author

lintool commented Jan 22, 2025

Actually, I'm thinking that each index would get its own separate underlying searcher instance with its own thread pool. So, all of these would be completely independent...

http://localhost:8081/api/v1.0/indexes/msmarco-v2.1-doc-segmented-shard00.arctic-embed-l.hnsw-int8/
http://localhost:8081/api/v1.0/indexes/msmarco-v2.1-doc-segmented-shard01.arctic-embed-l.hnsw-int8/
...
http://localhost:8081/api/v1.0/indexes/msmarco-v2.1-doc-segmented-shard09.arctic-embed-l.hnsw-int8/

Question is, how do we specify the configs? Do all indexes have the same config, e.g., say the setting is 4 threads - that'd mean all api/v1.0/indexes/xxx would get 4 threads. In which case, api/v1.0/indexes/msmarco-v2.1-doc-segmented.arctic-embed-l.hnsw-int8/ would actually trigger 40 threads in parallel - fans out to 10 shards, each shard fires up 4 threads. This is a simple model, but lacking fine-grained control...

@lintool
Copy link
Member Author

lintool commented Jan 25, 2025

I will write up in a guide shortly, but just to pass along. How to run TREC RAG24 test queries with ArcticEmbed-L shards:

SHARDS=(00 01 02 03 04 05 06 07 08 09); for shard in "${SHARDS[@]}"
do
    bin/run.sh io.anserini.search.SearchHnswDenseVectors -index msmarco-v2.1-doc-segmented-shard${shard}.arctic-embed-l.hnsw-int8 -efSearch 1000 -topics rag24.test.snowflake-arctic-embed-l -output runs/run.rag24.test.arctic-l-msv2.1.shard${shard}.txt -hits 250 -threads 32 > logs/log.run.rag24.test.arctic-l-msv2.1.shard${shard}.txt 2>&1
done

To evaluate:

cat runs/run.rag24.test.arctic-l-msv2.1.shard0* > runs/run.rag24.test.arctic-l-msv2.1.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.rag24.test-umbrela-all.txt runs/run.rag24.test.arctic-l-msv2.1.txt

On orca, a copy of the indexes are at `/mnt/msmarco-v2_1/indexes. Symlink to shared copy of indexes:

cd ~/.cache/pyserini/indexes/
ln -s /mnt/msmarco-v2_1/indexes/lucene-hnsw-int8.msmarco-v2.1-doc-segmented-shard00.arctic-embed-l.20250114.4884f5.aab3f8e9aa0563bd0f875584784a0845 .
ln -s /mnt/msmarco-v2_1/indexes/lucene-hnsw-int8.msmarco-v2.1-doc-segmented-shard01.arctic-embed-l.20250114.4884f5.34ea30fe72c2bc1795ae83e71b191547 .
ln -s /mnt/msmarco-v2_1/indexes/lucene-hnsw-int8.msmarco-v2.1-doc-segmented-shard02.arctic-embed-l.20250114.4884f5.b6271d6db65119977491675f74f466d5 .
ln -s /mnt/msmarco-v2_1/indexes/lucene-hnsw-int8.msmarco-v2.1-doc-segmented-shard03.arctic-embed-l.20250114.4884f5.a9cd644eb6037f67d2e9c06a8f60928d .
ln -s /mnt/msmarco-v2_1/indexes/lucene-hnsw-int8.msmarco-v2.1-doc-segmented-shard04.arctic-embed-l.20250114.4884f5.07b7e451e0525d01c1f1f2b1c42b1bd5 .
ln -s /mnt/msmarco-v2_1/indexes/lucene-hnsw-int8.msmarco-v2.1-doc-segmented-shard05.arctic-embed-l.20250114.4884f5.2573dce175788981be2f266ebb33c96d .
ln -s /mnt/msmarco-v2_1/indexes/lucene-hnsw-int8.msmarco-v2.1-doc-segmented-shard06.arctic-embed-l.20250114.4884f5.a644aea445a8b78cc9e99d2ce111ff11 .
ln -s /mnt/msmarco-v2_1/indexes/lucene-hnsw-int8.msmarco-v2.1-doc-segmented-shard07.arctic-embed-l.20250114.4884f5.402d37deccb44b5fc105049889e8aaea .
ln -s /mnt/msmarco-v2_1/indexes/lucene-hnsw-int8.msmarco-v2.1-doc-segmented-shard08.arctic-embed-l.20250114.4884f5.89ebcd027f7297b26a1edc8ae5726527 .
ln -s /mnt/msmarco-v2_1/indexes/lucene-hnsw-int8.msmarco-v2.1-doc-segmented-shard09.arctic-embed-l.20250114.4884f5.5e580bb7eb9ee2bb6bfa492b3430c17d .

(Otherwise, it's ~0.5TB of downloads)

@vincent-4 vincent-4 mentioned this issue Jan 26, 2025
4 tasks
@lintool lintool changed the title Discussion: sharded indexes Discussion: REST API for sharded indexes Feb 1, 2025
@vincent-4
Copy link
Member

vincent-4 commented Feb 2, 2025

I may have to update IndexInfo again to know when to call a shardedSearchService– e.g, a int shardList value: {123} so they could be appended to the name as needed in the call, eg doc-segmented + shard{0 ... 123}. Although it is getting pretty gnarly now, so maybe there's a better way to do it.

Otherwise, this is pretty straightforward, apart from shard index / shard settings / shard search configurations and their validations.

@lintool
Copy link
Member Author

lintool commented Feb 3, 2025

What about ShardInfo to parallel IndexInfo?

MSMARCO_V21_DOC_SEGMENTED_ARCTIC_EMBED_L_HNSW_INT8(
    "msmarco-v2.1-doc-segmented.arctic-embed-l.hnsw-int8",
    ....
    new IndexInfo[] {
        MSMARCO_V21_DOC_SEGMENTED_SHARD00_ARCTIC_EMBED_L_HNSW_INT8,
        MSMARCO_V21_DOC_SEGMENTED_SHARD01_ARCTIC_EMBED_L_HNSW_INT8
        ...
    }
    ...
)

@vincent-4
Copy link
Member

Wait, that's a good strategy. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants