A MongoDB sharded cluster consists of the following components.
- shard: Each shard contains a subset of the sharded data.
- mongos: The mongos acts as a query router, providing an interface between client applications and the sharded cluster.
- config servers: Config servers store metadata and configuration settings for the cluster.
For a production deployment, consider the following to ensure data redundancy and system availability.
- Config Server (3 member replica set):
config1
,config2
,config3
- 3 Shards (each a 3 member replica set):
shard1-1
,shard1-2
,shard1-3
shard2-1
,shard2-2
,shard2-3
shard3-1
,shard3-2
,shard3-3
- 2 Mongos:
mongos1
,mongos2
Go to the prod
or dev
directory and execute the following command.
It takes 1~2 minutes to completely deploy a cluster.
docker-compose up -d
You can examine the configuration and status using the following commands.
docker-compose exec config1 mongosh --port 27017
rs.status()
docker-compose exec shard1-1 mongosh --port 27017
rs.status()
docker-compose exec mongos1 mongosh --port 27017
sh.status()
You can use the sharded cluster by adding mongo-connection-uri
option in Yorkie.
Note that the Yorkie should be on the same network as the cluster.
For the production,
--mongo-connection-uri "mongodb://localhost:27020,localhost:27021"
For the development,
--mongo-connection-uri "mongodb://localhost:27020"
As mongos determines which shard a document is located and routes queries, it's necessary to configure mongos to identify shards and sharding rules. Therefore, the following commands should be applied to the primary mongos.
sh.addShard()
method adds each shard to the cluster.sh.shardCollections()
method shards each collection with the specified shard key and strategy.
Two strategies are available for sharding collections.
- Hashed sharding: It uses a hashed index of a single field as the shard key to partition data across your sharded cluster.
- Range-based sharding: It can use multiple fields as the shard key and divides data into contiguous ranges determined by the shard key values.
sh.addShard("shard-rs-1/shard1-1:27017,shard1-2:27017,shard1-3:27017")
sh.addShard("shard-rs-2/shard2-1:27017,shard2-2:27017,shard2-3:27017")
sh.addShard("shard-rs-3/shard3-1:27017,shard3-2:27017,shard3-3:27017")
sh.enableSharding("yorkie-meta")
sh.shardCollection("yorkie-meta.projects", { _id: "hashed" })
sh.shardCollection("yorkie-meta.users", { username: "hashed" })
sh.shardCollection("yorkie-meta.clients", { project_id: "hashed" })
sh.shardCollection("yorkie-meta.documents", { project_id: "hashed" })
sh.shardCollection("yorkie-meta.changes", { doc_id: "hashed", server_seq: 1 })
sh.shardCollection("yorkie-meta.snapshots", { doc_id: "hashed" })
sh.shardCollection("yorkie-meta.syncedseqs", { doc_id: "hashed" })
Considering the common query patterns used in Yorkie, it's possible to employ Range-based sharding using the following command.
sh.shardCollection("<database>.<collection>", { <shard key field> : 1, ... } )