Your app just went viral—congratulations! Each day brings new users, new possibilities, and new burdens for your servers to handle. For many applications, the ability to scale infrastructure, especially database infrastructure, can be a significant growing pain.
As such, scaling quickly and effectively is a key capability of any production-ready database. In this article, we’ll take a look at the scaling options of MongoDB, when to use them, and how to configure them in MongoDB Atlas.
MongoDB is a modern, document database. In contrast to relational databases such as MySQL and Oracle, non-relational databases such as MongoDB were designed for more modern architectures, such as cloud, where you need to scale out, reach a global audience, and maintain sovereignty within a single cluster. Non-relational databases excel at handling large datasets, and typically require less upfront design than a relational database.
As your application grows, each piece of the application must scale along with the size of your user base and your data needs. Historically, database scaling has been a major pain point for large applications or applications with above average throughput, and options have been either limited in number or costly to implement.
In contrast, MongoDB has a full range of scaling options available, and they are built into MongoDB Atlas, MongoDB’s database-as-a-service offering. Let’s look at a few different ways MongoDB can scale.
Vertical scaling refers to increasing the processing power of a single server or cluster. Both relational and non-relational databases can scale up, but eventually, there will be a limit in terms of maximum processing power and throughput. Additionally, there are increased costs with scaling up to high-performing hardware, as costs do not scale linearly.
Horizontal scaling, also known as scale-out, refers to bringing on additional nodes to share the load. This is difficult with relational databases due to the difficulty in spreading out related data across nodes. With non-relational databases, this is made simpler since collections are self-contained and not coupled relationally. This allows them to be distributed across nodes more simply, as queries do not have to “join” them together across nodes.
Scaling MongoDB horizontally is achieved through sharding (preferred) and replica sets.
As mentioned above, sharding is horizontal scaling by spreading data across multiple nodes. Each node contains a subset of the overall data. This is especially effective for increasing throughput for use cases that involve significant amounts of write operations, as each operation only affects one of the nodes and the partition of data it is managing.
While sharding happens automatically in MongoDB Atlas, it is still up to us to configure the shard key, which is used by MongoDB for partitioning the data in a non-overlapping fashion across shards. This can be done automatically through either ranged or hashed sharding, or customized using zoned sharding. For more information on these options, see this post on sharding from the official MongoDB blog.
Over time, datasets typically do not grow uniformly, and various shards will grow at faster rates than others. As your workloads evolve and data sets grow, there will be a need to rebalance data to ensure an even distribution of load across the cluster. This uneven distribution of data is addressed through shard balancing. In MongoDB, this is handled automatically by the sharded cluster balancer.
Replica sets seem similar to sharding, but they differ in that the dataset is duplicated. Replication allows for high availability, redundancy/failover handling, and decreased bottlenecks on read operations. However, they can also introduce issues for applications with large amounts of write transactions, as each update must be propagated over to every replica set member.
Proactive scaling refers to scaling your database in advance of foreseen load or high-traffic events. This could be based upon a regular pattern (e.g., day of the week or certain times of the year), or it could be done before specific events, such as launching a marketing campaign.
In contrast, reactive scaling refers to scaling in response to metrics. These could be warning signs such as slow transactions and query response times, or it could even be error messages coming from your database monitoring. In the worst-case scenario, this could be an outage due to excessive load.
Naturally, proactive scaling is preferable when possible.
Now, let’s take a look at how to scale MongoDB to meet the needs of your application.
As a service offering, MongoDB Atlas makes scaling as easy as setting the right configuration. Both horizontal and vertical scaling are supported.
Vertical scaling is as simple as configuring a cluster tier. Note that even within a tier, further scaling is possible (including auto scaling from the M10 tier upwards). We'll look at that later.
Horizontal scaling comes through the deployment of a sharded cluster.
In MongoDB, a sharded cluster consists of shards, routers/balancers, and config servers for metadata. While setting this up manually would require some infrastructure setup and configuration, Atlas makes this quite simple. Just toggle the option on for your MongoDB cluster and select the number of shards.
(Please note: This is only available for M30 clusters and up.)
The default setup creates replica sets and mongods for each of the shards and the config servers. This provides high availability, redundancy, and increased read and write performance through the use of both types of horizontal scaling. The routers, or mongos, distribute queries and write operations across the shards according to the data which is on that shard.
Don’t forget, a shard key needs to be configured, and there are a few different options available. For more information, see the MongoDB documentation on shard keys.
MongoDB Atlas has cluster auto-scaling, which scales vertically based on cluster usage. This is as simple as configuring the cluster tier:
Both cluster tier/CPU power and storage amount can be auto-scaled. This gives you automated and reactive vertical scaling both up and down, without having to worry about setting up new servers, transferring data, or even downtime in between. If necessary, the cluster can also be paused, effectively scaling the whole cluster to 0 except for storage.
In this article, we reviewed different types of scaling as well as how to implement each of these in MongoDB Atlas. For more information about MongoDB and case studies, check out MongoDB at Scale.
It depends on your use case! For most applications, you want the ability to do both, as that gives flexibility in meeting the throughput needs of your application. If the needs of your application can be met with a single instance, vertical scaling tends to be the simpler, more straightforward option.
For workloads which are more than a single instance can handle, horizontal scaling becomes necessary. Horizontal scaling also supports low latency in globally-distributed applications as well as aids in complying with data-sovereignty requirements. From an administrative and maintenance perspective, horizontal scaling tends to be the more difficult task. Having this feature provided by your database platform can be a huge time-saver.