Introducing the MongoDB 5.1 Rapid Release

Sahir Azam
November 9, 2021

Arriving just a few months after the General Availability of 5.0, MongoDB 5.1 is our first Rapid Release which brings more native time series enhancements, richer analytics, new security options, and overall improvements to platform resilience and developer productivity. Launching alongside MongoDB 5.1 are new capabilities in Atlas Search which will make it easier for users to build fast and rich application search experiences.

MongoDB 5.1 marks our accelerated release cadence designed to get new database features and improvements into your hands faster than ever before. MongoDB 5.1 and all future rapid releases will be fully supported on MongoDB Atlas and are available as development releases from our Download Center.

Native Time Series Enhancements

With optimized time series collections, clustered indexes, and window functions, MongoDB 5.0 made it faster, easier, and lower cost to serve the industry’s fastest growing, data intensive use cases such as IoT platforms and real-time financial analytics. Now with MongoDB 5.1, you can globally distribute your time series applications and further simplify their development:

More developer velocity

Time series collections can now take advantage of MongoDB’s native sharding to horizontally distribute massive data sets and co-locate nodes with data producers to support local write operations and to enforce the data sovereignty controls.

It is common for time series data to be uneven, for example a sensor goes offline and several readings are missed. But in order to perform analytics and ensure correctness of results data needs to be continuous. With densification you can now handle missing data better and build time series apps and analytics faster putting less burden on the developer. Time series collections now also support delete operations. While most time series applications are append-only, users need to be able to invoke their right to erasure so we are giving developers an easy way to comply with modern data privacy regulations.

Complete data lifecycle

From medical sensors to market data fluctuations, time series means hundreds of millions data points per day. You need to process these massive volumes fast, distill valuable insights then continue to retain the full data set for regulatory purposes - possibly for years - all without incurring skyrocketing costs and data movement complexity. With Atlas Online Archive support for time series, now available in preview, you can do exactly that and seamlessly and economically manage your entire time series data lifecycle. Simply define your own archiving policy, and Atlas handles all data movement for you by tiering aged time series data out of your database into lower cost, fully managed cloud object storage. Rather than delete anything, you can retain all your time series data, preserving the ability to query it at any time alongside your live data for long term trend analytics and machine learning, or for compliance purposes. Support for online archiving is available for MongoDB 5.0 and above.

Broader platform support for Time Series Data

Our native time series capabilities are supported across the entire MongoDB application data platform making it easy to work with time series data in any context. You can now create time series collections directly from Atlas Data Explorer, MongoDB Compass or MongoDB for VS Code. With support for date binning, date filtering options, and value comparison, Atlas Charts lets you create graphs and dashboards from any Atlas times series collection, easily share insights, and embed visualizations into your applications for a rich user experience.

Richer and More Flexible Analytics and Full-Text Search

Many developers start out with MongoDB for their operational use cases, and then expand to leverage our platform's versatility in powering analytics and search as well. MongoDB 5.1 includes new features and enhancements that make it easier to unlock insights from your data and improve user experience.

Cross-shard joins and graph traversals

For most transactional and operational workloads, the document data model largely eliminates the need to join data from different collections. This is because related data can be embedded in sub-documents and arrays within a single, richly structured document – following the principle that what is accessed together is often best stored together.

However analytical applications can sometimes require joins to be executed – for example bringing together customers and orders from separate collections. Through the $lookup aggregation pipeline stage, you can have the database join collections for you. The $graphLookup stage gives you the ability to traverse related data, performing “friend-of-friend” type queries to uncover patterns and surface previously unidentified connections in your data.

In MongoDB 5.1 we now allow you to use $lookup and $graphLookup to combine and analyze data that is distributed across shards which was not previously possible. Our design gives you even more precision in your code by enabling you to target individual shards as needed. However you don’t need to understand sharding or even know your collection is sharded to run these queries as there is no new syntax for developers to learn.

Materializing results for operational analytics

The $merge and $out aggregation stages can be used to write the results of an aggregation pipeline in order to create a new collection or create/update an on-demand materialized view. These stages enable users to reduce processing overhead by reading pre-computed results instead of re-running the aggregation each time, and by writing only incremental results when the aggregation results change.

Users often want to run resource-intensive analytical queries on secondary nodes in order to avoid performance impacts on the primary — but since only primaries can serve writes, aggregations including $out or $merge could not previously run on a secondary node.

Soon, such pipelines will run, performing their query execution work on a secondary node, then automatically directing any writes to the primary. This allows you to offload computationally expensive analytics work to secondary nodes while still being able to materialize the results of that work. This will be accessible via drivers in their upcoming releases.

Full-Text Search Facets: now in public preview

Faceted search allows users to filter and quickly navigate search results by categories and see the total number of results per category for at-a-glance statistics. With our new facet operator, facet and count operations are pushed down into Atlas Search’s embedded Lucene index and processed locally, taking advantage of 20+ years of Lucene optimizations. This makes workloads such as ecommerce product catalogs, content libraries, and counts run up to 100x faster. Learn more from our Atlas Search facets blog post.

New and Enhanced Security Options

End-to-end encryption for confidential computing

Extending beyond cloud provider Key Management Services (KMS), MongoDB’s unique Client-Side Field Level Encryption will support any KMIP-compliant KMS. This functionality is being released in new versions of drivers that will be available soon.

Client-Side FLE delivers some of the strongest privacy and security controls available anywhere today. By using the MongoDB drivers to encrypt the most sensitive fields in your documents before they leave the application you can do three things that are not possible with in-flight or at-rest encryption alone:

Protect data while it is in-use, in the memory of your active database instance. The database never sees plaintext, but data remains queryable.
Make data unreadable to anyone running the database for you, or who has access to the underlying database infrastructure — this includes MongoDB SREs running the Atlas services as well as cloud provider personnel.
Simplify the process of enforcing right to erasure (sometimes called right to be forgotten) mandates in modern privacy regulations such as the GDPR or the CCPA. This is because you simply destroy the key encrypting a user’s PII, and their data is rendered unreadable and unrecoverable — in-memory, at-rest, in backups, and in logs.

Google Cloud Private Service Connect

We’ve also added a new network security option to MongoDB Atlas with the availability of Google Private Service Connect (PSC). Private Service Connect allows you to create private and secure connections from your Google Cloud networks to MongoDB Atlas. It creates service endpoints in your VPCs that provide private connectivity and policy enforcement, allowing you to easily control network security in one place.

Along with VPC Peering, Google Cloud PSC makes it easy to connect your applications and services in Google Cloud to Atlas.

Platform Resilience

MongoDB 5.1 continues to build out controls for reliability and availability with the following enhancements:

We've made a number of changes to WiredTiger internals that improve backups, including minimizing the checkpoints pinned while a backup cursor is open and improving handling of backup cursors that are open for long periods. These improvements will reduce both the operational overhead and storage consumption on the replica node from which the backup is taken. This improvement is available for backups taken from MongoDB Atlas and from self-hosted deployments controlled by Ops Manager or Cloud Manager, and has been backported to MongoDB 4.2 and above.
In addition to enhancements affecting backups, WiredTiger checkpointing and locking have been improved to enhance performance when MongoDB is managing many concurrently active collections in a single instance. This is especially useful to multi-tenant applications built on MongoDB.
We'll also be adding improvements in upcoming versions of our drivers that support mongos controls to mitigate connection storms in sharded clusters, especially during failover events. These include preferentially connecting to nodes that have existing idle connections that can be reused, improving the matching of connection pool sizing across replica set members, limiting the rate of new connections, and adding a mechanism to limit the number of mongos servers used when connecting to sharded clusters via SRV records.

Improved Productivity for C# Developers

Making it easier for developers to query and manipulate data is at the core of our mission as the modern application data platform. For C# developers the LINQ API serves as the main gateway between the language and database. In MongoDB 5.1 we are improving developer productivity for our C# community with a completely redesigned LINQ interface that lets developers write all of their MongoDB queries as well as build sophisticated aggregation pipelines natively in C#.

Getting Started with MongoDB 5.1

You can learn more about all of the new features and enhancements in MongoDB 5.0 and 5.1 from our Guide to What’s New.

MongoDB 5.1 is available now. If you are running Atlas Serverless instances or have opted in to receive Rapid Releases in your dedicated Atlas cluster, then your deployment will be automatically updated to 5.1 starting today. For a short period after upgrade, the Feature Compatibility Version (FCV) will be set to 5.0; certain 5.1 features will not be available until we increment the FCV. MongoDB 5.1 is also available as a Development Release for evaluation purposes only from the MongoDB Download Center. Consistent with our new release cadence announced last year, the functionality available in 5.1 and the subsequent Rapid Releases will all roll up into MongoDB 6.0, our next Major Release scheduled for delivery in 2022.

I really look forward to hearing what you think about MongoDB 5.1, and can’t wait to tell you what’s new in the 5.2 Rapid Release scheduled for next quarter.

Safe Harbour Statement

The development, release, and timing of any features or functionality described for our products remains at our sole discretion. This information is merely intended to outline our general product direction and it should not be relied on in making a purchasing decision nor is this a commitment, promise or legal obligation to deliver any material, code, or functionality.

← Previous

100x Faster Facets and Counts with MongoDB Atlas Search: Public Preview

Today we’ve released one of the most powerful features of Atlas Search in public preview, and ready for your evaluation: lightning fast facets and counts over large data sets. Faceted search allows users to filter and quickly navigate search results by categories and see the total number of results per category for at-a-glance statistics. With the new facet operator , facet and count operations are pushed down into Atlas Search’s embedded Lucene index and processed locally – taking advantage of 20+ years of Lucene optimizations – before returning the faceted result set back to the application. What this means is that now facet-heavy workloads such as ecommerce product catalogs, content libraries, and counts run up to 100x faster . The power of facets and counts in full-text search Faceting is a popular search and analytics capability that allows an application to group information into related categories by applying filters to query results. Users can narrow their search results by simply selecting a facet value as a filter criteria. They can intuitively explore complex data sets, providing fast and convenient navigation to quickly drill into the data that is of most interest. A common use of faceting is navigating product catalogs. With travel starting to reopen, let's take a travel site as an example. By using faceted search, the site can present vacation options by destination region, trip type (i.e. hotel, self-catering, beach, ski, city break), price band, season, and more, enabling users to quickly navigate to the category that is most relevant to them. Facets also enable fast results counting. Extending our travel site example, business analysts can use facets to quickly compare sales statistics by counting the number of trips sold by region and season. Prior to the new facet operator, the only way Atlas Search could facet and count data was to retrieve the entire result set back to MongoDB’s internal $facet aggregation pipeline stage . While that was OK for smaller data sets, it became slow when the result set exceeded tens of thousands of documents. This all changes as now operations are pushed down to Atlas Search’s embedded and optimized Lucene library in a single $search pipeline stage. From our internal testing of a collection with one million documents, the new Atlas Search faceting improves performance by 100x. How to use faceting in Atlas Search Our new Atlas Search facets tutorial will help you get started. It describes how to: Create an index with a facet definition on string, date, and numeric fields in the sample_mflix.movies collection. Then run an Atlas Search query against those fields for results grouped by values for the string field and by ranges for the date and numeric fields, including the count for each of those groups. To use Atlas Search facets, you must be running your Atlas cluster on MongoDB 4.4.11 and above or MongoDB 5.0.4 and above. These clusters must be running on the M10 tier or higher. Facets and counts currently work on non-sharded collections. Support for sharded collections is scheduled for next year. The power of Atlas Search in a unified application data platform in the cloud MongoDB Atlas Search makes it easy to build fast, relevant full-text search on top of your data in the cloud. A couple of API calls or clicks in the Atlas UI, and you instantly expose your data to sophisticated search experiences that boost engagement and improve satisfaction with your applications. Your data is immediately more discoverable, usable, and valuable. By embedding the Apache Lucene library directly alongside your database, data is automatically synchronized with the search index; developers get to work with a single API; there is no separate system to run and pay for; and everything is fully-managed for you on any cloud you choose. Figure 1: Rather than bolting-on a separate search engine to your database, Atlas Search provides a fully integrated platform. Atlas Search provides the power you get with Lucene — including faceted navigation, autocomplete, fuzzy search, built-in analyzers, highlighting, custom scoring, and synonyms — combining it with the productivity you get fromMongoDB. As a result, developers can ship search applications and new features 30%+ faster. Next steps You can try out Atlas Search with the public preview of lightning-fast facets and counts today: If you are new to Atlas Search, simply spin up a cluster (M10 tier or above) and get started with our Atlas Search facets tutorial . If you are already using Atlas Search on M10 tiers and above then update your indexes to use the facet field mapping , and then start querying ! Your data remains searchable while it is being re-indexed. If you want to dig into the use cases you can serve with Atlas Search — along with users who are already taking advantage of it today — download our new Atlas Search whitepaper . Safe Harbor The development, release, and timing of any features or functionality described for our products remains at our sole discretion. This information is merely intended to outline our general product direction and it should not be relied on in making a purchasing decision nor is this a commitment, promise or legal obligation to deliver any material, code, or functionality.

November 9, 2021

Next →

Data and the European Landscape: 3 Trends for 2022

The past two years have brought massive changes for IT leaders: large and complex cloud migrations; unprecedented numbers of people suddenly working, shopping and learning from home; and a burst in demand for digital-first experiences. Like everyone else, we are hoping that 2022 isn’t so disruptive (fingers crossed!), our customer conversations in Europe do lead us to believe the new year will bring new business priorities. We’re already noticing changes in conversations around vendor lock-in, thanks to the Digital Markets Act, a new enthusiasm for combining operational and analytical data to drive new insights faster, and a more strategic embrace of sustainability. Here’s how we see these trends playing out in 2022. Digital markets act draws new attention to cloud vendor lock-in in Europe We’ve heard plenty about the European Commission’s Digital Markets Act , which, in the name of ensuring fair and open digital markets, would place new restrictions on companies that are deemed to be digital “gatekeepers” in the region. That discussion will be nothing compared to the vigorous debate we expect once the EU begins the very tricky political business of determining exactly which companies will fall under the act. If the EU sets the bar for revenues, users, and market size high enough, it’s possible that the regulation will end up affecting only Facebook, Amazon, Google, Apple, and Microsoft. But a European group representing 2,500 CIOs and almost 700 organizations is now pushing to have the regulation encompass more software companies. Their main concern centers around “distorted competition” in cloud infrastructure services and a worry that companies are being locked into one cloud vendor. A trend that will likely increase in 2022 that pushes back on cloud vendor lock-in is embracing multi-cloud strategies. We should expect to see more organisations in the region pursuing multi-cloud environments as a means to improve business continuity and agility whilst being able to access best of breed services from each cloud provider. As we have always said …”it’s fine to date your cloud provider….but don’t ever marry them.” The convergence of operational and analytical data The processing of operational and analytical data is almost always contained in different data systems, each tuned to that use case and managed by separate teams. But because that data lives in separate places, it’s almost impossible for organisations to generate insights and automate actions in real time, against live data. We believe 2022 is the year we’ll see a critical mass of companies in the region make significant progress toward a convergence of their operational and analytical data. We’re already starting to see some of the principles of microservices in operational applications, such as domain ownership, be applied to analytics as well. We’re hearing about this from so many of our customers locally, who are looking at MongoDB as an application data platform that allows them to perform queries across both real-time and historical data, using a unified platform and a single query API. This results in the applications they are building becoming more intelligent and contextual to their users, while avoiding dependencies on centralized analytics teams that otherwise slow down how quickly new, data-driven experiences can be released. Sustainability drives local strategic IT choice Technology always has some environmental cost. Sometimes that’s obvious — such as the energy needs and emissions associated with Bitcoin mining. More often, though, the environmental costs are well hidden. The European Green Deal commits the European Union to reducing emissions by 55% by 2030, with a focus on sustainable industry. With the U.N. Climate Change Conference (COP26) recently completed in Glasgow, and coming off the hottest European summer on record, climate issues have become top of mind. That means our customers are increasingly looking to make their technical operations more sustainable — including in their choice of cloud provider and data centers. According to research from IDC , more than 20% of CxOs say that sustainability is now important in selecting a strategic cloud service provider, and some 29% of CxOs are including sustainability into their RFPs for cloud services. Most interesting, 26% say they are willing to switch to providers with better sustainability credentials. Historically, it’s been difficult to make a switch like that. That’s part of the reason we built MongoDB Atlas — to give our customers the flexibility to run in any region , with any of the three largest cloud providers, and to make it easy to switch between them, and even to run a single database cluster across them. Publicly available information about the footprint of individual regions and even single data centers will make it simpler for companies to make informed decisions. Already, at least one cloud platform has added indicators to regions with the lowest carbon footprint. Source: IDC, European Customers Engage Services Providers at All Stages of Their Cloud Journey, IDC Survey Spotlight, Doc #EUR248484021, Dec 2021

December 21, 2021