Joyce, a Decentralized Approach to Foster Business Agility

Raffaele Camanzo
December 21, 2021
#Atlas

Despite all of the tools and methodologies that have arisen in the last few years, many companies, particularly those that have been in the market for decades, struggle when it comes to leveraging their operational data to build new digital products and services. According to research and surveys conducted by McKinsey over the last few years, the success rate of digital transformations is consistently low, with less than 30% succeeding at improving their company’s performance. There are a lot of reasons for this, but most of them can be summarized in a sentence: A digital transformation is primarily an organizational and cultural change and then a technological shift.

The question is not if digital transformation is a good thing nor is it if moving to the cloud is a good choice. Companies need (badly, in some cases) a digital transformation and yes, the pros of moving to the cloud usually overcome the cons.

So, let’s try to dig deeper and analyze three of the main problems companies face when they go on this journey

Digital products development

Products by nature are customer-driven but companies run their businesses on multiple back-end systems that are instead purpose-driven. Unless you run a very small business, different people with different objectives have ownership of such products and systems. Given this context, what happens when a company wants to launch a new digital product at speed? The back-end systems (CRMs, E-commerce, ERP, etc.) hold the data they need to bring to the customer. Some systems are SaaS, some are legacy, and perhaps others are custom applications created by the company that disrupted the market with innovative solutions back in the days, the perfect recipe for integration hell.

The product manager needs to coordinate and negotiate multiple change requests with the system’s owners whilst trying to convince them to add their needs in the backlog to meet the deadline. And things get even worse, as the new product relies on the computational power of the source systems, and if those systems cannot handle the additional traffic, both the product and the core services will be affected.

Third-party integration

“Everybody wants the change, (almost) nobody wants to change.” In this ever-growing digital world, partnering with third parties (whether they are clients or service providers) is crucial, but everyone who has tried to do so knows how challenging this is: non-standard interfaces, CSV files over FTP with fancy update rules, security issues… The list of unwanted things can grow indefinitely.

SaaS everywhere

The Software-as-a-Service model is extremely popular and getting the service you want without worrying about the underlying infrastructure gives freedom and speed of adoption, but what happens when a big company relies on multiple SaaS products to run their business? Sooner or later, they experience loss of control and higher costs in keeping a consistent view of the big picture. They need to deal with SaaS internal representations of their own data, multiple views of the same domain concept, unplanned expenses to export, and interpret and integrate the data from different sources with different formats.

Putting it all together

All the issues above fall into a well-known category of information technology. They are integration problems, and over the years, a lot of vendors promised a definitive solution. Now, you can consider low-code/no-code platforms with hundreds of ready-made connectors and modern graphical interfaces. Problem solved, right? Well, not really. Low-code integration platforms simplify implementation. They are really good at it, but doing so oversimplifies the real challenge: creating and maintaining a consistent set of APIs shaped around the business value over time, and preventing the interfaces from leaking internal complexities to the rest of the company, something that has to be defined and maintained through architectural choices and proper skills (completely hidden behind the selling points of such platforms).

There are two different ways to solve integration problems:

Centralized using adapters. In this case, the logic is pushed to the central orchestration component, with integration managed through a set of adapters. This is the rather old school SOA approach, the one that the majority of market integration platforms are built on.
Decentralized, pushing the logic to the edges, giving autonomous teams the freedom to define both the boundaries and the APIs that a domain must expose to deliver business value. This is a more modern approach that has arisen recently alongside the rise of microservices and, in the analytical world, with the concept of data mesh.

The former gives speed at the starting point and the illusion of reducing the number of choices and skills to manage the problems, but in the long run, inevitably, this begins to accumulate technical debt. Due to the lack of necessary degrees of freedom, you lose the ability to evolve the integration points over time, the same thing that caused the transition from SOA to microservices architectures. The latter needs the relevant skills, vision, and ability to execute but gives immediate results and allows you to flexibly manage the evolution of the enterprise architecture over time.

Old problems, new solutions

At Sourcesense in the last 20 years, we have partnered on hundreds of projects to bring agility, speed, and new open-source technology to our customers. Many times through the years, we were faced with the integration challenges above, and yes, we tried to solve them with the technology available at the time, so we have built some integration solutions on SOA (when they were the best of breed) and interacted with many of the integration platforms on the market. Then, we struggled with the issues and limitations of the integration landscape and have listened to our customers’ needs and where expectations have fallen short. The rise of agile methodologies, cloud computing, new techniques, technologies, and architectural styles has given an unprecedented boost to software evolution and the ability to support business needs, so we embraced the new wave and now have growing experience in solving problems with these tools.

Along the way, we’ve seen a recurring pattern when we encountered integration problems, the effectiveness of data hubs as components of the enterprise architectures to solve these challenges, so we built one of our own: Joyce.

Data hubs

This is a relatively new term and refers to software platforms that collect data from different sources with the main purpose of distribution and sharing. Since this definition is broad and vague, let’s add some other key elements that matter and help define the contours of our implementation. Collecting data from different sources can bring three major benefits:

Computational decoupling from the sources. Pulling (or pushing) the data out of the originating systems means that client applications and services interact with the hub and not directly with the sources, preventing them from being slowed down by additional traffic.
Catalog and discoverability. If data is collected correctly, this leads to the creation of a catalog, allowing people inside the organization to search, discover, and use the data inside the hub.
Security. The main purpose of the hubs is distribution and sharing. This leads immediately to focus on access control and security hardening. A single access point simplifies the overall security around the data because it significantly reduces the number of systems the clients have to interact with to gather the data they need.

Joyce, how it works

The cornerstone concept of Joyce is the schema. It allows you to shape the ingested data and how this data will be made available to client services. Using the same declarative approach made popular by Kubernetes, the schemas describe the expected result and the platform performs the actions to make it happen.

Schemas are standard JSON schema files stored and classified in a catalog. Their definition falls into three categories:

Input – how to gather and shape the source data. We leverage the Kafka Connect framework to provide ready-made connectors for a wide variety of sources. The ingested data can be filtered, formatted, and enriched with transformation handlers (domain-specific extensions of JSON schema).
Model – allows you to create new aggregates from the data stored in the platform. This feature gives the freedom to model the data the way needed by client services.
Export – bulk data export capability. Exported data can be any query run against the existing data with an optional temporal filter.

Input and model data is made available to all the client services with the proper authorization grants through auto-generated REST and GraphQL APIs. It is also possible to subscribe to a dedicated topic if an event-driven approach is more suitable for the use-case.

MongoDB: the key for a flexible model and performance at scale

We heavily rely on MongoDB. Thanks to its flexibility, we can easily map any data structure the user defines to collect the data. Half of the schema definition is basically the definition of a MongoDB schema. (We also auto-generate one schema per collection to guarantee data integrity.)

Joyce runs in a Kubernetes cluster and all its services are inherently stateless to exploit the full potential of horizontal scaling. The architecture is based on the CQRS pattern. This means that writes and reads are completely decoupled and can scale independently to meet the unique needs of the production environment. MongoDB is also the backing database of the API layer so we can keep the promise of low latency, high throughput, and continuous availability along all the components of the stack.

The platform is available as a fully managed PaaS on the three major cloud providers (AWS, Azure, GCP) but if needed, it can be installed on an existing infrastructure (in cloud and on prem).

Final considerations

There are many challenges leaders must face for a successful digital transformation. They need to guide their organizations along a process that involves changes on many levels. The exponential growth of technological solutions in the last few years adds more complexity and confusion. The evolution of organizational models and methodologies point in the direction of shared responsibility, people empowerment, and autonomous teams with a light and effective central governance. The same evolution also permeates the novel approaches to enterprise architectures like the data mesh.

Unfortunately, there’s no silver bullet, just the right choices for the given context. Despite all the marketing and hype around this or that one solution to all of your digital transformation needs, a long term successful shift needs guidance, competence and empowerment. We’ve built Joyce with the aim of reducing the burden of repetitive tasks and boilerplate code to get the results faster and catch the low hanging fruits without trying to replace the necessary architectural thinking to properly define the current state and the evolution of the enterprise architectures of our customers. If you’re struggling with the problems enlisted at the beginning of this article you should give Joyce a try.

Learn more about Joyce

← Previous

FHIR Technology is Driving Healthcare's Digital Revolution

Technology supporting healthcare’s digital transformation is so pervasive that the question isn’t what technology to choose, but rather, what problems need to be solved. Advancing technology and access to secure and real-time data analytics will vastly improve patients’ health and happiness, and growing interoperability standards are pushing organizations forward in their digital transformations. Together with the Healthcare Information and Management Systems Society (HIMSS) and leading healthcare insurance provider Humana , MongoDB recently released a three-part podcast series chronicling the ways Fast Healthcare Interoperability Resources (FHIR), AI, and the cloud are reshaping healthcare for the better. Here’s a quick roundup of our discussions. Data is the future of healthcare . Whether providers are driving patient engagement through wearable devices, wellness programs or connected care, data will take healthcare to the next digital frontier. We’ll see these advancements through AI, FHIR, and the cloud. FHIR is revolutionizing healthcare technology . Not only is FHIR implementation a requirement, it’s also a crossroads for data architects. Choosing the right approach has deep implications for healthcare IT. The operational data layer (ODL) approach to interoperability makes the impossible possible . Through Humana’s digital transformation journey, it became clear that meaningful progress isn’t possible using core legacy database systems. AI, FHIR, and the cloud: Why data is the future of healthcare In this episode , we dive into what a digital transformation would look like for the healthcare industry, and what are some of the biggest technology challenges facing healthcare today. A digitally transformed healthcare industry will weave real-time data analytics with more personalized care. Patients today want a more modern healthcare experience that includes telemedicine, digital forms and touchless mobile check ins. The end goal is simple: maximize the human experience while advancing away from legacy technology systems that slow down both healthcare practitioners and patients. When it comes to today’s biggest healthcare challenges, the cloud stands out as a key driver of promise and peril. The promise is that we can build applications, go to market and reach patients through wellness programs more quickly. The peril lies in the infrastructure, which is unknown to many healthcare organizations. This presents a unique challenge for the architects and certainly the developers at organizations with older legacy systems. The challenge here is avoiding a simple left hand shift or cloud for the sake of cloud, and moving from simple modernization to actual transformation. Listen below to hear the entire conversation Your browser does not support the audio element. Bring the FHIR inside for digital transformation In episode 2 , HIMSS and MongoDB take a closer look at why FHIR is a change agent in healthcare technology, and how healthcare organizations globally are using the new data standard to jump start legacy modernization and digital transformation. What is FHIR? The FHIR standard is a common set of schema definitions and APIs that helps providers and patients manage and exchange healthcare data. Using FHIR, records provided by healthcare organizations are standardized into a common data model over rest-based APIs. It makes the data that healthcare providers and payers use easier to exchange. Growing regulatory pressure has accelerated U.S. FHIR adoption among healthcare organizations and technology vendors.The Centers for Medicare and Medicaid Services (CMS) started a rolling deadline for FHIR compliance in 2020, with fines for institutions that fall behind. As a result, for most U.S.-based healthcare providers, payers, and their technology vendors, the past few years were a headlong race to adopt FHIR. Here are three reasons why FHIR is hugely significant for healthcare technology leaders: It’s a federal mandate from the Centers for Medicare & Medicaid Services. It’s a complex data integration challenge. Legacy systems built before the mid 2010s are not interoperable with the FHIR mandate. FHIR implementation approaches For large organizations with huge data requirements, data architects can experience paralysis from the sheer volume of legacy systems to unwind. These groups have all of their patients’ electronic healthcare record information, payer information and more bound up in legacy systems, none of which is interoperable with FHIR. The second challenge is cloud migration, which can be skirted by organizations using a checkbox compliance approach. In those cases, API layers are used to ingest and serve data to legacy systems, but are not really integrated with the legacy system in real time. The most successful approach to tackling this challenge is not to rewrite, unwind or replace legacy systems completely, but keep them contained. We recommend bringing in an operational data layer that exposes the information in the legacy system and keeps it in sync with the legacy system, but then lands it in an ODL in the FHIR standard. With the FHIR API, patients and providers can interact with data in real time and access records in milliseconds after a diagnosis. Real-time records synced with legacy systems and patients’ private data is protected. Delve into the full conversation below Your browser does not support the audio element. FHIR and the future of healthcare at Humana You don't have to take the rip and replace approach when modernizing your legacy systems with an ODL method. This was a key to successful modernization for Humana, as discussed in the third and final episode in our series. For large enterprises that may have decades’ worth of acquired legacy systems, often pulling similar datasets from disparate databases, the pursuit of modernized interoperability begins to look like an impossible task. Listen to the final episode of our podcast series to here how Humana’s ODL approach met the company’s data velocity requirements, and next steps for personalized healthcare and interoperability at Humana. Listen to the entire conversation below Your browser does not support the audio element. More related FHIR and healthcare resources [ White paper ] Bring the FHIR Inside: Digital Transformation Without the Rip and Replace [ On-demand webinar ] Building FHIR Applications with MongoDB

December 21, 2021

Next →

Data and the European Landscape: 3 Trends for 2022

The past two years have brought massive changes for IT leaders: large and complex cloud migrations; unprecedented numbers of people suddenly working, shopping and learning from home; and a burst in demand for digital-first experiences. Like everyone else, we are hoping that 2022 isn’t so disruptive (fingers crossed!), our customer conversations in Europe do lead us to believe the new year will bring new business priorities. We’re already noticing changes in conversations around vendor lock-in, thanks to the Digital Markets Act, a new enthusiasm for combining operational and analytical data to drive new insights faster, and a more strategic embrace of sustainability. Here’s how we see these trends playing out in 2022. Digital markets act draws new attention to cloud vendor lock-in in Europe We’ve heard plenty about the European Commission’s Digital Markets Act , which, in the name of ensuring fair and open digital markets, would place new restrictions on companies that are deemed to be digital “gatekeepers” in the region. That discussion will be nothing compared to the vigorous debate we expect once the EU begins the very tricky political business of determining exactly which companies will fall under the act. If the EU sets the bar for revenues, users, and market size high enough, it’s possible that the regulation will end up affecting only Facebook, Amazon, Google, Apple, and Microsoft. But a European group representing 2,500 CIOs and almost 700 organizations is now pushing to have the regulation encompass more software companies. Their main concern centers around “distorted competition” in cloud infrastructure services and a worry that companies are being locked into one cloud vendor. A trend that will likely increase in 2022 that pushes back on cloud vendor lock-in is embracing multi-cloud strategies. We should expect to see more organisations in the region pursuing multi-cloud environments as a means to improve business continuity and agility whilst being able to access best of breed services from each cloud provider. As we have always said …”it’s fine to date your cloud provider….but don’t ever marry them.” The convergence of operational and analytical data The processing of operational and analytical data is almost always contained in different data systems, each tuned to that use case and managed by separate teams. But because that data lives in separate places, it’s almost impossible for organisations to generate insights and automate actions in real time, against live data. We believe 2022 is the year we’ll see a critical mass of companies in the region make significant progress toward a convergence of their operational and analytical data. We’re already starting to see some of the principles of microservices in operational applications, such as domain ownership, be applied to analytics as well. We’re hearing about this from so many of our customers locally, who are looking at MongoDB as an application data platform that allows them to perform queries across both real-time and historical data, using a unified platform and a single query API. This results in the applications they are building becoming more intelligent and contextual to their users, while avoiding dependencies on centralized analytics teams that otherwise slow down how quickly new, data-driven experiences can be released. Sustainability drives local strategic IT choice Technology always has some environmental cost. Sometimes that’s obvious — such as the energy needs and emissions associated with Bitcoin mining. More often, though, the environmental costs are well hidden. The European Green Deal commits the European Union to reducing emissions by 55% by 2030, with a focus on sustainable industry. With the U.N. Climate Change Conference (COP26) recently completed in Glasgow, and coming off the hottest European summer on record, climate issues have become top of mind. That means our customers are increasingly looking to make their technical operations more sustainable — including in their choice of cloud provider and data centers. According to research from IDC , more than 20% of CxOs say that sustainability is now important in selecting a strategic cloud service provider, and some 29% of CxOs are including sustainability into their RFPs for cloud services. Most interesting, 26% say they are willing to switch to providers with better sustainability credentials. Historically, it’s been difficult to make a switch like that. That’s part of the reason we built MongoDB Atlas — to give our customers the flexibility to run in any region , with any of the three largest cloud providers, and to make it easy to switch between them, and even to run a single database cluster across them. Publicly available information about the footprint of individual regions and even single data centers will make it simpler for companies to make informed decisions. Already, at least one cloud platform has added indicators to regions with the lowest carbon footprint. Source: IDC, European Customers Engage Services Providers at All Stages of Their Cloud Journey, IDC Survey Spotlight, Doc #EUR248484021, Dec 2021

December 21, 2021