Why is software integration such a mess?

Andras Gerlits
5 min readOct 26, 2023

Although this article was written from a neutral perspective, I must preface it by saying that our company (omniledger.io) specialises in providing software and consulting services for distributed data platforms.

With the advent of “Software as a Service” (SaaS) products, companies have the option of not having to build their own solutions to everything. We run services on different computers, in different data-centres, where these solutions know nothing about other others around them. The flip side of our new cloud-based status-quo is the increased need for software integration.

Take “Customer Relationship Management” (CRM) software for example. CRM is very useful for (*duh*) managing customer relationship, but no CRM software serves all the needs of the whole company by itself. In any setup, we would need a number of other software running alongside it. Yet, if you count the number of companies offering various SaaS CRM solutions, I suspect you’ll easily reach a hundred before you get bored of it.

This tells us that most companies rely on their ability to integrate software (they have no direct control over) to operate. If you use 5 different SaaS providers, any one of them going down could mean no more “business as usual”. Worse, since you chose the SaaS products yourself, integrating them will be your problem alone, since these solutions will certainly not understand the specifics of your business domain.

Enter microservices

SaaS usually means exposed APIs. These APIs provide entry-points where you can read/write data from/to their system. To do this, we create small, simple applications (called adapters) operating on them which translate our company’s data into their expected format and call it a day.

Not so fast. The data we’re communicating now “lives” in both their SaaS software (often many of them at once) and our platform. Both platforms make decisions based on the last information it received, which can be a problem unless these things are tightly controlled. Say, you place an order for a sprocket. You hit the order button, but just a second later realise that you needed two. You go back to the form and update the number of ordered items and go on your way. Since these systems rarely execute all required operations before they tell you that your order was successful, now you have two requests in the system, racing each other towards all the affected subsystems. The first one, for an order of one, and the second one for two. Unless the system was designed to deal with this issue, some sub-systems can receive the orders in the wrong order. They can potentially process the second request first and the first one later, leaving you perhaps footing the bill for two, but only delivering one product.

In well-behaved systems, anomalies like this are managed through some ordering mechanism, but finding all the places where data can potentially contradict itself is not easy. Together, we call these kind of issues “data consistency” problems and they are a serious thorn in microservices based projects’s sides.

I give a rundown on these complexities in another article, but suffice to say that issues like this have haunted microservices ever since their inception. In fact, this is usually the issue cited as the main argument against working with microservices, since they make the software complex (and hence expensive to run and develop) and have hard to find bugs, impacting stability.

According to research (PDF), it’s humanly impossible to exhaustively reason about the correctness of such systems. To put it differently: these issues will never go away, even if the frequency of the problems will vary over time with ad-hoc fixes when specific problems come to light.

(…) we conducted a systematic literature review of representative articles reporting the adoption of microservices, we analyzed a set of popular open-source microservice applications, and we conducted an online survey to cross-validate the findings of the previous steps with the perceptions and experiences of over 120 experienced practitioners and researchers.

Through this process, we were able to categorize the state of practice of data management in microservices and observe several foundational challenges that cannot be solved by software engineering practices alone (…)

Orchestration

The established alternative to the above is to maintain a central software, which instructs each sub-system to execute their steps in some pre-determined order. This solution has many names, we’ll call it middleware. The middleware runs business-oriented processes, which implement the services required by the user. These are familiar operations to the business, such as “receive stock”, “fulfill order” or “on-board new employee”. We call them workflows.

In the above example, when placing the order, it would:

  1. tell the stock-handling software that the client ordered an item,
  2. the accounting software that we sold some stock for some money, and
  3. the logistics software to create a delivery plan.

Workflows are usually independent of each other and do their own checks. We can now make sure that each business-operation can run safely, but it leaves the questions of what happens if:

  • there are many such processes running in parallel (isolation)
  • if one workflow contradicts another (consistency)
  • a sub-system reports an error while executing the operation (atomicity)
  • the central solution fails

We can already see how moving towards this solution is not the silver bullet we hoped it would be. It solves some issues present in microservices, opens some others and leaves some unaddressed.

The problems of isolation, consistency and atomicity were also present in microservices, but a new problem we now need to deal with is a single, central point of failure. With microservices, there is no single central “authority” to instruct the others, which means that there’s no single system that impacts all the others if it does go down. We usually build these with extra redundancy (both on the hardware and the software level) but careful planning can only go so far. Software bugs, internet infrastructure outages or maintenance errors mean that they sometimes crash even with the most careful planning.

ACID? In my database?

This is not the first time computer science came across the problem of guaranteeing safe exchange of data between processes. In fact, databases have provided various levels of support for these issues since the 70s. By today, the so-called consistency guarantees provided by databases have become industry standards. The first three letters of the acronym stands for the terms we already referred to above, Atomicity-Consistency-Isolation. The D, which stands for Durability is -although generally important- not central to our current analysis.

It’s widely accepted that databases provide a safe way for processes to communicate data between each other. In that case, why not extend them to also talk to other databases with the same safety guarantees.

This is exactly the service we invented, patented and built. We provide a way to bring the safeness of the data between isolated database instances to your entire integration platform. We can retrofit this over your existing solution and promise unparalleled stability and resilience, while only relying on your developer’s ability to use their own database, so this also results in a significant drop in complexity for them.

Visit our website or if you’re not yet convinced, read our (increasingly technical) articles on how our solution can help you stabilise your application and tame your complexities.

--

--

Andras Gerlits

Writing about distributed consistency. Also founded a company called omniledger.io that helps others with distributed consistency.