Rejuvenating Enterprise Integration with Ballerina

18 min readMar 15, 2017

In this post I will discuss the current state of the Enterprise Integration landscape, the characteristics of future integration middleware and how Ballerina; a new programming language which is designed and optimized for integration, empowers the future of Enterprise Integration middleware.

Microservices for the Enterprise: Designing, Developing, and Deploying

Microservices for the Enterprise: Designing, Developing, and Deploying [Kasun Indrasiri, Prabath Siriwardena] on…

www.amazon.com

State of Enterprise Integration Technologies Landscape

Brown-field enterprises require to integrate the existing software application, services, systems and data, to form new software solutions to realize business functionalities. The task of plumbing these applications, services, systems and data is known as Enterprise Integration. Although Enterprise Integration is not a new thing, with the increasing adoption of Cloud, Mobile, APIs, IoT, along with the convergence of Data and Application integration technologies, Enterprise Integration is becoming a renewed hot topic in the Enterprise Architecture landscape.
Also, with the popularity of architectural paradigms such as Microservices and Container Architecture, the role of Enterprise Integration middleware in modern Enterprise Architecture is going through some drastic changes.

Conventional Centralized Integration Middleware — SOA and ESB

Almost all the integration middleware solutions out there are based on the concept of using a central integration bus(a.k.a Enterprise Service Bus- ESB) that can connect anything with anything. The central integration bus knows how to communicate with these heterogeneous systems/service and it acts as the communication channel between those systems/service.

Figure 1.1 — Centralized integration middleware/ESB is action.

The central integration bus contains the routing logic, implementation of various EIPs, connectors to various applications(on-premise and cloud) and systems, and even the business logic of some interactions. Obviously this has made the ESB/integration bus the heart of the software solution of any brownfield enterprise, in which all the enterprise integration scenarios that are related to the business are built upon.

Can Conventional Integration Middleware survive?

The conventional integration middleware is facing some severe challenges owing to the increasing adoption of Microservices and Container Architecture. While microservices architecture literally rules out the ESB/central integration middleware, almost all the existing integration middleware are not container friendly either(i.e. startup time and memory footprint). In addition, almost all the existing integration middleware are designed several decades back and they are not well-suited for the modern integration requirements. With the drastic changes in the Enterprise Integration requirements and proliferation of API, SaaS application, fine-grained services, IoT etc., most conventional integration solutions are falling apart when it comes to realization of those new integration use cases.

Microservices and ESB/Integration Middleware — Myths and Facts

Microservices architecture simply tries to eliminate the use of ESB or any dedicated integration middleware layer. This is actually coming from Martin Fowler’s ‘Smart endpoints and dumb pipes’ concept of Microservice Architecture. It simply means that the entire routing logic or business logic that resides at the central ESB layer has to be segregated and distributed among the (smart) clients and (micro) services (figure 1.2). The communication channel is just becoming a dumb channel which only connects the clients and services.

Figure 1.2 — From ESB to ‘Smart endpoints and dumb pipes’ (source : YOW! 2016 — Microservices by Martin Fowler)

When you try to implement the ‘smart endpoints and dumb pipe’ in practice, it’s very likely that you will ended up in having point to point connectivity between your services and consumers. Unfortunately this is the same point-to-point mess that ESB is originally designed to solve. (figure 1.3) This will become even worse because with Microservice Architecture, as you will experience a drastic growth of the number of services that you want to integrate with.

Figure 1.3 — ‘Smart-endpoint and dumb pipes’ in practice (source : Microservices in Practice)

Also, it’s very unlikely that you have to communicate with all these sleek microservices only. When you have to communicate with any of the proprietary/legacy system(e.g. SAP), a complex SaaS app/web API (such as Salesforce), you will surely face the implementation nightmares.
So, it’s really interesting to have a closer look at some popular microservices implementations out there and see how they have implemented this ‘Smart-endpoint and dumb pipes’ concept.

Netflix

Netflix is probably the most popular and successful microservices implementation that you have heard of. Netflix exposes their internal service functionalities through Netflix API layer. They explain the functionality of the Netflix API as follows.

The Netflix API is the “front door” to the Netflix ecosystem of microservices. As requests come from devices, the API provides the logic of composing calls to all services that are required to construct a response. It gathers whatever information it needs from the backend services, in whatever order needed, formats and filters the data as necessary, and returns the response. So, at its core, the Netflix API is an orchestration service that exposes coarse grained APIs by composing fined grained functionality provided by the microservices.
Netflix Tech Blog

Figure 1.4. : Netflix API Gateway layer (source : Netflix Tech Blog)

As shown in figure 1.4, the API layer is responsible for orchestration between microservices and the implementation is built on top of Java(and Rx-Java in particular).
So, it’s quite obvious that, unlike ‘Smart endpoints and dumb pipe’ theory, there’s a significant portion of routing/orchestration logic that resides in the API-Gateway layer which resides between the microservices and the client applications.

Uber

Uber is another popular microservice implementation and they have also broken their monolithic application to thousands of microservices. Similar to Netflix, they use ‘Edge Services’ which are exposed to the external client/mobile applications and the service orchestration logic is burnt into the edge service.

Figure 1.5: Edge Services contains the orchestration logic (source : InfoQ)

The edge services are primarily implemented on top of Node.js.

Paypal

Paypal also follows a similar pattern in their microservice implementation. The API façade layer exposes Paypal business functionalities to various internal and external client applications (figure 1.6). And the orchestration logic resides in the API façade layer and its implemented using Groovy.

Figure 1.6: API Façade layer contains the orchestration logic. (source : InfoQ)

From these Microservices implementations, it’s quite obvious that none of these organizations follow the ‘smart-endpoint and dumb pipe’ principle. Rather, having an orchestration layer between the microservices and their consumers is more practical. Although ESB or centralized integration middleware can serve these orchestration needs, owing to their monolithic nature, we can’t bring them back by any means!

That’s why we need to re-think the existing integration middleware architecture and come up with a new integration middleware that can cater to the modern Enterprise Integration needs.

Ballerina

Ballerina is a new programming language that is designed and optimized for integration. Ballerina revolutionizes the way you model integration scenarios with its graphical and textual syntax which is built on top of the sequence diagram metaphor. It is fully container native and 100% open source.

Why Ballerina?

At WSO2, we have been working on the integration space for over a decade or so and, have been actively involved in the development of WSO2 ESB which is one of the premier open source ESBs out there. Similar to all the other ESBs out there, WSO2 ESB is also intended and designed to be used as the centralized integration middleware and aforementioned state of the art integration requirements are pretty hard to achieve with such conventional integration middleware.
So, moving forward, we had to completely rethink integration middleware architecture along with the bleeding edge architectural paradigms. In that context, I guess we had two key motivations behind building Ballerina.

Sequence Diagrams to model Integration scenarios

We often find that when we have to explain any complex integration scenario to our customers, we end up using a sequence diagram model( I recall several instances where we had to explain some of the complex execution flows of WSO2 ESB with the use of the sequence diagrams). Unlike data flow diagrams, sequence diagrams are extremely powerful and scalable when it comes to complex orchestrations and multi part interactions. This is the main motivation to use sequence diagrams for representing integration scenarios graphically in Ballerina. We also designed the textual representation of the same with the parity between the textual and graphical syntaxes.

Lightweight, high-performance and container native Integration engine

At WSO2, we always wanted to build integration middleware solutions that can perform really well and our users/customers also push us to the very edge when it comes to performance. While boosting the throughput and reducing the latency is one of our top priorities for the next-generation integration middleware solution that we were building, we also had few new challenges.

The container architecture has remarkably changed the way we develop, deploy and scale software applications. Hence we need to make sure that our new integration middleware solution is fully container native. Which means, we need to make sure the startup time of the integration runtime to be few seconds with low resource consumption (memory footprint, CPU consumption). Therefore, as the key objectives of runtime design and implementation, we had to implement a super-fast and container native runtime.

Ballerina Concepts

It’s time to learn the basic concepts of Ballerina. There are quite a few resources for learning Ballerina concepts, hence I’m not going to reiterate them in this post (I would highly recommend you to go through a series of blog posts by Sanjiva Weerawarana on Ballerina and follow http://ballerinalang.org/)

Let me try to provide a concise explanation of all the main constructs.

Figure 1.7 : Ballerina components overview

Service and Resource

In Ballerina, we introduced the ‘service’ concept as the interface of any externally exposed integration logic. A service comprises of a homogeneous collection of network accessible entry points; ‘resources’.A service must be bounded to a network protocol with the use of annotations so that it can bind to the respective server connector.

A ‘resource’ contains the set of statements which are executed sequentially. That forms the integration logic of any integration scenario that you built with Ballerina.

For example, in the scenario shown in figure 1.8 and 1.9, we have a service defined for ‘Ecommerce’ application and it comprises of a collection of resources, ‘productInfo’, ‘productMgt’, ‘ordersInfo’, ‘ordersMgt’ and so on and so forth. A given resource by default executed by a worker thread (default worker) and you can control the execution by spawning new threads if you prefer (discussed in detail in workers section below).

In the Ballerina graphical notation we represent a resource as a grouped set of lifelines and their interactions. A service is denoted as a collection of such resources (figure 1.8).

Figure 1.8. : Ballerina composer graphical view

Figure 1.9. : Ballerina composer textual view

Server and Client Connectors

The service interface is protocol agnostic. Which means you can bind any protocol specific ‘Server Connector’ to a service to receive messages from the network protocols. Server connectors are plugged into the service/resource interfaces via annotations. For instance, a service/resource can receive messages from http server connector and that is configured and controlled via annotations defined at the service and resource level (e.g: http:BasePath, http:Path etc.).

The client connectors represent an external system that is accessible through the network such as HTTP service, Database or any SaaS application such as Salesforce. Ballerina comes with client connectors for these disparate systems and in the sequence diagram, we denote each external system that is accessible through client connectors (connector instance for each external system) as a lifeline. Each client connector comes with a set of ‘actions’ and those actions are invoked from the worker life line/s.

For instance, in figure 1.8, ‘productService’ HTTP service is represented as a lifeline in the sequence diagram and it’s access via HTTP client connector of Ballerina. The actions of the HTTP client connectors are the various HTTP operations such as ‘get’, ‘post’, ‘put’ etc.

Functions

Ballerina uses functions to represent the reusable integration scenario/code. If you want to reuse a given portion of Ballerina code then you model, that portion as a Ballerina function and a given Ballerina function can be invoked from a resource or another function. Similar to generic terminology of functions in general purpose languages, Ballerina functions can have multiple input arguments and multiple return values too.

In the Ballerina graphical representation, a function is denoted as a yet another sequence diagram, which comprises of interaction between lifelines.

‘Main’ is a special type of a function that always runs first in a given Ballerina program. Main is often used when you don’t want to expose a given integration scenario as a service, rather you want to run the scenario as an executed program.

Workers

Worker is a thread of execution that can be programmed independently. By default a resource or a function contains the default worker that where you sequentially configure the ballerina programming logic.

Within the worker there can be statements which are been executed sequentially as well as the action innovations which are interacting with the connectors.

State of the art Integration with Ballerina

Let’s discuss some of the key differentiators of Ballerina and why it’is at he fore-front of being the most comprehensive futuristic integration middleware.

Light-weight container native runtime — Micro-integrations/Integration Microservices.

As we have discussed earlier, the conventional monolithic integration solutions are not compatible with Container or Microservice architecture. That’s where the concept of ‘Micro-integration’ (integration microservices) come into the picture.

A microintegration is an integration architecture pattern in which you build a specific integration scenario, deploy and run that scenario independently on top of a lightweight integration framework.
A given micro-integration can integrate microservices, APIs, on-premise or cloud applications.
Since you can deploy each integration scenario as an independent runtime, you can also scale them independently (as in centralized integration middleware, you can opt to run multiple scenarios in a given runtime as well.)

From the ground up Ballerina is designed and developed to cater to the micro-integration needs.

Ballerina based micro-integration/integration microservice(e.g: RESTful HTTP service) can be started in less than 1 second with very low memory footprint. None of the existing integration middleware is capable of achieving that yet.
Despite the scenario you want to integrate, you can also leverage Ballerina to implement the respective micro-integration. For example, figure 1.10 illustrates several micro-integration scenarios. In scenario 1, Ballerina is used to implement a micro-integration with orchestration between several microservices. Similarly, traditional integration scenarios as well as a hybrid of microservices and monolithic systems can be implemented with Ballerina. Unlike, centralized integration middleware, all these integration scenarios are totally independent and can develop, deploy and scale independently.

Figure 1.10 : Micro-integration with Ballerina

Ballerina supports different styles of integration scenarios. As illustrated in figure 1.11:

Integration Service : Ballerina can be used to integrate some systems/services and expose the integration as a ‘service’. This is more or less a virtual/composite service that primarily focus on the integration aspect, rather than core business logic.
Runnable Integrations : A given integration scenario may be not exposed as a service, but you need to run that scenario through an invoker. The invoking can be periodic or just a programmatic invocation of the integration scenario (similar to running a script). Ballerina allows you to run such integration scenarios by implementing them in the Ballerina — Main mode (similar to the main function of programming languages).
Integration service as a managed API:A Ballerina service can be easily converted to an API by including API Gateway related functionalities such as security, throttling etc. This will be an extension plugged at the service invocation logic of Ballerina (not supported yet on 0.8 release). Since Ballerina service can be modeled through OpenAPI/Swagger, followed by the implementation logic through sequence diagram-like design, this will vastly improve the entire API design and implementation procedure.

Revolutionized Graphical and textual representation

With most of the conventional Integration middleware, you can model a given integration scenario using either graphical or textual modeling. In current integration middleware, graphical representation is based on data-flow model while textual representation is modeled as a DSL(Domain Specific Language).

The data flow based graphical modeling doesn’t really simplify the representation of a complex integration scenario. It becomes really clumsy with integration scenarios which involves many services and systems(specially with microservices, you will have to integrate/orchestrate dozens or hundreds of services).
As the textual representation of integration, most vendors use a Domain Specific Language (either as an external or internal DSL). Camel, Mule, WSO2 ESB/Synapse are popular implementations of DSL based integration middleware. Although the original intention of using a high level DSL is to simplify the implementation, when it comes to real world integration scenarios, often the DSLs are not fluent and powerful enough to model them. The absence of key general purpose programming languages constructs has forced the integration middleware vendors to come up with ‘expression languages’ (e.g Simple Expression Language of Camel), which is kind of a disconnected set of statements that you would execute from the DSL.
As we have seen in the previous microservice implementations, partly due to the fact that the existing DSL based approach is not powerful enough, most organizations try to use general purposes languages such as Java, Node.js, Groovy to implement modern integration logic (e.g. Netflix‘s and Paypal’s approach to implement service orchestration at API Gateway level).
Using general purpose languages which are not optimized or designed for integration is an overkill and an implementation nightmare. You need to take care of all the low level details (such as connection and thread handling etc.) and the integration scenarios that you implement with these languages are virtually impossible to represent graphically.

So, that’s why the next generation integration middleware needs to have a powerful graphical and textual programming language.

Ballerina uses its innovative textual and graphical programming syntax based on Sequence diagram metaphor for modeling integration scenarios. The graphical and textual syntaxes would maintain parity and would be 100% interchangeable.
Using sequence Sequence diagram metaphor of Ballerina, make it easier to model integration, as well as easier to understand the existing integration scenario (even for a non-techie person, sequence diagram is quite easy to understand).
Ballerina offers a wide range of in-built connectors to various network protocols and connectors for cloud APIs such as Twitter, Facebook etc. (and in the future we’ll offer connectors to all the enterprise applications such as SAP and SaaS applications such as Salesforce). Hence, Ballerina encapsulates all the low level integration logic (such as HTTP connection and protocol related stuff at the server and client side). This will drastically reduce the time you spend on integrating multiple systems and services (compare this approach with using general purpose languages such as Java or Node.js for implementing integration or orchestration logic with various libraries)
Rather than spending time on writing code to plumb low level integration points, Ballerina developers can focus on the actual business aspects of integration scenarios.

Orchestrations, Choreography and Multi-party interactions

Service orchestrations are getting more and more complex with the proliferation of fine-grained services. With the diverse integration use cases, we need to cater to scenarios which involve multi-party interactions which are possibly run in parallel. The traditional graphical data flow models fall apart in representing these kind of scenarios.

The sequence diagram based graphical modeling is quite scalable when it comes to modeling complex integration scenarios with multi-party interactions. For example, complex service orchestration logic is much more simpler to illustrate using sequence diagram based graphical representation of Ballerina than the conventional data flow model.
Complex multi-part interaction with parallel(functional parallelism) and asynchronous execution can be easily represented with Ballerina graphical syntax.
Asynchronous computations/integrations — In Ballerina, the worker (say the default worker of a resource) can invoke a new worker (parallel thread of execution) and the default worker can continue with it’s own work (conceptually this resembles Java future construct). At a given point, a worker (in this case, default worker) can wait and retrieve the result of the asynchronous computation/integration.
Fork-join : Ballerina offers a fork construct that allows you to replicate a message to any number of parallel workers and have them independently operate on the copies of the message. The join part of the fork statement allows one to define how the caller of fork will wait for the parallel workers to complete. (Note: Standard sequence diagram notation has ‘par’ notation to represent the parallel execution. However, we didn’t use that for the graphical notation in Ballerina graphical syntax, rather a parallel lifeline denotes the worker who runs in parallel with the other workers.)

Support diverse integration requirements

As we have discussed earlier, Ballerina offers a wide range of server and client connectors that you can use to integrate using cloud services, data sources, network protocols and proprietary and legacy applications.
(The protocols such as HTTP, HTTP2, WebSockets, JMS, File will be available in 1.0 version as fully production ready connectors, along with cloud connectors such as Salesforce, Twitter, Facebook, LinkedIn and so on and so forth.)
The Ballerina language itself offers various integration capabilities that you often find in standard Enterprise Integration Patterns(EIPs) and it also comes with Data Mapping/Type Mapping constructs (both graphical and textual representations).

Performance

Modern integration requirements often demand extremely low latency messaging and high throughput from the integration middleware to withstand the ever increasing growth of traffic. Also, with the popularity of Container architecture, the performance metrics of any modern integration middleware has shifted to a totally different dimension. While the throughput and latency are still very important metrics, the startup time, memory footprint, CPU utilization/load average (during execution), have become extremely critical too.

From the ground up, Ballerina is built to achieve the optimum performance on various fronts.

Ballerina run time is extremely lightweight and fully container native, with around one second of startup time and very low memory footprint.
The server connectors from ground up leverage cutting edge technologies to achieve high performance network communication. For example, the HTTP transport which is based on Netty, Disruptor and Pass-Thru message architecture, gives you the maximum throughput with minimal latency (over 5X or more faster than almost all the conventional integration middleware solutions out there).

Streamlined design, development, testing and deployment life cycle

Forrester recently defined the Enterprise Integration as follows:

“Technology for developing, maintaining, testing, deploying, and governing interfaces between applications, machines, or databases” (source : Forrester TechRadar™: Integration Technologies 2015).

If you look at this definition, it clearly emphasize the importance of not just developing the integrations but also other aspects of the life cycle of an integration scenario.
The conventional integration middleware more or less were focusing only on the development aspect while maintenance, testing, deploying, versioning controlling etc. were considered to be handled outside the integration middleware.
Conventional integration middleware foster configuration over code approach for designing integrations. However, in practice, most of the time the integration scenarios end up in a clunky state with humongous set of configurations. However, since it’s configuration based, it was unable to directly apply most of the software development lifecycle concepts through configurations.
Therefore, as Forrester highlights in their Enterprise Integration definition, the next-generation integration middleware has to cater to the entire development, testing, automation, deployment and troubleshooting aspect of the life cycle.
Ballerina tries to overcome the shortcomings of configuration based integration scenario development life cycle in the following ways.

Since Ballerina is a fully fledged programming language, it offers all the software development life cycle management features that you find in a programming language.
Packaging, sharing and executing

— A Ballerina program can consist of a number of Ballerina files, which you can organize into packages. You can also import external packages into your Ballerina program. Collections of Ballerina code can be packaged as a library so that the resulting package can be shared. The Ballerina repository is a collection of Ballerina libraries. You can also create a self-contained Ballerina archive for a Ballerina service (.bsz) or for a Ballerina main (.bmz).

— Each of these Ballerina archives can be executed through Ballerina (e.g: ballerina run service (filename | packagename | servicearchive)+ )

Ballerina composer

— Ballerina composer is the graphical modeling tool, which allows you to build an integration scenario using the sequence diagram based graphical representation.

— From the composer you can model, run, debug any integration scenario.
Ballerina tools offer IDE plugins for Intellij Idea, Atom VIM, VS Code etc.

Testerina and Dockerina

— Testerina is the test framework built for ballerina language. The key objective here is to unit test the integration logics that you write with Ballerina. Testerina also includes a mocking/emulating capability in which you can emulate the backend system and service for your test scenarios.

— Docerina is the API documentation generator tool of the Ballerina language. With Docerina, you can document your Ballerina programs using the annotations dedicated for documentation.

Docker

— Ballerina offers native support for creating a docker image out of a Ballerina archive (using ‘ballerina docker’).
Hence you can model any micro-integration scenario as a Ballerina archive and create and run that as a Docker image.

Conclusion

The conventional integration middleware needs to be change drastically and should morph into de-centralized integration middleware with the support for the futuristic integration requirements. While practical application of Microservice Architecture still requires the need for the integration middleware, but only decentralized integration solutions fit into that architecture.

Ballerina is huge step towards, re-defining the next-generation decentralized integration middleware. The graphical and textual parity of the Ballerina language which is based on the Sequence Diagram metaphor, will revolutionized the way you model integrations. With in next couple of months, Ballerina 1.0 GA will be available as a fully fledge production-ready programming language for building integrations.