Building a serverless ingestion pipeline to decouple a front-office application from the back-office.
We recently had to build a front-office responsive web application, making available back-office data to the end-customer. It seemed like a simple web-based application that could be built with our favorite(?) single page application framework.
However, when looking a bit deeper some non-functionals popped up that we needed to take into account:
Taking the above non-functional requirements into account made us move towards an autonomous bubble style architecture. The autonomous bubble is a pattern described in domain-driven design, the original article is located here. In short it allows to build a new system based on the domains of other systems, while achieving as much isolation as possible from these other (often legacy) systems.
This means that our front-office application will have its own persistent storage for storing the data to be shown to the end-customers. Data should be available in that persistent storage whenever a customer requests it, we want to avoid having to request data from the back-office systems in real-time.
The solution proposed in the autonomous bubble pattern is the so-called synchronizing anti-corruption layer. This component will take on the responsibility of synchronizing the data between the back-office systems and the front-office application. This synchronization is typically done by detecting changes in the back-office systems, which then need to be translated and inserted into the front-office application. The more frequent we can do this synchronization, the faster modifications in the back-office systems will be visible in the front-office application.
In short, this pattern enables a simple architecture for the front-office application: a single page application with a (java) backend exposing REST services for fetching the data stored in a database. The technical complexity of integrating front-office with back-office is isolated in the synchronizing anti-corruption layer.
Setting up a synchronizing anti-corruption layer is quite complex, both from a development perspective as from an operational perspective. This automatically triggers the reflex for looking at products, components, … that we could (re-)use to make this simpler (a.k.a build vs buy).
AWS has a couple of serverless PAAS solutions that seem to fit the purpose. Specifically, we were looking at:
Amazon API Gateway is a fully managed service that makes it easy to create, publish, maintain, monitor and secure APIs. We can use this service to expose the REST APIs of the different back-office systems behind our own domain (preventing CORS when accessed from the front-office application). We can secure those APIs using API keys and protect the back-office applications from excessive load using throttling. We could even do caching if wanted to.
With AWS we can run code without having to provision or manage servers. We can leverage this service for both reading the data from the back-office systems as well as for writing the data to the front-office system. Using AWS Lambda provides us with a serverless, monitored, auto-scaled and pay-per-use solution, letting us focus on the actual source code needed to read and write the data. AWS Lambda supports a multitude of triggers for running your code, amongst others cron-like expressions that enable us to poll for changes in the back-office systems on regular intervals.
Amazon Kinesis is a massively scalable and durable real-time data streaming service. Typical use cases are handling log data, handling event data and enabling analytics. In our cases we have modifications in the back-office systems that act as events that need to be handled towards the front-office. By pushing modifications on a data stream, we can trigger the AWS Lambda functions responsible for pushing these changes towards the front-office system. We can use the no. of shards of the Kinesis data stream as a way to control concurrency of the consuming lambdas.
We need to keep track of the latest changes that have been synced. This will enable us to resume synchronization on the next invocation of our ‘pull changes’ lambda. For this purpose, we use Amazon DynamoDB which is very easy to integrate with AWS Lambda: simply applying the correct permissions and using the Amazon SDK in your lambda will do the trick.
In order to configure monitoring, log aggregation and alerting we used Amazon CloudWatch. All aforementioned Amazon services integrate automatically with CloudWatch making it easy and intuitive to set this up. With the ability to define custom dashboards and alerts (e-mail and SMS) we had all the functionality needed to manage this solution.
Using AWS services, we were able to simplify the solution and mostly focus on the complexities specific to our problem domain. The actual front-office application has a simple design and was therefore easy to build, while the synchronizing anti-corruption layer was reduced to creating some lambda functions and configuring a couple of Amazon services. All AWS services are designed with operations and manageability in mind, which we could use in our advantage to build a monitoring solution quickly and efficiently.
Want to know more? Feel free to contact us, always happy to discuss.