Choreography and Orchestration using AWS Serverless
Easy and practical example that shows when to use choreography and orchestration as modes of interaction in a microservices architecture, running on AWS Serverless
an AWS Lambda function routinely polls an API for the loadshedding stage
another Lambda function pulls the schedule for that stage
all the data is stored in DynamoDB
another Lambda function then routinely sends out telegram notifications based on the stage and the schedule.
EventBridge rules that schedules the functions to run at specific times
Altogether, it looks like this:
With my other bots (based on this sample), I’ve shown that AWS Serverless is the best place to run a bot, due to low cost and simplicity. So when building this loadshedding bot, I decided to avoid the Lambda monolith - a fat function containing all the above logic in a single function - and chose to split the functionality over 3 separate functions, essentially each its own microservice, albeit with a shared DynamoDB table. This way, I could have specific EventBridge rules to schedule the functions at run at the intervals I needed them to, and logging and debugging was easier with simpler functions. Deployment was really easy to using AWS SAM - all it takes is a sam build && sam deploy each time I need to make a change.
Now to the point I really want to discuss: how would I get these different Lambda functions to co-ordinate and work together? How would the schedule function know when the loadsheding stage had changed (which could change a few times a day), and how would the notification function know when either or both the stage and schedules had changed, in order to send out a new telegram notification? I also didn’t want the different functions to be constantly polling the DynamoDB table, as that would just increase costs for both Lambda and DynamoDB. There were two options available: Choreography and Orchestration. For my initial use-case, I was going after simplicity, so cheoragraphy made more sense. Later on, for a different use-case, I needed all the functions to run in a specific order, so I used Orchestration. Lets see how choreography and orchestration can be achieved on AWS.
Choreography
In choreography, every service works independently. There are no hard dependencies between them, and they are loosely coupled only through shared events. Each service listens for events that it’s interested in and does its own thing. This follows the event-driven paradigm. And since Lambda itself is inherently event-driven, the choreography approach has become very popular in the serverless community.
When the loadshedding stage changed, the schedule function needed to be aware, and then the notification function needed to be run. I was after simplicity after-all, and since each function was updating DynamoDB, I used DynamoDB Streams to invoke the other Lambda functions,i.e. an update to DynamoDB emits an event to Lambda. With this, the Lambda functions dont need to poll DynamoDB - only when there is a update made, will the function be invoked. So any of the functions can update the loadshedding stage and schedule, and the other functions will then be notified of this change and process it. And with AWS SAM, integrating DynamoDB Streams with Lambda is really easy.
I could have used SQS as a queue to capture all of these events, or EventBridge as an event bus, but using DynamoDB streams was just the easiest in this case. Either way, I managed to send events between the different functions, and they acted on it when required. And for the most part, it worked really well. There were a few times that the upstream loadshedding API was down, which would have required me to write some custom retry logic, however I simply relied on the EventBridge schedule to call it again later.
Then in the last few days, I realised I needed a new capability: the electricity supply company was providing loadshedding updates on Twitter due to emergency failures, and I wanted the ability to invoke all the existing functions on an adhoc basis, but in a specific order: get the latest loadshedding stage, then get the schedule for that stage, and then post a notification. If I simply re-used the existing architecture, I would be invoking one lambda function from another, which is generally frowned upon. I would also need to make sure I build that custom retry logic to cater for any failures. This made me realise that for this specific use-case, I needed to orchestrate and coordinate the different functions to run in the order required.
Orchestration
In orchestration, there is a controller (the ‘orchestrator’) that controls the interaction between services. It dictates the control flow of the business logic and is responsible for making sure that everything happens on cue. This follows the request-response paradigm.
From telegram itself, using a telegram command, I wanted users to be able to instruct the bot to pull the last loadshedding info. It would then to call the stage API successfull, retry if required, then call the stage API, retry if required, then send the telegram notification. To orchestrate all of this, I used AWS Step Functions, which allowed me to build a serverless workflow, that makes it easy to take care of retries without custom code. I used the Workflow Studio to visually design this workflow using drag and drop:
and then exported the JSON definition into the AWS SAM template for deployment to AWS. Now I can make sure each function runs in order, with retries, with no custom code or no changes to the existing Lambda functions.
Hopefully this easy but practical example showed when to use choreography and orchestration as modes of interaction in a microservices architecture.
This post will cover what is AWS Lambda, how it works, and how cold starts can impact performance. It then covers Lambda Snapstart, how to enable, and how to measure its impact on cold starts using different AWS services.