September 6, 2024

AWS is the Linux of Cloud

I’ve been a Linux user for about 25 years. I first used it at University in the early 2000s, because the Computer Science Labs at my university ran Red Hat Linux. Ever since then, it’s been my default OS, my go-to tool, whenever I needed to get something done. The terminal, or command line (CLI), had anything and everything you needed to just get things done. Even before smart phones, it was perhaps the first “there’s an app for that”, where you could find some tool or command that had what you needed. And that’s where the power of Linux really shines, together with open source, where there’s commands/tools for data processing, networking, security, databases, websites - you name it, it had it. Once you knew the basics - like cron is to schedule something to run, perhaps a bash script, that maybe uses grep and awk to filter out some data, that you save to mysql, or email out - the world really was your oyster. And there wasn’t just one or two commands to choose to fulfill a particular requirement, there’s literally tons of different commands and ways to be able to complete a task.
I now primarily use macOS, and frequently use Ubuntu, as well as raspbian on Raspberry Pi. I’ve also used BSD and Solaris in the past. So when I say Linux, I’m talking overall about the Operating System, including the kernel, apps and tools. I’m not limiting this discussion to any specific Linux distribution, nor is this restricted to Linux only, as this could equally apply to any UNIX OS. As long as they have the CLI with the usual array of builtin commands like grep, awk, etc, you good to go.

I started using AWS only a few years ago, starting just with a basic EC2 instance to run this blog. Over time, I’ve gotten exposed to more AWS services, and the more I’ve used them, I kept on getting this nagging feeling like this was strangely familiar. Like how easy it was to run a piece of code (Lambda function) on a schedule (EventBridge Scheduler), or transform some data (Glue), or search the logs (CloudWatch Logs) for something that happened. And really how easy it was to combine multiple AWS services together to solve some niche problem.

There are many (better) authors, like Matt Asay, who write about AWS contributions to open source. But what I am talking about here is how using AWS feels much like the power of Linux, in that there’s many different ways to complete a task, by using an AWS service, solution, or just by combining different AWS services together. In the rest of this post, I am going to compare a few of the well know Linux commands and use-cases, and show you how they can be accomplished on AWS. This post may be useful to the sysadmin or devops roles, who are trying to figure out how to be as efficient on AWS as they are on Linux.

cron: Amazon EventBridge Scheduler

Cron on Linux is awesome, allowing you to schedule something to happen at some time. EventBridge Scheduler is a serverless scheduler that allows you to create, run, and manage tasks from one central, managed service. With EventBridge Scheduler, you can create schedules using cron and rate expressions for recurring patterns, or configure one-time invocations. And where EventBridge is really cool, is that unlike Linux where each user has their own cron, so if you were looking for a particular cron job, you would need to look at root, then other users, but with EventBridge it’s all in a central place, and even has a GUI to manage it. You could combine EvenBridge with AWS Batch to achieve even more.

shell scripting: AWS Lambda, Glue

BASH (or zsh or whatever shell you using) is crazy powerfull. You can include python, perl, PHP, or other languages in this category as well. They allow you to integrate natively into Linux, check for files, run commands, whatever. It’s often used as glue code, to get some data into a DB, or process a new file, etc. And that’s why Lambda feels like a bash script to me. Bash itself is supported on Lambda. Lambda allows you to natively call into AWS and do whatever needs to be done. E.g. you can automatically process new files in an S3 bucket, or do something whenever a EC2 instances starts or stops. It can automate anything in AWS.
AWS Glue makes it easy to write or autogenerate extract, transform, and load (ETL) scripts, in addition to testing and running them.

pipes: Step Functions, EventBridge Pipes

Pipes are a key component of Unix. Pipes pass text streams from one process to the next. This allows users to execute a series of commands where the output of one command becomes the input of the next. AWS Step Functions is another workflow focused implementation of Unix pipes. Like the other tools listed above, it includes a visual workflow builder. Being an Amazon product, Step Functions integrates most AWS services, including support for interacting with around 10 000 of the platform’s API endpoints.
EventBridge Pipes allows you to create point-to-point integrations between event producers and consumers with optional transformation, filtering, and enrichment steps.

logs: CloudTrail and CloudWatch Logs

Operating system logs provide a wealth of diagnostic information about your computers, and Linux is no exception. Everything from kernel events to user actions is logged by Linux, allowing you to see almost any action performed on your servers. I like to think CloudTrail is like syslog, where actions taken by a user, role, or an AWS service are recorded as events in CloudTrail. Based off some log in CloudTrail, you can execute a Lambda function.
For application logs, kinda like logs in /var/log/, CloudWatch Logs enables you to centralize the logs from all of your systems, applications, and AWS services that you use, in a single, highly scalable service. You can then easily view them, search them for specific error codes or patterns, filter them based on specific fields, or archive them securely for future analysis. CloudWatch Logs enables you to see all of your logs, regardless of their source, as a single and consistent flow of events ordered by time.

Mapping other common Linux use cases to AWS

Where you used to hosting an FTP server on Linux, instead use AWS Transfer on top of S3, which can then invoke a Lambda function on each file upload.
Instead of hosting a static web server on Linux, use S3. Instead of running Ghost on Linux, see all the options of running Ghost on AWS.
Instead of using Ansible or Chef to automate and patch your Linux fleet, use AWS Systems Manager.

The open source way of building software

In the world of software development, there are two distinct approaches that have shaped the industry - the "cathedral" and the "bazaar" models. These two models, as outlined in the influential essay by Eric S. Raymond, offer a fascinating contrast in how software is built and delivered. According to Raymond, the bazaar model is a more effective approach for complex software projects, as it taps into the creativity and problem-solving abilities of a larger community. In my simple-minded way, I like to think that Amazons Two-Pizza Team structure is closer to the bazaar style, the way the Linux kernel was built, because these small teams have complete independence on how they build and operate. And like in the open source and Linux world, that may lead to many different services that do can similar things, which is a good thing in my book. Because like Linux, AWS provides building blocks, allowing you to organise them in ways to solve your unique challenges.