Flask on AWS Serverless: A learning journey - Part 2
About 3 years ago I learnt some basic Python, which I've used almost exclusively to
This is a proposal I wrote earlier this year, to the powers that be, to motivate for Hybrid Cloud Infrastructure, to serve multiple applications in multiple regions across Africa
Our current infrastructure – including hardware: servers, storage, networking and software: OS, virtualisation, tools - together with the applications that run on it, allows us to serve our customers. In order for us to be flexible and efficient, and scale to the demands of our customers, this infrastructure needs to meet the requirements of the service and applications, and needs to be located appropriately, according to the location it is serving.
What this post will show, is that due to our current Infrastructure limitations, this is becoming increasingly difficult, and is hindering our ability to serve our customers, and their increasing desire for digital services. More often than not, our ability to roll out new services is hampered by the availability of infrastructure that is flexible and agile. Cloud is a solution, that provides Infrastructure that is an enabler for us. This document puts together an overview of the current situation, makes a business case for, and paves a roadmap to, using the Cloud for Group-wide Platforms.
We will start with a background to how Infrastructure has evolved over time, and where the market is
Then we list the needs and requirements for modernised Infrastructure, provides details on solutions, and proposals for the way forward. There are a few planned Platform roll-outs in 2018, and it is clear that we cannot rely on the existing Infrastructure. In order to have quick deployments and service roll-out, and reduced costs to automation, we require modernised Infrastructure that is standardised across the landscape, can scale easily, is automated, and is integrated into public Cloud.
The key requirement is for a Hybrid Cloud, with integration to public Cloud. The proposal includes these key elements of a Managed Hybrid Private Cloud
Cloud is an extremely large topic, and can cover many use-cases and areas of key interest and importance:
• SaaS, like Office365, that is primarily for internal/staff consumption
• Network-based, that is generally discussed with NFV and SDN.
• Ad-Hoc Cloud services, that any department in any region could use. This will include discussions around internal governance regarding the use of these services.
o Lack of floor space or power in the Data Center
o Lack of network ports and equipment
o Lack of sufficient network bandwidth
o The most difficult is the integration into the network. When we later talk about virtualisation, we will see that it does assist to reduce issues to some extent, but then still suffers from the same issues of network integration, and manual installation and provisioning of OS and applications.
• Mixed hardware landscape: Multiple vendors hardware is used across different services. This results in:
o Support agreements with each vendor to maintain and support their hardware
o Lack of consolidation of Opex, and less ability to pool resources (spare parts, etc)
o In-consistent: some applications may not be approved to run on certain hardware
• Less buying power, as we cant consolidate and buy in bulk.
• Lack of standards across Operations
• Vendor lock-in: Vendors typically say that their application will only run on their hardware.
o This also means that when it comes time to expand or decommission, our options are limited to compatibility
• Little or no sharing of resources:
o servers are running idle
o Complete DR sites, fully paid for, sitting idle, and sometimes never used through out their life-cycle.
• Very low rate of virtualisation. The timeline for deployment is very long due to the huge amount of manual work required, which is caused by it being non-virtualised.
Before we present the proposal, let us step back and see how Infrastructure has evolved and progressed over time
In order for us to efficiently deliver on these, and to modernise the existing portfolio of services, we require infrastructure that is:
• quick to provision,
• standardised across the Regions – we simply cannot progress if we have to go to each Region, and request VMs. It will be all different, and take too long
• Automated – most things should be template driven, choosing from readily available services, and automating all custom jobs
• Flexible – supports different types of workloads
We are not an infrastructure company – we do not have the time and skills to build and maintain Cloud infrastructure – rather we need flexible infrastructure that will allow us to focus on our core business. In other words, reduce the undifferentiated heavy lifting associated with managing infrastructure and focus on the activities associated with building the products and services that our brand is known for.
Virtualisation is a key ingredient of modern Infrastructure. It builds on the basic premise that not all Infrastructure capacity is used all the time by applications, and thus, by aggregating and pooling the capacity, there is much more work that can be done on the same amount of servers. By over-subscribing the allocation of computing and storage resources, we can run more work-loads on the same amount of Infrastructure.
There is quite a lot of overhead that goes into installing new hardware, that involves procurement, finding sufficient rackspace, power and network, installing into the data center, integration into the network, and then the mundane task of installing and configuring the hardware, OS and application. Virtualisation removes the need to install dedicated hardware per application.
Once the base Hypervisors are installed, then only VMs need to be created per application. So just by having virtualized infrastructure, we have saved the time, cost and effort required to install dedicated hardware. However, once the VMs are created, then the OS and application must be installed. This too is quite a burdensome and timeous process, that requires dedicated teams to install the application and manage the OS. That is why it is vital to use pre-built templates and automation to efficiently configure new VMs. This means that setting up new platforms and applications, especially those that are Group-wide platforms, that use the same application installed in multiple locations, can and should be automated to ensure that they are consistently installed in the same way.
Cloud platforms come with significant tooling and automation tools that simplify and manage the layers above virtualisation.
Cloud services are readily available to anybody on the internet, to both individuals and corporates alike. It requires very little financial investment, with consumption on a pay-as-go-model, and no tie-in to long-term contracts. Thus it appeal is very broad, and because its so easy to use, consume and pay, it has already entered into our company, knowingly or un-knowingly. We are already using some Cloud SaaS like Office365 to serve email and collaboration tools to all staff. In some cases, we have seen it been used by Marketing departments, via vendors that have certain applications that are cloud hosted.
There are very likely to be other pockets with-in our company using Cloud-based services, either directly, or in-directly via vendors. Thus Cloud has brought shadow IT into the mainstream – while at the same enabling quick time to market and flexibility, it also brings about questions like visibility, governance and data loss.
Some of our partners are already using Cloud, where there solutions are completely Cloud based, and therefore opens the same questions above regarding data and governance. Whats strange as wll, is that by them using Cloud, they have far more agility and scale, that cannot be matched by the infrastructure that we have.
I am seeing a growing number of customers that, once they gain significant experience running hybrid applications, adopt a cloud-first policy. This encouraging paradigm reverses the burden of proof. Rather than having to prove how a project can be implemented in the cloud, the organization must prove why it can’t. This is often coupled with a predisposition to use hosted SaaS solutions for common corporate back office functions, such as HRIS, e-mail, and collaboration systems.
This brings us now to the definition of Cloud. So far, we have established the efforts and time that goes into managing physical, un-virtualised servers. We have seen the value of adding virtualisation. However, virtualisation is not enough, as it means that the VM and layers above still need to be managed.
However, they key thing to understand is that virtualisation is not Cloud. Cloud computing is the delivery of servers, storage, databases, networking, software, and analytics, over a network. Cloud computing is more than just technology, it’s a mindset shift on how to deploy and manage infrastructure and applications.
There are three types of Cloud deployments:
The biggest CSPs are
• Google – GCP:
• Amazon – AWS:
• Microsoft – Azure:
• IBM – Bluemix
• Oracle
• EOH
Each of the Cloud deployments can offer the different types of Cloud services described previously: IaaS, PaaS and SaaS
Lets be clear about what is under scope of this document: Software-as-a-Service (SaaS) is applications that are hosted in the cloud, and end users only interact with it via using the service. Examples of SaaS is Gmail, Office365, Salesforce, etc. We do not play in this space. We need to build and configure applications, that are integrated into the rest of our eco-system, and are therefore mainly interested in IaaS and more on PaaS.
Lets take an example of how a particular service can be seen across the layers:
Microsoft Exchange, AD, Sharepoint are basic but critical services, and here are the options at each layer:
• SaaS: With Office365, you can purchase Azure’s Exchange and AD capabilities.
o It auto-scales for the number of users, is available globally, updates are automated and hidden to users
o Additional capabilities: and it comes with a range of other capabilities: Teams for chat/collaboration, Planner to task planning, etc.
o Management: No Operations team is required (besides basic Admin).
o SLA: There is a guarantee of 99.x% uptime for everything.
• PaaS: An Enterprise can purchase VMs, and using the pre-provisioned templates, it can come installed with Exchange and AD.
o Additional capabilities: It wont include other capabilities like Teams and Planner
o Management: An Exchange operations team will be required to manage the software, but the OS and layers below are auto-managed.
o SLA: The VMs are guaranteed to be up, but now the applications.
• IaaS: An Enterprise can purchase VMs, and then will need to install and manage Exchange and AD.
o Additional capabilities
o Management: The full life-cycle of Operations teams will be required: Exchange, OS teams, etc. Updates and upgrades will have to be tested and executed.
o SLA: The VMs are guaranteed to be up, but not the applications.
Therefore, the IaaS and particularly PaaS is what is relevant here. A lot of people initially think Cloud, and particularly IaaS is “just a VM in a data center somewhere”. However, Cloud is much more than this, as it adds automation, tooling and systems to manage, scale and integrate, that an onsite VMWare or HyperV deployment wont have. A true Cloud platform allows for applications to be spun up with a click of a button. GCP, AWS and Azure have each over 90 services, including AI, Machine Learning, and clustered databases that can just be provisioned from a web browser.
The real power of Public Cloud is very easily seen, due to the fact that its open to anyone on the internet, its billing model is simple and cheap, which has allowed a new wave of companies to launch services to millions of users that previously could only have been done with large amounts of capital. The top benefits of public cloud computing are:
• Cost: eliminates the capital expense of buying hardware
• Speed: Most cloud computing services are provided self service and on demand, so even vast amounts of computing resources can be provisioned in minutes, typically with just a few mouse clicks, giving businesses a lot of flexibility and taking the pressure off capacity planning.
• Global scale: The benefits of cloud computing services include the ability to scale elastically, delivering the right amount of IT resources at the right location
• Productivity: On-site datacenters typically require a lot of “racking and stacking”—hardware set up, software patching and other time-consuming IT management chores. Cloud computing removes the need for many of these tasks, so IT teams can spend time on achieving more important business goals.
• Performance: The biggest cloud computing services run on a worldwide network of secure datacenters, which are regularly upgraded to the latest generation of fast and efficient computing hardware. This offers several benefits over a single corporate datacenter, including reduced network latency for applications and greater economies of scale.
• Reliability: Cloud computing makes data backup, disaster recovery and business continuity easier and less expensive, because data can be mirrored at multiple redundant sites on the cloud provider’s network.
Its very important that we realise that the big Cloud providers (GCP, AWS, Azure, IBM) are hyper-scale Clouds, and that it is virtually impossible to compete with their speed of innovation, and depth and width of services that they have built. They are reshaping the services market, radically changing IT spending patterns within enterprises, and causing major disruptions among infrastructure technology vendors. What literally takes dedicated teams of people in most Enterprises to build infrastructure, databases, and platforms, can be easily purchased on-line and provisioned with-in minutes, to launch services with enough capacity to serve millions of customers.
There are at least four factors that are working toward accelerating this transition:
Here are some quick examples of new services and business models that Cloud has enabled:
• When Pokemon Go launched, users took up the service so fast, that it went over 10 times the worst-case forecast. Because it was based on GCP, they managed to spin up tens of thousands of cores, and it was the largest container deployment ever. Pokémon GO has broken all previous records of popularity of any mobile game or app for that matter on both Apple’s App Store and Google Play market:
o Total number of downloads – 100 million (by August 8th, Google Play market only)
o Total revenue – $268 million (by August 12st)
o Average revenue per DAU – $0.25
o Percentage of iOS users that do in-app purchases – 80%
o Daily Active Users – 20+ millions
While it makes sense for us to utilise public Cloud, for its obvious benefits, there are a few reasons why it may not be currently feasible for certain use cases:
• Regulatory – may affect how data can flow
• Data Protection (POPI) – customers may need to allow us to store their date in public cloud
• Regional integrations to local systems o may not be practical
• Network links, latency and redundant routes may not be favourable for all Regions
According to the 2017 State of the Cloud Survey, Hybrid Cloud is the Preferred Enterprise Strategy, with:
• 85% of enterprises having a multi-cloud strategy,
• 41% of workloads run in public cloud, and 38% run in private cloud
• Azure increases market penetration, with 34% using Azure
Hybrid Cloud allows data and application portability between multiple cloud infrastructures, either public or private. It allows to run parts or certain applications and workloads on public cloud, and another in our private cloud, that balances both environments, based on our unique needs. Hybrid cloud is about maximizing workloads in the best environment and managing costs. It’s not necessarily all about public or private cloud, but how to orchestrate both.
When it comes to hybrid cloud, there are a few options:
• Retain our existing legacy infrastructure, and using tools that allow integration and networking into the cloud, put new apps into public cloud.
• Build infrastructure internally
The proposal therefore is to leverage off the power and functionality of public Cloud, but localized to our considerations, by building a Hybrid Cloud internally.
By building a hybrid cloud, it will allow us to use the capabilities of cloud, but in an internal environment that is regulatory and data compliant, while also allowing us to easily move to true public cloud.
Its important to note the intent here – we don’t want to compete with the big Cloud providers, rather, by building a hybrid cloud that is integrated and aligned with public cloud, it provides a highway to move to public cloud.
Also, by doing hybrid cloud across the Regions, it future-proofs our investment:
• Having a single consolidated view of infrastructure, VMs and services deployed across the Regions
• Having a single consolidated view of VMs and services across public clouds
• Having a single view of costs of public clouds
• Aligned to standards of public cloud
• Easy and mapped out upgrade paths
• Prevents vendor-lock in
• Allows capabilities of containers and other newer services
By installing a private cloud internally, it will provide us with agile infrastructure, but will not necessarily pave a way to use public Cloud. Private cloud could also limit us to the ‘virtualisation is cloud’ common misunderstanding, that lacks the PaaS, automation and tooling layers. Our target is to leverage off the benefits of public cloud first, and only use other alternatives due regulatory and other restrictions. Hybrid cloud provides this capability.
At this point, it will be usefull to show the distinction between the different types of infrastructure, which are the core building blocks of Cloud.
Up to now, most enterprises have used physical infrastructure that is installed locally in our data centers that comprises of:
• Compute: rack-mounted servers, or chassis with blades, that provide CPU and RAM
• Storage: SAN and NAS based storage, typically on fiber channel storage network
• Network: LAN networking equipment, that includes switches, firewalls and load-balancers
The above requires skilled and dedicated teams to install and maintain. The hardware normally has a lifespan of 5 years, then which it needs to be replaced. Migrating live production services to new hardware is non-trivial and is measure in time-spans of months to years.
The OS, DB and application layers also need to be maintained.
If you add the virtualisation layer to the above, it includes the installation and maintenance of the hypervisor. When building your own virtualized platforms, perhaps using VMWare or Hyper-V, compatibility between the hypervisor and underlying hardware is very important. The time to build such platforms takes between a few months to a year.
Because each enterprise was building their own platforms, each consisting of the same physical and software layers, but according to different standards, it was difficult to upgrade and bug-fix, due to the complexities that each layer added.
Companies like Nutanix, VCE and Netapp therefore decided to offer and build these as standard blocks, built in factory, and configured to cater for each client. Converged solutions, like vBlock and Flexpod, are built in the vendors factory, fully configured and tested, and delivered to clients data centers, where they are only required to be powered up, and plugged into the network, before that can run live production workloads. This cuts down on build times from months, down to a mere few weeks, all standardised and full supported.
There are two types of converged infrastructure:
• Converged Infrastructure (CI): like what has been shown in Section 6.1, it has dedicated layers for compute, storage and networking. These different layers have been well integrated, and there is a management layer to control all the layers for centralised provisioning and management.
• Hyper-converged infrastructure (HCI): combines compute and storage into the servers, using high-density servers that contain storage. This simplifies the storage and related fibre channel networking layers, which enables easier upgrading for life-cycle management. HCI is less expensive than converged, due it not having dedicated storage devices and storage network. Gartner ranks Nutanix as the leader of the HCI quadrant:
Hyper converged is the preferred method for building infrastructure.
Network connectivity in each Region is key requirement for cloud. When we utilise public cloud, we consume it over internet links, to the public cloud data centers. The availability, capacity, redundancy and latency of network links between Regions and public cloud providers is very important, and determines if its possible to use it for critical enterprise applications
Using the Cloud is a journey, not a destination. Because of the intrinsic nature of cloud, and how easy it is to use, its easy to start, and then progress as we learn and develop. Based on that, there are a few architectural types for cloud adoption, that can be used as a rough guideline on how to progress on the cloud journey:
This section will identify the Cloud providers that offer hybrid offerings.
What also needs to be included in the comparison is how structured each is, which indicates if its turnkey solution, vs a solution that requires effort to be built.
Lets start with the big public cloud providers:
Google Cloud Platform (GCP) is one of the most mature and full service public clouds. GCP does not have a pure-play on-premise hybrid offering, but together with Nutanix, offer the ability to move applications between Nutanix on-premise and GCP. This enables Nutanix customers to natively extend their datacenter environment into GCP.
Amazon Web Services (AWS) does not offer a pure-play hybrid offering. It favours point 1 in the hybrid cloud architectural styles, where you put certain loads on AWS, and connect it together.
Microsoft offers Azure Stack, that you can run internally, which is a replica of public Azure. It allows easily moving of workloads between hybrid and public Azure.
Bluemix Dedicated is a private cloud that includes a catalog of services that are made available specifically to meet the needs of an enterprise, including some additional services that are syndicated from the Bluemix public cloud. Bluemix Dedicated is a single-tenant Bluemix environment built on SoftLayer with the same level of security standards as the public platform
Oracle Cloud at Customer is the on-premise version of its public cloud.
Offers both a public and hybrid cloud model
EOH has a public cloud, and offers brokerage services across all other public clouds, but does not have a hybrid on-premise offering.
As with any major technology, in order to aid adoption, sometimes there is a change that is required to occur in the consumer. Cloud is a technology mind-shift, that requires a different way in how we think about it, in order to take full advantage of its capabilities.
In order to be successful, we has to adapt its thinking and processes on these key concerns:
Cloud is more that just virtualisation. Automation is a requirement to take full advantage of cloud computing. Once we automate our infrastructure, we can take full benefit of auto-scaling, which is the most fundamentally new features that cloud computing has to offer.
Agile and DevOps are not mere frameworks, but are culturally different ways to think compared to waterfall and traditional infrastructure. By fully embracing Agile and DevOps, we will greatly reduce time to market for new services and features, which will auto-provisioned on cloud.
Typically, CAPEX is used in building new capabilities, and is seen as investment, which we can depreciate on our books. OPEX is seen as burden, and does not grow our capabilities. But one of the key benefits of the Cloud is that huge capital intensive projects are not required, because cloud works on the pay-as-you-go model, where you only pay for what you consume. With auto-scaling, you only add capacity when you need it, and you don’t need to pay for it upfront when its not used.
We will need to adjust our Financial KPIs to take full advantage of clouds consumption model.
The cloud will always be more secure that any internal enterprise system. Because of the huge amount of skill and resources that cloud providers have put into it, and they huge uptake of cloud in the world, they have made sure that their systems are secure, always up to date, protected from threats like DDoS, and compliant to all major regulations.
We need to ensure that our partners and applications vendors keep with the cloud, and are cloud ready, and cloud agnostic
Often, vendors tell us that:
• They don’t support virtualisation
• They support virtusalistion, but only on their hardware
• They support virtusalistion, but only on a specific Hypervisor
• They say they are Cloud-enabled, but what they mean is that they support virtualisation
Internet Solutions is one of the largest ISPs in SA. They provide, amongst others, Cloud IaaS to their customers, with some PaaS and SaaS (Hosted VoIP, DRaaS) in 5 data centers across SA. In 2015, they realised that enterprise customers were using them for their production services due to location and latency, but choose to use the large CSPs (AWS, GC, Azure) for development, testing and other adhoc services. They realised they they cannot compete on cost, due to the large scale that the CSPs operate at, and that customers will slowly but eventually move over to the large CSPs.
They developed an in-house cloud portal - Skylight - that can provision cloud services to their in-house VMWare, and to the CSPs. By holding the billing relationship with the customers, yet allowing them the flexibility of the CSPs, they can retain customers for longer. This model of cloud services broker and aggregator model, is very powerful, as it combines the different CSPs into one unified interface, and may allow for migration of services across the different clouds.
Some interesting links I came across: