2021/09/12

Don't treat people like serverless functions.

When I were knee high to a grasshopper we didn't have all this new fangled cloud infrastructure and we certainly didn't have the concept of serverless computing. How can you compute without a computer?..

But before my time (and I'm not that old!) computers were people. People like Sally. Actual humans sitting in offices with bits of paper, pencils and tables of logarithms and trigonometric functions. Adding, subtracting, scribbling down results, checking and verifying. Whether a human being or a hunk of metal, you need a computer to compute.

Well, most of the time the thing you're computing is important enough that you can't afford for it not to be computed. It's a bad thing if you need to calculate wage-packets and can't do it, whether it's because Sally's off sick and can't compute today or the server has gone down because it's run out of disk space. You're going to have a riot on your hands come Friday afternoon when the pubs open...

Which brings us to the concept of redundancy. 

Rather than relying on Sally alone we need to ensure we have someone else around who can also compute when she's out sick. Equally - in the world of tin - we need backups in case our primary fails - disks, networks, servers, power, cooling, data-centers etc. 

This is an expensive habit.

Almost every system will be important enough to warrant some degree of redundancy. The more critical they become, the greater the degree of redundancy required and the more time architects spend worrying about the impact of component failures, how long it takes to recover, how much data could be lost, acceptable error rates and so on.

In the bad old days we would literally have a standby server for every primary. Boom! Double the costs right there.

Just imagine if we needed to employ Jane as well as Sally just to cope with the days Sally was sick? Of course we'd not do this. We need to ensure the function can be picked up by someone else but that person could be someone else in a team where the team has a degree of redundancy built in (or even someone who's main job is something else, Bob from accounting or Sallys manager for example). Perhaps we can work out that if we need seven computers we best hire ten to cover the days some are sick...

Having a level of redundancy in the organisation provides the flexibility to handle outages and failures. Besides, people can't work at 100% capacity. They will burn out, productivity will fall, they'll hate you for it and will leave as soon as a better opportunity turns up.

Anyway, back in the world of tin, along came virtualisation and we could host multiple virtual machines (VMs), on one physical host in much the same way one person could turn their hands to multiple tasks. This was great ('ish) as it reduced the number of physical servers significantly and saved on hardware, power, space, CO2 emissions and consequently dollars. We still need to support some degree of redundancy in case a host node went down or a VM failed but it's much better than before. 

How much better?

Well, most systems aren't Google or Netflix. I know, surprising huh?

Most systems don't need to support 1000's of transactions per second. Mostly it's less than 1 tps. Yup, one! And often it's a lot less than one... like a few hundred transactions a day is typical of many systems. Me and a my Casio fx-85GT Plus can handle that!

So we can stuff a lot of VMs onto a single physical host with perhaps 50 VMs running across 3 physical hosts in a cluster whilst still maintaining enough redundancy to ensure availability. Make that tin sweat!

Suffice to say, if we treated Sally like the tin, she would not be impressed.

VMs are still pretty hefty though. Each VM runs its own copy of the operating system and associated processes which makes them pretty big (GBs of RAM for the VM compared with perhaps a few MB for application processes). We've had multi-tasking operating systems for decades now and there's little reason we can't run multiple application processes on the same server. Other than it's a really bad idea.

Developers make lots of assumptions about the environment they're running - which libraries and versions are available, what file paths they can use etc. - and a lot of these things aren't compatible with each other. They're also really bad at security and act like peace loving drug infused hippies... "hey, why would anyone else read my files man?". Add to this that bugs happen (always will) resulting in unstable or runaway processes crashing or consuming all the resources available and it's a recipe for disaster.

Running multiple disparate application processes on the same server is a bad idea. Or was...

Now along comes containerization and provides a degree of isolation between processes within a server to prevent one hippie process treading on another hippies toes. And we know how exposed hippies toes are don't we?

This can give us an order of magnitude increase in processes on a host so we're now up to 500 containers across our 3 node cluster of computers. Nice.

Sally on the other hand is seriously pissed.

But we still have to manage a bunch of physical servers underpinning our applications. Whether VMs or containers, there's a bunch of power hungry, raging hot physical computers burning away in the background. And in the case of human computers, really angry overworked ones.

Then came serverless.

Forget the server. You pay-per-use - that few hundred transactions a day - leveraging services provided by cloud providers. Everything from databases to messaging to raw compute can be provided as a pay-per-use service without needing to worry about the server or redundancy.

Erm, well, except for the cloud service provider - who worries about it a lot - and your architect who still needs to worry about the non-functional as well as functional characteristics of all those services we end up consuming (especially the economics of serverless if that one tps turns into many thousands..). 

Of course serverless isn't really server-less and there's always a bit of tin somewhere. We're really talking about building applications out of services (like AWS Lambda or Google Cloud Functions). Those services are carefully managed by cloud providers who supply all the necessary redundancy to support the resiliency, availability and scalability you'd expect.

But what about Sally? Does she still have a job?

Sadly no. Sally has now been moved into the gig economy on a zero-hours contract and works on a pay-per-use basis. She doesn't get any guaranteed work, an hourly rate or sickness benefits. Please don't treat people like serverless functions.

Voyaging dwarves riding phantom eagles

It's been said before... the only two difficult things in computing are naming things and cache invalidation... or naming things and som...