I’ve been going through my bookcase; on orders from a higher-being, to weed out old, redundant books and make way for… well, I’m not entirely sure what, but anyway, it’s not been very successful.
I came across an old copy of Release It! by Michael T. Nygard and started flicking through, chuckling occasionally as memories (good and bad) surfaced. It’s an excellent book but made me stop and think when I came across a note reading:
There’s nothing fundamentally wrong with this other than it chimes with a problem I’m currently facing and I don’t like any of the usual solutions.
Sessions either reside in some sort of stateful pool; persistent database, session management server, replicated memory etc., or more commonly exist stand-alone within each node of a cluster. In either case load-balancing is needed to route requests to the home node where the session exists (delays in replication means you can’t go to any node even when a stateful pool is used). Such load-balancing is performed by a network load-balancer, reverse proxy, web-server (mod_proxy, WebSphere plugin etc.) or application server and can work using numerous different algorithms; IP based routing, round-robin, least-connections etc.
So in my solution I now need some sort of load-balancer – more components, joy! But even worse, it’s creating havoc with reliability. Each time a node fails I lose all sessions on that server (unless I plumb for a session-management-server which I need like a hole in the head). And nodes fails all the time… (think cloud, autoscaling and hundreds of nodes).
So now I’m going to kind-of break that treasured piece of advice from Michael and create larger cookies (more likely request parameters) and include in them some every-so-slightly-sensitive details which I really shouldn’t. I should point out this isn’t is criminal as it sounds.
Firstly the data really isn’t that sensitive. It’s essentially routing information that needs to be remembered between requests – not my credit card details.
Secondly it’s still very small – a few bytes or so but I’d probably not worry too much until it gets to around 2K+ (some profiling required here I suspect).
Thirdly, there are other ways to protect the data – notably encryption and hashing. If I don’t want the client to be able to read it then I’ll encrypt it. If I don’t mind the client reading the data but want to make sure it’s not been tampered with, I’ll use an HMAC instead. A JSON Web Token like format should well work in most cases.
Now I can have no session on the back-end servers at all but instead need to decrypt (or verify the hash) and decode a token on each request. If a node fails I don’t care (much) as any other node can handle the same request and my load balancing can be as dumb as I can wish.
I’ve sacrificed performance for reliability – both in terms of computational effort server side and in terms of network payload – and made some simplification to the overall topology to boot. CPU cycles are getting pretty cheap now though and this pattern should scale horizontally and vertically – time for some testing… The network penalty isn’t so cheap but again should be acceptable and if I avoid using “cookies” for the token then I can at least save the load on every single request.
It also means that in a network of micro-services, so long as each service propagates these tokens around, the more thorny routing problem in this sort of environment virtually disappears.
I do though now have a key management problem. Somewhere, somehow I need to store the keys securely whilst distributing them to every node in the cluster… oh and don’t mention key-rotation…