Skip to main content

Session Abolition

I've been going through my bookcase; on orders from a higher-being, to weed out old, redundant books and make way for... well, I'm not entirely sure what, but anyway, it's not been very successful.

I came across an old copy of Release It! by Michael T. Nygard and started flicking through, chuckling occasionally as memories (good and bad) surfaced. It's an excellent book but made me stop and think when I came across a note reading:
Serve small cookies
Use cookies for identifiers, not entire objects. Keep session data on the server, where it can't be altered by a malicious client.

There's nothing fundamentally wrong with this other than it chimes with a problem I'm currently facing and I don't like any of the usual solutions.

Sessions either reside in some sort of stateful pool; persistent database, session management server, replicated memory etc., or more commonly exist stand-alone within each node of a cluster. In either case load-balancing is needed to route requests to the home node where the session exists (delays in replication means you can't go to any node even when a stateful pool is used). Such load-balancing is performed by a network load-balancer, reverse proxy, web-server (mod_proxy, WebSphere plugin etc.) or application server and can work using numerous different algorithms; IP based routing, round-robin, least-connections etc.

So in my solution I now need some sort of load-balancer - more components, joy! But even worse, it's creating havoc with reliability. Each time a node fails I lose all sessions on that server (unless I plumb for a session-management-server which I need like a hole in the head). And nodes fails all the time... (think cloud, autoscaling and hundreds of nodes).

So now I'm going to kind-of break that treasured piece of advice from Michael and create larger cookies (more likely request parameters) and include in them some every-so-slightly-sensitive details which I really shouldn't. I should point out this isn't is criminal as it sounds.

Firstly the data really isn't that sensitive. It's essentially routing information that needs to be remembered between requests - not my credit card details.

Secondly it's still very small - a few bytes or so but I'd probably not worry too much until it gets to around 2K+ (some profiling required here I suspect).

Thirdly, there are other ways to protect the data - notably encryption and hashing. If I don't want the client to be able to read it then I'll encrypt it. If I don't mind the client reading the data but want to make sure it's not been tampered with, I'll use an HMAC instead. A JSON Web Token like format should well work in most cases.

Now I can have no session on the back-end servers at all but instead need to decrypt (or verify the hash) and decode a token on each request. If a node fails I don't care (much) as any other node can handle the same request and my load balancing can be as dumb as I can wish.

I've sacrificed performance for reliability - both in terms of computational effort server side and in terms of network payload - and made some simplification to the overall topology to boot. CPU cycles are getting pretty cheap now though and this pattern should scale horizontally and vertically - time for some testing... The network penalty isn't so cheap but again should be acceptable and if I avoid using "cookies" for the token then I can at least save the load on every single request.

It also means that in a network of micro-services, so long as each service propagates these tokens around, the more thorny routing problem in this sort of environment virtually disappears.

I do though now have a key management problem. Somewhere, somehow I need to store the keys securely whilst distributing them to every node in the cluster... oh and don't mention key-rotation...

Comments

Popular posts from this blog

An Observation

Much has changed in the past few years, hell, much has changed in the past few weeks, but that’s another story... and I’ve found a little time on my hands in which to tidy things up. The world of non-functionals has never been so important and yet remains irritatingly ignored by so many - in particular by product owners who seem to think NFRs are nothing more than a tech concern. So if your fancy new product collapses when you get get too many users, is that ok? It’s fair that the engineering team should be asking “how many users are we going to get?”,   or “how many failures can we tolerate?” but the only person who can really answer those questions is the product owner.   The dumb answer to these sort of question is “lots!”, or “none!” because at that point you’ve given carte-blanche to the engineering team to over engineer... and that most likely means it’ll take a hell of a lot longer to deliver and/or cost a hell of a lot more to run. The dumb answer is also “only a couple” and “

Inter-microservice Integrity

A central issue in a microservices environment is how to maintain transactional integrity between services. The scenario is fairly simple. Service A performs some operation which persists data and at the same time raises an event or notifies service B of this action. There's a couple of failure scenarios that raise a problem. Firstly, service B could be unavailable. Does service A rollback or unpick the transaction? What if it's already been committed in A? Do you notify the service consumer of a failure and trigger what could be a cascading failure across the entire service network? Or do you accept long term inconsistency between A & B? Secondly, if service B is available but you don't commit in service A before raising the event then you've told B about something that's not committed... What happens if you then try to commit in A and find you can't? Do you now need to have compensating transactions to tell service B "oops, ignore that previous messag

Equifax Data Breach Due to Failure to Install Patches

"the Equifax data compromise was due to their failure to install the security updates provided in a timely manner." Source: MEDIA ALERT: The Apache Software Foundation Confirms Equifax Data Breach Due to Failure to Install Patches Provided for Apache® Struts™ Exploit : The Apache Software Foundation Blog As simple as that apparently. Keep up to date with patching.