nonfunctionalarchitect.com: June 2014

2014/06/29

Government IT Spend

The way that governments are run they'll be a huge amount of duplication and waste in this lot. Situation is even worse when you consider that historically it's mostly proprietary software with n year support contracts for stuff that's rarely used (but hits the headlines when it is). Not at all surprising.

The future for government IT is open-source and cloud based.

Bypassing BT's DNS Service

I suffered from BTs failure yesterday which knocked out many sites though thankfully it didn't seem to affect nonfunctionalarchitect.com - phew! What a relief huh?

Anyway, BT has now apologised for the incident and is investigating root-cause. Well, feeling lost and detached from reality without full and proper access to the net (internet access should be a human right) I naturally did my own investigating which included the obligatory reboots to no avail (my Mac, wife's PC, home-hub) - and you know they'll make you redo these steps if you have to call support...

Some sites could be pinged, some couldn't (could not resolve host) which points at a DNS issue. Bypassing BTs DNS isn't that easy though as they have a transparent DNS service in place which means you can't just add Googles free DNS servers to your list (8.8.8.8 and 8.8.4.4 if you're interested). Doing this in my case simply resulted in an error message saying that BT's Parental Controls were on a prevented me using another DNS service. Turning parental controls off stopped the error message but didn't help me resolve names because the transparent DNS service remains intercepting any requests.

I could only think of two methods to bypass BT's DNS service:

1. Use a VPN.

This will still rely on BT's network but prevents them from intercepting anything since it's all secure in a warm and cosy encrypted VPN tunnel. The only problem here is finding a VPN end-point to connect to first - I have one, but its to allow me remote access to my house which in turns relies on BT. Doh!

2. Use TOR (The Onion Ring) and Privoxy.

This prevents DNS lookups from the browser (hence use of Privoxy) and all requests are sent over the TOR network and may surface anywhere in the world (preferably somewhere not using BT's DNS service though I have little control over this). It's not the fastest solution but it works. Fortunately I had an old VM with TOR and Privoxy installed and configured so with a few tweaks to this (listen on 0.0.0.0 (all addresses) rather than 127.0.0.1 (localhost only)) I could configure all the machines in the house to use this VM as a proxy service and bingo! We were back online and didn't have to risk talking to each other anymore - phew!

TOR is awesome and useful for accessing sites which may be blocked by your service provider, your government or for some other legal issue (such as why the really cool but generally inaccessible BBC Future site is blocked from fee paying British residents). It's also useful if you want to test stuff from somewhere else in the world over what feels like a wet piece of string for a network.

Resiliency worries needs to be considered before you have failure. In this instance you need to have a VM (or physical machine) pre-configured and ready for such an emergency (and don't call 999, they won't be able to help... ). Smug mode on!

2014/06/28

Excremental Form

We often think we know what good design is; whether it be system, code or graphic design, and it's a good thing that we strive for perfection.

Perfection though is subjective, comes at a cost and is ultimately unachievable. We must embrace the kludges, hacks, work-arounds and other compromises and like the Greek idiom; "whoever is not Greek is barbarian", we should be damn proud of being that little bit barbaric even if we continue to admire the Greeks.

The question is not whether the design is good but whether the compromises are justified, sound and fit for purpose. Even shit can have good and bad form.

2014/06/20

Eyes on the road

What's it got to do with non-functionals? Hell, I dunno... That you're a single point of failure perhaps?

https://www.youtube.com/watch?v=JHixeIr_6BM

Chaos Monkey

I've had a number of discussions in the past about how we should be testing failover and recovery procedures on a regular basis - to make sure they work and everyone knows what to do so you're not caught out when it happens for real (which will be at the worst possible moment). Scheduling these tests, even in production, is (or should be) possible at some convenient('ish) time. If you think it isn't then you've already got a resiliency problem (you're out when some component fails) as well as a maintenance problem.

I've also talked (ok, muttered) about how a healthy injection of randomness can actually improve stability, resilience and flexibility. Something covered by Nassim Taleb in his book Antifragile.

Anyway, beat to the punch again, Netflix developed a tool called Chaos Monkey (aka Simian Army) a few years back with randomly kills elements of the infrastructure to help identify weak points. Well worth checking out on codinghorror.com.

For the record... I'm not advocating that you use Chaos Monkey in production... Just that it's a good way to test the resiliency of your environment and identify potential failure points. You should be testing procedures in production in a more structured manner.

2014/06/18

Telco CDNs & Monopolies

Telco CDNs (Content Distribution Networks) are provided by telcos by embedding content caching infrastructure deep in the network close to the end-user (just before the last kM of copper wire). The result is improved streaming to end-users and significantly less load on both the content providers servers and the telcos wider network. It's a win-win-win for everyone.

Telcos charge content providers for this service. If the telcos network has a limited client base then perhaps there's not much point in the content provider paying them to cache the content since it'll not reach many end-users. If the telco is a state run (or previously state run) monopoly telco then if you want to make sure your content is delivered in the best quality you'll pay (if you can). The telco could thus be accused of abuse if they are seen to be using a monopoly position to drive ever higher profits through leveraging this sort of technology. It can also be considered an abuse of net neutrality principles by essentially prioritising (biasing) content. Worse still if it's state run then you'll wonder if it's 1984 all over again (the fashion was truly awful!).

Technically I think the idea of telco CDNs is pretty neat and efficient (storage capacity is cheap compared to network capacity). I'd also not want to add directly to the cost of my internet connection to fund the infrastructure to support this so am pleased if someone else is prepared to pay.

Ultimately though we all pay of course and you could argue that this model at least attempts to ensure users of high volume services such as NetFlix pay rather than everyone. However, as with net neutrality concerns in general I wonder when the first public outcry will come... when we discover a telco is prioritising it's own video streaming service over a competitors? when we find the government has been using such methods to intentionally drop "undesirable" content? or when we can't watch East-Enders in HD because the BBC hasn't paid their bill recently?

2014/06/17

Resilient WebSphere Session Management

I've been promising myself that I'll write this short piece sometime and since the football today has been a little sluggish I thought I take timeout from the world cup and get on with it... (you know it won't be short either..).

Creating applications than can scale horizontally is; in theory, pretty simple. Processing must be parallelizable such that the work can be split amongst all member processors and servers in a cluster. Map-reduce is a common pattern implemented to achieve this. Another; even more common, pattern is the simple request-response mechanism of the web. It may not sound like it since each request is typically independent from each other, but from a servers perspective it is arguably an example of parallel processing. Map-reduce handles pre-requisites by breaking jobs down into separate map and reduce tasks (fork and join) and chaining multiple map-reduce jobs. The web implements it's own natural scheduling of requests which must be performed in sequence as a consequence of the wet-ware interacting at a snails pace with the UI. In this case any state needing to be retained between requests is typically held in sessions - in-memory on the server.

Resiliency though is a different issue than scalability.

In map-reduce, if a server fails then the processing task can be restarted on another node. They'll be some repeat work performed as the results of the in-flight task will have been lost (and maybe more) but computers don't much mind doing repetitive tasks and will quite willingly get on with it without much grumbling (ignoring the question of "free will" in computing for the moment).

Humans do mind repeating themselves though (I've wanted to measure my reluctance to repeat tasks over time since I think it's got progressively worse in recent years...).

So how do you not lose a users session state if a server goes down?

Firstly, you're likely going to piss someone off. They'll be some request in mid flight the second the server does down unless you're in maintenance mode and are quiescing the server cleanly. Of course you could not bother with server session state at all and track all data through cookies running back and forth over the network. This isn't very good - lot's of network traffic and not very secure if you need to hold anything the user (or Eve) shouldn't see, or if you're concerned about someone spoofing requests. Sometimes it's viable though...

But really you want a way for the server to handle such failures for you... and with WebSphere Application Server (WAS) there's a few options (see how long it takes me to get to the point!).

==== SCROLL TO HERE IF YOU WANT TO SKIP THE RATTLING ====

The WAS plugin should always be used in front of WAS. The plugin will route requests to the correct downstream app-server based on a clone id tagged on to the end of the session id cookie (JSESSIONID). If the target server is not available (plugin cannot open a connection to the server) then another will be tried. It also means that whatever http server (Apache, IIS, IHS) a request lands on it will be routed to the correct WAS server where the session is held in memory. It's quite configurable for problem determination; on the fly, so well worth becoming friends with.

When the request finally lands on the WAS server then you've essentially three options for how you manage sessions for resiliency.

Local Sessions - Do nothing and all sessions will be held in memory on the local server. In this instance, if the server goes down, you'll lose the session and users will have to login again and repeat any work they've done to date which is held in session (and note; as above, users don't like repeating themselves).

Database persistent sessions - Configure a JDBC source and WAS can store changes to the session in a database (make sure all your objects are serializable). The implementation has several options to optimize for performance over safety and the like but at the end of the day you're writing session information to a database - it can have a significant performance impact and adds another pre-requisite dependency (i.e. a supported, available and resilient database). Requests hitting the original server will find session data available in-memory already. Requests hitting another server will incur a database round trip to fetch session state. As a one-off hit it's tolerable but to avoid repeated DB hits you still want to use the plugin.

Memory to memory replication - Here changes to user sessions are replicated;in the background, between all servers in a cluster. In theory any server could serve requests and the plugin can be ignored but in practice you'll still want requests to go back to the origin to increase the likelihood that the server has the correct state as even memory-memory replication can take some (small) time. There are two modes this can operate in, peer-to-peer (normal) and client-server (where a server operates as a dedicated session state server).

My preference is for peer-to-peer memory-to-memory replication due to performance and cost factors (no additional database required which would also need to be resilient, no dedicated session state server). Details of how you can setup this up are in the WAS Admin Redbook.

Finally, you should always keep the amount of data stored in session objects to a minimum (<4kB) and all objects need to be serializable if you want to replicate or store sessions in a database. Don't store the complete results of a cursor in session for quick access - repeat the query and return only the results you want (using paging to skip through) - and don't store things like database connections in session, it won't work, at least, not for long...

2014/06/12

Cloud Computing Patterns

Found a website today on Cloud Computing Patterns. Bit of a teaser to buy the book really since there's not a great deal of detail on the site itself. Still, a useful inventory of cloud patterns and some nice diagrams showing how things like workload elasticity works and the service models; IaaS, PaaS and SaaS, fit together.

2014/06/05

Hitler uses Git

Hilarious...

https://www.youtube.com/watch?v=CDeG4S-mJts&feature=kp

2014/06/04

Scaling on a budget

Pre-cloud era. You have a decision to make. Do you define your capacity and performance requirements in the belief that you'll build the next top 1000 web-site in the world or start out with the view that you'll likely build a dud which will be lucky to get more than a handful of visits each day?

If the former then you'll need to build your own data-centres (redundant globally distributed data-centres). If the latter then you may as well climb into your grave before you start. But most likely you'll go for something in the middle, or rather at the lower end, something which you can afford.

The problem comes when your site becomes popular. Worse still, when that popularity is temporary. In most cases you'll suffer something like a slashdot effect for a day or so which will knock you out temporarily but could trash your image permanently. If you started at the higher end then your problems have probably become terminal (at least financially) already.

It's a dilemma that every new web-site needs to address.

Post-cloud era. You have a choice - IaaS or PaaS? If you go with infrastructure then you can possibly scale out horizontally by adding more servers when needed. This though is relatively slow to provision* since you need to spin up a new server, install your applications and components, add it to the cluster, configure load-balancing, DNS resiliency and so on. Vertical scaling may be quicker but provides limited additional headroom. And this assumes you designed the application to scale in the first place - if you didn't then chances are probably 1 in 10 that you'll get lucky. On the up side, the IaaS solution gives you the flexibility to do-your-own-thing and your existing legacy applications have a good chance they can be made to run in the cloud this way (everything is relative of course).

If you go with PaaS then you're leveraging (in theory) a platform which has been designed to scale but which constrains your solution design in doing so. Your existing applications have little chance they'll run off-the-shelf (actually, no chance at all really) though if you're lucky some of your libraries may (may!) work depending on compatibility (Google App Engine for Java, Microsoft Azure for .NET for example). The transition is more painful with PaaS but where you gain is in highly elastic scalability at low cost because it's designed into the framework.

IaaS is great (this site runs on it), is flexible with minimal constraints, low cost and can be provisioned quickly (compared to the pre-cloud world).

PaaS provides a more limited set of capabilities at a low price point and constrains how applications can be built so that they scale and co-host with other users applications (introducing multi-tenancy issues).

A mix of these options probably provides the best solution overall depending on individual component requirements and other NFRs (security for example).

Anyway, it traverses the rats maze of my mind today due to relevance in the news... Many Government web-sites have pitiful visitor numbers until they get slashdotted or are placed at #1 on the BBC website - something which happens quite regularly though most of the time the sites get very little traffic - peaky. Todays victim is the Get Safe Online site which collapsed under load - probably as result of the BBC advertising it. For such sites perhaps PaaS is the way forward.

* I can't really believe I'm calling IaaS "slow" given provisioning can be measured in the minutes and hours when previously you'd be talking days, weeks and likely months...

Linux! Champion of Big Data

Big data solutions based on distributed databases such as MongoDB (and Hadoop and others) rely on have very many nodes running in parallel to provide resiliency, performance and scalability.

This is a step up from the "cluster of 2-nodes" model (primary & failover) used for many legacy SQL installations. Such is simply not big enough to support resiliency with the sort of distributed database model NoSQL solutions provide (even if it could scale). For example, you'll need a minimum of x3 nodes just to allow the election of a primary to work in a replicated cluster and more for sharding using MongoDB.

Of course there's a reason why you've chosen a NoSQL solution in the first place - scale - and the choice of horizontal v vertical scaling at these sizes makes sense. This is all good news for Linux since an increase in the number of nodes has costs associated with it which I will likely dictate that Linux will become the OS of choice for such solutions instead of Windows or other UNIX OS's. Commodity hardware will likely be the same for all OS's (bar UNIX's) so the differentiator will be the OS (on price at least).

Of course, if your volumes are low then you can always stick with a SQL database - tried, tested and actually pretty damn good and suited to most problems out there. In many cases SQL should be the default. NoSQL if you're forced to by capacity requirements...

MongoDB Write Concern Performance

MongoDB is a popular NoSQL database which scales to very significant volumes through sharding and can provide resiliency through replication sets. MongoDB doesn't support the sort or transaction isolation that you might get with a more traditional database (read committed, dirty reads etc.) and works at the document level as an atomic transaction (it's either inserted/updated, or it's not) - you cannot have a transaction spanning multiple documents.

What MongoDB does provide is called "Write-Concern" which provides some assurance over whether the transaction was safely written or not.

You can store a document and request "acknowledgement" (or not), whether the document was replicated to any replica-sets (for resiliency), whether the document was written to the transaction log etc. There's a very good article on the details of Write-Concern over on the MongoDB site. Clearly the performance will vary depending on the options chosen and the Java driver supports a wide range of these:

ACKNOWLEDGED

ERRORS_IGNORED

FSYNCED

FSYNC_SAFE

JOURNALED

JOURNAL_SAFE

MAJORITY

NONE

NORMAL

REPLICAS_SAFE

REPLICA_ACKNOWLEDGED

SAFE

UNACKNOWLEDGED

So for a performance comparison I fired up a small 3 node MongoDB cluster (2 database servers, 1 arbitrator) and ran a script to store 100 documents in the database using the various methods available to see what the difference is. The database was cleaned down each time (to zero - so overall is very small).

**WARNING: Performance testing is highly dependent upon the environment in which it is run. These results are based on a dev/test environment running x3 guests on the same host node and may not be representative for you and only exist to provide a comparison. **

The results for all modes are shown below and shows x3 relatively distinct clusters.

[caption id="attachment_163" align="alignnone" width="480"]

Write Concern - All Modes[/caption]

Note: The initial run in all cases incurs a start up cost and hence appears slower than normal. This dissipates quickly though and performance can be seem to improve after this first run.

The slowest of these are FSYNCED, FSYNC_SAFE, JOURNALED and JOURNALED_SAFE (with JOURNALED_SAFE being the slowest).

[caption id="attachment_166" align="alignnone" width="480"]

Write-Concern Cluster 3 - Slowest[/caption]

These options all require the data to be written to disk which explains why they are significantly slower than other options though the contended nature of the test environment likely makes the results appear worse than they would be in a production environment. FSYNC modes are mainly useful for backups and the like so shouldn't be used in code. JOURNALED modes depend on the journal commit interval (default 30 or 100ms) as well as the performance of your disks. Interestingly JOURNAL_SAFE is the supposedly the same as JOURNALED so seems a little odd that I can see a relatively significant reduction in performance consistently.

The second cluster improves performance significantly (from 3.5s overall to 500ms). This group covers the MAJORITY, REPLICAS_SAFE and REPLICAS_ACKNOWLEDGED options.

[caption id="attachment_165" align="alignnone" width="480"]

Write-Concern Cluster 2 - Mid[/caption]

These options are all related to data replication to secondary nodes. REPLICA_ACKNOWLEDGED waits for x2 servers to have stored the data whilst MAJORITY waits for the majority to have stored and in this test since there are only x2 database servers it's unsurprising that the results are similar. As the number of database servers increases then MAJORITY may be safer than REPLICA_ACKNOWLEDGED but will suffer some performance degradation. This though isn't a linearly scaled performance drop since replication will generally occur in parallel. REPLICA_SAFE is supposedly the same as REPLICA_ACKNOWLEDGED and in this instance the results seem to back this up.

The fastest options cover everything else; ACKNOWLEDGED, SAFE, NORMAL, NONE, ERRORS_IGNORED and UNACKNOWLEDGED.

[caption id="attachment_164" align="alignnone" width="480"]

Write-Concern Cluster 1 - Fastest[/caption]

In theory I was expecting SAFE and ACKNOWLEDGED to be similar with NORMAL, NONE, ERRORS_IGNORED and UNACKNOWLEDGED quicker still since this last set shouldn't wait for any acknowledgement from the server - once written to socket then assume all ok. However, the code I used was an older library I developed some time back which returns the object ID once stored. Since this has to read some data back, some sort of acknowledgement is implicit and so unsurprisingly they all perform similarly.

ERRORS_IGNORED and NONE are deprecated and shouldn't be used anymore whilst NORMAL seems an odd name as the default for MonoDB itself is ACKNOWLEDGED!?

In summary. For raw speed ACKNOWLEDGED should do though if you want fire-and-forget then specific code and UNACKNOWLEDGED should be faster still. A performance drop will occur if you want the assurance that the data has been replicated to another server via REPLICA_ACKNOWLEDGED and this will depend on your network performance and locations so is worth testing for your specific needs. Finally, if you want to know it's gone to disk then it's slower still with the JOURNALED option, especially if you've contention on the disks as I do. For the truly paranoid there should be a REPLICA_JOURNALED option which would confirm both replicated and journaled.

Finally, if you insist on a replica acknowledging as well then it needs to be online and your code may hang if a replica is not available. If you've lots of replicas then this may be acceptable but if you've only 1 (as in this test case) then it's bad enough to bring the application down immediately.

Google Maps Usability

Is it me or is the recently revamped Google Maps not as usable as the old one? They've maxed out the map; which is nice, but the menu items are now so obscure you'd not know where they are without flipping endlessly around the map (it's worse on the Android version IMHO).

For example, I search for a location, find it, right click, select "directions to here" and it provides directions from that location to itself!? Well ok, I can kind of understand why but it's obviously not what I meant... So I right click on home and it says "What's here?", not "directions from here". To do that I need to type (wtf!) my start location into the drop-down on the left of the screen. Not unusable just irritating...

Plus the search menu overlaps the map which can be irritating on smaller screens and seems to slide up-and-down at will as you drag around the map - again, irritating... (but still pretty much the best map service out there...).

2014/06/02

WebSockets at The Red Lion

More of a functional one this... I'm working on a personal project at the moment related to some automated diagramming tools I developed some time back. This calls for inter browser communication (between browsers running on different clients) for which web-sockets are the logical choice in these delightful¹ HTML5 days. I've developed a prototype to test out web-sockets in Java using the same connection for both commands (messages between client and server) and for inter browsers comms (messages between clients).

It's only a prototype to test that messaging works but if you head over to The Red Lion² you may be able to join me online for a virtual pint.

Developing web-socket applications in Java is relatively straightforward though the examples online tend to forget about thread blocking by sending messages to clients in sequence (one client blocking could cause delays to other users). The code I've developed avoids this via use of Java concurrency libraries which is fine except that it violates JEE rules by spawning it's own threads. It's currently limited to a single server and doesn't support a clustered environment which would require either a persistent datastore (less than desirable) or server-server comms (which I'd prefer). I did some test on the LAN to pump messages pretty quickly through the tiny Atom server I have running and it seemed to hold up to the task. Interestingly Chrome started to lock up though IE, FF and Safari seemed to cope. I should try and get some stats on this.

As a prototype there is no authentication (feel free to lie) and no logging of message content (you'll have to trust me on that one or check out the code).

In case you're interested, the code is on github.com, no warranties (as ever).

1 If you've developed in a pre-HTML5 world then you'll know what I mean... If you're still having to support non HTML5 compliant browsers then.. sorry :(
2 The Red Lion is the most common pub name in the UK.

nonfunctionalarchitect.com