I don't necessarily think it's a bad thing that your offline and online identity are intertwined - at the naive level that the Zuckerberg marketing machine operates it sounds fair enough - and, ultimately it's true; for most of us we are the same person online as offline, physically if not behaviourally.

However, one of the reasons the internet is so liberating is precisely because you can maintain a number of alter-egos, you too can be a warrior at the weekend! It also forces the innate prejudices we have to be put to one side due to the historic interaction limitations that existed on the net - that a 15yr old geek can stand as an authority on something online where they'd be laughed off stage offline is evidence of this.

Zuckerberg and those of his ilk are riding on the back of this wave. They were born into a time where anonymity online was the norm and created the likes of fb to capture this herd of anonymous sheep desperate for somewhere to mingle and conjoin with friends and other like minded folk. But they've gradually lifted the shroud of privacy and pushed our online and offline selfs together, not for any other philosophical ideology than to drive bigger and bigger profits by selling this data on for advertising.

What's worse, they're destroying the historical limits of the physical world (as it was) where a rant in a pub, a one night stand or an off the cuff comment on the state of your bosses hygiene, could be forgotten in short order and in any case was unlikely to reach the ears of any more than a few dozen people. Now, any transgression, no matter how minor, is likely to be recorded for the next 100 years and available for anyone - at a price.

It's a sad day when the reason the net has been so successful for mankind (in part) is so easily being eroded away without any significant objection being raised. It's worse though when who we are as human beings is being abused for the sake of profit with no political will to stand against it. There will be a back-lash at some point, the only question is how much we are prepared to lose in the meantime.

Rant triggered by Guardian podcast - Founder of 4Chan Chris Poole, the 'anti-Zuckerberg'.



Chief Muppet

Is it me or has there been a radical explosion in the title "Chief" recently? CEO, CFO, CIO, CTO, I can get this (kind of)... But isn't the point that such a role is, well, the chief? Like the president being commander-in-chief?

So we dilute the office (Executive, Financial, Information, Technology...) and you're "chief" of your office... But ok, whatever, you need a little viagra to stimulate these guys.

We then have chief architect (period) which; I admit, in some cases was a role I had some respect for. But now I'm seeing  chief-architect-of-xxx (where xxx is some random project spawned the morning after a particularly heavy drinking session). You're not the chief, you're a muppet for believing the title has any bearing on your status. The only effect that title has is to make the CEOs feet go cold when he realises his veil of authority is slowly eroding away, and for the minions you supposedly lead to think you're a bit of a dick.

So I've decided to bypass this faux "chief" thing and skip it, going straight for... Master-of-the-Universe!

Now I'm just waiting for someone to title themselves "God of Pocket Calculators!"... and the circle is complete, as Dylan said, everybody "Gotta serve somebody" (and yes, I know it's a Willie Nelson cover!).

"They may call you Doctor or they may call you Chief ... But you're gonna have to serve somebody"...



Welcome to the Future!

The security police are starting to crawl out of the their bunkers once more to shout and scream about the mess that has been left behind by the enthusiastic but naive, avant garde of technology - the Internet of Things. Steven M. (I'm sure he has longer surname somewhere) has a post over at Linked-in on this titled Should I care about the Internet of Things?

Depending on your viewpoint, advocates of IoT are either leading us into a bright new shiny future where everything talks to everything else in a glossy-white plastic world where robots beep-beep their way to satisfy your every whim, fridges restock themselves before you know you've run out of milk and cars drive themselves 6 inches apart on highways made of strawberry milkshake, or, a world where killer robots are roaming free, terrorists are randomly restocking your fridge with tins of baked-beans (I really don't like baked beans) and cars spontaneously crash of their own volition, polluting the otherwise natural beauty that is the strawberry milkshake highway (which is a given fact the future holds for us!). I may have got some of that muddled up...

Anyway, as with Steven M. also, analogies with the biological world abound (and here I'll no doubt get another complaint from my brother-in-law) and I have to say I'm kind of attracted to the idea of self-organising, self-replicating (in a virtual sense at least) systems that learn and are ready to do our bidding. It will though expose us to new attack vectors we've barely begun to dream up - shit will happen my friends!

Personally I think we'll nail the sort of security issues that plague us today - code injection attacks, poor data encryption and access control policies etc. - but we're just not thinking about future issues in the right way; mainly because futurologists are full of shit (generally at least).

So if we take the biological analogy a few steps too far then what sort of risks will we face? Positive feedback leading to runaway processes? Sophisticated virtual bacteria roaming the net, consuming all resources in their path? Today it's all too easy to spin up a few hundred machines in the cloud to initiate a DoS attack - brutal but often effective and with few options to remedy - but this stuff will become so elaborate and complex in it's integration as to become something of a wonder in itself. And the less we say about it becoming conscious the better (even though I reckon that's where we're headed...). Like I said, futurologists are full of shit...

Should we care about IoT? Definitely! It's cool stuff, it's going to happen but it's not going to be as awesome as the sales guy says. The hype is over and it's time to suck-up some negativity and let reality sink in.

And mark my word! The day will come when you have to book an appointment with the doc because your house has come down with some nasty bug, is vomiting sewage into the hallway, spraying water all over the kitchen and it's 45°C inside as the central heating goes into overdrive... Welcome to the future - as shit tomorrow as it is today!


Computation v Security: Encryption and Hashing

As an aside on previous post on computation and security requirements I thought I'd add a note on an obvious omission, encryption and hashing...

Tricks like encryption and hashing aren't really applicable to computation security requirements even though they are computations themselves. Encryption and hashing are more applicable to transport (connections) and storage (data). It's nigh-on impossible to do any computation on encrypted data so you generally need it in plain-text form (homomorphic encryption aside since it's not really market ready yet).

Hashing is a useful tool in so many cases but is increasingly becoming overused. The compute power available today; especially in the cloud, means its relatively easy for someone to create a lookup database of all words in hashed form. This can then be used to identify user passwords for example. You can salt the hash to make it more distinct but this then means you need to manage the salt; and likely change it from time to time for the same reason you change encryption keys. Forcing longer and more complex passwords helps (well, maybe, that's another debate) but with compute power in 5 years time it may well be pointless and alternative forms of identification will be needed (if they aren't already).

Using hashes to obscure data such as postcodes or dates is even less worthy as the number of hashes you need to create are limited and can be computed in seconds on a modern computer. Date-of-birth for example is limited to say 100 years * 365 days worth of hashes. A particularly determined attacker could even look at the distribution of these hashes to determine that it's date data even if it's not labelled as such.

Encryption and hashing are useful for data transfer and persistence but; whilst they're clearly computational tasks themselves, they're not generally requirements for computational components currently.

Computation v Security

Last month I wrote a piece about computation, data and connections with a view to starting to list out some considerations for each of these with respect to non-functionals... This is part one on computation and security.

In terms of computation, we're talking about code that does stuff, the stuff that performs the logical processing on the data and using those connections.

From a security perspective we're primarily concerned with access control. Conceptually this is a question of who is allowed to do what, where, when and how.

  • Who - in essence covering authentication and identification. The who may be human or system. There are many ways to authenticate users and I would strongly advise you use an off-the-shelf component. Most application servers (JEE or .NET) will have built in ways to authenticate users against LDAP or AD etc. These will have been tested for security (penetration testing) and will be more secure than any home-grown solution.

  • What - authorisation to execute the functionality provided (e.g. create-order, send-email, press-the-big-red-button). Again, lots of standardised ways to check authorisation in J2EE and .NET application servers which should be used. Manual/custom code checks such as "isUserInRole()" can be bypassed if someone gets access to the code (a notable issue with client side JavaScript for example).

  • Where - more access-control - where the code is located. Note that where-ever this is we need to question how much you trust that location. Does it require physical security? Is it acceptable to run in the users-browser (e.g. JavaScript)? Should it be in a DMZ or a more tightly controlled security zone? If we assume that it gets compromised*, then what? Does this allow unauthorised access?

  • When - there's a touch point here with availability requirements. A lot of code runs 24x7 but many batch-jobs run on a daily, weekly or monthly schedule. Should these be allowed to run outside of predefined windows and what is the impact if they do?

  • How - this is the code itself which is protected through language choice, frameworks, access to source-control systems, code-reviews and the secure development practices. Scripted languages suffer from the potential to be hijacked rather easily, compiled code is harder to subvert since it's unlikely the source-code and compilers are available on the production system (at least they shouldn't be!).

Secure Development

Secure development is a minefield and new vulnerabilities are found all the time. Keep an eye on the OWASP top ten for the most common issues.

Things like SQL injection (or XML**, JS or anything else injection) attacks are normally top of the list. These result from code which is simply concatenating strings together, some of which are supplied by the end user, to form a SQL statement which is thrown at a database. Submitting something like "; drop database;" into a request may not be the best thing. Perhaps worse still an attacker can use these methods to query the database structure and retrieve or modify data they should not be allowed access to.

A common fix for this; in the case of SQL, is to use bind variables and prepared statements (which can also show performance improvements). This will result in the the dodgy data being part of the query/insert itself where it may look a little odd in the database but should do less damage. You should also scan any parameters for suspect characters since this avoids storing such nonsense but in itself isn't a great solution as it leaves you at the mercy of your developers and any failings they have (and don't we all).

Other secure development concerns are:

  • Data validation. Check every parameter is of the correct type, expected range and form (e.g. see above re SQL injection). This often means code at both ends of a web-page (i.e. in JavaScript for user-friendly solutions, and in the server code in case the JS has been hacked). You may choose to trust code within a container as this should have been checked at compile time but anything at two ends of a connection should be checked and with interpreted code you may want to be paranoid if you don't have control over the environment very well.

  • Escaping strings. XSS attacks work by allowing attackers to insert JS and HTML into someone elses browser through your website. The impact can vary from being a minor nuisance to allowing someone to capture personal data or submit requests on behalf of the user unwittingly. Escaping strings will result int the JS/HTML appearing as simple text on the page. If you do want to allow some HTML then you'll need more elaborate parsing.

  • Avoiding buffer overflows. Many modern languages such as Java and C# help you avoid through their own memory management though you need to ensure you keep these frameworks patched and up-to-date. If you code in lower level languages and do your own memory management then you may want to consider the impact if access is granted to memory unwillingly. Validating data and ranges (above) also helps avoid some of these issues.

  • Trapping exceptions and dealing with these effectively; including those that should never happen. I'm a fan of having code which simply throws a complete hissy-fit in the event these things happen. Just bomb out in as brutal a way as is acceptable for simplicities sake as it's usually symptomatic of something more serious. Though you need to ensure you log.

  • Log events and exceptions, record the state as much as is reasonable for later analysis (don't log passwords!) and don't expose this data to the end user. A full stack trace in the browser may be useful during development but just exposes the inner workings of your system to hackers - a useful way to identify potential weak spots.

  • Audit. Where you want to prove traceability then record audit logs of who did what, when, and where. When you get attacked this will help trace back to an IP address, a machine or a user who initiated it. There is then the question of where audit logs are kept and ensuring these can't be subverted else if you need to use them in a court of law at some stage you'll not be able to demonstrate that the logs themselves haven't been tampered with (i.e. non-repudiation).


There is a very (very very very) strong "buy" argument for security. Most commercial security solutions will be paranoid. They'll have been tested and attacked aggressively to ensure they are secure and should have some sort of formal security accreditation.  The supplier should also be responsive to new attack vectors and aware of those which your developers never thought of; such as filtering out track/trace HTTP methods, scanning requests for code injection, limiting data volumes to avoid buffer-overflows etc. If you create your own security solution then your only real security is security through obscurity. With a commercial option you'll get better security, adherence to standards and another layer of security between your code and attackers which can act as a safety net in the event of issues elsewhere.

There are lots of options in terms of frameworks, reverse proxies, intrusion detection, bastion servers, firewalls, anti-virus, directory services etc. which all help to complete the shield and provide protection.

Ultimately, security requirements in relation to computation are concerned with access-control, identification, authentication, authorisation, auditing, and the adoption of good secure development practices.

* I'm always amazed at the reaction of many developers to the risk of something being compromised. They often assume it won't happen and come up with all sorts of shallow arguments "it won't happen, the box is safe under my desk". The point of asking these questions is to step through the argument and consider "what if"? What if a malicious colleague got access to the box? What if the box was stolen from the office (I have seen this happen more than once)? etc. If the rationale is sound and/or the impact of it happening is low then perhaps it's acceptable, if not then something needs to change.

** There was a case I heard of regarding a piece of XSL which would send a transformer into overdrive creating some XML which was many many GB's in size. This was apparently a very small piece of code but the result would be a collapse of the server. It is quite likely some smart cookie out there will work out a way to use whatever language your using against you at some time.

Voyaging dwarves riding phantom eagles

It's been said before... the only two difficult things in computing are naming things and cache invalidation... or naming things and som...