2014/07/05

Data Currency and Exploding Bunnies

There is such a thing as data currency - i.e. how current and up to date the data is. On the web, stale data is a social disease which deservingly leads to isolation and the irritating and distant tut-tut'ing that goes with it - much like zits on a greasy teenager. We're all culpable (me in particular given the last time I updated the mighty stellarmap.com) but I expect more from The Guardian. So... tut-tut to The Guardian for pressing UK Gov Data from 2010 on their Data home-page today. There was me thinking I'd happen across some interesting nuggets only to find old, stale and consequently misleading data.

Of course I'm not helping by providing a bunch of links to stale content myself but it's time more attention was paid to data currency by net publishers. Perhaps then this wouldn't be such a big problem.

Screen grab below for the hell of it...

gdata

 

From a non-functional perspective an important criteria for data is how current it needs to be. Various caching strategies may; or may not, be viable depending on how critical this is. If you're buying stock then you want the correct price now, if you're browsing the news then perhaps today is sufficient. This affects how deep into your system transactional tentacles reach and the resources that need to be committed to address this. However, the issue on The Guardian site likely relates instead to the algorithms that promote data and how these are either insensitive to the element of time, the subject of data of low velocity (which it isn't in this case) or which are sensitive; and who wouldn't be, to the internet scale viral effect causing excessive temporary popularity. Perhaps content providers need to start saying, "ok, we know it's a funny video of a cat stuck in a washing machine but that was so 2005 and welcome to 2015, so here's a fully interactive 3d experience of a bunny playing with grenades instead"...

No comments:

Post a Comment

Voyaging dwarves riding phantom eagles

It's been said before... the only two difficult things in computing are naming things and cache invalidation... or naming things and som...