It seems to me that we are seeing an increasing number of issues such as this reported by the Guardian. A lost transaction results in a credit default against an individual with the result that they cannot obtain a mortgage to buy a house. Small error for the company, huge impact for the individual.
The company admitted that despite the request being submitted on their website they did not receive the request!? So either the user pressed submit then walked away without noting the response was something other than "all ok!" or the response was "all ok!" and the company failed to process the request correctly.
If the former then, well, user error for being a muppet... As end users we all need to accept some responsibility and check that we get the feedback we expect.
For the latter, there are several reasons why subsequent processing could have failed. Poor transaction management so the request never gets committed, poor process management so the request drops into some dead queue never to be dealt with (either through incompetence or through malicious intent), or through system failure and a need to rollback with resulting data loss.
With the growth in IT over the past couple of decades there are bound to be some quality issues as the result of ever shorter, demanding and more stringent deadlines and budgets. Time and effort needs to be spent exploring the hypothetical space of what could go wrong so that at least some conscious awareness and acceptance of the risks is achieved. This being the case I'm usually quite happy to be overruled by the customer on the basis of cost and time pressures.
However, it's often not expensive to put in place some logging and monitoring - and in this case there must have been something for the company to admit the request had been submitted. Web logs, application logs, database logs etc. are all valuable sources of information when PD'ing (Problem Determination). You do though need to spend at least some time reviewing and auditing these so you can identify issues and deal with them accordingly.
I remember one case where a code change was rushed out which never actually committed any transaction. Fortunately we had the safety net of some judicious logging which allowed us to recover and replay the transactions. WARNING: It worked here but this isn't always a good idea!
In general though, logging and monitoring are a very good idea. In some cases system defects will be identified, in others, transient issues will be found which may require further work to deal with them temporarily. Whatever the underlying issue it's important to incorporate feedback and quality controls into the design of systems to identify problems before they become disasters. At a rudimentary level logging can help with this but you need to close the feedback loop through active monitoring with processes in place to deal with incidents when they arise. It really shouldn't just be something that we only do when the customer complains.
I don't know the detail of what happened in this case. It could have been user-error, or we could applaud the company for having logging in place, or they could just have got lucky. In any case, we need to get feedback on how systems and processes are performing and operating in order to deal with issues when they arise, improve quality and indeed the business value of the system itself through continuous improvement.
There, I said it. A four letter swear word. Something worse than the F’ word if the horror on the boss’ face is anything to go by. We don’t ...
Much has changed in the past few years, hell, much has changed in the past few weeks, but that’s another story... and I’ve found a little ti...
Nice piece of work. Begs the questions when we'll see Windows for Linux though ;)
A central issue in a microservices environment is how to maintain transactional integrity between services. The scenario is fairly simple. S...