Spreadsheet risk

I’m currently writing a course for a client to educate their finance staff about some of the risks involved in using spreadsheets.

The key organisation in this area is Eusprig, I have spoken there a couple of times and their web site holds a wealth of information and links onwards to most of the important work in the area.

The most widely referenced author is Ray Panko, and I was ratching around his web site here for material for the course. One thing jumped out – of all the most recent studies virtually all of them them found material errors in 100% of the spreadsheets tested.

He discusses 3 types of (quantitative) errors:

  1. Mechanical – like pointing to the wrong cell
  2. Logical – developer has flawed reasoning
  3. Omission – something important got left out

What is interesting about this list (to me anyway!) is that a spreadsheet auditing tool (my one (more of a risk assessment tool than an auditor) is over at www.xlanalyst.net) can only really help with the first type. And the suggestion seems to be that these type aren’t actually that common. 

Omission errors are notoriously difficult to discover, throughout all software development, and probably beyond. After all nothing that is there is wrong.

Logic errors often have a domain specific aspect, such that, if you have limited knowledge of the problem domain of the spreadsheet, you have a severely limited chance of discovering a logical error.

This leads me to a couple of thoughts

  • The first, slightly cynically, I wonder how these researchers managed to hit 100% error rates
  • The second, is that good, easy to follow designs and high build quality are key factors in reducing errors rates in commercial spreadsheets.

I do a lot of maintenance and support, often of work that was written by other people. The stuff with a poor design breaks more often, costs an awful lot more to fix when its broken, and costs a lot more to modify for changing business needs.

I do believe therefore that the phrase ‘quality is free’, is true when applied to the lifetime of an important spreadsheet.

cheers

Simon

10 Responses to “Spreadsheet risk”

  1. Dennis Wallentin Says:

    Simon,

    We can agree that it exist a great number of error although I also questioning 100 % of error!

    When You do the maintance work do You notices any differences between solutions created inhouse with solutions created by external consultants?

    In my experience the #2 in the list can be reduced by using a written specification. But this is usually never done in internal projects and only larger corporates use it with external consultants.

    As for the #1 some of it can be reduced via test subprocesses. Again, this is done more frequent in projects where external consultants are involved.

    The list should also include one additional item and that is poor data quality. The phrase “garbish in garbish out” is still valid…

    Kind regards,
    Dennis

  2. Ross Says:

    I remember when I first joined Eusprig there was a discussion about an Excel model used to plan a Nuclear power plant.

    People were like, no way, you could never use Excel to do that it’s not robust enough. I made two points.
    1 it was a model, not a control programme.
    2 Any software, Excel, VBA, or C is just a prone to errors as anything else.

    You may not have the ability to control as many of the errors with VBA/VB as you do with C, but you could make the same argument with C vs machine langue and machine language v’s factions. But more importantly stuff developed with C follows a much might development framework, it’s tested more thoroughly (alpha, beat, black box, white box etc) and it takes much longer too. If spread sheet were developed like this then they could be called C! That’s the point. The problem is not with the SS, the 3 errors above could stand for any code, the real problem is with the use of SS. More over, SS risk has become a big area of interest in recent years due to the value of items they now process and of course Enron!

    Risk is reduced when the decision is made to properly develop a SS, and give it the support it needs to be a quality tool. The major problem is that SS are seen as quick, simple and easy, so the exceptions are to high, Managers don’t think about errors, they think about outputs, and developers/poweruser and keen to try and meet that expectation. The problem is when the error is discovered it’s often too late!

  3. Simon Murphy Says:

    Dennis you are right about data quality being an issue.
    In terms of spreadsheet quality, most of what I see is poor, internal or external, and you are right, it is a lack of any sort of process that is a major cause. I always have a design review of the hand drawn paper based design – whether I am the developer, mentor or customer.

    Ross I made a similar point last year I think, when I sneaked in a mini rant to my presentation. The issue is not technical, its cultural, based on unrealistic expectations of the customer and the would-be developer. There are some specific issues that make spreadsheets more risky than other tools (eg they are designed to be flexible – not robust).
    Its good that Eusprig is focused on the spreadsheets, but we mustn’t forget the other tools and issues.
    cheers
    Simon

  4. Marcus Says:

    The other risk I’ve seen a lot of is versionitis. A business user wants to make their own changes to a spreadsheet so they’ll copy the file and make their amendments – this could be anything from business logic to the colour scheme. Pretty soon there’s multiple version of this spreadsheet model floating around and no-one knows which is the ‘one version of the truth’. I don’t know which risk bucket you’d want to categorise this under.

    My approach to counter this is not to use spreadsheet files explicitly but Excel add-ins (COm and native) with the underlying data stored in a relational database.

    > The primary add-in creates a menu and copies down the remaining add-ins (business logic, UserForms) on-demand from a file server ensuring each person is using the most recent version each time they start Excel.

    > Updates are easy – simply copy an updated add-in file to the server. The next time a user starts Excel they’ll have the most recent version.

    > As the data is in one location it’s easier to maintain one version of the truth. It’s also easier to control who has access to view or alter the data by comparing their logon id against a privileges table.

    How has anyone else countered (user instigated) versionitis?

  5. Simon Murphy Says:

    Spot on Marcus
    Reconciling these multiple versions must be one of the biggest drains on those orgs that have not invested in some decent data stores (and some that have).
    Your approach sounds good, I just proposed something very similar to a client – more to do with managing the versions of VBA code in ‘smart’ spreadsheets than cell based numbers, but the exact same principle.

    One of the biggest problems is the lack of any decent diffing tools so you can instantly tell what parts of a workbook (and code) are identical and which aren’t. I always end up writing my own rough and ready thing.
    Anyone else see this versionitis as a big issue?
    cheers
    Simon

  6. Bob Phillips Says:

    Interesting thought Simon, but 100%? I admit that it is hard to think of a single spreadsheet that I have worked on that I haven’t found errors of some sort in (including those that I originally constructed ), but I am sceptiacl it is 100%. Could it be that only very complex models are bing assessed, hence the chances of being 100% correct are slight?

    As a small aside, in my days as a systems programmer, we once got a patch for a system routine from ICL. When we opened up the code to apply this patch, we saw that the original routine was 5 lines of code. % lines, and a bug!

    Anyway, I raised a thread over at JMT on this, where I said

    …n Excel 2003 and before, there are/were 16,777,216 cells. In a ‘typical’ business critical applictaion, 1% of these cells, or 167,772 cells, might have formulae. Allowing a ‘conservative’ estimate that there are 0.1% errors in these formulae, this could mean 167 errors.

    That is bad enough, but now we have Excel 2007, and we know how it will be (ab)used. Excel 2007 has 17,179,869,184 cells, which using the same algorithm gives 171,798,691 formulae cells, and possibly 171,798 cells with errors.

    I know that most spreadsheets will not go anywhere near the liits, but I am always amazed at what I see in current Excel, so it is possible.

    The auditing implications here are enormous. Is there anyone here involved in spreadsheet auditing? What are the implications of Sarbanes-Oxley on 2007, or vice-versa? What about Basel II?

    If it is anywhere near 100%, and spreadsheets will get more complex, I think I might just wet myself

    BTW, is this Ross our friend over at JMT?

  7. Marcus Says:

    Hi Bob et al; I’m involved in a Basel II project right now at a financial institution.

    The business uses a series of spreadsheets to collect, collate and calculate various aspects of their capital allocation requirements based on both external empirical data and internal data.

    External regulators have questioned the use of Excel to process such a critical calculation – not so much from a ‘is Excel capable’ perspective but more over the security and process controls which can be implemented using Excel (very little).

    There is no issue with collecting, collating and calculating on a more robust platform (RDBMS) and using Excel for the presentation of results. Most of the number crunching the business does could be done quite easily with SQL in a relational store – given the volume of data they are working with and the complexity (as in fiddly) formulae being used. When Calc mode is turned to Manual and the user has to press F9 they want to update the results – you know you’ve out grown lookup formulae using Indirect, Match, VLookup etc.

    On a previous Sox project (at another B&F), the issue was the same. Not so much ‘can Excel calculate and present what we need’ but, ‘can Excel enforce the security and process controls required’ so we can have confidence in the integrity of the data.

  8. Simon Murphy Says:

    Bob
    Yes a scary thought indeed, Ray Pankos research seems to indicate error rates around .5 to 5% of cells (not always clear which ‘cells’ – all, formulas, etc) so your estimate may not be far off. The problem with spreadsheets is that each formula in each cell needs checking or controlling somehow. With a programming language you define the function once and then pass in parameters, in a spreadsheet each cell is a formula definition.
    (was the pants thing in excitement for this new revenue opportunity, or fear of the horrors you are going to have to deal with?)

    Marcus – sounds like youre working in a growth area!
    I think Excel makes a great prototyping tool, but it really doesn’t support the kind of controls you need for important ongoing processing, in my view anyway. I agree about manual calc, I take that as a sign of defeat.
    cheers
    Simon

  9. Don price Says:

    Hi Simon,

    I’ve just started looking at your posts, and very good they are too, a good mix of tech, info and opinion, although some are way beyond me!

    Re the 100% error rate, my paper to EuSpRIG last year (Spreadsheet Risk – A new direction for HMRC) gave the error rate that we find on our audits of tax spreadsheets at about 14%.

    As far as I am concerned, a spreadsheet is designed to download/input figures to create a figure. As long as the bottom line figure is accurate, I am happy. I don’t care if there are errors of design, format, logic, as long as they don’t affect the bottom line. Many of the spreadsheets that I see (most of them) have errors, or areas to query, which do not affect the end result. These may be pointed out to the user, but I am really only concerned with material errors, which tend to be mechanical or domain errors.

    Just thought I would throw that in.

  10. Florian Says:

    Hi,
    I found your blog via google by accident and have to admit that youve a really interesting blog :-)
    Just saved your feed in my reader, have a nice day :)

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.


%d bloggers like this: