Archive for February, 2007

User defined functions

Tuesday, 27th February, 2007

I’m still on the look out for spreadsheet horror stories for Eusprig this year, but in the meantime…

This is the final post on developer features for the time being. I think this one may be a big issue though.
User defined functions.
In mainstream programming you define a function once and use it potentially many times, so for example

bool AssertApproxEqual(double firstValue, double secondValue, double materialityLevel){
  return (fabs(firstValue - secondValue) < materialityLevel);
  //fabs is the float(/double) version of abs - absolute value

and then in various places you call it. This is great as you can thoroughly test the function with appropriate data, and then be assured that it will work right, as long as the parameters are valid.

In native spreadsheets we don’t really have this functionality easily accessible. And I think we probably should.
In spreadsheeting each use of a function is actually a definition, or redefinition. This creates many comprehension and auditing headaches. If I use =SUM(D10:D20), (or =SUM(R[-11]C:R[-1])) to sum sales, and then copy it across for each period, then to check the spreadsheet is correct I need to make sure that those formulas remain the same. If the formula is very long or complex then this becomes significantly harder. In big spreadsheets I personally often end up coding the formula population in VBA so I can be sure they are consistent.

We do have some options:
You can put formulas in names, I quite like this but I think its so unusual as to be unfair on potential maintainers, and users possibly. There is also a performance hit as names use array evaluation rules. It is possible to pass parameters to names but it is messy.

You can write worksheet functions in VBA, I never do this, the performance hit is just too great for me. It also gives clients a bigger maintenance challenge as now they need people with VBA skills as well as spreadsheet skills. (The answer to this of course is to get a maintenance agreement from a member of the soon to be announced spreadsheet support network thingy. or codematic of course) I also think VBA is a bit of a sledgehammer to crack a nut in this case, its power creating unnecessary security worries.

You can use XLM, I do do this, although I accept XLM skills are pretty rare these days. I do it because the performance is good and distribution is easy. There are plenty of downsides, scary ‘unstoppable macro’ warnings, no ability to sign code are 2 obvious ones. Even if you have no XLM experience, and forgetting for a minute how old and deprecated XLM is, just look at this, ignore the first 2 lines and see if you can guess what it does. (note too my spangly new Excel 2007)


XLLs, these make sense for general functions, but not no so much for stuff that relates to one file only, as they must be distributed separately. These are actually pretty easy to write these days as there are some good tools available to help (at a price) eg TurboExcel (soon to change name??) and XLL+. Writing them from scratch isn’t that hard if you know a bit of C.

There are also assorted other ways using automation add-ins with VB6 or .net .

Dermot Balson demonstrates an interesting approach here. He uses data tables, a table of data, and a complex calculation to define a function once and use it many times. There was a VBA version at one time too. Access has a similar feature when you create a calculated column, or a calc in a query – defined once, used many.

I would like to see this custom function facility properly integrated into spreadsheet applications. The key point being accessible to ‘normal’ users. Defining and using functions is what spreadsheeting is all about, its odd that you currently need to be a power user to do it effectively.

You certainly should not need additional tools like a C/C++ compiler for xlls, or Visual Basic (no longer sold), or Visual Studio or other .net IDE for automation add-ins. Deployment should be a doddle too, no registration, no local admin rights required etc. I know there are lots of negatives associated with distributing multiple copies, but it seems to have worked for spreadsheets (so far anyway!). It would be nice if such UDFs did not trigger macro warnings, maybe by limiting their functionality. I’m open on protecting UDF’s from prying eyes, personal preference is probably not to bother though.

Easy migration would be the icing on the cake, imagine modelling the function in Excel, click a button, and deploy some sort of compiled component (dll probably) to your web server, that doesn’t even know what Excel is. TurboExcel does this (I think), and XLL+ can host xlls without needing Excel I think, but as I don’t currently have a working version of either (hint!) I can’t check.

I was thinking of creating a homemade version with a UDF called (imaginatively) UDF. It would be used like this:

=UDF(“SumSales”, param1, param2, param3…)
SumSales would refer to a table of workings a bit like Dermot describes. These would be defined on a special new sort of ‘function definition’ sheet. The definition would include a decent description, lots of info about the parameters, including expected data type and error values (so it would never need to be wrapped in ISERROR, or IF(ISERROR for us old school types (although PED has an IFERROR xll, and I have an XLM version above)
I’m sure this could be built into spreadsheets natively avoiding the need for the ugly UDF piece. Actually I don’t think current Excel XLM macro sheets are that far off. Not sure if Ooc or Gnumeric have something similar. Here is a rough example:

Here are some links (corrected – thanks Rob) to some other peoples thoughts in this area, if you know of others please leave a link in a comment.
Some (Microsoft funded, I think) research 1, 2
A Microsoft patent in this area
Useful names for googling: Simon Peyton-Jones, Alan Blackwell
Cortes and Hansen a simplified spreadsheet showing how user defined functions may be integrated, and plenty of onward interestng links.

I have heard this mentioned many times as an important feature in taking spreadsheeting to the ‘next level’ whatever that is. What do you think?, is it vital to have some form of function definition separate from its usage? or is the repeated combined definition and use a key part of what makes spreadsheets spreadsheets? Do we have enough options already? Quite a few people have been less than enthusiastic about other suggestions, mainly on the basis they would add complexity, is this the same? or would it reduce complexity? (Depends on the user I guess).



Eusprig 2007

Monday, 26th February, 2007

The theme for this years conference is Enterprise Spreadsheet Management: A Necessary Evil

I would like to submit a management paper from a developers perspective for consideration. I have a few horror stories I can draw on, but I was wondering if any of you folks would like to contribute too?

I’m looking for funny/scary real life situations you have seen involving spreadsheet management (or mismanagement). Or any strong views you have based on experience in the trenches. Heres an example (which needs shortening):

At one company the whole monthly management reporting for 120 business units was driven by an Excel macro one of the consolidation team had recorded/cobbled together. This was a closely guarded treasure and outside interference was not welcome, even though this imposed a significant burden on its owner each period end, especially as it failed most months. As this was an 8 hour process that ran overnight, any failure meant all financial reporting was delayed at least a day. It also meant a long night at work for its owner, as well as a requirement to fit holidays (and sick time) around the reporting cycles. Such key man dependencies are completely standard, and very real. One unanticipated absence instantly caused a 24 hour delay as cover staff battled to understand the contorted logic. The usual failure on the first run put reporting back another 24 hours.

I’m thinking more around the management of the on-going use of a spreadsheet, rather than the initial development, but I do have some other questions:

  1. What is an average build time to life time (for me 3 months build, 1-3 year life, but wide variation)?

  2. Examples of impact of build quality on lifetime maintenance (me: one 2 hr job took 3 days because s/s was so bad)

  3. Examples of using wrong versions of spreadsheets (me: reported wrong year as ‘prior year’ for 4 months, due to wrong link)

  4. Examples of attempts to remove spreadsheets altogether (me : Oracle Financial Analyzer implementation that eventually died, nice idea wrong culture)

  5. Examples, with main benefits, of migrating from s/s to something else (me: moving to Essbase with all the power and security is my favourite, although Access or SQL Server are more common)

  6. And finally what do you think is the single most important issue facing enterprises with loads of spreadsheets scattered around their network? And how would you resolve it.

Don’t feel like you have to answer all (or any), don’t betray any confidences or break any NDAs. But if you can help put the commercial developers view across I would be most grateful. How grateful? Well you’ll get a mention in the acknowledgments section, or maybe as a reference. If you want to remain anonymous feel free to email me. I’ll put the paper up for comment as soon as its done. Its a 2,000 word limit so I may have to edit any wordy contributions. And of course the paper may not get accepted.

Oh, and we don’t have long, the deadline is the 28th Feb! (thats this Wednesday so contributions by end of Tuesday (latest of your time and UK time) please).




Structured data sheets

Thursday, 22nd February, 2007

This is what got me thinking about combining the best of Access and Excel the other day.


I think I would like to see some new sheet types. A couple that immediately come to mind are function definition sheets (to be covered soon) and structured data sheets.
A structured data sheet would end up pretty similar to an Access table probably (possibly a not very well structured one (ie not normalised)). Heres a thought about that- if users dont understand normalisation and data theory, why not develop tools that hide that stuff from them, whilst still allowing them to express themselves effectively? (But of course make it available for people who know what they are doing and need full control).

Anyway, structured data sheets… A lot of the models I work with have big, repetitive tables, often with vlookups or index/matches pulling specific items out for use elsewhere, perhaps summaries. The data is structured to an extent, but there is no way to tell Excel (or your spreadsheet of choice).

If you could say values only here, formulas only there, you could remove a whole class of potential errors. If you could create efficient indexes, look ups could be faster. Perhaps you could use SQL (or something a bit more user friendly – whatever that is), maybe removing the 3 argument lookup/unsorted list errors. Perhaps you could define the functions you need just once, and apply them in many places, removing potentially inconsistent formula type errors. I don’t think it could give us world peace, but it might be handy all the same?

Heres a rough prototype, but I can already see possible changes  – including losing the row and column headers.


Do you have any other suggestions for new developer features?
Harlan already suggested SQL.Request (how would you implement connections Harlan?), It seems I already nicked Ross’s combining Excel and Access idea (you do have prior art proof of that tm right Ross?) Has anyone got any others they would like to share? Improv anyone?


Accel or Excess?

Tuesday, 20th February, 2007

Should Microsoft combine Excel and Access into 1 product?

The more I think about it the more it makes sense. This is another developer feature post btw – I think I would like Access in my Excel as a feature please. No offence any Access fans reading this- feel free to think of having Excel as a feature in your Access (would that be a step forward??).

You could choose a forms based structured data project (ie Access) or a free form unstructured project (Excel), or something in-between (the new bit).

Heres some reasons why I think it might work

  • Complex reports are hard work in Access, easy(er) in Excel.
  • Excel doesn’t really do relational data, Access/Jet does
  • I reckon there is a lot of commonality already underneath
  • Some big horrible workbooks could be neatly replaced with well structured data and calculations that are defined once and used many times.
  • Have you got any others?

Whats in it for MS?

  • Boost credibility of Excel as a development platform
  • Reassure those users who wonder what the impacts of SQL Express are on Access long term future
  • Internal cost savings?
  • Create a best of breed desktop analysis/business intelligence application (I would leave the precise description and title to Microsoft’s wonderful marketing department of course)
  • Demonstrate a massive commitment to reducing spreadsheet hell.
  • Others?

I would have preferred closer database integration than the new 1 million row limit in 2007. In fact I suggested they reduce rows to a couple of thousand and improve db integration as soon as I heard.

What do you think? and why? would you buy it? would you use it?



Spreadsheet development feature?

Monday, 19th February, 2007

This is the first of a series of posts I’m planning about trying to apply some of the generally accepted ideas from mainstream software development to spreadsheeting. I’m a little torn about this because I only really see 2 valid uses for spreadsheets in responsible organisations:

  1. For prototyping complex systems, to gain enough understanding to build in something more production focused.
  2. As a thin layer between managed reporting repositories (eg Essbase and Analysis Services) and the analyst. 

Spreadsheets are superb at these two things, and that makes them inappropriate for most production systems. My Eusprig paper from 2005 covered this so I’m not going to rehash it here. But I would hate to see any new feature introduced that reduced spreadsheets ability to deliver in these 2 areas in the future.

Data hiding

It is generally accepted in mainstream software development that the concept of data hiding (David Parnas), is fundamental for any significant project.

This isn’t trying to be sneaky or patronising, its all about reducing system fragility.

If a developer locks clients out of certain parts of their code they are then free to change those bits without breaking all clients. This is a good thing. If one of your colleagues has ever changed a workbook that you linked to, breaking your model, you’ll know just what I mean.

If you’ve ever moved VBA between Excel versions you’ll know about this. It just works, even though the Excel team have clearly reworked stuff in the background. (Any upgrade issues I have had have always been down to poor VBA code).

The key issue is to separate the interface from the implementation. In simple terms that means in VBA that any public subs or functions should not change their return type or parameters (number or type) once they have been called elsewhere. You are of course free to change the internals of the procedure, for example if you find a better way. Classic example replacing VBA based finds with range.find which is massively faster.

In the actual spreadsheet grid there is no way to protect your intermediate workings. Anyone anywhere can link to any cell in your workbook. Unless you password protect it, then they can’t get in at all. What if you wanted to let people get at the results but know you wanted to change the intermediate calculations. There is no way currently, names go some of the way.

What about a new formatting or protection option – No Link. This would allow developers to prevent people linking to unstable, or inappropriate values.

You could define a section of the worksheet as ‘Exportable’ or something, or define other bits as ‘Private’ or ‘no link’ or something. Maybe 2 levels, exclude external workbook links, and exclude links from other sheets in the same workbooks. Net result, you steer clients away from bad stuff towards good stuff. Here is an example, the grey area is ‘Private’. What do you think?

no link screen

[Note the neat customised toolbar btw – try that in Excel 2007!] (actually looking at it, I have never used the undo button (ctrl z), or the sum(alt =), or the help(F1), or that funny auditing one(?) – room for further improvements)

Back to the post

I haven’t built anything to do this, I’m not even sure I could, not without wrapping each formula to check its parameters.

I’m going to run a series of posts on possible developer enhancements, and the most popular (that are possible) may make it into a free community toolset, that is in the pipeline.

Other features I want to cover are:

  • defining formulas (UDFs)
  • structured worksheets
  • some new sheet types
  • automated build.

What do you think of spreadsheet based data hiding? would you use it? does it go against the grain of spreadsheet development? Are there other things you think would be much more useful?



Office 2007 and Excel 2007

Friday, 16th February, 2007

I did promise myself I would not use this blog to moan about Office 2007. But Jon Peltier brought it up over on DDOE and I couldn’t resist. And really my only issue is the ribbon, or rather Microsofts deliberate (and unnecessary) decision not to provide a compatibility mode. In fact technically in using 2003, I am using 2007 in ‘compatibility mode’, according to Jensen Harris.

I just think they asked the wrong question and so the ribbon (which is little more than a fat clumsy toolbar after all, don’t let the new name mislead you) is not the right answer.

Their question seems to have been: is the ribbon better than menus and toolbars? and maybe it is, I neither know nor care, that question is soooo irrelevant.
What they should have asked (IMO) is : is the ribbon better than menus and toolbars for all those people who have been using menus and toolbars for years, and have menus and toolbars in almost every one of the other applications they use? probably not.

In the UK we drive on the left, most other places drive on the right. Would it be better if we drove on the right too? or maybe everyone else moved over to the left? Probably so in theory, but does it make sense to convert?? no probably not, just like the ribbon.

Automatic cars are easier to drive than manual right? not if you’re used to a manual they aren’t (unless that’s just me (kangarooing down the road, unclear of what to do with my spare foot)).

To be frank I find Microsofts willingness to throw away my investment in learning their product quite alarming, especially as it is becoming a bit of a trend – anyone for classic VB?

One final thing; I think ‘Office’ is ruining Excel. Most people I meet are heavy users of only one (or maybe 2) office products, this senseless conformism is diluting the power of the individual applications. I’d love to see Excel (and maybe Access) broken free from the rest of office (except for these 2 everything else is a glorified typewriter anyway). Instead of the current 7 (seven!) 2007 versions they could have 2: Office WP – for word processing (word, powerpoint, outlook and all that other new stuff) and Office IW for information workers (Excel and Access). what do you reckon?

Don’t get me wrong, I think there are some great new features in Excel, for all types of user (maybe not much for developers). For me though the pros of the new features are outweighed by the cons of the ribbon interface. Whats incredible is most people seem to be going through the same process: good=new features, bad=ribbon. I havent seen anyone I respect put the ribbon in the good category [Edit: but I havent actively searched for positive comment either] , its just for some the pros outweigh the downside of the ribbon. I could have a skewed view of course [Edit: There are after all lots of places I do not participate].

I suppose I should be looking at Office 2007 as a real opportunity, unfortunately I’m a bit too cynical for that, luckily I think my clients are too.
What about you?



Wednesday, 14th February, 2007

Time for something contentious.
I’m going to combine documentation and commenting code, hopefully you’ll see why. Here I am talking about Exel/VBA based business tools only. This will almost certainly not apply if you are using a different technology to solve a different issue.
Heres a question – do you like documenting your work? If so why?, If not why not?
Do you think you are a conscientious developer?
Me – no I don’t like it, because I think its generally pointless. Much bigger benefits can be had by focusing on other areas in my experience. And yes I think I am conscientious, thats why I don’t like wasting clients money on low value work like keeping unused documentation up to date.

Beginners don’t comment/document because they don’t realise they need to.
Intermediates do comment and document (a bit) because they recognise they need to.
Advanced developers neither comment nor document because they dont need to.

Beginner/Intermediate developers (and we all go through this whenever we work on an unfamiliar project, or with an unfamiliar technology) often develop as a series of challenges and battles, and are just relieved when something finally seems to work. Their work often lacks design clarity and often includes workarounds and compromises due to lack of knowledge and time pressures. Beginners don’t realise someone (possibly them in 2 years time) will have to come back to this work to change it for new business requirements, so dont document/comment.

Intermediates have had their fingers burnt looking after old work – possibly their own possibly someone elses. They provide comments and documentation because the work needs it.

Advanced devs bring all their experience and resources to bear, they are confident they can implement whatever they design. So they focus on a clear design, they consider how all the components of the system will interact to get the job done, they revise and optimise and simplify on paper. When they do finally fire up a computer its to copy and paste proven components from previous projects, and to provide some glue code and project specific stuff. Advanced developers work tirelessly to remove the need to document or comment.

In code, short well named routines that do 1 thing well do not need further comment.
In worksheets well named single purpose sheets with a good name and a clear title should make their purpose clear. in cells short clear functions that only rely on several other cells, and have a clear description in a cell nearby do not need further description.

I think building with the mind set that ‘I’ll leave this monster, and just explain it in the documentation’ whether concious or unconcious is not a high quality approach. Building whilst thinking, ‘this needs to be totally self explanatory, how can I make it more simple to follow?’, in my opinion will lead to a higher quality easier to manage product.

The only documentation that I think is worthwhile is details of the underlying business assumptions that drive the model. Anything that describes technical aspects of the system probably represents a failure, to either design the system well, or to target the documentation properly.

Agree? disagree?



Latest software development read

Tuesday, 13th February, 2007

Just finished a great book – My Job Went to India

Not as xenophobic as it sounds, in fact not at all. Another excellent book by the pragmatic programmers.

Basically its a discussion of software development in the current climate. The main focus is on the current outsourcing fashion, what that means for western developers and how to compete with someone with a similar skill set on paper who costs 1/10 of what you cost.

My favorite bit was early on:
Imagine you set up a new company to build a single product. The whole companies future depends on the success of this product, if it fails the company goes bust.
What would you do in that situation?
Would you do some market research to make sure there was a viable market for the product?
Would you check out the competition?
Would you invest pretty carefully to make sure you got the product right?

As a developer your skill set is your only product.
Are you giving it the focus it deserves?

I have always had a very strong focus on skills, but this analogy really seemed to drive the point home.

Another very valid point he raises is learning a totally new language, not necessarily to use it, but more for the additional insight it will give you in applying you usual toolset.

This is like top class sports performers, who often train their opposite side (eg orthodox boxers training southpaw). Not because they expect to use it, but for the insight.

[This doesn’t always work – I broke my right arm in a motorbike crash, so had to learn to write left handed, my right handed hand writing is still rubbish]

I’m thinking of having another look at Smalltalk or maybe ruby, what about you (and dont be trying to pass off looking at from VBA as some kind of mind expanding leap)?



Spreadsheet auditing (2 of many)

Saturday, 10th February, 2007

When I am looking at other peoples spreadsheets (well and my own actually) I like to use the audit arrows to see the dependents and precedents. I have used a few tools that create a separate chart of a dependency tree, but personally I like to see it all in context.

Trace dependencies arrow screen shot

I have a couple of rough tools in this area that I use myself but are way too rough to release (this goes back to a discussion on DDOE about proportion of effort to get the key things working, compared to the effort to polish a product to a standard that be won’t be an embarrassment on release).  For XLAnalyst I think the core code that was useful to me took about 1/20th of the effort involved in the final released version.

Going forward I’m thinking I’ll just release the most useful bits here as source code, with a bit of discussion, and if people express enough interest in a proper packaged supported version then that may follow. If other people have code or ideas or other skills or services (anyone *like* documentation?) they might like to contribute to an open source VBA spreadsheet quality/audit type tool then please leave a comment.

When I am clicking around a worksheet I often like the arrows to follow me. If you stick the following 2 lines in the Workbook_SheetSelectionChange event of ThisWorkbook, that will happen for that workbook.

Target.ShowDependents ' or ShowPrecedents (or both)

sh and target are parameters passed into that event for you.

If you want that to work universally, then you need to wrap it up in a class to catch the events at application level. I’ll post that later with a proper workbook you can download and use. If anyone wants it?

A couple of issues around this

  1. It breaks the clipboard, I think I ended up turning it off during build and only using it during review
  2. ShowPrecedents/Dependents will show all (in the current workbook – doesn’t really work for external links (but youre not using those anyway right?)
  3. The VBA range.Dependents or directDependents does not include off sheet references, making it somewhere between useless and dangerous.
  4. None of this stuff covers conditional formatting or validation type dependencies, just cell formulas.

What are you favourite tools/techniques for sense checking formulas?



Some spreadsheet quality resources

Saturday, 10th February, 2007

I just thought I would call out a few of the resources that people may find useful on the codematic site.

I expect many future posts may be partly based on this stuff, so you could read it and never come back, but then you would miss out on the comments from fellow professionals which are the most valuable bit.

In no particular order

Eusprig 2005 I presented on some of the issues that make spreadsheets specifically a bit of a challenge as development tools
The pdf of the paper is here
The powerpoints are here

Eusprig 2006 I presented on how I review spreadsheets, as part of potential maintenance agreements. This was a fairly practical paper, with several tools discussed
The pdf of the paper is here
The powerpoints are here

Eusprig 2007 I’ll be discussing in a later post

In 2005 I presented at a community developer event at Microsoft, the topic was working with Excel using Visual Studio (2005 RC1 I think it was)
It discussed the main options around VSTO, COM add-ins shimming etc, probably slightly out of date as MS are moving pretty fast in this area. (I’m not rushing to update all the .net stuff, I’ll be leaving it for probably the next 12 months until things mature a little).
The powerpoints are here

In 2005 I spoke at the first US Excel User conference about Spreadsheet design concepts, mainly to do with breaking complex models up into sensible chunks, and keeping related things together, and unrelated stuff apart.
It was a great event by the way, totally recommended.
The powerpoints are here

In 2006 I spoke at the first UK Excel User conference
one session was on spreadsheet design (again) and is pretty similar to the one above, but with some better examples
The powerpoints are here

The second session was on converting xlas to VB6 COM add-ins. This gives some pretty good background, but in fact the process is pretty trivial if the xla is well thought out
The powerpoints are here
The base VB6 Excel COM add-in project is here

The third presentation was from a 3 hour workshop on VBA best practice, it was a fairly hands on session so the slides only skim the surface. There are some supporting screen shots here.
The powerpoints are here

I’m helping to organise the 2007 UK EUC and would be very interested to hear any suggestions for venue, content or presenters. Anyone fancy Edinburgh?

Find out more about the Excel User Conferences here