I’m still on the look out for spreadsheet horror stories for Eusprig this year, but in the meantime…
This is the final post on developer features for the time being. I think this one may be a big issue though.
User defined functions.
In mainstream programming you define a function once and use it potentially many times, so for example
bool AssertApproxEqual(double firstValue, double secondValue, double materialityLevel){
return (fabs(firstValue - secondValue) < materialityLevel);
//fabs is the float(/double) version of abs - absolute value
}
and then in various places you call it. This is great as you can thoroughly test the function with appropriate data, and then be assured that it will work right, as long as the parameters are valid.
In native spreadsheets we don’t really have this functionality easily accessible. And I think we probably should.
In spreadsheeting each use of a function is actually a definition, or redefinition. This creates many comprehension and auditing headaches. If I use =SUM(D10:D20), (or =SUM(R[-11]C:R[-1])) to sum sales, and then copy it across for each period, then to check the spreadsheet is correct I need to make sure that those formulas remain the same. If the formula is very long or complex then this becomes significantly harder. In big spreadsheets I personally often end up coding the formula population in VBA so I can be sure they are consistent.
We do have some options:
You can put formulas in names, I quite like this but I think its so unusual as to be unfair on potential maintainers, and users possibly. There is also a performance hit as names use array evaluation rules. It is possible to pass parameters to names but it is messy.
You can write worksheet functions in VBA, I never do this, the performance hit is just too great for me. It also gives clients a bigger maintenance challenge as now they need people with VBA skills as well as spreadsheet skills. (The answer to this of course is to get a maintenance agreement from a member of the soon to be announced spreadsheet support network thingy. or codematic of course) I also think VBA is a bit of a sledgehammer to crack a nut in this case, its power creating unnecessary security worries.
You can use XLM, I do do this, although I accept XLM skills are pretty rare these days. I do it because the performance is good and distribution is easy. There are plenty of downsides, scary ‘unstoppable macro’ warnings, no ability to sign code are 2 obvious ones. Even if you have no XLM experience, and forgetting for a minute how old and deprecated XLM is, just look at this, ignore the first 2 lines and see if you can guess what it does. (note too my spangly new Excel 2007)
XLLs, these make sense for general functions, but not no so much for stuff that relates to one file only, as they must be distributed separately. These are actually pretty easy to write these days as there are some good tools available to help (at a price) eg TurboExcel (soon to change name??) and XLL+. Writing them from scratch isn’t that hard if you know a bit of C.
There are also assorted other ways using automation add-ins with VB6 or .net .
Dermot Balson demonstrates an interesting approach here. He uses data tables, a table of data, and a complex calculation to define a function once and use it many times. There was a VBA version at one time too. Access has a similar feature when you create a calculated column, or a calc in a query – defined once, used many.
I would like to see this custom function facility properly integrated into spreadsheet applications. The key point being accessible to ‘normal’ users. Defining and using functions is what spreadsheeting is all about, its odd that you currently need to be a power user to do it effectively.
You certainly should not need additional tools like a C/C++ compiler for xlls, or Visual Basic (no longer sold), or Visual Studio or other .net IDE for automation add-ins. Deployment should be a doddle too, no registration, no local admin rights required etc. I know there are lots of negatives associated with distributing multiple copies, but it seems to have worked for spreadsheets (so far anyway!). It would be nice if such UDFs did not trigger macro warnings, maybe by limiting their functionality. I’m open on protecting UDF’s from prying eyes, personal preference is probably not to bother though.
Easy migration would be the icing on the cake, imagine modelling the function in Excel, click a button, and deploy some sort of compiled component (dll probably) to your web server, that doesn’t even know what Excel is. TurboExcel does this (I think), and XLL+ can host xlls without needing Excel I think, but as I don’t currently have a working version of either (hint!) I can’t check.
I was thinking of creating a homemade version with a UDF called (imaginatively) UDF. It would be used like this:
=UDF(“SumSales”, param1, param2, param3…)
SumSales would refer to a table of workings a bit like Dermot describes. These would be defined on a special new sort of ‘function definition’ sheet. The definition would include a decent description, lots of info about the parameters, including expected data type and error values (so it would never need to be wrapped in ISERROR, or IF(ISERROR for us old school types (although PED has an IFERROR xll, and I have an XLM version above)
I’m sure this could be built into spreadsheets natively avoiding the need for the ugly UDF piece. Actually I don’t think current Excel XLM macro sheets are that far off. Not sure if Ooc or Gnumeric have something similar. Here is a rough example:
Here are some links (corrected – thanks Rob) to some other peoples thoughts in this area, if you know of others please leave a link in a comment.
Some (Microsoft funded, I think) research 1, 2
A Microsoft patent in this area
Useful names for googling: Simon Peyton-Jones, Alan Blackwell
Cortes and Hansen a simplified spreadsheet showing how user defined functions may be integrated, and plenty of onward interestng links.
I have heard this mentioned many times as an important feature in taking spreadsheeting to the ‘next level’ whatever that is. What do you think?, is it vital to have some form of function definition separate from its usage? or is the repeated combined definition and use a key part of what makes spreadsheets spreadsheets? Do we have enough options already? Quite a few people have been less than enthusiastic about other suggestions, mainly on the basis they would add complexity, is this the same? or would it reduce complexity? (Depends on the user I guess).
Cheers
Simon