Archive for March, 2007

Quiet week warning

Sunday, 18th March, 2007

Things could go quiet for a few days this week. I’m in Seattle and I’m not sure what internet access I’ll have. I had meant to sort out a couple of posts in advance, but it hasn’t been that sort of week.

Feel free to carry on without me, I’ll catch up when I can

possible topics:

  • Do we still need office productivity suites?
  • Does techological influence within a company increase with technical competence?
  • Is there a single feature that if added to (or removed from ;-) ) the next Excel would make it a compelling upgrade?
  • And what would it be?

cheers

Simon

The spreadsheet disadvantage

Friday, 16th March, 2007

So we’ve looked at some of the benefits, and there were some excellent comments.

But what is the single biggest downside to spreadsheet usage?

I was thinking the management of the uncontrolled duplication and ‘versionitis’ (good description Marcus), but actually I’m not sure that is unique to spreadsheets. I think it is most severe in spreadsheets, but I’ve seen that with Word docs too (‘which version of the spec are you working too? oh no, you should be using the one thats on my personal drive!’).

I reckon the biggest disadvantage of spreadsheet use is it encourages people to tangle up the data and a specific view of the data (eg a report). I think it was the Smalltalk community that introduced the MVC model (model, view controller), which seems pretty fundamental these days. MFC still uses the MVC approach, and I’m sure plenty of other dev tools do too. Access does for example (in fact all the database products I can think of do), you have tables for storing data, and queries or reports for seeing different views.

This lack of separation of data and view seems to me to be a barrier stopping many people making massive improvements in their use of data. I certainly think it explains why so many spreadsheet users can’t understand Access, even after training. I think it also explains why people often struggle with pivot tables.

I totally accept this feature is also one of the benefits, and is a contributory factor in the development speed associated with spreadsheeting. As with most things, sensible use will maximise the benefits whilst minimising the negative impacts.

What do you think about the separation issue? what do you think is the biggest spreadsheet downside?

Cheers

Simon

The spreadsheet advantage

Wednesday, 14th March, 2007

What are the one or 2 most compelling reasons for you choosing to develop (part of) your system in a spreadsheet?

And what do you see as the biggest down side?

For me: probably the biggest (cultural) issue is that business users understand (or think they do) spreadsheets. They are certainly more comfortable with them than SQL or any of the previous supposed End User developer tools. I’d be interested in any research as to whether this is an intrinsic thing, or just because people have more training/experience in spreadsheets. I am undecided as to whether this is a valid reason, or one we should be challenging with training/coaching/education etc. What do you think?

These are my top 2 most important features of spreadsheet development:

  • Ultra fast development time
  • Simple deployment

What are yours?

I’ll have to have a think about what the biggest downsides are – feel free to start the ball rolling. Mine would be something around easy to mis-manage probably.

Cheers Simon

COM components, VBA and performance – part 3

Tuesday, 13th March, 2007

I am going to shut up about this soon, I promise.

This whole performance thing was driven by being told many times to use early binding rather than late, as EB is faster.

So I did some fairly significant testing, and yes early binding is faster. But language choice can have just as big an impact, yet I don’t hear people saying use ATL it’s faster than VB (which it is).

How much you care about performance depends what you are doing of course. And if performance is critical then a spreadsheet may not be the best choice to start with.

Having a reasonable knowledge of the facts means that should I hit performance bottlenecks in the future, I have some ideas of where to start. Moving one or two key classes or functions to a different language could solve 90% of performance problems.

Professional Excel Development by Bullen, Bovey and Green includes an extremely useful performance monitoring tool. Using something like this is critical to ensure you are optimising the right thing. There is no point writing a function in assembler to get ultimate performance if its only called a few times. Generally targeting those areas that get called most often will get the best performance boost for your effort.

Would I still use late binding? Yes I would, in some cases I would take the performance hit to gain the easier deployment.

Will I move all my VB stuff to  (C++)? No, but I may consider moving some core elements.

Would I prioritise code performance over code clarity? In very limited cases, in well defined snippets yes I would.

What about you?

cheers

Simon

COM components, VBA and performance – part 2

Monday, 12th March, 2007

So part one was interesting, but actually not quite what I had set off to look at.

As Rob pointed out I was creating, using and destroying the objects each iteration. So that gives a reasonable view of the lifetime cost, assuming you create, call 5 functions and destroy. If you are doing anything different then that version of the test would not be overly helpful.

So I dropped everything and re-did the tests (hopefully properly – let me know!) I split out the cost to create and the cost to call into separate columns. I added a new section on the same page, with a link to a new set of test projects. I also included an xll version this time. Don’t bother using it from a worksheet cell as it doesn’t return a value so it errors or returns 0.

The new tests even things up quite a bit, C++ is still well faster, but overall early bound will probably be better than late bound, whatever the language. But note its still twice as fast to create a C++ object late bound as a VB one early. So with lots of creation and limited function calls, C++ late bound would be better than VB early bound.

VB6 did prove to be faster than VBA in the calling part, which I have never seen in use, maybe I don’t do complicated enough calcs. Mind you the VBA benefitted from being in a module rather than a class, stucturing it the same as the VB version adds 2 seconds of object creation/deletion overhead. I knew there was a reason I avoided excessive Object Orientation in VBA.

The thing I really wanted to test was the cost to call, early v late, and it seems late bound is about 20 times slower to call. So you pay twice for using late binding, once at object creation and then again at every function call.

Interestingly though I did some tests on the performance of individual functions (actually the same functionality) and found that SpeedTest1 was twice as fast as SpeedTest10, for late bound components, I didn’t see a difference with the early bound version. This would suggest if you are using late binding you should get your most used functions to the top of the function table (roughly write them first in the class in VB).

Have a play with the second set of test routines and let me know how you go. And of course if you can see any more weaknesses in my testing let me know. I find these results a bit more reasonable, but don’t that that put you off challenging my approach, I’d love to hear of a better approach to testing.

Also as a matter of interest is anyone getting client pressure/encouragement/interest to move away from VB6?

No one I speak to seems overly worried.

cheers

Simon

COM components, VBA and performance

Monday, 12th March, 2007

The last post looked at some of the other issues, this one is looking at performance.
Specifically performance of the interface and trying to ignore execution speed difference effects. Execution speed is definitely important, but that is for another test.
I used a simple routine that passes 2 longs, 1 by val and 1 by ref, it then sets the value of second to the first.
The reason I did this was to preserve the function return value for success/failure as per COM standards.

I used 5 similar functions so there would be a worthwhile function table, and called each one in turn. Increasing to 10 functions (but keeping 5 calls) added about 10% to the time of the late bound VB version. Which function was called has an impact, presumably depending on its position in the COM function table.

I did it on my rickety old laptop running Visual Studio 6.0 and Excel 2003.
I did a few different test projects, heres the results:
I had to put them on the codematic site to get a decent layout.

The C++ projects were done with ATL, MFC only really does late binding.
What is interesting is that late binding to an ATL component is still much faster than early binding to a VB dll.
So the next time someone tells you early binding is faster than late binding, be sure to tell them language has a bigger impact.
No real surprise that the native win32 version was 10 times faster than COM. Thats the price you pay for simplicity and robustness.
Its worth noting the VBA performance – its 20x faster than an early bound VB6 dll. The actual functions are going to have to be doing some significant work to overcome that calling overhead.

[Edit – Note this was a test to create an object, call some (5) methods on it and destroy it (well leave VBA to, a bit sloppy I know), a more useful test would be to break out the 2 components of time – 1. create/destroy the objects, 2. call their methods. Expect a post covering that tomorrow, and thanks for pointing it out Rob. The mix of creating/destroying objects versus calling methods on that object will have a significant effect on performance.]

I didn’t test exe servers as this was purely a performance test. They might catch up in some SETI like massively parallel grid computation, but for Excel devs, fairly safe to assume their performance would not be near the top.

So if performance is important the choices from best to worst (excluding keeping things in VBA):
win 32 is the fastest
setting a reference to an ATL COM object is next
late binding to an ATL is much slower
setting a reference to a VB activex dll is worse again
late binding to a VB component is over 100 times slower than win 32

Of course to be sure the results are valid for whatever you may be doing, you would need to test the specifics. (ie don’t sue me if you spend 6 months rewriting you VB dll as a native win 32 DLL only to find its 10x slower. Do tell us though – I’ll send you a lollipop for your trouble.
The 2 COM dlls, the win 32 one and the 5 workbooks are linked to from that page above, let me know how you get on. (The COM ones need regsvr32-ing)
If anyone fancies doing the .net equivalent let me know. I am assuming .net would be worse again as there are more interfaces to pass through, anyone want to prove me wrong?

If anyone can see a fault with my reasoning, or gets different results let me know that too. I am a little surprised at the results so suggestions as to why they might be right/wrong are very welcome.
Cheers

Simon

Early binding in VBA

Saturday, 10th March, 2007

Early binding (dim x as Areferenced.ObjectType) is generally considered better than late binding (dim x as object; x= CreateObject(“AnUnreferenced.ObjectType”)

Here are the main reasons why:

  • you get intellisense (usually)
  • you get much compile time checking
  • its faster than late binding

(do you have any others?)

There is one down side which people often skip over:
if the correct version of the referenced component isn’t registered on the client pc, VBA will fail to compile let alone run, with the classic “Cannot find project or library” error.
How many times have you had to talk a user through opening the VBIDE and checking Refererences for #Missing ones?
This is the main reason we are advised to develop on the oldest version we intend to support. Most components seem to upgrade to a newer version fine, very few seem to down grade to work with an older version.

I wrote a reference checker workbook to help diagnose these problems.
On the dev machine you point it at the workbook/add-in that is not working right for the client.
It registers all the references that that wb/add-in uses. Save it and send to the users having problems.
They open it click the button save it and send it back. Now it shows details of the deve refs and the users refs and highlights any missing ones.
Its free and its here, feedback very welcome.

My next post will look at bit more closely at the performance issues.

cheers

Simon

Spreadsheet Hell

Wednesday, 7th March, 2007

Here is a paper I have been working on: spreadsheet hell (pdf (with minor edits))

Its about some of the issues we face as spreadsheet developers and users. Its more focused towards the ongoing maintenance issues rather than actual development.

It’s been submitted to Eusprig for consideration (over an hour before the deadline!), should find out by the end of the month if its accepted or not.

Any comments welcome, either here or by email. 

I’ve tried to mention everyone who helped, thank you!

If you think I’ve missed you (and you weren’t being anonymous) drop me a note, or leave a comment.

cheers

Simon

Excel xlls and the C API

Monday, 5th March, 2007

My list of promised but not delivered items on this blog is getting longer and longer, but here is another…

I’ve had a request for some introductory training in writing xlls in C/C++. Doing that face to face may prove to be unrealistic, so I’m thinking of producing some stuff here for people to have a go with and discuss etc.

Is that something people would be interested in? Leave a comment if so, and I’ll raise it up my priority list (I’m thinking over the next month or so).

I’m thinking of tackling it from an Excel VBA dev perspective rather than C/C++, so I’d mainly be covering just how to get your code into worksheet functions – you’d have to sort out your own C/C++ code. I’ll use Visual Studio 6.0, not because I’m in some Northern time warp, but because it has no .net noise. At some point I should move to a newer IDE and revel in the productivity improvements, one day. If you don’t have an IDE then I think you should be able to download Visual Studio C++ Express for free, get the platform SDK too and you should be sorted. Another option is http://www.bloodshed.net/ (Dev C++)

To get you started there are some book reviews here.

Key xll book summaries:

Steve Dalton – Excel Add-in development in C/C++, thorough coverage, excellent content, most helpful if you are happy with C++, some fair barriers for pure VBA folks. Comes with useful framework code to wrap up the C API.

Microsoft Excel 97 Developers kit (hard to get hold of now) – covers use of their C based framework, makes sense from a VBA POV

Bullen, Bovey and Green Professional Excel Development – has an excellent chapter on xlls, should make sense to VBA developers.

Some open source/free-ish xll frameworks:

http://xlw.sourceforge.net/ (Started by Jérôme Lecomte, recent updates(C++))

 http://sourceforge.net/projects/xll/ (Keith Lewis xll framework (C++))

http://support.microsoft.com/kb/q152152/ (latest version of MS xll framework stuff, C rather than C++)

To get a sense of what bits of Excel you can play with via xlls you should get hold of MacroFun, this is the original Excel 4 XLM help file. These are the functions and features that are exposed via the C API. 

You will also need the C header files these should come with the MS framework above, but if not http://support.microsoft.com/kb/178474.

They’re the main ones I can think of off the top of my head. If there’s anything I have missed be sure and leave a comment. Other than that I’m sure google (or your favorite other searh engine (anyone recommend others?)) can help.

Xlls get a significant revamp in Excel 2007 and are the stated add-in technology of choice, so this isn’t going to go away anytime soon.

Cheers

Simon

Microsoft Office Developer Advisory Council

Sunday, 4th March, 2007

Good news for devs working in the Microsoft Office space. MS have formed an advisory council to focus specifically on the developer story for the next release of Office (codename Office 14 apparently). I can’t really tell you exactly whats involved as

  • a. I don’t know
  • and b. if I did the details would probably be under NDA 

But I can say its a long term thing (think multiple year), its focused on the developer perspective, and I’ll be involved on the Excel side. If anyone else reading this is involved then either leave a comment or drop me an email. Especially if you are off to Seattle in a few weeks (and fancy a beer). From a blog p.o.v. don’t be expecting lots of juicy info (well any info actually), everything will be under NDA.

Hopefully I have made some of my views fairly clear through my previous posts, if you have anything you would like me to pass on to the product teams (other than the ribbon is bobbins) then let me know. Obviously I’m not promising anything, but I am keen to get the Excel/VBA developer view across, so be sure to tell me what you think. It is a public blog anyway of course, so you never know who is reading ;-).

I think this connecting with the community is an excellent move by MS, and should benefit us all, what about you?

cheers

Simon

ps if you’re not from the North of England, you may not have heard of Frank Sidebottom (Francis Siddyb’tem) and his signature song – ‘the Robbins are bobbins’, bobbins does not mean good, and can be used in polite company.