Archive for February, 2010

Microsoft operating profit by division

Monday, 22nd February, 2010

I saw this chart here.

MSFT operating profit by division

Ignoring for a nano second the dreadful line/area approach instead of the far easier to interpret column style, a couple of things jump out at me.

Didn’t Windows, and Microsoft get a up-tick from the Win 7 release?

And secondarily, isn’t Office absolutely flat over the same period? Even though Sharepoint has allegedly become a billion dollar product over the period (revenue not profit), and Great Plains was integrated into MDB in 2006, not sure if thats in the chart or just before – if its in it didn’t have a profit impact did it?

Look at March 2007 the official start of the Oww starts now campaign for Vista of Office 2007.

I don’t think you could look at the chart and say Office client has been a runaway success, to me considering the success of the server tools side of Office, it looks like Office 2007 client has lost impact.

It won’t come as news to some of you that I think the ribbon is the all time most expensive mistake Microsoft has made so far. I wouldn’t be so blinkered to say this chart categorically proves it, but it certainly doesn’t show ‘our fastest selling ever’ either.

I suppose I could dig out the underlying numbers to see a bit more clearly, in particular I would have expected a bit of a drop when they stopped selling 2003, although perhaps the heavy influence of EA and SA enterprise licensing hides the retail story.

Anyway, how do you interpret this? to support your preconceived idea or disprove it?

cheers

Simon

xlls With ExcelDNA

Thursday, 18th February, 2010

Excel DNA is a free open source tool to expose .net UDFs to Excel via the xll interface.

The code is here, the blog is here. The forums and discussions are here.

It has been my destiny to try ExcelDNA for a very long time, 2 or more years I think it’s been on my todo list. The primary blocker was the need to have .net 2.0, and me keeping my main dev machine at 1.1 for client compatibility reasons. Anyway, new year, new machine, new approach. I’m on Office 2007 and Visual Studio 2008. So Excel DNA is suddenly on my radar.

Excel DNA has 2 main modes, text file mode and .net assembly mode.

In the first one of these Excel DNA is effectively 3 components [EDIT: My bad – as Mike and Ross and Govert have pointed out it is 2 Components because you don’t need to distribute first one]

  • The ExcelDNA.Integration.dll assembly to glue the next two bits together
  • The ExcelDNA.xll caller
  • The UDF in a text file.

To work with ExcelDNA in text mode you only need notepad, or a decent text editor (Notepad++ for example). Visual studio might be handy, but is not essential.

You put the 3 2 components in the same folder, rename the .xll to the same name as the text file. Give the text file a .dna extension, double click the .xll and enable macros and your functions from the text file are there. The code in the text file can be written in any .net language. The demos have VB, C# and F#.


Once you open the xll ExcelDNA registers all the functions it finds in the .dna file. Note there is no direct file lock on the .dll. This means you can code and refresh the .dll without restarting Excel – this can be really handy. I guess if you go and change the parameters or return type you’ll be in for some fun, but for changes to the implementation this is a real time and frustration saver.

In the .net assembly mode there are 4 3 components, the 3 2 from above and the .net assembly. But now instead of the .dna text file containing the function, it just contains a pointer to the .dll assembly, that contains a compiled class library. These can be marked threadsafe, as can the text files I think. This approach makes a big difference, in my limited testing a threadsafe .dll calculates in about a quarter of the time of the text file approach, and this puts it about twice as slow as raw excel or a C xll, close to 10 times faster than VBA.

If intellectual property concerns are an issue then you need to go the .net dll approach, and trust in a .net obfuscater.

For the ease of use, the fact you don’t need Visual Studio, or if you do use it to get the fastest performance, you only need a free Express edition, Excel DNA is extremely impressive.

To create a faster than VBA udf you just need to copy your VBA out to a text file, add a couple of flags, make sure its valid VB.net, change the file name and the .xll name, open the xll and you are away. If you have .net code to expose to Excel just compile it and point the text file to it. My results were a little erratic with the class library I was using, so I’ll need to check that again. As the whole thing is open source I can always trace things right through to understand what the issues are. Bear in mind it is still under development…

[Edit: Multi-Threading could be unreliable in certain situations in 0.21, this is fixed in 0.22]

Deployment is straightforward too, there are several files (3 or 4 2 or 3) that need to end up in the same folder, .net 2.0 or later is needed. But there is no registry messing required, no formal install.

ExcelDNA also covers command based stuff, but I didn’t test that for this review as my focus for now is mainly on UDFs.

ExcelDNA is targeted towards Excel VBA devs who want better udf performance, and it delivers that well. I think its something that deserves a bit of investigation for many Excel VBA devs because it is so accessible.

In summary:

  • Simple deployment
  • Good performance
  • Simple development
  • Simple to expose existing .net resources to Excel
  • Free and open source
  • Also has command features, that I did not test yet.
  • Choice of .net language

On the downside:

  • Still under development
  • Multiple file approach may not suit all situations
  • Lifetime management might be a pain, as with most add-ins

Are you using ExcelDNA?

What are your experiences?

cheers

Simon

Excel xlls with tools

Wednesday, 17th February, 2010

Next I’m going to look at some of the tools that are available for writing xlls.

These are the ones I’m going to cover:

  • XLL+ from Planatech
  • Add-in Express
  • ExcelDNA
  • Possibly one or two others time/trials permitting.

They each have different strengths and weaknesses and so each will be more suitable for different groups of people, and different jobs. And thats the bit I want to cover. So I’m not especially going to give blow by blow instructions as covered in each products help file. the intention is to give a flavour of the effort involved and the returns for that effort.

Hopefully we already know the basic requirements for (UDF) xlls

  1. a function that takes some parameters and returns a value, or set of values
  2. a way to tell Excel about it so it can be called from a cell.
  3. Thats it.

If you think there is a tool that I have missed then let me know. I’m kind of writing this from the perspective of an Excel dev who want to get into fast worksheet functions. A few of the tools I have tried seems to be targeting a different group so I may not mention them.

Yeah I know this is out of order, not sure how come I missed this before posting the XLL+ review. ho hum.

cheers

Simon

Xlls with Planatech XLL+

Wednesday, 17th February, 2010

A 30 day trial is available here.

Pricing is here, (about 1,100 USD/700GBP for a single dev license).

XLL+ is a C++ based Visual Studio Add-in for writing xlls. That means you need VS Professional not an Express version. It’s also C++, if that moves it outside your interest area, I would say read on don’t skip the rest of this – perhaps there are features that would make it worth your while to invest in C++.

XLL+ adds a new xll project type.

As you are setting up the project you need to consider which advanced features you may want to include

I’ll discuss some of these below, the defaults are fine for most simple xll projects, but you may need to make changes here for some of the more powerful features. This is a review rather than a tutorial though, so I won’t be covering all. The product comes with good help files, and they can be downloaded separately if you want to check then out before downloading the tool.

In one of these XLL+ projects you can use tools>>New XLL+ Function to get a simple wizard to guide you through creating a udf (or command).

There are some incredibly powerful features here just exposed by a single check box.

Cache results, for example easily allows your function to hold a local copy of its results to save an expensive database round trip for example. The wizard generates the code to check the cache first, only calling your function if needed. I guess you have to decide how the cache gets invalidated.

There are others like asynchronous functions, this will be a new integral part of Excel 2010 and the 2010 SDK, but with XLL+ you can use it in every version of Excel. With an async function the calc sets it off, and its return value is initially ##NO DATA## or something, then at some future point when your function has finished calculating it calls back into Excel to update that value to be the real result. This is very valuable for slow functions.

Once you get through the wizards and drop into the actual code (C++ as mentioned before) the most obvious thing is that XLL+ has generated two versions of your function: a pre 2007 one, and a 2007 one. This saves a major pain if you are targeting multiple Excel versions. It has also sorted all the registration issues to get your function correctly registered in each version on Excel. This can be changed via the wizard for example if you want to add arguments or change data types. Helpfully it wraps it all up in exception handlers too.

XLL+ then provides a bunch of useful classes to interact with the Excel host, some of these are not available in UDFs, but XLL+ can also generate commands (the equivalent of the Sub in VBA). In many cases manipulating the Excel OM through the C API is staggeringly fast compared to the VBA equiv. For example the worksheet password remover here (written without XLL+) is an order of magnitude faster than a VBA or VB6, or .net version that uses the COM OM. I recently did some styles stuff in XLL+, again 10-100 times faster than the VSTO equivalent. XLL+ doesn’t give full coverage, but it allows raw calls to the underlying SDK for anything that is not common enough to be included.

At the end of the day though you are going to need to write your udf in C++, unless….

You decide to make use of the ‘new-in-the-latest-version’ feature to wrap .net functions and pass them through the xll interface. For an enterprise with an existing codebase in C# being able to expose those through the fast xll interface instead of slow automation functions is a massive benefit. Deployment is easy too, just put the xll and the .net assembly dll it wraps in the same folder and you are done, no registry, not path hassles, it just works. There are limitations around the data types the .net component exposes and whether they can be converted effectively to types the xll interface can pass. The end result is a little slower than a pure C/C++ xll, because there is still some marshalling from native to managed, but its much faster than the automation approach because there is no COM layer in the way.

This would also work if you code your udf in VB.net of course

XLL+ also provides logging out of the box so you can keep track of whats going on, and there are basic performance profiling features too. And as it is all written in C++ you can extend as you see fit with your own, or someone elses C++. The other big (and somewhat advanced) feature is handles, XLL+ makes it very simple to create a large object in the xll memory then make various bits of that available to Excel cells.

I would say XLL+ is certainly targeted more at the advanced end of Excel developments, its not really a hobbyist’s toy. If you can see the benefits to the work you do then you really need to set aside something like a week to get to grips with the product and start to understand how you can best get it to work for you.

In the past I have just fired it up and hacked together a UDF that does what I want, roughly. These last few weeks though I have spent much more time reading the help files and trying the samples, even those that do not seem relevant to my current projects. I now have a good sense of what the product can do and how best to make it do it.

I compared XLL+ performance to one of my hand written C UDFs and was shocked to see XLL+ was faster. What’s this I thought? all that clunky C++ faster than lean mean C? then I realised XLL+ had registered its version as thread safe and I hadn’t with mine (yep thats right, I’m in E2007 at the moment). Once I fixed that they were identical, except the XLL+ one was easier to write (and a much bigger file: 500kb v 48kb).

For me killer features are

  • Simple initial deployment
  • Excellent performance
  • Easy to develop compared to raw xlls
  • Ability to wrap .net dlls and pass them through the xll interface for decent performance (way better then VBA for example)
  • Easy Excel version neutrality
  • Automatic help file generation

The biggest downsides are

  • Need strong C++ to get the most from the product (perhaps not if you purely wrap .net assemblies)
  • The way the Excel OM is exposed is quite a leap for a VBA dev.
  • Significant learning curve for Excel devs

In summary I think XLL+ is perhaps aimed more at enterprise type C++ devs needing to expose their powerful C++ libraries to Excel users, rather than Excel power users and developers. The latter will need to invest in some C++ to get the best return on their financial investment if they buy the product. (And good luck getting C++ training!) That said dedicated performance buffs should seriously consider making the investment.

Have you used/do you use XLL+? what do you think?

Next up (in a couple of days) I‘m going to take a look at the opposite end of the cost spectrum.

Cheers
Simon

ps I’m going to do a proper performance summary in a week or so once I have reviewed all the tools in my list.

Raw xlls

Tuesday, 16th February, 2010

When I say raw, what I mean is just with the Microsoft supplied SDK (and your choice of Visual Studio version (including Express versions)).

I’m not going to repeat all the stuff from here, so if this area is new to you then feel free to start over there.

If you are planning an xll that will simply take some basic parameters (strings, doubles, arrays etc) and return some simple data type then a raw xll may be just the thing. The biggest gotcha for a VBA dev is the lifetime of variables. A local variable in C lasts until the function it is defined in ends. Which means that if you return that variable as the result, by the time it gets back to Excel the address may not be valid. 99 times out of 100 you may get the expected result, every now and then you might get nonesense or an access violation crash. The quick fix prior to Excel 2007 was to declare the variable used as the return as static. Sadly that doesn’t fly in a multithreaded world.

The quick answer is if you want to take advantage of the potential 25% or so speed increase of MTC in 2007 then unless your C/C++ skillz are up to it you may want to look at tools to help, or a hybrid approach. The 2007 SDK does have some Thread Local Storage features, but thats quite a step up from a simple C function.

You can write the function as a normal native windows dll project in C or C++, and then you just need to register it with Excel.

You can do the registration part in VBA or XLM, and once its registered it can be called directly from cells. I covered REGISTER() here. Or you can embed that code in the xll using the xlAutoOpen routine to call Excel4 with xlfRegister.

The biggest ‘correction’ I get when talking to clients about xlls is they assume when I say C or C++ that I mean C#. But xlls are native code, and Microsoft has been unable and/or unwilling to create a compiler from C# to native code.

If your add-in will be calling back into the Excel OM, for example to get cell colours, or some other property, then you need to be sure your XLM skillz are up to the job. For example if you want to know the address of the last cell you need to check MacroFun.hlp to work out what number to pass GET.DOCUMENT (which becomes xlfGetDocument in C). Here again some of the commercial tools may help, as might a hybrid approach.

Raw xlls are a single file (usually) so initial deployment is easy, lifetime management is not so easy as the xll will be locked whenever its loaded into Excel meaning updates are not simple to deploy.This is like xlas which is why many enterprises use an add-in loader. xlls need no installation and the file can be anywhere on the local machine or the network, uninstallation involves either deleting the file or removing the link in the \Excel\Addins part of the registry.

If you are targeting multiple versions of Excel, then there is additional overhead in lining up data types to version, but only really if you want to benefit from the improvements in 2007 and beyond. A xll targeting Excel 2003 will work fine in 2007, no recompilation is necessary, and if the base function is threadsafe you can just register it as threadsafe in 2007 to participate in MTC.

There are no hidden references in xlls, if the function name is used in a cell and Excel finds it in its registered functions list it calls it (if not, you just get a #NAME? error). Unlike VBA UDFs in xlas where the references can get very tangled and file path sensitive.

So in summary raw xlls make sense when

  • you have reasonable C/C++ skills
  • you understand Excels C API
  • The function only deals with basic data types
  • simple initial deployment is useful
  • future updates will be rare
  • you don’t want to buy a commercial tool.

The write/test/correct cycle is more involved than VBA because you have to unregister your dll from Excel to release the file lock. The unregister functionality has never worked so the easiest is to shut down Excel to edit the xll.

There are tons of C++ resources around and its very much a living language, both within the Microsoft world, and outside. C++ gets a big revamp in VS2010.

I have a base project here that separates out the actual function, and all the registration mischief. Hopefully that split makes it simple to see the different elements. I am contemplating making the tool I use to generate those sort of projects available commercially, I’d welcome any views on that…

Anything to add? or disagree with? (xlls with tools coming next)

Cheers

Simon

Managing Excel Extensions/Customisations

Tuesday, 16th February, 2010

Great question about what I mean by managing an add-in. What I mean is how do you maintain appropriate control over its whole lifetime.

If you consider the lifetime of the app (well if you did you would be in the smallest of small minorities!):

  1. You need to develop an understanding of what is needed
  2. come up with some idea on how best to implement the conflicting requirements
  3. build it
  4. test it
  5. Get it out to the users
  6. maintain it’s operational integrity
  7. make corrections, updates and enhancements
  8. facilitate appropriate removal

I guess I’m talking about 3 onwards, or perhaps 2, maybe 1 if you consider prototyping.

These are partly to do with the actual technology, and also partly to do with the availability and accessibility of code, components, resources etc. So for example the unit test frameworks that target .net are probably a little more mature than those that target VBA.

For C# there is the .net framework itself, and the .net MVP contributions. For C++ there is std and boost and all the open source stuff, for VBA there is all the MVP contributions and a ton of VB6 stuff.

You would have to be pretty blinkered to claim one tech is better than all the others. They each have their place and the most appropriate will depend on many factors, some of which are above, others may be personal to the developer or the client. Existing body of knowledge and skillset are a couple of these that can be very influential, and sometimes in unexpected ways.

I’ll point out the highlights and lowlights that I know about as we go through

Cheers

Simon

Excel Customisation/Extension summary

Monday, 15th February, 2010

Just a quick post to summarise the main techs for extending and customising Excel. This is a precursor to a few posts coming in the next couple of weeks so I don’t have to drip feed it.

There are two main types of Excel customisation

  1. Additional Worksheet functions.
  2. Additional Commands

There is an additional type which is an automation app where your code grabs a running excel or starts a new one, but eventually it will either add commands or functions or just manage the way the user interacts, perhaps with one of them godawful dictator apps. Where the developer removes all the useful features from the clients Excel and hopes to be able to reinstate them when their app closes. I’m not going to talk here about Automation applications, not because they are useless or anything, just they are not as common as the other types. I guess you could argue for other types of customisations that remove functionality, but I won’t be covering them either.

I’m going to look at the first one of those, Worksheet functions over the next week or so. This is some investigation I have been meaning to do since last summer, so its not as delayed as many other things on my todo list.

You can write User Defined Functions (UDFs) in a range of technologies:

  1. VBA – easy to write, easy to deploy, hard to manage, limited protection, stable, slow, pretty much unusable IMO
  2. .net – automation add-in, harder to write, harder to deploy (needs registry changes), hard to manage, reasonable protection, no idea on stability, probably not as good as VBA, much slower than VBA, even using VS2008 and Excel 2007. These are completely unusable IMO.
  3. xlls – harder to write, easy to deploy, not so easy to manage, good protection (fully compiled), pretty stable if written well, native Excel levels of performance (Ie fast)
  4. Hybrid – some combination of the above.
  5. XLM – yeah baby! XLM can be shockingly faster than VBA, or slower, its about as easy to write assuming an equivalent skill level, but those skills are much rarer. There is also a hint of XLMs imminent demise in Office 15 (2012?)
  6. There are probably some minor ones I missed – let us know in a comment.

I’m not going to cover VBA, partly because its covered to death all over the internet, but also because personally I would have to be in a serious serious mess before I would consider using a VBA UDF because their performance is so poor.

Automation add-ins I’m not going to cover because I assume its some sort of sick joke from Microsoft.

I’ve done XLM before so that basically leaves xlls and hybrids, so thats whats coming up this week. I have had a few commercial tools on trial and have requests outstanding for others.

There are 2 basic types of UDF

  1. Those that call back into the Excel Object Model somewhere (perhaps to get the background colour for example)
  2. Those that just work on the arguments passed in as values

This becomes quite an important distinction when using anything outside of VBA/XLM.

First up raw xlls tomorrow, or the next day.

cheers

Simon

Spreadsheet Barbie 2011?

Sunday, 14th February, 2010

I spotted a link to here from el Reg. Seemingly people have voted for a computer engineer version of Barbie.

If it sells maybe they will vote for a spreadsheet Barbie for 2011?

I wonder if the iPhone app store halo effect might start to (or already be starting to) bathe the IT profession in a slightly more positive light?

I can’t imagine that, if it happens, to filter down (up?) to the humble spreadsheet user/developer, can you?

How would you dress/accessorise a spreadsheet Barbie? (gotta be a #REF!/#VALUE! teeshirt at least, unless she is an independent (then she would just be in her pyjamas and slippers all day right??))

Do you think the image of IT workers is improving?

I have noticed that a great many films now have some hacker geek in who can perform miracles (eg Italian Job 2, National Treasure), perhaps we are getting cooler as a profession?? Of course I’m not sure how being able to hack the traffic grid matches up to getting your 4 line array monsta-formula to work?

cheers

Simon

VSTO

Thursday, 11th February, 2010

In the olden days we used to call VSTO visto.

Then MS started talking about vista and it all got a bit scary – if someone said that vist_ is rubbish, no one knew if the were referring to their unusable version 0.9 office customisation technology, or their unusable version 0.9 operating system.

So the word came from on high that henceforth (2003?2004?) the office customisation tech shall be known as V-S-T-O, V.S.T.O. – whatever but not anything with a i in it.

It always makes me think of this:

ha ha ha

(I have always hated that song)

wait for the d, delirious, i, instantaneous, s, source control, c, curly brackets, o, object oriented

shame there is no m for memory leaky

cheers

Simon

The Better Spreadsheet Fallacy

Wednesday, 10th February, 2010

There are, and always have been a stack of ‘better than spreadsheet’ applications trying to encourage spreadsheeters away from their big grids. Improv, I’m looking at you.

Many of the live spreadsheets I see are ugly, damned ugly. Most of them would be far better in almost any other tech,( even an abacus for some of them!).

And I think from this mess of live spreadsheets, people think that a better tool will result in better applications to replace theses ugly pugly indirect offset nightmares.

Thats a fallacy though. Heres why:

The spreadsheet is the journey not the destination. When you look at a live spreadsheet you are not seeing what the developer initially intended, you are seeing all the twists and turns, scope creep, blind alleys and good days and bad days of the development lifecycle. And more often than not you see the shortcuts the developer had to make to meet the (perhaps constantly changing) timescales.

The main strengths of spreadsheets is their ability to support the developer laying out their ideas, and developing them over time. This is why I think they are a great (unmatched?) prototyping tool. What’s left after it goes live is often suboptimal, which is why I’m a big fan of migrating stabilised spreadsheets to a more production ready technology. But I have yet to see a tools that supports that freeform brain dump-organise-enhance-use cycle as well as a spreadsheet.

I think the opportunity is in migrating spreadsheets to technologies that better support live use, things like databases to organise data into an easy to maintain structure, and code to allow the full suite of dev tools like source control, static analysis, code metrics etc. I am convinced the spreadsheet step needs to be in there in a great many cases.

What do you think?

cheers

Simon