J-Source

New Excel feature takes data analysis to new levels

Reporters who use Microsoft Excel to crunch government data have something to cheer about with the release of the beta version of Office 2010. A new add-in to Excel brings Excel’s ease of use to the analysis of huge datasets. Last week I talked about the imminent arrival of the Office 2010 public beta. Well,…

Reporters who use Microsoft Excel to crunch government data have something to cheer about with the release of the beta version of Office 2010. A new add-in to Excel brings Excel’s ease of use to the analysis of huge datasets.

Last week I talked about the imminent arrival of the Office 2010 public beta. Well, it’s out now and some of the best news for journalists is a quite remarkable enhancement to Excel called PowerPivot.

I won’t get into the technical underpinnings; that’s the stuff of computer magazines.

But what it does is just cool.

Excel has some powerful analysis features, particularly pivot tables, which allow users to create summaries of their data. For example, given a spreadsheet of government contracts, one can summarize how much each government contractor received each year.

But traditional pivot tables are limited by Excel’s row limit of a million rows in the latest versions, and about 65,000 in older versions. More contracts than that and you were either out of luck, or forced to move to a more expensive (and complicated) database program.

PowerPivot helps blow away some of those limitations.

For starters, it allows analysis of external databases of almost limitless size. They can be delimited text files, Microsoft Access database files, Oracle database files etc. Using an ODBC connection—a way for a database to read files from other database systems—it can use many more formats.

Once the data is brought into Excel, you can create a pivot table or pivot chart to summarize, in theory, tens of millions of rows. Quite remarkable.

[node:ad]

And there’s more.

PowerPivot also brings the table-joining power of a relational database to Excel. So now you can create pivot tables and charts from multiple tables with common fields. The old example of government contracts and political donations is always useful for explaining this. By joining on the common field (name of donor/contractor) you can compare total contracts to total money donated and see if there are hints of political favouritism.

And if this isn’t enough, you can also use PowerPivot to join the data in Excel worksheets either to other worksheets, or to external datasets.

I tried it out on some smallish datasets, and it did everything it claims to do.

PowerPivot isn’t part of the basic Excel program, probably in part because it requires significantly more computer power to run it than Excel does. You need at least a couple of gigs of memory, plus a more modern processor. That said, almost any computer you buy with Windows Vista or Windows 7 installed is going to meet those requirements.

You can get more detail and download the add-in from the PowerPivot website. The Office 2010 beta is available here.

I’d love to hear comments back from users who give it a try. Meantime, I’ll talk more about some of the new features of Office 2010 in future posts.