Apps Development

03. Discovering the Limits

Limits of Excel as Data Tool

People do use tools not specifically designed for the task sometimes when the best tool for the task is not available or — in my case — not entirely to be trusted. I’m sure my distrust of proprietary software is slowly becoming apparent — perhaps too much so — and I might have to edit these pages to clear all that stuff out, eventually. In any case, this is why I used an Excel spreadsheet for database use. Well, one reason, anyway: we don’t have access to Access at work.

But now, we’re going to discover that using a flat-file database (which is essentially what a spreadsheet is) has its limits. The second page of data we copied to sheet is a list of our transactions. On this sheet, that first column — the transaction numbers — is made up of unique numbers. On the first sheet, that column has duplicates, since one transaction can involve a purchase of more than one item.

In order to easily manage a one-to-many record set, we’ll need a relational database. Yes, it can be done in VBA code, but here is where another shortcoming of using Excel versus using a RDBS (relation database system) comes to light: SQL. SQL stands for ‘Structured Query Language’. Excel doesn’t support it: most RDBSes do. And when you’re doing reporting, SQL makes your life SO much easier!

So, we’re going to make a change, but it will be bigger than it first appears. For reasons that will become increasingly apparent to anyone reading these pages, it is of utmost importance to me that I own freely what I will have created. So, I wish to create this in software — and in an OS — that supports that freedom. I will not be bound by some EULA that can dictate — at some point down the track, as Instagram users have discovered to their horror — what is to be done with what I create. Here are the proposed steps to take:

  • Chose a libre Operating System, i.e., some flavour of GNU/Linux
  • Continue development in the same essential Language — BASIC — in GNU/Linux
  • Use a stand-alone data container, again, something libre — I chose SQLite

For OS, I run Linux Mint 13 Maya 64-bit. On the issue of ‘free’ versus ‘libre‘ I have addressed it thus: I have donated to the Linux Mint project and will continue to do so. Whilst they are not strictly GNU(freedom)-compliant in that they do include some non-free software in their distro, at this point it is a significant step away from the malware that is Windows. Nothing reports back to the mother ship.
For software, I have purchased the dual-platform version of Q7Basic and the 3-platform licence for PureBasic, neither of which I’ve been using at all. Instead, I’ve been developing in Gambas which is free. To this effort I have donated as well.

All this to say: I have no intention of perpetuating the notion that I’m promoting free software as to mean ‘no-cost software’. That is not the meaning of ‘free’ in this context, which is why I use the term ‘freedom’ or ‘libre‘ to differentiate and clarify that I’m espousing a cause, not being a cheapskate.

The fewest get that.

Preparing the Data

After the date conversion has been done to both spreadsheets — Itemised Purchases and Transactions — it’s time to get them ready for export to csv. So:

  1. Copy these sheets each to a new workbook
  2. Remove the columns that do nothing: the ones with ‘Reset’, ‘Complete’ or ‘View Order’ and the old, pre-conversion dates.
  3. Check there are no commas anywhere in any of the fields, which will stuff up the import to SQLite – replace any commas with a dash ( – )
  4. Save to .csv, comma delimited.

Then, in the Transactions ‘table’ (which is what they’re going to be) I’ll have 3 fields:

  • PurchaseOrderNumber (unique values)
  • Amount
  • DateOfPurchase

…and in the ItemisedPurchases table I have 3 fields as well:

  • PurchaseOrderNumber (NOT unique values… there are some duplicates)
  • ItemName
  • DateOfPurchase – removed

The Date of Purchase field is actually redundant in the Itemised Purchases table, so I’m going to remove it.