WSPR – Weak Signal Propagation Reporter: how to handle its large CSV files

If you landed on this page you must have tried, with no joy, to crunch the data contained in the .csv files downloadable from the WSPRnet website. The problem is that Excel is unable to manage data volumes in excess of about one million lines. What follows is a workaround of this limitation that will enable you to open and manage this large amount of data in Excel. I used Excel 2016 to test my method and I am not sure of what the equivalent steps are on older versions (or newer versions, if you are reading this article a long time after I published it).

The wspr compressed .csv file for May 2016 is very large and in excess of 500MB; once decompressed, it becomes about 2.4GB in volume and – sit down before you read this – about 29.6 million lines! I opened large files in the past, at the time when Excel would only be able to deal with 65k lines, however nothing as large as this file.

The way around this problem is to create a query linked to the .csv file, this way you can handle more then the 1 million or so lines that Excel is able to load in a worksheet – however I confess that until I tried I wasn’t sure that it would work with nearly 30 million lines. This is how to proceed:

  1. Open a new blank document in Excel
  2. Click on the Data tab and then open the New Query option
  3. A drop-down menu will appear, pick “From File” and “From CSV”
  4. Locate your file and press Import.
  5. When the new window opens, select the Load drop-down and pick “Load To…”
  6. This will lead you to yet another window. Here you are going to select “Only Create Connection” and tick the box “Add this data to the Data Model”.
  7. This will set your computer to work. You’ll need a beefy machine to do this in reasonable time, I’ve done this on an i7 6700HQ CPU laptop with an SSD and 16GB RAM, it has done the job in about 5 minutes. You can go and brew a coffee at this point but remember to make sure that the cooling fans of your computer work well.
  8. When done, go to Power Pivot at the top of your tabs. If you don’t have Power Pivot at the top, press the Manage Data Model button under Data and you’ll end up with something similar to a spreadsheet with filtering options.

The rest is just ordinary data management. You can filter, pivot and use some other basic spreadsheet functionalities. Once you filtered the lines that you need, you can copy and paste in an Excel sheet. In my example, I extracted all the entries for reporter K9AN which, unsurprisingly, returned about 500k entries. This “small” filtered amount of data (and I doubt that any average user will approach K9AN’s volume) can now be easily handled in Excel.

I hope you will find this useful.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s