Importing large data set into Easy Digital Downloads

Importing large data set into Easy Digital Downloads

I manage a client’s WordPress installation which delivers digital documents using Easy Digital Downloads as the primary plugin.

Recently they asked me to facilitate sharing documents bi-directionally with partners, distributors and subscribers.

This included adding a large document library of over 2,000 documents initially, and a small number later, incrementally.

Experimenting with EDD’s own Import routine demonstrated that it was (admittedly) “pretty useless”.

And our original design web design was flawed in that the original data import had been done by taking separate fields and merging them into an HTML table that populated the Description field. (There had been no requirement for data sharing in the original specification.)

So my plan was to move this data out, and then back into separate fields, enabling future data imports and exports in a “normal” way.

The first thing I did was to use the Advanced Custom Fields Plugin to extend the EDD posts with additional fields. This was pretty straightforward. Indeed, why the original developer and his “data guru” didn’t use this approach originally, is beyond me…

But how to get and populate these new ACF fields with the Data?

After a long wait for such, in Jan 2016, EDD 2.5 was released with a new, simple, and much more effective Data Export module. That meant I could get the Downloads data into a Spreadsheet to manipulate it. (But not using Excel, rather LibreOffice Calc – see Excel UTF-8 gotcha, here.)

Getting the right info out of the HTML Table data, required fairly complex formula: string searching, splitting and length deducing and also needing to first remove surplus HTML tags. (How it got up to three <strong> tags is also beyond me…)

For anyone interested, this is an example of how to get a formula to find variable length Data, in this case after the emboldened word “Form:from the data table:

=IFERROR(MID(F2,FIND("<strong>Form:</strong>",F2)+23,FIND("</td>",F2,FIND("<strong>Form:</strong>",F2))-23-FIND("<strong>Form:</strong>",F2)),"")

So I then had the right data, but how to get it back into the Downloads? My contracted “Data Programmer” had gone AWOL, so it was down to me.

Research kept me coming back to the Plugin WP All Import. And it supported both Custom Fields (in the paid Pro version) and ACF (with a paid Add-on). I tried the “test me” site, but it crashed as soon as I Activated the EDD plugin. <sigh> But it looked as though it should work. So I paid my $138 (Pro plus the ACF Add-on) on the basis of a Guaranteed Money Back, if it didn’t work.

After a lot of experimenting on the staging server set up to do exactly that, I had worked out that adding data back into the ACF custom fields was more straightforward than I’d originally thought, and that I didn’t even need the ACF fields add-on (and I got my $38 refunded, without any problem). As the EDD data is all referenced by the Key field Post_ID, getting the separated data back into the new ACF fields in the right record was a drag and drop operation.

WP All Import Drag & Drop csv data into field

WP All Import Drag & Drop csv data into field

So that was the add data to my new ACFields sorted.

What about bringing a whole new data set in? I’d received (and cleaned, rectified missing accented characters etc.) my external data file.

And that was working as above except for one, but critically important part – the EDD Download files and Location.

The 1st thing was getting my files onto the Server – I wasn’t going to use the Media Library routine to load up 2,000 plus docs. But once they’re FTPd onto the server, they still need to be registered in the Library. EDD used to recommend the simple Plugin Add to Server to remedy this but that doesn’t (and according to its author, never will) handle large volumes of files.

Instead, I used a more sophisticated Plugin Media from FTP. This is pretty straightforward, even if it takes time to scan the /uploads directory tree. One thing to watch out for, is to uncheck the Organise into month and Year folders option in the Plugin’s settings – otherwise it will move your files (and they’ll be in the wrong place for the next step).

And that brings us to bringing in where EDD will look for its files. The field it uses is edd_download_files which WPAllImport shows you in Custom Fields . The problem arises because that value is an Array, because it can have multiple values (you can have more than one file per Download page), which itself contains a two part array – [file] and [name].

If you use WPAllImport to Auto-Detect, you’ll see it renders a “serialized” data value. Looking something like

Key 0 and Value a:2:{s:4:”name”;s:15:”myfile.pdf”;s:4:”file”;s:70:”http://ind-tech.com/wp-content/uploads/edd/myfile.pdf”;}

Where myfile.pdf is the first record in the edd_download_field table

Initially, I didn’t understand what that was. If I used it, each record was populated (so that was something) but each one had the same value of that first record. The WPAllImport log said

edd_download_field updated with `a:1:{i:0;a:2:{s:4:”name”;s:15:”myfile.pdf”;s:4:”file”;s:70:”http://ind-tech.com/wp-content/uploads/edd/myfile.pdf”;}}` for post `Post Name

I tried adding [file] and [name] as additional lines, and got various combinations of similar output, but nothing that would render an array within an array.

This issue arises, I found, because MySQL cannot store data in “real” arrays. (Wow!?) Instead it has to serialize and unserialize array data when storing and retrieving such an array. So the vales starting “a:2” are the serialized data.

As far as I could find, WPAllImport doesn’t give you an option to handle Arrays within Arrays. That means you have to create an array field value yourself, and import that.

This means you have to have create an intermediate spreadsheet: source_data.csv >> intermediate.ods >> import_for_edd.csv

And the value you have to create has to calculate for each record, a) the value of the file name, b) the value of the path+filename, and c) the number of characters in each of those. For my spreadsheet, this meant have a formula like this:

=CONCATENATE("a:2:{s:4:""name"";s:",LEN(C2),":",CHAR(34),C2,CHAR(34),";s:4:""file"";s:",LEN(E2),":",CHAR(34),E2,CHAR(34),";}")

which produced the requisite output for each row:
a:2:{s:4:”name”;s:15:”myfile1.pdf”;s:4:”file”;s:70:”http://ind-tech.com/wp-content/uploads/edd/myfile1.pdf”;}
a:2:{s:4:”name”;s:15:”myfile2.pdf”;s:4:”file”;s:70:”http://ind-tech.com/wp-content/uploads/edd/myfile2.pdf”;} and so on…
and when the import was run it created the “unserializable” Value e.g:
a:1:{i:0;a:2:{s:4:”name”;s:15:”myfile1.pdf”;s:4:”file”;s:70:”http://ind-tech.com/wp-content/uploads/edd/myfile1.pdf”;}}
for each post (Download)

(It has to be quite complicated, as you need e.g. to have quotation marks (CHAR(34)) in the right places. In my spreadsheet , column C contained the myfile.pdf name and column D contained the path+filename.)

And that, after a lot of trial and working out pathname errors, worked! All files correctly locatable and downloadable – phew!

Hope that helps if you’re trying to do the same thing…

2 Comments

  • Santhosh says:

    23rd April 2017 at 6:48 pm

    Hi, thanks for the tip.
    If I add a price like “20.00” to the CSV and import via wp all-import to EDD, it works.
    However, for a variable price like “Monthly: 10.00 | Yearly: 120.00” to the CSV is not working. edd_price takes only the first numerical, i.e. “10.00”.
    What could be the reason? Do you have a fix for this? You help will be much appreciated.

    • admin says:

      23rd April 2017 at 8:25 pm

      I’d have to examine this in more detail (and am really pushed for time at the moment). WP All Import has lots of options including the FOREACH (for multiple values in the same field).

      I’d be surprised if one of those advance options didn’t hold the answer…

Leave a reply

Your email is never published nor shared. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.