To any and all: this is my GSoC 2010 WordPress project proposal. If you’re one of my (few) readers, feel free to have a look and even leave a comment if you’d like. If you’re one of the GSoC WordPress representatives, I welcome any and all feedback you might have on how to improve my application.
My goal is to augment the current import/export functionality in WordPress in the following ways:
- Alter the current import architecture, such that for each blog type, a plugin can be installed to enable the WordPress installation to import from and export to blogs of that type.
- Augment the current export functionality to include an over-the-wire option, where content is sent via XML-RPC to the specified blog, provided the user has installed the plugin for that blog type.
- Make the core importer/exporter robust enough to handle failures and resume as soon as is possible.
Obviously there are a lot of details in each of these items, so I’ll try to address them all.
Import: Currently, WordPress does not use plugins to support importing from different blogs types. Rather, they are all included in the core. I propose removing the currently supported blogs from the importer core and make each one a plugin to a new core import interface (obviously WordPress would be installed by default, possibly even included in the core install). Many of the importers already use remote procedure calls to pull content from those blogs, so that functionality would be moved from its current location to a plugin whose calls would mesh with the core importer’s API. The granularity of having a plugin for every blog type would allow users to specify precisely what content from the remote blogs they wish to import to their local WordPress blog.
Export: The only method by which WordPress currently allows content export is in the form of an XML file. I propose adding an option to the current export core to send content over-the-wire, via XML-RPC. This, like importing, would allow the user to export content to blogs based on what blog-type plugins they have installed. Options would be available to the user to decide whether they want to export everything, or just published posts, or pages, or drafts, or even pick individual items to export. This would also allow the interface to reflect exactly what sort of content the remote blog is compatible with.
Performance: In both cases, a robust ability to handle large amounts of information as well as unplanned failures or user actions would need to be implemented. Right now, the WordPress core handles this well by resuming import automatically if a user navigates away or the connection is terminated, and this will be included in the new import core as well as the new export functionality. I’d also like to investigate the possibility of batching these processes, or even sending them to a cronjob. Furthermore, I’d like to investigate the WordPress XML-RPC library itself to see if any optimizations can be made (if possible I’d love to work with Joseph Scott on the XML-RPC investigations to find out if any of that is plausible, as well as anyone else on the WordPress team who has a greater depth of experience with XML-RPC than myself…which is likely everyone), as previous work that I’ve done has shown that exporting a few thousand entries at once – even just the titles and bodies – can take quite awhile.
I hadn’t included this in my initial idea, but given that several other potential GSoC-ers mentioned it in their proposals, I thought I would add specific consideration regarding the transfer of binary information, e.g. images and videos. According to the Xmlrpc spec and the implementation used by WordPress, this is certainly possible over XML-RPC, but it would also incur an overhead that could easily translate to transfer time well beyond linear to the number of items being sent. At the very least I’d like to make this an option that is user-configurable – perhaps something which highlights the entries containing multimedia content – specifically, content stored internally relative to the local WordPress application – and its size, allowing the user to pick and choose which are sent and which are kept. Again, this also possibly be optimized using batch processing and/or cronjobs to make this transfer as invisible to the user as possible.
All my qualifications, other commitments, and reasons for wanting this opportunity (which I really, really do, by the way) are in my GSoC application that I’ll be submitting shortly after posting this. I hope you’ll give this strong consideration, and as always I’m more than happy to answer any questions, clarify anything that isn’t clear, and address any concerns you might have. Plus, comments give me the illusion of having a readership on this blog 🙂 Thank you!