GSoC 2010 WordPress proposal: Plugin to mirror blog content via XML-RPC

Yep, I’m submitting a second proposal to WordPress 🙂

Overview

This project was originally born out of a personal need: I had a WordPress install running locally on my laptop and wanted to mirror notes that I took – from one-sentence thoughts to entire stories – on a permanent, stationary blog elsewhere on the web (the progression of implementations can be found on my personal projects wiki). The culmination of this project led to a plugin which would automatically save the settings of the remote WordPress blog to which I wanted to automatically mirror posts if an internet connection was present. If there was no connection, I could mirror the posts manually via a form (or un-mirror posts) at a later time.

The basic implementation is already functional as a WordPress plugin. However, there are numerous bugs and scalability issues to address, and within the setting of GSoC, significantly more robust functionality that can be added.

Description

These are the main points of focus I want to hit in this project.

  • Support all WordPress content types: More than simply posts, this plugin will also allow the user to mirror comments, pages, and even drafts.
  • Process backgrounding: This includes several aspects of the plugin. Firstly, if a user manually selects content and begins mirroring many items of content (perhaps in the thousands), this process will use AJAX or simply batch the process and run it in the background, making the process not only invisible to the user, but if the user navigates away or the process is suddenly terminated, it will resume automatically and invisibly until the process has completed (similar to the current import implementation in the WordPress core). Second, if the user has configured the plugin to automatically mirror content when it becomes available, the automatic mirroring may be batched if no internet connection exists until a connection can be established, at which point the process will again be run in the background, invisible to the user (perhaps via a cronjob).
  • Configurability: The user will be able to specify which types of content they would like automatically mirrored for them at the time of their creation (even comments that are posted can be automatically mirrored), or if they don’t wish any content to be mirrored, but rather pick and choose content manually at a later time. Perhaps even a “sliding window” option could be given to the user – to mirror only a certain window of time (or number of items) before automatically un-mirroring that content.
  • Multi-blog support: The current implementation supports only other WordPress blogs. I would like to put the full capabilities of the WordPress XML-RPC implementation to use by adding support for mirroring posts on other blogs, insofar as the other blogs support the sort of content that WordPress does (would need to work on specifically which content items in WordPress translate to which content items of other blogs types).
  • Robust error handling and recovery: This was partially addressed in “Process backgrounding” but it’s important enough to make it explicit. Since there is already (and likely will remain) an option to “Mirror everything at once”, this process can take awhile and may be extremely error-prone as a result. Furthermore, this plugin was designed specifically with situations involving intermittent network connections in mind. As such, there is never any guarantee that a process can be run, or that it can be finished if it begins. As such, process batching will be supported, as will an implementation similar to that of the current WordPress importer error-handling where the process is resumed later if interrupted. Furthermore, I would like to work with the XML-RPC gurus directly to identify any and all significant points of failure within the protocol’s implementation, as I have encountered strange errors under stranger conditions in this regard, and these need to be accounted for as well.

Time permitting (big if), I would like to look into the possibility of adding some sort of encryption to the XML-RPC implementation outside of requiring HTTPS. But that’s just a “would be nice”.

The biggest risk I see to this project – aside from ensuring proper testing and coming up with good translation paradigms between different blog types – is the final item of handling errors correctly. In my current plugin’s development history, I have frequently corrupted one database or the other of my two blogs when something didn’t mesh quite right between them (though it might be worthwhile to point out that it happened a lot more often with early iterations of the plugin – before it was actually a plugin – than with the later ones), largely due to the complete lack of any coherent error-handling strategy. All points of failure must be identified and accounted for, in addition to accounting for as many user actions as is possible. I would love to work directly with the WordPress devs on this one as well, as I’m sure they can fill me in on some of the most bizarre user actions imaginable that still need to be kept in mind.

Conclusion

This would be a fantastic opportunity to bring the plugin I wrote up to standards. It certainly works as is (and I use it everyday), but it has the potential to do quite a bit more. Please feel welcome to leave feedback, and as with my previous proposal, I hope you’ll give this one strong consideration 🙂 (all listings of qualifications and previous experience are in my official GSoC application, which was already submitted but without a link to this entry, so I do apologize for blanking on that part).

Advertisements

About Shannon Quinn

Oh hai!
This entry was posted in GSoC, Programming and tagged , , , , , , , , . Bookmark the permalink.

4 Responses to GSoC 2010 WordPress proposal: Plugin to mirror blog content via XML-RPC

  1. Colin says:

    Yeah dude that’d be awesome

  2. eksith says:

    Speaking of multiple blogs…

    WordPress has the import/export features which can really be exploited here. I think it can already tell which posts have been imported and which remain, so an autosync utility can benefit a lot in a push/pull situation involving newly created posts, drafts or comments etc…

    If there is some way to turn the manual import/export functionality from an XML file into pure XML-RPC, that’s a giant first step toward reusing some of the existing WP functionality.

    • magsol says:

      Yeah, I’ve toyed with the idea of a two-way system (having plugins installed on both ends) and that could do some really cool stuff. But yeah, just making the conversation from XML files to pure XML-RPC, even one-sided, would really be worthwhile 🙂 Hopefully WordPress agrees with you!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s