Google Summer of Code: WordPress Proposal

Import/Export Tuning and Stress Testing

As mentioned in the abstract, import/export (i/e) functionality is core to any blog, and can make or break the user’s experience, possibly even before it has begun.  The WordPress functionality in this regard is extremely robust, but there is always room for improvement.

As much attention as is given to coding will be given to stress testing and documentation.  These two aspects are always important, but given the nature of this project as an improvement of already-existing functionality, the former will be essential to providing any sort of starting point for coding, and the latter will be crucial for developers at future stages for further improvements.

The first step in improving this functionality will be to design a robust and multithreaded load tester that can be distributed over a network of machines not only to simulate the real-world Internet, but also make honest attempts at pushing the i/e functionality to its limits.  This will identify and isolate any weaknesses in the i/e design, and will serve as a starting point for all of the other improvements.  The load tester will consist of an automated client mimicking the behavior of a human being requesting to import and export WordPress blog content.  The client will have varied files to choose from with various levels of language compliance (for testing malformed input), and will regulate the data transfer at varied speeds to determine how WordPress reacts to transmissions using less bandwidth.  The multithreaded architecture of the client will enable multiple asynchronous requests, further mimicking the true nature of the Internet and also stressing the i/e functionality to its limits.  The client will keep detailed statistics on file requests and transfer sizes, transmission time, response time, and server responses.

Upon completion of the initial load testing, a list will be created that details out any weaknesses in the core i/e functionality.  This will be the starting point for improvement, though simultaneous efforts will begin to add new functionality to the export feature.  New file formats, most notably CSV, will be available in which to export the WordPress blog.  Furthermore, options will be added that give the user even more control of specific data to export, such as all the entries without comments, or post data without titles, or only a month of data, or the entire blog.  All these options will be tested extensively with the automated client to ensure that processing is efficient and data transfer is effective.  Unit test suites will be devised continuously to ensure all new functionality works as expected, and regression tests will keep the present code base consistent with the new features.

Once the export functionality has been completed and tested, import functionality to mirror export will be implemented.  Support for all the file types export can handle will be added to import, and further robustness will have to be added to handle malformed or incomplete export files, as well as valid import files with only small portions of information, such as a single post or a series of posts without titles or comments.  Once again, thorough testing and use of the load testers in a distributed environment will check that the new features work as planned, varying file sizes and bandwidth usage and request rates.

New parsing and content generation tools will have to be added to support different file types, and these algorithms will have to be optimized to stand up to stress testing.  The current i/e system will be used as a base, and efficient use of PHP’s RegEx and construction of CSV grammars will aid in parsing the files.  Incomplete or malformed import files will have to be handled gracefully.  Furthermore, a minimum baseline for expected input will have to be assumed, and reflected in the export functionality.

Documentation, as mentioned previously, is as essential as the code itself. For each line of code there will be at least one comment, and detailed function explanations as well as overall file introductions will accompany all the source code.  PHPDoc or an analogous documentation generator will be used to create professional and maintainable documentation for future use.

All these features are fully planned to be streamlined and efficient, therefore in line with the philosophy of the WordPress core.  That is where these features are planned to reside.  By guaranteeing that their execution will not hamper the overall performance of WordPress through the rigorous testing, there will only be an additional performance gain by residing within the core of the application and having immediate access to any aspect of the program.


About Shannon Quinn

Oh hai!
This entry was posted in GSoC, Programming. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s