ORBIT/Development
This page brings together a number of resources describing the development of the ORBIT wiki as a work in progress. It is not comprehensive, but is intended as an illustrative guide to some of the issues we've faced, particularly with a view that our learning might prove useful to other OER and MediaWiki projects.
Google Docs
Resource Tracking
We used google's ability to 'scrape' tables to extract information from our Wiki, and manipulate it in a spreadsheet...
Table Scraping
As the project progressed, the wiki became more complete, and the 'status' levels of resources more complex - with some resources requiring longer to gain permissions, others considered strong enough to go up on the wiki but - if time - would benefit from some editing, and others considered finalised (in so far as that's ever true on a Wiki!). At this later stage a decision was made to try and embed as much of the data from google docs into the wiki tables as possible. This was for a few reasons including
- To maintain a clear - and public - record of provenance, reasoning behind meta-data assigning, and resource progression
- To make it clearer to anyone navigating the wiki - particularly editors - what stage resources were at, and what would be needed to 'finalise' resources
- To allow for an automated check between our google docs spreadsheet, and data on the wiki, with a view to automating updates of the google spreadsheet. This was done using google's 'scraper' function.
On the Wiki we setup a number of queries of the following form, specifying the category, and information from that category to appear in the columns :{{#ask: [[Category:ToolInfo]]| ?resourcenumber| ?final| format=table | limit=200 }} Within google, a small formula can retrieve these tables, for example
- =importhtml("http://orbit.educ.cam.ac.uk/wiki/User:Bjoern/resourceoverview","table",3)
Report Writing
We also used Google Docs to compile, collaborate on, and export our project final report
While the Wiki provides facilities for editing, and semantically marking up documents - and this could include a project report - it is not well suited to synchronous collaborative authoring. For this reason, after an initial draft was created in a desktop based office suite, it was uploaded to google docs for further expansion, editing, and commenting.
PDF and Resource Pages
While PDFs live up to their name as a Portable Document Format, the provision of only PDF files by other providers, and us, was deemed problematic. We sought to provide export options - including in PDF - alongside more flexible, easily remixable and editable formats.
Within the Wiki, a decision was required regarding whether resources should be provided as:
- .doc (or similar)
- html (not editable once uploaded, but more flexible formatting than wikitext)
- wikitext
or a combination?
In general, we sought to provide a wikitext version, and a .doc version for all activities, with the ability to export pages to PDF provided - for example - through the 'book creator' function. The 'book creator' was used to collate resources for our own coursebook, but could also be used by readers who wished to collect their own resources for a customised book.
However, in order to provide resources in these formats, some - openly licensed - resources needed to be converted from PDF (an issue Simon Knight discussed in a blog here). While many tools can convert basic PDFs, including the Open Source Libre Office suite, and Google Docs, larger and more complicated PDFs are more challenging to convert in a way that preserves formatting, and reduces the time required for manual post-conversion-editing. The Nitro PDF converter (free to use online) was at the time of conversion (summer 2012) found to be the most successful, although the Zamzar conversion suite (free to use online) was also very successful. However, even those programs frequently: converted table frames and text boxes as images (making them harder to edit); converted headers, footers, and some images into 'backgrounds' on word documents; failed to convert bullets and numbered lists/headings properly; and created paragraphs with line breaks between each line, as opposed to maintaining the continuous text flow. These are well known problems with PDF, and PDFs were not intended for conversion to and from the format, they are however problematic for creative commons projects - particularly those which seek to facilitate reuse, and remixing.
The issue in this case is how we can release files in such a way that they can be disassembled, and reassembled in various formats, mixes, and versions. PDF is not well equipped for this role. There is a related technical issue here related to the tracking of Creative Commons content (e.g., our resources) once they are "out in the wild" - when/if they are appropriated for use on other sites (again, Simon Knight discusses this in a blog here). PDFs - particularly if they have embedded images which link to an original on the authors website - can be used for this purpose, and make it particularly easy to track content in so far as PDFs cannot be disassembled so
- They are less likely to be uploaded elsewhere, and more likely to remain as links to the original website and
- Authors only need to track one document, not multiple sections of a document, some of which may have been versioned for particular purposes (for example, translation into another language).
However, these elements of content use are things we should be seeking to encourage! It is thus important to consider as an author why you might want to track, and how that can be done to best maximise the primary aim of the resources - in our case, to provide flexible open resources for interactive teaching.
Attribution, Reuse, Remixing
PDFs
Templates for Attribution
Transclusion
Dynamic display of sections - semantic media wiki and issues with displaying page sections
Semantic Media Wiki
Structure
Issues
Books
- Our books
- Book creator
- Divs
- Embedded templates
- Heading numbers
- Section transclusion problem