Latest revision as of 12:57, 11 November 2012

DRAFT - This page is not finished! It is here because we like to collaborate on content transparently, and give everybody a chance to comment as content is being developed! So feel free to browse and comment, but bear in mind that this content is evolving!

This page brings together a number of resources describing the development of the ORBIT wiki as a work in progress. It is not comprehensive, but is intended as an illustrative guide to some of the issues we've faced, particularly with a view that our learning might prove useful to other OER and MediaWiki projects.

Google Docs

Resource Tracking

We used google's ability to 'scrape' tables to extract information from our Wiki, and manipulate it in a spreadsheet...

With very little on the Wiki initially, our main objective was to track incoming resources without 'swamping' the wiki with incomplete (or wireframed) pages. As a result, at this stage we used Google Spreadsheets to keep track of resources, their status, provenance, and other information which appears in the Resource Info table which appears on every resource's page.

Table Scraping

As the project progressed, the wiki became more complete, and the 'status' levels of resources more complex - with some resources requiring longer to gain permissions, others considered strong enough to go up on the wiki but - if time - would benefit from some editing, and others considered finalised (in so far as that's ever true on a Wiki!). At this later stage a decision was made to try and embed as much of the data from google docs into the wiki tables as possible. This was for a few reasons including

To maintain a clear - and public - record of provenance, reasoning behind meta-data assigning, and resource progression
To make it clearer to anyone navigating the wiki - particularly editors - what stage resources were at, and what would be needed to 'finalise' resources
To allow for an automated check between our google docs spreadsheet, and data on the wiki, with a view to automating updates of the google spreadsheet. This was done using google's 'scraper' function.

On the Wiki we setup a number of queries of the following form, specifying the category, and information from that category to appear in the columns :{{#ask: [[Category:ToolInfo]]| ?resourcenumber| ?final| format=table | limit=200 }} Within google, a small formula can retrieve these tables, for example

=importhtml("http://orbit.educ.cam.ac.uk/wiki/User:Bjoern/resourceoverview","table",3)

imports the 3rd table from the page http://orbit.educ.cam.ac.uk/wiki/User:Bjoern/resourceoverview to the sheet it is inserted on in google spreadsheet. From this, we could cross-reference results and manipulate our data more freely.

Report Writing

We also used Google Docs to compile, collaborate on, and export our project final report

While the Wiki provides facilities for editing, and semantically marking up documents - and this could include a project report - it is not well suited to synchronous collaborative authoring. For this reason, after an initial draft was created in a desktop based office suite, it was uploaded to google docs for further expansion, editing, and commenting.

PDF and Resource Pages

While PDFs live up to their name as a Portable Document Format, the provision of only PDF files by other providers, and us, was deemed problematic. We sought to provide export options - including in PDF - alongside more flexible, easily remixable and editable formats.

Within the Wiki, a decision was required regarding whether resources should be provided as:

PDF
.doc (or similar)
html (not editable once uploaded, but more flexible formatting than wikitext)
wikitext

or a combination?

In general, we sought to provide a wikitext version, and a .doc version for all activities, with the ability to export pages to PDF provided - for example - through the 'book creator' function. The 'book creator' was used to collate resources for our own coursebook, but could also be used by readers who wished to collect their own resources for a customised book.

However, in order to provide resources in these formats, some - openly licensed - resources needed to be converted from PDF (an issue Simon Knight discussed in a blog here). While many tools can convert basic PDFs, including the Open Source Libre Office suite, and Google Docs, larger and more complicated PDFs are more challenging to convert in a way that preserves formatting, and reduces the time required for manual post-conversion-editing. The Nitro PDF converter (free to use online) was at the time of conversion (summer 2012) found to be the most successful, although the Zamzar conversion suite (free to use online) was also very successful. However, even those programs frequently: converted table frames and text boxes as images (making them harder to edit); converted headers, footers, and some images into 'backgrounds' on word documents; failed to convert bullets and numbered lists/headings properly; and created paragraphs with line breaks between each line, as opposed to maintaining the continuous text flow. These are well known problems with PDF, and PDFs were not intended for conversion to and from the format, they are however problematic for creative commons projects - particularly those which seek to facilitate reuse, and remixing.

The issue in this case is how we can release files in such a way that they can be disassembled, and reassembled in various formats, mixes, and versions. PDF is not well equipped for this role. There is a related technical issue here related to the tracking of Creative Commons content (e.g., our resources) once they are "out in the wild" - when/if they are appropriated for use on other sites (again, Simon Knight discusses this in a blog here). PDFs - particularly if they have embedded images which link to an original on the authors website - can be used for this purpose, and make it particularly easy to track content in so far as PDFs cannot be disassembled so

They are less likely to be uploaded elsewhere, and more likely to remain as links to the original website and
Authors only need to track one document, not multiple sections of a document, some of which may have been versioned for particular purposes (for example, translation into another language).

However, these elements of content use are things we should be seeking to encourage! It is thus important to consider as an author why you might want to track, and how that can be done to best maximise the primary aim of the resources - in our case, to provide flexible open resources for interactive teaching.

Attribution, Reuse, Remixing

PDFs

Templates for Attribution

Transclusion

Dynamic display of sections - semantic media wiki and issues with displaying page sections

Semantic Media Wiki

Structure

Issues

Books

A key output of the project was a coursebook, containing materials on pedagogy, professional development, classroom teaching resources, and teaching tools in sets of chapters. The book was successfully created, although there were limitations to the creator which affected some of our decisions.

Our books
Book creator
Divs
Embedded templates
Heading numbers
Section transclusion problem

@@ Line 2: / Line 2: @@
 This page brings together a number of resources describing the development of the ORBIT wiki as a work in progress.  It is ''not'' comprehensive, but is intended as an illustrative guide to some of the issues we've faced, particularly with a view that our learning might prove useful to other OER and MediaWiki projects.
-=[[Google Docs]]=
+=Google Docs=
+==Resource Tracking==
+<div class="toccolours mw-collapsible mw-collapsed">
+We used google's ability to 'scrape' tables to extract information from our Wiki, and manipulate it in a spreadsheet...
+<div class="mw-collapsible-content">With very little on the Wiki initially, our main objective was to track incoming resources without 'swamping' the wiki with incomplete (or wireframed) pages.  As a result, at this stage we used Google Spreadsheets to keep track of resources, their status, provenance, and other information which appears in the Resource Info table which appears on every resource's page.
+==Table Scraping==
+As the project progressed, the wiki became more complete, and the 'status' levels of resources more complex - with some resources requiring longer to gain permissions, others considered strong enough to go up on the wiki but - if time - would benefit from some editing, and others considered finalised (in so far as that's ever true on a Wiki!). At this later stage a decision was made to try and embed as much of the data from google docs into the wiki tables as possible.  This was for a few reasons including
+# To maintain a clear - and public - record of provenance, reasoning behind meta-data assigning, and resource progression
+# To make it clearer to anyone navigating the wiki - particularly editors - what stage resources were at, and what would be needed to 'finalise' resources
+# To allow for an automated check between our google docs spreadsheet, and data on the wiki, with a view to automating updates of the google spreadsheet. This was done using google's 'scraper' function.
+On the Wiki we setup a number of queries of the following form, specifying the category, and information from that category to appear in the columns
+<nowiki>:{{#ask: [[Category:ToolInfo]]| ?resourcenumber| ?final| format=table | limit=200 }}</nowiki>
+Within google, a small formula can retrieve these tables, for example
+:=importhtml("http://orbit.educ.cam.ac.uk/wiki/User:Bjoern/resourceoverview","table",3)
+imports the 3rd table from the page http://orbit.educ.cam.ac.uk/wiki/User:Bjoern/resourceoverview to the sheet it is inserted on in google spreadsheet.  From this, we could cross-reference results and manipulate our data more freely.</div>
+</div>
 ==Report Writing==
+<div class="toccolours mw-collapsible mw-collapsed">
+We also used Google Docs to compile, collaborate on, and export our project final report
+<div class="mw-collapsible-content">
+While the Wiki provides facilities for editing, and semantically marking up documents - and this could include a project report - it is not well suited to synchronous collaborative authoring.  For this reason, after an initial draft was created in a desktop based office suite, it was uploaded to google docs for further expansion, editing, and commenting.
+</div>
+</div>
-=[[PDF and Resource Pages]]=
+=PDF and Resource Pages=
-==Options for resource upload==
+<div class="toccolours mw-collapsible mw-collapsed">
-==PDF issues==
+While PDFs live up to their name as a Portable Document Format, the provision of ''only'' PDF files by other providers, and us, was deemed problematic. We sought to provide export options - including in PDF - alongside more flexible, easily remixable and editable formats.
+<div class="mw-collapsible-content">
+Within the Wiki, a decision was required regarding whether resources should be provided as:
+*PDF
+*.doc (or similar)
+*html (not editable once uploaded, but more flexible formatting than wikitext)
+*wikitext
+or a combination?
+In general, we sought to provide a wikitext version, and a .doc version for all activities, with the ability to export pages to PDF provided - for example - through the 'book creator' function.  The 'book creator' was used to collate resources for our own coursebook, but could also be used by readers who wished to collect their own resources for a customised book.
+However, in order to provide resources in these formats, some - openly licensed - resources needed to be converted from PDF (an issue [[User:SimonKnight|Simon Knight]] discussed in a blog [http://www.nominettrust.org.uk/knowledge-centre/blogs/creative-commons-open-government-licensing-and-pdfs here]). While many tools can convert basic PDFs, including the Open Source Libre Office suite, and Google Docs, larger and more complicated PDFs are more challenging to convert in a way that preserves formatting, and reduces the time required for manual post-conversion-editing.  The [http://www.pdftoword.com/ Nitro PDF converter] (free to use online) was at the time of conversion (summer 2012) found to be the most successful, although the [http://www.zamzar.com Zamzar] conversion suite (free to use online)  was also very successful.  However, even those programs frequently: converted table frames and text boxes as images (making them harder to edit); converted headers, footers, and some images into 'backgrounds' on word documents; failed to convert bullets and numbered lists/headings properly; and created paragraphs with line breaks between each line, as opposed to maintaining the continuous text flow.  These are well known problems with PDF, and PDFs were not intended for conversion to and from the format, they are however problematic for creative commons projects - particularly those which seek to facilitate reuse, and remixing.
+The issue in this case is how we can release files in such a way that they can be disassembled, and reassembled in various formats, mixes, and versions.  PDF is not well equipped for this role.  There is a related technical issue here related to the tracking of Creative Commons content (e.g., our resources) once they are "out in the wild" - when/if they are appropriated for use on other sites (again, [[User:SimonKnight|Simon Knight]] discusses this in a blog [http://www.nominettrust.org.uk/knowledge-centre/blogs/measuring-impact-tracking-open-content-wild here]). PDFs - particularly if they have embedded images which link to an original on the authors website - can be used for this purpose, and make it particularly easy to track content in so far as PDFs cannot be disassembled so
+# They are less likely to be uploaded elsewhere, and more likely to remain as links to the original website and
+# Authors only need to track one document, not multiple sections of a document, some of which may have been versioned for particular purposes (for example, translation into another language).
+However, these elements of content use are things we should be seeking to encourage! It is thus important to consider as an author why you might want to track, and how that can be done to best maximise the primary aim of the resources - in our case, to provide flexible open resources for interactive teaching.
+</div>
+</div>
 =Attribution, Reuse, Remixing=
@@ Line 22: / Line 62: @@
 =Books=
-==Issues with templates and book creator==
+<div class="toccolours mw-collapsible mw-collapsed">A key output of the project was a coursebook, containing materials on pedagogy, professional development, classroom teaching resources, and teaching tools in sets of chapters. The book was successfully created, although there were limitations to the creator which affected some of our decisions.
+<div class="mw-collapsible-content">
+*Our books
+*Book creator
+*Divs
+*Embedded templates
+*Heading numbers
+*Section transclusion problem
+</div>
+</div>

Anonymous

Search

ORBIT/Development: Difference between revisions

Namespaces

More

Page actions

Latest revision as of 12:57, 11 November 2012

Contents

Google Docs

Resource Tracking

Table Scraping

Report Writing

PDF and Resource Pages

Attribution, Reuse, Remixing

PDFs

Templates for Attribution

Transclusion

Dynamic display of sections - semantic media wiki and issues with displaying page sections

Semantic Media Wiki

Structure

Issues

Books

Navigation

Navigation

OER4Schools Resource

ORBIT Lesson Ideas

Wiki

Wiki tools

Wiki tools

Anonymous

Search

ORBIT/Development: Difference between revisions

Latest revision as of 12:57, 11 November 2012

Google Docs

Resource Tracking

Table Scraping

Report Writing

PDF and Resource Pages

Attribution, Reuse, Remixing

PDFs

Templates for Attribution

Transclusion

Dynamic display of sections - semantic media wiki and issues with displaying page sections

Semantic Media Wiki

Structure

Issues

Books

Navigation

Wiki tools

Page tools

Categories