Over the past few months CETIS and Jorum have been discussing approaches to bulk deposit to support the projects in the UKOER programme as they deposit or represent their OERs in Jorum. Based on feedback from projects gathered through our technical reviews of projects, we’ve investigated approaches which might work for the programme.
One option we have investigated is the use of RSS. Gareth Waller from Jorum produced a set of feed requirements and a discussion paper suggesting possible issues with the use of RSS. A number of projects have trialled their feeds and provided feedback on Lorna’s blog post introducing Gareth’s paper and outling the issues. The Xpert project has also produced a briefing paper looking at issues around RSS –based deposit. (Considerations and evaluations of the development of distributed repositories when using RSS aggregation as a submission protocol. By Pat Lockley, The University of Nottingham http://webapps.nottingham.ac.uk/elgg/xpert/files/-1/803/xpert+metadata+final.pdf )
Many thanks to Gareth and Laura from Jorum and everyone else who’s contributed to the discussion thus far. This post is a summary of that discussion and comment about other options suggested.
Please note this conversation is shaped by the constraints of the programme. The discussion below focuses on the relation of a single OER project producing a feed or feeds of resources to contribute to Jorum. Issues of how Jorum addresses and combines data feeds from different projects and provides standardised data are a separate discussion.
Although submission to a repository isn’t the primary purpose of RSS, it does have functionality and features that may make it suitable for such a purpose. The investigation of RSS as an option for submitting content to Jorum began with the observations:
- Many of the diverse choices of platforms used across the programme are capable of producing RSS
- RSS is the preferred format for most current OER discovery services (aggregators). For example,
- OCWC have produced recommendations for an application profile RSS feeds for OERs http://wiki.ocwconsortium.org/index.php?title=RSS_feeds
- iTunes(U) uses RSS and a number institutions making material available through iTunesU
Jorum produced an outline of their minimum requirements for feed-based ingest and a briefing paper summarizing their current take on issues around RSS for deposit.
Feed format and content
Jorum‘s current requirements are:
- RSS version 2.0 feed
- At least one element belonging to one of the following namespace directly under the channel element. Metadata for all items must be represented in elements belonging to this namespace.
- Licence information on each item (in the relevant metadata element). This must contain a v2 Eng & Wales CC licence url e.g.
- DC : rights e.g. Licensed under a Creative Commons Attribution – NonCommercial-ShareAlike 2.0 Licence – see http://creativecommons.org/licenses/by-nc-sa/2.0/uk/
- IMSMD : rights/description/langstring
- LOM: rights/description/string”
Jorum currently processes the feed as follows:
- “The feed is *not* continually polled for new content. […] The current functionality simply reads the feed when it is deposited and all the items are created in DSpace. It’s a snapshot in time of that RSS feed. If you add in the same feed again, it will store duplicates.
- The physical data of a resource in the feed is not stored in JorumOpen. A link is simply created pointing to the resource as indicated by the RSS feed (the “link” element).”
- “The feed MUST be valid XML – if the XML coming back isn’t valid in the first place then we cannot process it (neither can any validator, XML reader etc). ”
- “Items within a feed are not auto classified within Jorum. In other words, every item in a feed is stored within a single collection as chosen by the admin user i.e. a top level JACS or LearnDirect classification. Having individual feed for each classification such as the OpenLearn model would ensure that items are classified correctly as these feeds can be deposited separately.”
Possible issues about the use of feeds
In his paper Gareth raises a number of issues and questions including the following:
- RSS items need to contain the unique id of the OER
- It’s not yet clear how to tell from the feed if an OER has changed or been deleted
- Feeds should not contain the whole repository contents
- There is the possibility that OERs might fall between arbitrary limits for feed creation (50 most recent items polled everyday misses resources above this number)
- The richness of metadata which exists within the platform creating the RSS may be restricted to using subset of the fields they have available by the feed creation process or feed consumption process.
- Feed deposit needs to make assumptions about licensing
- Current exploration of feed deposit relates only to harvesting metadata, not to harvesting resources.
Part 2 of this post will look at the community responses to this proposal and look at emerging issues.