RAPID Archive for long term rich-media archiving and editorial sharing

RAPID Archive is the long-term archive component of RAPID Browser, archiving news packages and making use of a Lucene indexer for improved performance and scalability of use.

 

archive

RAPID Archive makes use of MySQL and Lucene indexer, allowing high performance and scalability of use.

The primary use of RAPID Archive is to enable a publisher or news organisation to maximise his return from his content and rights and especially from the value that he has added.

The archive does this by enabling editors to share content over time, across different locations, and among different organisational units.
“Sharing” means that editors can reuse material in its existing form or develop new content products from it.

At Independent News and Media, for example:
  • editors go back to the archive to pull a story from a daily for updating or re-use in a weekly magazine;
  • journalists in New Zealand take Robert Fisk articles published in the London Independent for use in their location
  • the syndication department provides paid-for feeds direct from the archive to organisations including agents, newspaper customers and online hosts like Lexis-Nexis
To make this easy, users can search and browse content, see it in context (e.g. how it looked published on the web or in print, in different versions, with “related items”) or on its own. Once an editor has found something that he/she wants, he/she can apply actions to the content to, e.g., download it in the format that is going to make his job easiest.

What content?

Variety. News organisations develop stories, news, features in different formats – newspapers, supplements, web sites, mobile services, yearbooks, and more.

Flexibility. Indeed, the excitement about “convergent” newsrooms reflects a growing understanding that news publishers need to serve users who have their own view how and when they receive the content and how they interact with it.

Consequently, to do its job well, the archive needs to manage evolving varieties of content and different kinds of metadata. Which content and which metadata is not a one-off static decision.

Integrity. Moreover, editors themselves increasingly participate in the creation and delivery of different products taking editorial decisions about breaking news for the web, or features in a special print supplement.

To serve these editorial users well going forward, the archive cannot be just a “print archive” or a “web archive” or an “SMS archive”. Rather, it needs to aggregate and integrate print content, web content, mobile content, and future content as formats evolve – in a word, it’s an archive for editorial, whatever the format.

What representation?

RAPID Archive is capturing the value in the different formats. Therefore the content's schemas need to be flexible as well as efficient. For this reason, RAPID Archive uses XML as its “canonical” representation and allows for evolution in the formats it represents.

The archive is implemented in an SQL engine with Lucene-based text indexing (as is the case with the RB engine).

Basing it on an XML foundation enabled us to build on the work done by standards organisations like W3C, IPTC and Ifra. In the first implementations, NITF, NewsML as well as XMP and RDF play a large part. This gives us a language with which to describe news in the way that the industry requires.

It also ensures we can communicate with the movers in the industry about new developments – and move in time to new standards rather than idiosyncratic ways of handing semantic web, web2.0, rights commons, etc. It provides future-proofing for us and our clients.

In this way we do believe that RAPID Archive can accommodate different needs (and different, customer specific metadata) and harmonise them in one framework of an editorial archive with integrity.

RAPID Archive is designed to serve the needs of the Librarians as well as editors and the public.

Librarians/archivists perspective

  • Simple, fast access to classification information and other metadata, both to see it and to validate and/or edit it
  • Rich Boolean and other search operators
  • "Empowerment" - the ability to maintain or modify classification schemes, indexes, frequently used queries, and to do so without having to rely on additional software development
  • Ability to easily edit keywords lists and various classifications treasuries.
  • Ability to link a news item to a personality, institution and other profiles.
  • Librarians' workflow for enhancing content.
  • Availability of tools for acquiring page and web content into the archive with the minimum of effort.

Newspaper editors' perspective

  • Editorial users are in a great hurry usually, Stories must come to the screen very quickly
  • All advanced search routines need hiding behind simple menus
  • Ability to narrow searches down using simple menus a must, but don't give them too many options
  • Sometimes it's very important for editors to know they are seeing everything on a subject, so this must always be an option - breadth rather than precision
  • Never more than two or maximum three clicks away from real data
  • Latest results always most relevant
  • Availability of stemming language search for Arabic, English and other scripts.

For in-depth knowledge of our approach to rich-media archiving please download the RAPID Archive White Paper (PDF, 2.1Mb) - or write to marketing@knowledgeview.co.uk for scheduling on-line trials of RAPID Archive.

KnowledgeView

KnowledgeView Ltd develops easy to use Newsroom software for media and enterprises that need to acquire, share or publish news to multiple platforms. Over 5000 journalists, information professional and 40 Media companies use KnowledgeView’s software worldwide.