Semantic Web and data model

The project has to be placed in the context of our move towards open data. This approach has been defined by the W3C, regarding the “semantic web” or “linked data”.
Find out more about how semantic web and linked data are used in the BnF.

This is about structuring resources in order to make them reusable by machines in a better way. The project uses data which have been created in various formats such as Intermarc for the catalogue of printed books, XML-EAD for archives inventories and Dublin Core for the digital library.

Such data is automatically gathered, modelled and enriched and are published in the RDF semantic web language. The result is available on the website, in different RDF syntaxes: RDF-XML, RDF-N3, and RDF-NT.

Part of the data is matched with external value vocabularies: for languages and nationalities, for subjects, DCMI type for document types. They are also matched with data sets that are identified by CKAN: DBpedia and VIAF. The pages for RAMEAU sujects headings are matched with other thesaurus, from libraries (LCSH, DnB, BNE) or more specialized (Agrovoc, Geonames, Thesaurus W).

Understanding data model

What the Bibliothèque nationale de France provides

  • URI for resources: all resources have permanent identifiers, granted via the ARK process which is the way to find all resources of the library.
  • a display of data in RDF as “linked open data”: available for every page and for the whole database

How to retrieve data

o hôte :
port : 21

o login : databnf
mot de passe : databnf

  • by using HTTP and following URL to retrieve dumps, according to the type of ressources on the website

    You can consult the licence to use our data.

    The software used: CubicWeb

    CubicWeb is an open source platform to develop Web semantic applications and is published under LGPL licence.

    Within the project, this software is used to:

    • Extract and integrate data from heterogeneous sources and in various formats (CSV, MARC, Dublin Core, EAD-XML, RDF, …),
    • Merge, match and gather them in a SQL base,
    • Generate pages in any format, in this case: HTML, JSON, RDF-XML or PDF.

    It is based on the query language RQL (Relation Query Language), which is similar to the W3C'sSPARQL and the Python language.

    In 2013, CubicWeb won the Dataconnexions award, organized by Etalab, a body affiliated to France's Prime Ministry, whose objective is to encourage efforts towards public open data.