Retrieving data.bnf.fr data

The data.bnf.fr project has to be placed in the context of our move towards open data. This approach has been defined by the W3C, regarding the “semantic web” or “linked data”.
Find out more about how semantic web and linked data are used in the BnF.

This is about structuring resources in order to make them reusable by machines in a better way. The data.bnf.fr project uses data which have been created in various formats such as Intermarc for the catalogue of printed books, XML-EAD for archives inventories and Dublin Core for the digital library.

Such data is automatically gathered, modelled and enriched and are published in the RDF semantic web language. The result is available on the website, in different RDF syntaxes: RDF-XML, RDF-N3, and RDF-NT.

Part of the data is matched with external value vocabularies: id.loc.gov for languages and nationalities, dewey.info for subjects, DCMI type for document types. They are also matched with data sets that are identified by CKAN: DBpedia and VIAF. The pages for RAMEAU sujects headings are matched with other thesaurus, from libraries (LCSH, DnB, BNE) or more specialized (Agrovoc, Geonames, Thesaurus W).

Understanding data.bnf.fr data model

What the Bibliothèque nationale de France provides

  • URI for resources: all resources have permanent identifiers, granted via the ARK process which is the way to find all resources of the library.
  • a display of data in RDF as “linked open data”: available for every page and for the whole database

How to retrieve data.bnf.fr data

o hôte : echanges.bnf.fr
port : 21

o login : databnf
mot de passe : databnf

  • by using HTTP and following URL to retrieve dumps, according to the type of ressources :
  • complete rdf dump (rdf/xml)
  • RAMEAU subjects rdf dumps (xml)
  • RAMEAU subjects rdf dumps (n3)
  • RAMEAU subjects rdf dumps (nt)
  • documents-about rdf dump (rdf/xml)
  • documents-about rdf dump (n3)
  • documents-about rdf dump (nt)
  • authors rdf dump (rdf/xml)
  • authors rdf dump (n3)
  • authors rdf dump (nt)
  • organizations rdf dump (rdf/xml)
  • organizations rdf dump (n3)
  • organizations rdf dump (nt)
  • contributions rdf dump (rdf/xml)
  • contributions rdf dump (n3)
  • contributions rdf dump (nt)
  • works rdf dump (xml)
  • works rdf dump (n3)
  • works rdf dump (nt)
  • editions rdf dump (rdf/xml)
  • editions rdf dump (n3)
  • editions rdf dump (nt)
  • studies rdf dump (rdf/xml)
  • studies rdf dump (n3)
  • studies rdf dump (nt)
  • places rdf dump (rdf/xml)
  • places rdf dump (n3)
  • places rdf dump (nt)
  • dates rdf dump (rdf/xml)
  • dates rdf dump (n3)
  • dates rdf dump (nt)
  • performances rdf dump (rdf/xml)
  • performances rdf dump (n3)
  • performances rdf dump (nt)
  • serials rdf dump (rdf/xml)
  • serials rdf dump (n3)
  • serials rdf dump (nt)
  • external links rdf dump (rdf/xml)
  • external links rdf dump (n3)
  • external links rdf dump (nt)
  • geographical codes rdf dump(rdf/xml)
  • geographical codes rdf dump (n3)
  • geographical codes rdf dump (nt)
  • musical style codes rdf dump (rdf/xml)
  • musical style codes rdf dump (n3)
  • musical style codes rdf dump (nt)
  • role codes rdf dump (rdf/xml)
  • role codes rdf dump (n3)
  • role codes rdf dump (nt)
  • You can consult the licence to use our data.

    The software used: CubicWeb

    CubicWeb is an open source platform to develop Web semantic applications and is published under LGPL licence.

    Within the project, this software is used to:

    • Extract and integrate data from heterogeneous sources and in various formats (CSV, MARC, Dublin Core, EAD-XML, RDF, …),
    • Merge, match and gather them in a SQL base,
    • Generate pages in any format, in this case: HTML, JSON, RDF-XML or PDF.

    It is based on the query language RQL (Relation Query Language), which is similar to the W3C'sSPARQL and the Python language.

    .

    In 2013, CubicWeb won the Dataconnexions award, organized by Etalab, a body affiliated to France's Prime Ministry, whose objective is to encourage efforts towards public open data.