The main goals of data.bnf.fr are:
The data.bnf.fr project aims to make the National Library of France's data more useful on the Web. This data is of various orders; in particular, it makes it possible to describe and identify the documents curated at the BnF, as well as the people or organizations that created them. The site makes it possible to gather around its pages of authors, works, themes, places, dates and periodicals resources of the Bibliothèque nationale de France, as well as external resources. These pages connect the various contents, links and services that the institution provides on the web, which for technical reasons are scattered in the several applications of the BnF. The project is also part of a process of opening up BnF to the Web of data and adopting the standards of the Semantic Web.
Launched in July 2011, data.bnf.fr continues to evolve and grow.
Data on data.bnf.fr are available under the French Open License, notably used by data.gouv.fr. Reuse and reproduction of RDF data is free and open to any uses, including commercial ones. An attribution statement is required.
More on this: Terms of use for data.bnf.fr
Data.bnf.fr is thus strongly positioned in the open data initiative. Driven by civic actors and governments, open public data aims to make available non-nominative data, which is neither privacy nor security related and collected or produced by public organizations. Incorporated into French legislation through the transposition of the 2003 European Directive (Directive 2003/98/EC of the European Parliament and of the Council of 17 November 2003 on the re-use of public sector information) in Ordinance n°2005-650 of June 6, 2005 on freedom of access to administrative documents and the re-use of public information, the opening of public data is part of a national policy.
Their main concerns are democratic and economic, namely on the one hand to make public action more transparent and efficient and to rationalize the creation of public data, by sharing data, and on the other hand to develop economic activity by making re-usable information available, whether commercially or not.
These purposes are congruent with the missions of the Bibliothèque nationale de France, namely "to provide access to the collections for most people, while respecting the secrets protected by law, in accordance with the legislation on intellectual property and in a way that is compatible with the preservation of these collections," and to allow "remote consultation using the most modern technologies for data delivery" (art. R341-2 of the Heritage Code).
Thus, it is a matter of sharing with citizens the benefits of libraries' work on identifying and reporting the collections they hold, including digital collections. It is a path to improve the circulation and reuse of BnF data, making it interoperable to give it a larger audience on the Web.
.Data.bnf.fr displays high-quality structured data.
HTML pages of data.bnf.fr are autogenerated from data and identifiers found in the several databases of the BnF: BnF general catalog, BnF archives and manuscripts, Gallica. HTML pages are generated according to computer workflows using semantic web technologies.
As legal deposit of documents published in France is mandatory, the collections available on authors and works are very comprehensive and reflect the diversity of French cultural production. Several million documents, free of rights, are digitized and freely accessible in Gallica.
The authority records are the root of the site's pages: the "person and organization authorities" for author pages, the "title authorities" for work pages, and the "RAMEAU authorities" (the subject indexing language used at BnF) for subject pages.
In 2021, data.bnf.fr has almost complete coverage of good quality catalog records, including over 2 million authors.
Data.bnf.fr extracts, transforms, and aggregates data from separate databases produced in different formats into a common database in order to link them together and make them interoperable.
His pages are indexed by search engines, whereas they do not reference the data and metadata available in BnF's non-indexable databases, and point to digitized documents.
To achieve this, data.bnf.fr relies on several components:
Data.bnf.fr extracts, transforms, and aggregates data from separate databases produced in different formats into a common database in order to link them together and make them interoperable.
His pages are indexed by search engines, whereas they do not reference the data and metadata available in BnF's non-indexable databases, and point to digitized documents.
To achieve this, data.bnf.fr relies on several components:
Authority records are the core of data structure: information from different sources that are related to the same authors, works, or themes are thus aggregated in these pages.
The Author pages gather all bibliographic records containing a link to the author's identifier.
Work pages collect all records containing both a link to the author's identifier and a link to the work ID. Without a link, a simple string comparison alignment mechanism is activated.
The Theme pages aggregate information about a given theme (the different ways of naming it, preferred and rejected forms, at BnF and at other institutions, according to several vocabularies) and the works about this theme.
Also found in data.bnf.fr:
Place pages built from two distinct types of records (Rameau on the one hand, Department of Maps and Plans on the other), gradually merged as single pages providing, in particular, geographical coordinates.
Date pages that display relationships between works, organizations, authors, documents, etc. and that date.
Performance pages that collect related bibliographic records.
Serial pages, also constructed from periodical bibliographic records, provide brief information about the title, and where applicable, related authors.
Data.bnf.fr enables experimenting a new way of structuring information, no longer centered on the document but on the author's work. However, the work each document relates to is rarely described in the catalog (less than 8% of the documents). If we wanted to do this manually on the 12 million records of the catalog, we would have to spend 45 years, at a rate of 2 minutes per document. Yet a national process is currently underway, the Bibliographic Transition, to put this new pattern in practice (by adopting the IFLA-LRM, Library Reference Model).
The BnF is therefore experimenting a semi-automatic process to generate the description of each work based on information describing its successive releases. The first corpus processed concerns 20th century printed works.
For each author, the titles of his publications are retrieved to cluster them by similarity. And for each group, a program computes the work-related information according to what is found in the documents (alternative title forms, translation titles, date of first publication, other authors). The results of these computations are then put online at data.bnf.fr to evaluate the relevance of the process.
.It is also exposed to the scrutiny of users, who are invited to give feedbacks.
As the issues may have several origins (source data, clustering criteria, etc.), the BnF cannot guarantee to correct them quickly. It can, however, include it in the corrections to be made, for the time when these same works are versed in the general catalog.
You can also participate in this major project and help us improve the reliability of the data by reporting the errors you identify in these autogenerated works: data[at]bnf.fr.
Data.bnf.fr fits into the Web by providing links that redirect the user to resources inside or outside the BnF.
There are several types of links:
>Data.bnf.fr uses data produced in a various formats, including Intermarc for book catalogs, XML-EAD for archive inventories and manuscripts, and Dublin Core for the digital library.
These data are restructured, aggregated, enriched by automatic processing, and published according to the W3C recommendation for the Semantic Web, RDF. The result is available on this site, in several RDF syntaxes: RDF-XML, RDF-N3, and RDF-NT.
More about this: Semantic Web and Data Model
General considerations on BnF ARKs
BnF assigns ID in the ARK 12148 (Bibliothèque nationale de France) domain according to the following principles.
Resource mutability
The mutability of resources present in data.bnf.fr and identified by ARKs is defined as follows.
Addressing authority
Data.bnf.fr addressing authority manages the following generic service qualifiers:
Availability
The services (except sparql) and data of data.bnf.fr are accessible 24 hours a day 7 days a week. It should be mentioned that temporary unavailability may be related to internal service issues and are not always foreseeable.
For more information, on the BnF website: The ARK (Archival Resource Key) identifier.
The major project developments are summarized on this page.
Data.bnf.fr and Gallica were awarded the Stanford Prize for Innovation in Research Libraries (SPIRL).FOUCHER Tiphaine, « Le web de données en pratique : data.bnf.fr », Vidéo coproduite par la BnF et le Cnfpt.
LEVOIN Xavier, 2021. Data.bnf.fr : améliorer la découvrabilité des contenus culturels sur le web, Archimag, n°341, p. 28-29.
LAPÔTRE Raphaëlle, 2018. Data.bnf.fr as a sandbox for FRBRization: automated work creation in data.bnf.fr, SWIB18 : https://youtu.be/-cabjegojNw.
LAPÔTRE Raphaëlle, 2017. Library Metadata on the Web: the Example of data.bnf.fr, JLIS.it 8, 3, p. 58-70. Doi: 10.4403/jlis.it-12402.
BERMES Emmanuelle, 2016. Vers de nouveaux catalogues. Paris : Cercle de la librairie.
BERMES Emmanuelle, BOULET Vincent, LECLAIRE Céline, 2016. Améliorer l’accès aux données des bibliothèques sur le web : l’exemple de data.bnf.fr.IFLA World Library and Information Congress : http://library.ifla.org/1447/1/081-bermes-fr.pdf.
BERMES Emmanuelle, 2014. Les bibliothèques sur le Web.Les catalogues au défi du Web (session 2) : http://video.cnfpt.fr/conferences-1/les-catalogues-au-defi-du-web-les-bibliotheques-sur-le-web.
BERMES Emmanuelle, avec la collaboration d’Antoine Isaac et Gautier Poupeau, 2013. Le Web sémantique en bibliothèque. Paris : Cercle de la librairie.