Open data at the Swedish National Heritage Board
LARS LUNDQVIST, HEAD OF INFORMATION DEVELOPMENT, THE NATIONAL HERITAGE BOARD SWEDEN
For a number of years, the National Heritage Board in Sweden has worked intently to develop a common infrastructure for Swedish digitised cultural heritage, so that it can be searched, shared, and reused on contemporary media terms. Lars recounts the coordinated efforts of the National Heritage Board to ensure a sustainable infrastructure and consistent open licensing of data across the vast Swedish heritage landscape.
New conditions for heritage management
The conditions for and potential of information management and communication have been changed radically by the introduction of new media and new technology. The web enables individuals and institutions to share information and collaborate on a global scale. This challenges local perspectives and prerogatives and affects traditional business models. What was relevant in a pre-web, pre-digital era has in many cases become obsolete.
These changes have influenced strategic thinking at the Swedish National Heritage Board (NHB). New media are not merely new mass medial channels, they call for new ways of thinking. The institution’s traditional role as broadcaster, which expects users to visit the institution on a 6-day/6-hour basis, is challenged by a constantly accessible communication network that encompasses a wide range of information resources and experts in many fields. Digitization has changed our users’ expectations and their behavior and all this together has created an urge to adapt to these changes in order to stay relevant. 
In this paper I will briefly outline some actions we have taken at the NHB. Focus will be on open data and the significance of licensing metadata. The overall objective is to facilitate use and reuse of digital information for research, community planning, education, creativity and cultural creative industries.
Open heritage and the semantic web
One of the NHB’s tasks is to provide the Swedish historic preservation sector with information on ancient monuments and historic buildings as well as infrastructure for museums. Since 2008, the NHB has been work- ing under a government mandate to manage and develop the web service Swedish Open Cultural Heritage (SOCH)1 – an infrastructure mainly for historic preservation and museum domains. SOCH aggregates informa- tion from over 20 institutions (5.1 million objects as of September 2013) thereby enabling searches across institutional boundaries. SOCH’s aim is to streamline information searches and, by serving as an open resource for application developers, to stimulate application development. The NHB and SOCH also act as a national aggregator for Europeana.
SOCH provides the basis for the NHB’s work with linked open data. At the NHB we believe that the semantic web can be a remedy for fragmented cultural heritage resources, and we aim to replace current unstructured information resources that inhibit search and usability with an infrastructure built on semantic principles.*2
Digitization spoils us
It isn’t easy to comprehend where the digitization of society will take us. Development is rapid. It is difficult to predict what will become the norm and which technologies have the longevity to be implemented in an agency such as the NHB. However, it is not an option to simply sit and wait for the future, because it is due to come to us every day. The NHB has existed for some 380 years and has so far been able to adjust to societal changes over the centuries.
So here we are, online, taking for granted unprecedented access to a vast quantity of information and services, 24/7. We are all becoming more impatient and, in a sense, spoilt. Tasks that 10 years ago took us days or weeks to accomplish are now completed in minutes or hours. We experience this change every day and may scarcely be aware of it anymore. It happens in our everyday life as well as at work.
The nature of digital information
The nature of digital information on the web differs radically from analog information in a mass media context. This is not always fully understood within public sector institutions, and this sometimes creates a mismatch when it comes to information management. Information managers need to understand how digital information can “act” on the web:
- The moment you publish information on the web, you lose control of it.
- On the web, borders are irrelevant, whether they are political-administrative or institutional.
- Digital social networks are becoming the primary platforms for diffusion of ideas and opinions.
- People look for information on the web in order to solve a problem. It is of secondary importance which institution manages information. 
“If it’s not findable, who cares about it?”
The web provides enormous amounts of information. The number of institutions sharing their collections is growing, as is the number of collection items. The abundance of sites may not pose problems for experts who are very familiar with the institutional landscape and seek a clearly delineated range of information. But for those who don’t know which institution holds what information, things get very complicated.
This is where the semantic web comes into play. Its strength lies in the fact that it makes heritage information accessible, visible, and findable, and allows it to be linked to related data.*3
In the long run, this may be the key to maintaining relevance in a digital era. As described in a tweet I saw a few years ago:
“If your content is not interoperable, it’s not findable. If it’s not findable, who cares about it?”
Knowledge is elsewhere
Openness is not just about distributing information. It is also a matter of being present in order to interact and cooperate with the people who want to follow you. Ideally, openness allows you to work together with members of the community.
This is important when it comes to developing content in our databases. We have to realize that the true experts on cultural heritage and historic preservation are not necessarily working at the NHB, and that we will never be able to maximize quality and completeness of our content without help from an external community.
Another crucial question, and one that is more relevant for this paper, is: How will we ever be able to provide all of our different target groups such as researchers, heritage managers, exhibition producers, teachers etc. with information and services? It is quite obvious that the NHB alone will never be able to meet all of society’s needs, because we will never be able to build services or applications to support all of our target groups. First, we must admit that developing applications is not one of our strengths at the NHB. Furthermore, our knowledge of existing target groups and their needs is limited, and we can’t predict the emergence of new ones in the future.
We must also consider our role as managers of information. Is it our business as information managers to restrict reuse of publicly funded data? Is it our mission to control and direct how individuals use digital cultural heritage? In my opinion it is ideologically problematic not to release digitized cultural heritage – cultural heritage is a common property and concern.
A new agenda: Open data
The NHB is pursuing a strategy that will allow us to release raw data so as to make it as reusable as possible. As it is our hope to be able to support the needs not only of the historic preservation community, we aim to ensure that information can be used for other purposes, by other user groups as well. Our role will be focused on quality issues and the development of the NHB information system to better support work within the cultural heritage sector and beyond. In short, this might be expressed as follows:
Digitization at the NHB shall make it as easy as possible, for as many as possible, to use and reuse culture heritage information. The ultimate goal is to enable people and institutions to share content beyond the boundaries of applications and websites.
The obvious way to make our data accessible in this way is to work within the guidelines that define open data, which according to Wikipedia can be defined as:
“the idea that certain data should be freely available to everyone to use and republish as they wish, without restrictions from copyright, patents or other mechanisms of control. (...) [T]he term ‘open data’ [is] gaining popularity with the (...) launch of open-data government initiatives.”*4
The Public Sector Information (PSI) directive, which aims to remove barriers that hinder the re-use of public sector information throughout the EU, also supports the idea of implementing “openness.” It points out that all agencies must make their data accessible for reuse.*5
There are good reasons for spending less resources on institutional, domain specific silos or “portals”. Instead, more effort should be put into licensing data and developing an infrastructure that enables the efficient distribution and unrestricted use of data. The goal is to stimulate stakeholders like researchers, municipality planners, the tourism industry and many others to link to and access remote information in their own systems and applications via technical interfaces.
How to communicate “openness”?
There are clear indications that public sector institutions are becoming more generous in allowing re-use of digital material. This is mostly communicated on an institution’s website, often conditionally, with phrases such as: “Feel free to use the image but you must describe how you will use it”, or “You may use the image, but you are not allowed to make derivative works”. This model might work when users need information from only one or a few institutions. But this doesn’t work if information is compiled from many sources, for instance in cross-search services like SOCH and Europeana. How will users and application developers learn how the material can be re-used?
To solve this problem terms for reuse must be formalized in a machine readable format. Creative Commons licensing, for example, is one model that allows a copyright holder to set terms for reuse of protected works.
Creative Commons licenses (CC) are used when a copyright holder wants to give people the right to share, use, and build upon a copyrighted work. Importantly, CC licenses can be made machine readable. CC licenses can be applied to all works falling under copyright, including books, plays, movies, music, articles, photographs, blogs, and websites. Creative Commons is not to be used for works with expired protection, e.g. works in the public domain.*6
Working with digitization is not just about replacing old tools with new ones. The change goes deeper than that. Digitization has brought about new behaviors and new expectations within society and in some cases, new mandates from government. Each institution must explore the possibilities that digitalization offers, and design an appropriate strategy. In the case of the NHB, making its store of information and knowledge available via the semantic web both plays to its strengths and goes a long way toward fulfilling its stated mission.
Acknowledgment: Special thanks to Leslie Spitz-Edson for text improvements.
*1 en.wikipedia.org/wiki/Swedish_Open_Cultural_Heritage , accessed 31 March 2013.
*2 en.wikipedia.org/wiki/Semantic_web , accessed 31 March 2013.
*3 An interesting vision to explore is the idea of a Cultural Commons, in the way Europeana describes it: pro.europeana.eu/web/guest/cultural-commons , accessed 5 April 2013. Read more on this topic in Jill Cousins’ article p. 132.
*4 en.wikipedia.org/wiki/Open_data , accessed 31 March 2013.
*5 ec.europa.eu/digital-agenda/en/open-data-0 , accessed 31 March 2013.
*6 Read more on Creative Commons in Martin von Haller Grønbæk’s article p. 141, and at creativecommons.org/ og en.wikipedia.org/wiki/Creative_Commons_licensescreativecommons.org/ , accessed 31 March 2013.