Select Page

Marcus Ebner: “In a globalized science world it can be very tiring if you don’t speak the same ‘language’ or can’t read someone else´s data.”

May 26, 2011

8

All Posts

Marcus Ebner is a Geologist working as a domain expert in knowledge management in the Department of Geoinformation at the Geological Survey of Austria.

The Geological Survey of Austria is a public sector research institution that is affiliated with the Austrian Federal Ministry for Science and Research and as such the premier advisory body for the Austrian Government for geosciences. The core program covers diverse activities in a wide range of geosciences, such as geoscientific mapping, basic research, environmental monitoring including natural resources and water management and maintenance of extensive databases and archives.

PoolParty Team had the chance to talk with Mr. Ebner about the role of SKOS and other standards in the context of the INSPIRE Directive and other scenarios for data harmonisation.

1. What is the purpose of your thesaurus project?

Our thesaurus project is motivated by the increasing need for a uniform description (i.e. semantics behind the language) of our geospatial data products which should enhance the value and reusability for our stakeholders. The main driving force behind this project is the INSPIRE Directive of the European Parliament and Council which should establish an infrastructure for spatial information in Europe to support Community environmental policies, and policies or activities which may have an impact on the environment. As a public authority the Geological Survey of Austria is legally called to implement this directive for the data themes geology and mineral resources. To implement this regulation we need an agreed standard (both in terms of technical interoperability and a semantic framework for a knowledge organisation system) to start building upon.

A second very important motivation of our thesaurus project is that we want to encourage our (domain) experts to contribute to the structured knowledge on an organizational level. This would allow us organizing our knowledge management in a structured fashion.

2. Why did you choose thesauri to organize your information? What kind of problems are you able to solve with this approach?

Thesauri are the just right balance between too simple glossaries and too complicated ontologies, they allow for the semantic richness we were looking for. In terms of information science, thesauri are controlled vocabularies whose concepts are linked by semantic relations. These semantic relations are the key aspect for us because they allow information retrieval that is way beyond ordinary full text search. Thesauri are still quite simple to build or understand, but far less rigorous compared to ontologies which build on complex description logic. By attributing our geospatial datasets with concepts from controlled vocabularies we are able to generate data that will be compliant with the INSPIRE directive. Additionally we will significantly increase the semantic value of our data sets which will then allow much easier knowledge retrieval for in-house and external users.

3. Which role does SKOS and/or Linked Data play in order to achieve your goals?

SKOS, as the name implies (simple knowledge organisation system), is as a W3C standard exactly what we were looking for, simple enough to be easily implemented but complex enough to map thesauri and geared towards (semantic) web use. Publishing thesauri to the Linked open data (LOD) cloud, i.e. using http URIs as IDs for concepts, allows users to look up things with their browsers. We believe that if something is as simple as looking up things in Wikipedia with no expert knowledge or software necessary, we have lowered the barriers for users tremendously. By giving easy access to our data we hope to motivate more customers to use our services. In addition, international geoscience communities recently decided to use SKOS to encode their thesauri and publish them as linked data. Linking our thesauri with other thesauri in the LOD cloud therefore should be as simple as clicking links when browsing the web.

4. What are the most important values you generate for your stakeholders?

For users on an institutional level the most important aspect is probably that during this project an agreed controlled vocabulary is developed that can be used as a reference work. Probably it is quite similar for external users who are asking for a standard controlled vocabulary that can be referred to. In addition our data should then be readable for non-experts too.

5. What are the most important arguments to use Semantic Web standards and Linked Data, especially in the geo domain?

One of the most important arguments is that the entire science community feels the need for agreed standards. I just returned from one of the biggest geoscience meetings the European Geoscience Union held in Vienna and there were plenty of thematic sessions discussing the need for common standards to facilitate scientific collaboration and easy data access. Especially in a globalized science world working with researchers from abroad can soon become very tiring if you don’t speak the same “language” or can’t read someone else´s data.

Virtually all communication and information is nowadays transferred via the web so it was quite natural for us to select a standard like SKOS that is built for the web. One of the best examples for is probably the INSPIRE directive of the European Parliament. To be able to develop environmental laws on a European level that equally account e.g. for the water situation in the Vienna basin and the Ebro basin in Spain, the available geospatial data from EU member states need an agreed technical and semantic standard.

6. What kind of applications can be built or have been built on top of your thesauri?

Currently we have developed an application that allows users of GIS software within the Austrian Geological Survey to get access to the thesauri in their application, to be able to attribute the geospatial data with the concepts from these controlled vocabularies. After a public launch of our project we want to take the effort to build some use cases for applications with both in-house and external users, e.g. for  semantic web map applications.

7. Why did you choose PoolParty to manage your thesauri?

Using PoolParty not only facilitates the development and management of SKOS based thesauri like other SKOS editors. The biggest difference to other products is probably that it comes with a set of out of the box web-services. In the case of the SPARQL endpoint it gives machine-readable open access to the data but also fulfils the function of a bridgehead in the development of new applications for our stakeholders. In the meantime its open access service interface allows building external applications upon this service we did not consider so far. This capability that allowed us to publish our data in a standard open format right away was probably the most important argument for us.

8. How do you manage to get your thesauri used, how are you going to build an “eco-system” around your work?

This is hard to answer from the current status of our project. But we have developed a twofold strategy: (i) on an organizational level we have appointed a thesaurus editorial team to develop a controlled vocabulary that meets our domain experts approval and (ii) for external users we plan to develop a reference application to demonstrate how our thesaurus service (in form of a public SPARQL-endpoint) can be used in other applications.

9. Do you plan to publish your thesauri or parts of it on the LOD cloud? Under which licenses?

Yes, we will probably provide most of our controlled vocabularies under a “CreativeCommons – share alike” licence in accordance with major nodes in the LOD cloud.

10. What are your future plans and next steps?

The next big step is the public launch of our thesaurus service in the next months. To accomplish this we still have to do some tedious work in developing and expanding the thematic thesauri. At the same time we want to take the effort to develop and present some completed use-cases probably by a semantic web map service application so that our potential users will be able to understand how they can benefit from this new service.

A related presentation about “Thesaurus Management and INSPIRE” was given by Martin Schiegl (Geological Survey of Austria) at OGD Conference 2011

You may also like these posts …