Semantic Data Management

SemData@VLDB Workshop

Workshop on Semantic Data Management (SemData)
September 17, 2010
At the 36th International Conference on Very Large Data Bases
Singapore: 13 - 17 Sept 2010, Grand Copthorne Waterfront Hotel

Workshop introduction and objectives

The Semantic Web represents the next generation Web of Data, where information is published and interlinked in order to facilitate the exploitation of its structure and meaning for both humans and machines. Semantic Web applications require database management systems for the handling of structured data, taking into consideration the models used to represent semantics. To foster the realization of the Semantic Web, the World Wide Web Consortium (W3C) developed a set of metadata models, ontology models, and query languages. Today, most of the Semantic Web repositories are database engines, which store data represented in RDF, support SPARQL queries, and can interpret schemata and ontologies represented in RDFS and OWL. We are thus at the point where the adoption of semantic technologies is growing. However, these technologies often appear to be immature, and tend to be too expensive or risky to deploy in real business. Solid data management layer concepts, architectures, and tools are important to everyone in the semantic ecosystem, and creating them requires a strong community, with a critical mass of involvement.
Semantic data management refers to a range of techniques for the manipulation and usage of data based on its meaning. It enables sustainable solutions for a range of IT environments, where the usage of today's mainstream technology is either inefficient or entirely unfeasible: enterprise data integration, life science research, data sharing in SaaS architectures, querying linked data on the Web. In a nutshell, semantic data management fosters the economy of knowledge, facilitating more comprehensive usage of larger scale and more complex datasets at lower cost.
The goal of the SemData workshop is to provide a platform for the discussion and investigation of various aspects related to semantic databases and data management in the large. Many of the semantic data management challenges cumulate in the need for scalable and performing database solutions for semantic data, a building block that runs largely behind comparable non-semantic technologies. In order to make semantic technologies take on the targeted market share, it is indispensable that technological progress allows semantic repositories to reach near performance parity with some of the best RDBMS solutions without having to omit the advantages of a higher query expressivity compared to basic key-value stores, or the higher schema flexibility compared to the relational model. It is time that one must no longer pay a heavy price in terms of longer run times or more expensive equipment for profiting from the flexibility of the generic physical model underlying the semantic graph-based structures of RDF. We also recognize that there will always be a burden with more flexibility. Hence, the goal is to minimize the drawbacks and maximize the advantages of the semantic RDF-minded repositories.

Topics and applications

The SemData workshop seeks trans-disciplinary expert discussions on issues such as semantic repositories, their virtualization and distribution, and interoperability with related database solutions such as relational, XML, graph databases or others. We thus welcome original academia and industry papers or project descriptions that propose innovative approaches for semantic data management in the large, with a particular focus on semantic database solutions including their virtualization and distribution.
The topics of interest of this workshop include but are not limited to:

  • semantic repositories and databases: storage facilities for semantic artifacts, RDF repositories, reasoning supported data management infrastructures, data base schemas optimized for semantic data, indexing structures, storage density and performance improvements
  • distribution, interoperability, and benchmarking: "Classical" semantic storage subjects: distributed repositories (data partitioning, replication, and federation); interoperability and integration with RDBMS; performance evaluation and benchmarking
  • virtualized semantic repositories: identification and composition of (fragments of) datasets in a manner, abstracting the applications from the specific setup of the data management service (e.g. local vs. remote and distribution)
  • semantic data bus: a communication layer bridging the gap between the data layer and the application layer
  • embedded data processing: "move the processing close to the data" mechanisms, allowing application-specific data processing to be performed within the semantic repository, e.g. stored procedures and engine extension APIs
  • adaptive indexing and multi-modal retrieval: strategies for dynamic materialization towards specific data- and query-patterns; indexing structures for specific types of data and queries (FTS, co-occurrence, concordance, temporal, spatial)


09.00 to 09.30 Workshop Opening / Presentation of SemData Initiative

09.30 to 10.30 Research Papers

  • Spyros Kotoulas and Jacopo Urbani. SPARQL Query Answering on a Shared-Nothing Architecture
  • Kuldeep B.R Reddy and Sreenivasa P. Kumar. Optimizing SPARQL queries over the Web of Linked Data

10.30 to 11.00 Coffee Break

11.00 to 12.30 Industry Position Papers

  • Orri Erling (OpenLink Software/Virtuoso). Directions and Challenges for Semdata
  • Atanas Kiryakov, Barry Bishoa, Damyan Ognyanoff, Ivan Peikov, Zdravko Tashev, Ruslan Velkov (Ontotext/OWLIM). The Features of BigOWLIM that Enabled the BBC’s World Cup Website
  • Jans Aasman (Franz Inc/AllegroGraph). New Developments in AllegroGraph

12.30 to 14.00 Lunch (Level 2 Kiwi Lounge)

14.00 to 15.00 Research Paper

  • Robert Binna, Wolfgang Gassler, Eva Zangerle, Dominic Pacher and Günther Specht. SpiderStore: Exploiting Main Memory for Efficient RDF Graph Representation and Fast Querying
  • Thanh Tran and Günter Ladwig. Structure Index for RDF Data

15.00 to 15.30 Presentation Wrap-Up and Question Collection

15.30 to 16.00 Coffee Break

16.00 to 17.30 Q&A and Panel


The workshop proceedings are published at CEUR-WS:


Karl Aberer
Distributed Information Systems Laboratory LSIR
Ecole Politechnique Federale de Lausanne, Switzerland

Reto Krummenacher
Semantic Technology Institute STI
University of Innsbruck, Austria

Atanas Kiryakov
Ontotext AD, Sofia, Bulgaria

Rajaraman Kanagasabai
Data Mining Department
Institute for Infocomm Research, Singapore

Technical Program Committee

Andreas Harth, KIT
Bryan Thompson, Systap
Carlos Pedrinaci, Open University
Dieter Fensel, STI Innsbruck
Martin Kersten, CWI
Elena Simperl, KIT
Fabrice Huet, INRIA - University of Nice
Kavitha Srinivas, IBM Watson
Massimo Paolucci, DoCoMo Lab Europe
Michael Witbrock, CyCorp
Orri Erling, OpenLink Software
Peter Haase, fluid Operations
Spyros Kotoulas, VU Amsterdam
Thanh Tran, KIT
Zoltan Miklos, EPFL
Frank van Harmelen, VU Amsterdam
Grigoris Antoniou, FORTH
Christian Bizer, FU Berlin
Andy Seaborne, Talis
Steve Harris, Garlik
Axel Polleres, DERI Galway
Matthias Wagner, DoCoMo Lab Europe
Tom Heath, Talis
Xavier Lopez, Oracle
Takahira Yamaguchi, Keio University
Manfred Hauswirth, DERI Galway
Panagiotis Karras, National University of Singapore
Sebastian Link, Victoria University of Wellington
Sherif Sakr, University of New South Wales
Stefano Ceri, Politecnico di Milano
Jans Aasman, Franz Inc.


Phone: +43 (0)512 5076452
Fax: +43 (0)512 507 94906452