Reasoning & Querying over Large Knowledge Graphs

Axel Polleres & Sebastian Rudolph

Knowledge Graphs are typically graphs containing information about “interesting” entities and their relationships in particular domains. This includes instance data, but also schema information. Semantic Web Standards such as RDF(S) and OWL serve as a standardised format for modeling, publishing and exchanging knowledge graphs. In fact, Linked Data can be viewed as a large network of interconnected Knowledge Graphs. In this tutorial, we will learn how to perform reasoning over Knowledge Graphs published as RDF/Linked Data and how to query them. Based on examples, we will introduce central reasoning and querying techniques such as:

  • Rule-based Materialisation using RDFS
  • Tableau-based Reasoning using OWL
  • Querying with SPARQL
  • Reasoning by Query Rewriting.
  • Querying and Reasoning over contextualised graph data (Reification and Property Graphs)

Finally, time allowing, we will ask ourselves to what extent these techniques work over distributedly published Linked Data and what the involved challenges and open problems are, including federated queries, dealing with broken links, and authority of links.

Machine Learning, Embeddings for Large Knowledge Graphs

Claudia d’Amato & Heiko Paulheim

Machine Learning methods are used for the refinement of Knowledge Graphs (i.e., completing missing knowledge or marking mistakes), as well as for using Knowledge Graphs as background knowledge in data-intensive tasks. There are two main families of approaches for machine learning with knowledge graphs: a) symbolic; b) numeric or sub-symbolic. Symbolic approaches adopt symbols for representing entities and relationships of a problem domain (observations/examples) and infer generalizations of examples that provide new insights into the data/examples and that are ideally readily interpretable even by the human. Sub-symbolic approaches typically adopt feature vector (propositional) representations and lack of the ability of providing interpretable models but usually result scalable also in presence of large data collections. The talk will provide an overview of the main characteristics of the two families of approaches and will focus on embeddings (mappings from discrete objects, e.g words, to vectors of real numbers) as a useful solution for encapsulating symbol-based representations in feature vector representations to be hence ready for applying scalable numeric approaches. The talk will also highlight current research questions and directions to be potentially followed in the context of Knowledge Graphs.

Distributed Decentralised FAIR Knowledge Graphs

John Domingue & Michel Dumontier

The FAIR (Findable, Accessible, Interoperable, Reusable) principles posit specific objectives to maximize the discoverability and reusability of shared data and services. We will explore what the concept of FAIR means for knowledge graphs in accordance with the summer school theme of preservation and evolution. In particular, how do we create persistent identifiers for a knowledge graph? Can we be assured that this identifier will help retrieve the same knowledge graph a few years later?

We will expand the above notions to take into account decentralisation. In a recent article Sir Tim Berners-Lee said that the web is ‘anti-human’ because of the over-centralisation of data, often private, held by a few large companies. We will outline how decentralised technologies such as blockchains, and Sir Tim Berners-Lee’s Solid platform lead us to a future where data is owned, controlled and managed in a more equitable fashion.

The last part of the presentation will add to FAIR ‘TRADE’ (TRustable, Autonomous, DistributEd) – notions that can support the preservation and evolution of FAIR knowledge graphs in a trusted decentralised manner.

Event Timeslots (3)

Tuesday 2nd
KR Querying Reasoning For Large Knowledge Graphs

Tuesday 2nd
Machine Learning, embeddings for Large Knowledge Graphs

Wednesday 3rd
Distributed Decentralised FAIR Knowledge Graphs