STLab at ESWC 2017
STLab will be present at the 14th edition of the ESWC which will take place from May 28th, 2016 to June 1st, 2017 in Portoroz, Slovenia.
Below all STLabers presentations:
May 28th
9:30-10:30
Valentina Presutti
Frame-based sentiment analysis
Invited talk at the Third International Workshop at ESWC on Emotions, Modality, Sentiment Analysis and the Semantic Web
Abstract: Frames are at the basis of our way to contextualise and interpret natural language. The interpretation of affective communications depends on the specific context likewise any other type of verbal (or textual) communication. In this talk, Valentina Presutti will present a method for performing sentiment analysis of opinions characterised by its frame-oriented and semantic nature. In fact, the method relies on frame-oriented knowledge graphs, an ontology for opinions, and a resource of frame roles specially created for the sentiment domain.
9:50
Diego Reforgiato Recupero
An Approach for Discovering and Exploring Semantic Relationships between Genes
Workshop:Semantic Web solutions for large-scale biomedical data analytics (SeWeBMeDA-2017)
Abstract: This paper presents an approach for extracting, integrating and mining the annotations from a large corpus of gene summaries. It includes: i) a method for extracting annotations from several ontologies, mapping them into concepts and evaluating the semantic relatedness of genes, ii) the definition of a NoSQL graph database that leverages a loosely structured and multifaceted organization of data for storing concepts and their relationships, and iii) a mechanism to support the customized exploration of stored information. A prototype with a user-friendly interface fully enables users to visualize all concepts of their interest and to take advantage of their visualization for formulating biomedical hypotheses and discovering new knowledge.
11:00
Diego Reforgiato Recupero
Exploiting Cognitive Computing and Frame Semantic Features for Biomedical Document Clustering
Workshop:Semantic Web solutions for large-scale biomedical data analytics (SeWeBMeDA-2017)
Abstract: Nowadays, there are plenty of text documents in different domains that have unstructured content which makes them hard to analyze automatically. In particular, in the medical domain, this problem is even more stressed and is earning more and more attention. Medical reports may contain relevant information that can be employed, among many useful applications, to build predictive systems able to classify new medical cases thus supporting physicians to take more correct and reliable actions about diagnosis and cares. It is generally hard and time consuming inferring information for comparing unstructured data and evaluating similarities between various resources. In this work we show how it is possible to cluster medical reports, based on features detected by using two emerging tools, IBM Watson and Framester, from a collection of text documents. Experiments and results have proved the quality of the resulting clusterings and the key role that these services can play.
11:50 – 12:20
Andrea Giovanni Nuzzolese
ScholarlyData
Workshop: Scientometrics
Abstract: ScholarlyData is the new and currently the largest reference linked dataset of the Semantic Web community about papers, people, organisations, and events related to its academic conferences. Originally started from the Semantic Web Dog Food (SWDF), it addresses multiple issues on scholarly data representation and maintenance by (i) adopting a novel data model, (ii) establishing an open source workflow to support the addition of new data from the community and (iii) adopting an entity deduplication methodology. The novel data model consists of a new self-contained ontology, called the conference-ontology, which exploits good ontology design practices and Ontology Design Patterns (ODP) and is aligned to other relevant ontologies in the scholarly domain. The workflow is implemented in open source tool called cLODg, which support the production of metadata for conferences and scholarly data in nearby one-click. Finally, the entity deduplication methodology relies on blocking techniques to narrow down a list of candidate duplicate URI pairs and exploits supervised classification methods to identify candidate duplicates.
12:00
Diego Reforgiato Recupero
Bearish – Bullish Sentiment Analysis on Financial Microblogs information in Sentiment Analysis
Workshop: Semantic Sentiment Analysis Workshop
Abstract: User-generated data in blogs and social networks has recently become a valuable resource for sentiment analysis in the financial domain since it has been shown to be extremely significant to marketing research companies and public opinion organizations. In this paper a fine-grained approach is proposed to predict a real-valued sentiment score. We use several feature sets consisting of lexical features, semantic features and combination of lexical and semantic features. To evaluate our approach a microblog messages dataset is used. Since our dataset includes confidence scores of real numbers within the [0-1] range, we compare the performance of two learning methods: Random Forest and SVR. We test the results of the training model boosted by semantics against classification results obtained by n-grams. Our results indicate that our approach succeeds in performing the accuracy level of more than 72% in some cases.
14:30
Diego Reforgiato Recupero
MORE SENSE: MOvie REviews SENtiment analysis boosted with SEmantics
Workshop: Semantic Sentiment Analysis Workshop
Abstract: Sentiment analysis is becoming one of the most active area in Natural Language Processing nowadays. Its importance coincides with the growth of social media and the open space they create for expressing opinions and emotions via reviews, forum discussions, microblogs, Twitter and social networks. Most of the existing approaches on sentiment analysis rely mainly on the presence of affect words that explicitly reflect sentiment. However, these approaches are semantically weak, that is, they do not take into account the semantics of words when detecting their sentiment in text. Only recently a few approaches (e.g. sentic computing) started investigating towards this direction. Following this trend, this paper investigates the role of semantics in sentiment analysis of movie reviews. To this end, frame semantics and lexical resources such as BabelNet are employed to extract semantic features from movie reviews that lead to more accurate sentiment analysis models. Experiments are conducted with different types of semantic information by assessing their impact in movie reviews dataset. A 10-fold cross-validation shows that F1 measure increases slightly when using semantics in sentiment analysis in social media. Results show that the proposed approach considering word’s semantics for sentiment analysis is a promising direction.
May 29th
10:00-10:15
Alessandro Russo
Knowledge-driven Support for Reminiscence on Companion Robots
Presentation at the 1st International Workshop on Application of Semantic Web technologies in Robotics (AnSWeR)
Abstract: “We present our work towards the development of an application for personalized reminiscence therapy in people with dementia.
The reminiscence process aims at recalling personal memories by combining user-specific knowledge, dialogue-based human-robot interaction and multimedia content.
The application is part of a robotic software framework for companion robots, investigated in the EU MARIO project and under evaluation in different dementia care settings.”
10.15 – 10.30
Andrea Giovanni Nuzzolese
A Knowledge Management System for Assistive Robotics
Workshop: Application of Semantic Web technologies in Robotics – AnSWeR
Abstract: In this paper we demonstrate how to use an ontology network in order to fill the gap between knowledge and robots’ abilities. The demonstration is focused on two components of the knowledge management system of MARIO robots. These components are the MARIO Ontology Network (i.e. MON) that organises knowledge in MARIO and an Object-RDF mapper, called Lizard, that dynamically generates APIs on top of the MON to enable the interaction between software components that implement robot’s abilities and the MON itself.
11:50 – 12:05
Luigi Asprino
Autonomous Comprehensive Geriatric Assessment
Workshop: Application of Semantic Web technologies in Robotics – AnSWeR
Abstract: In this paper we present the MARIO’s CGA module for the Kompa ?? platform that aims at autonomously performing and managing the execution of specific tests required in the CGA process. The application relies on the CGA ontology, which is part of the Mario Ontology Network (MON), as a support to the execution of the assessment process and a reference model for storing test information.
May 30th
11:45 – 12:15
Diego Reforgiato Recupero
Presentation at the Semantic Sentiment Analysis Challenge
Abstract: The Semantic Sentiment Analysis Challenge looks for systems that can transform unstructured textual information to structured machine processable data in any domain by using recent advances in natural language processing, sentiment analysis and semantic web. By relying on large semantic knowledge bases, Semantic Web best practices and techniques, and new lexical resources, semantic sentiment analysis steps away from blind use of keywords, simple statistical analysis based on syntactical rules, but rather relies on the implicit, semantics features associated with natural language concepts. The challenge, now at its forth edition, has been successfully executed during ESWC and encourages researchers and industries to apply their systems for several tasks within the domain of sentiment analysis.
14:00 – 14:30
Andrea Giovanni Nuzzolese
Entity Deduplication on Scholarlydata
Main Conference: Track Linked Data
Abstract: ScholarlyData is the new and currently the largest reference linked dataset of the Semantic Web community about papers, people, organisations, and events related to its academic conferences. Originally started from the Semantic Web Dog Food (SWDF), it addressed multiple issues on data representation and maintenance by (i) adopting a novel data model and (ii) establishing an open source workflow to support the addition of new data from the community. Nevertheless, the major issue with the current dataset is the presence of multiple URIs for the same entities, typically in persons and organisations. In this work we: (i) perform entity deduplication on the whole dataset, using supervised classification methods; (ii) devise a protocol to choose the most representative URI for an entity and deprecate duplicated ones, while ensuring backward compatibilities for them; (iii) incorporate the automatic deduplication step in the general workflow to reduce the creation of duplicate URIs when adding new data. Our early experiment focused on the person and organisation URIs and results show significant improvement over state-of-the-art solutions. We managed to consolidate, on the entire dataset, over 100 and 800 pairs of duplicate person and organisation URIs and their associated triples (over 1,800 and 5,000) respectively, hence significantly improving the overall quality and connectivity of the data graph. Integrated into the ScholarlyData data publishing workflow, we believe that this serves a major step towards the creation of clean, high-quality scholarly linked data on the Semantic Web.