During the Data Mining and Opinion Dynamics workshop organised by the University of Paris-Diderot, the Penelope platform was presented to the stakeholders of the Odycceus Project, including Agence France Presse (AFP), the Sciences Po Medialab, and the Institut National de l’Audiovisuel (INA). New Penelope components, including Twitter, Reddit, 4chan and 8chan data acquisition tools have been announced!
A new Penelope Component, called Semantic Frame Extractor is now online!
Frame semantics is commonly used as a methodology for representing the meaning of linguistic utterances. While semantic frames have successfully been formalised on a large scale, it is still a major challenge to automatically extract them from raw text. This Penelope component overcomes this challenge by using precision language processing techniques. Concretely, the component takes a sentence (or a list of texts) and a frame of interest (e.g. ‘Causation’) as input and returns all instances of this frame, and its frame elements, that occur in the sentence (or list of texts). The language processing part of the semantic frame extractor has been developed within the Fluid Construction Grammar (FCG) framework.
The OpenAPI specification of the component is available at https://app.swaggerhub.com/apis/EHAI/Semantic-Frame-Extractor-API/1.0.0. As all components, it can be used form any programming languages are via Penelope interfaces such as the Penelope Workbench.
Paul Van Eecke and Katrien Beuls (EHAI – VUB) gave an invited talk at the Computational Linguistics Colloquium of the Abteilung für Computerlinguistik of the HHU Düsseldorf. During this talk, they demonstrated how their Fluid Construction Grammar-based semantic frame extractor, which is available as a Penelope component, can be used to analyse causal structures in online newspaper media.