Skip to main content
Weiter zur Homepage der Europäischen Kommission (öffnet in neuem Fenster)
Deutsch Deutsch
CORDIS - Forschungsergebnisse der EU
CORDIS

Cross-Lingual Embeddings for Less-Represented Languages in European News Media

CORDIS bietet Links zu öffentlichen Ergebnissen und Veröffentlichungen von HORIZONT-Projekten.

Links zu Ergebnissen und Veröffentlichungen von RP7-Projekten sowie Links zu einigen Typen spezifischer Ergebnisse wie Datensätzen und Software werden dynamisch von OpenAIRE abgerufen.

Leistungen

Final context-dependent and dynamic embeddings technology (T1.2) (öffnet in neuem Fenster)

Contextaware crosslingual embeddings which will enable improved understanding of short texts such as user comments in the context of an emerging comment thread and the news story being commented report and source code T12

Initial cross-lingual and multilingual embeddings technology (T1.1) (öffnet in neuem Fenster)

Initial embeddings and transformations between a selection of all targeted languages (Estonian, Finnish, Swedish, Latvian, Lithuanian, Croatian, Slovene, English, Russian) (report and source code) (T1.1)

Initial cross-lingual semantic enrichment technology (T2.1) (öffnet in neuem Fenster)

Initial approach to named entity (NE) extraction and disambiguation and event detection, covering multiple domains and languages (report and source code) (T2.1).

Datasets, benchmarks and evaluation metrics for cross-lingual content analysis (T4.4) (öffnet in neuem Fenster)

Gathering and preprocessing training and testing data (Estonian, Latvian, Lithuanian, Russian, Croatian, Finnish and English) provided by the media partners (report and dataset) (T4.4) .

Initial deep network architecture (T1.3) (öffnet in neuem Fenster)

Deep neural networks will be adapted to morphologically rich languages by using character-level inputs and additional information on morphology (suffixes, prefixes, separately trained POS tags) (report and source code) (T1.3).

Interim report on ethics and responsible science and journalism (T6.5) (öffnet in neuem Fenster)

Interim report on ethics and responsible science and journalism, with analysis of news production and new tool development (T6.5).

Final evaluation report on cross-lingual user generated content filtering and analysis technology (T3.4) (öffnet in neuem Fenster)

Producing datasets for evaluation and development of algorithms T34

Final dynamic multilingual news generation technology (T5.2) (öffnet in neuem Fenster)

Development of a novel method for automatically organising news articles to be maximally informative to the assumed reader report and source code T52

Final cross-lingual news viewpoints identification technology (T4.3) (öffnet in neuem Fenster)

Development of methods for detecting viewpoints and sentiments based on media sources report and source code T43

Final real-time multilingual news linking technology (T4.1) (öffnet in neuem Fenster)

Development of tools for linking news stories across languages based on their topics andcontents report and source code T41

Final evaluation report on cross-lingual content analysis technology (T4.4) (öffnet in neuem Fenster)

All tools developed in WP4 will be evaluated using the produced datasets and manual user evaluation T44

Final report on ethics and responsible science and journalism (T6.5). (öffnet in neuem Fenster)

Final report on ethics and responsible science and journalism T65

Initial interpretability and visualisation technology (T1.4) (öffnet in neuem Fenster)

Initial approaches to explanation of deep learning models by adoptation of perturbation based explanation methods based on coalitional game theory to ext classification and initial development of visual tools for visually explaining the classification process. (report and source code) (T1.4).

Final tehnology for multilingual and self-explainable news generation (T5.1) (öffnet in neuem Fenster)

Based on the analysis of newsrooms WP6 the NLG technology will be adapted for the requirements of news generation The task will develop mechanisms for i determining what is interesting or important in the given data and deciding what to report and for ii rendering that information in an accurate manner iii in multiple languages report and source code T51

Final evaluation report on cross-lingual embedding technology (T1.5) (öffnet in neuem Fenster)

Report on evaluation of the crosslingual and multilingual embeddings on public datasetsand challenges T15

Initial context-dependent and dynamic embeddings technology (T1.2) (öffnet in neuem Fenster)

Context-aware cross-lingual embeddings which will enable improved understanding of short texts such as user comments in the context of an emerging comment thread and the news story being commented (report and source code) (T1.2).

Report on user needs and challenges for news media industry (T6.1). (öffnet in neuem Fenster)

Initial report on identification and analysis of needs of different stakeholders in news media industry. We will arrange workshop to identify in detail challenges that are specific to operations of different media partners and prepare a specifications documentation (T6.1).

Recommendations on avoiding gender and other biases (T6.4) (öffnet in neuem Fenster)

The means to avoid and detect gender and other biases in news media contents creation will be developped in T6.4. This deliverable will propose the recommendations for avoiding gender bias (T6.4).

Final interpretability and visualisation technology (T1.4) (öffnet in neuem Fenster)

Adoptation of three most popular perturbation based explanation methods based on coalitional game theory IME LIME and SHAP to be suitable for text classification and development of visualisation techniques where different explanatory lexical units in the source texts words ngrams sentences are visualizedreport and source code T14

Initial cross-lingual context and opinion analysis technology (T3.1) (öffnet in neuem Fenster)

Report on initial developed technology for a range of user comment analyses, including topic modelling, conversation structure and context modelling, sentiment, stance and opinion detection and effect and information spread measurement (report and source code) (T3.1).

Final report on gender bias in content creation (T6.4) (öffnet in neuem Fenster)

Final report on gender bias in content creation T64

Reusable EMBEDDIA components available through the ClowdFlows web interface (T7.4) (öffnet in neuem Fenster)

Developed tools and procedures will be incorporated as widgets and make them available beyond the media context and assure reusability and repeatability of experiments report and source code T74

Initial multilingual news linking technology (T4.1) (öffnet in neuem Fenster)

Development of initial tools for linking news stories across languages based on their topics and contents (report and source code) (T4.1).

Initial keyword extraction techniques (T2.2) (öffnet in neuem Fenster)

Initial keyword extraction by application of statistical approaches (based on heuristics), machine learning approaches, as well as graph-based approaches (report and source code) (T2.2).

Final cross-lingual news summarisation and visualisation technology (T4.2) (öffnet in neuem Fenster)

Development of textual and visual languageindependent multidocument news summarisation report and source code T42

Initial dynamic news generation technology (T5.2) (öffnet in neuem Fenster)

Development of a novel method for automatically organising news articles, considering the domain of the article, effects of time and news repetition (report and source code) (T5.2).

Refined analysis of news media partners’ needs and challenges (T6.1). (öffnet in neuem Fenster)

Refined report of news media partners’ needs and challenges and their analysis with regard to the state of the art in NLP for news media (T6.1).

Final cross-lingual and multilingual embeddings technology (T1.1) (öffnet in neuem Fenster)

Embeddings and transformations between all targeted languages including EstonianFinnish Swedish Latvian Lithuanian Croatian Slovene as well as English and Russian report and source code T11

Report generator from multilingual comments (T3.3) (öffnet in neuem Fenster)

Report on developed and implemented methods for generating humanreadable reports in multiple languages from the outputs of the methods developed in T31 and T32 report and source code T33

Datasets, benchmarks and evaluation metrics for cross-lingual user generated content filtering and analysis (T3.4) (öffnet in neuem Fenster)

Evaluation and development of algorithms requires relevant, annotated, and multilingual datasets (report and dataset) (T3.4).

Final evaluation report on advanced cross-lingual NLP technology (T2.4) (öffnet in neuem Fenster)

Final report on existing evaluation datasets and benchmarks for NER NEL and event detection for instance ACE Meantime and TAC KBPs Entity Discovery and Linking tasks report and dataset T24

Final deep network architecture (T1.3) (öffnet in neuem Fenster)

Deep neural networks will be adapted to morphologically rich languagesby using characterlevel inputs and additional information on morphology suffixes prefixes separately trained POS tags report and source code T13

Multilingual language generation approach (T2.3) (öffnet in neuem Fenster)

Incorporating hybrid techniques in the architecture, to take advantage of the robustness of machine learning techniques and transparency of rule-based techniques. Adaptation of the context-aware word-embeddings developed in T1.2 to improve fluency and variability in the generated texts (report and source code) (T2.3).

Final multilingual keyword extraction techniques (T2.2) (öffnet in neuem Fenster)

Application and further development of statistical approaches based on heuristicsmachine learning approaches as well as graphbased approaches report and source code T22

Initial news generation technology (T5.1) (öffnet in neuem Fenster)

Based on the analysis of newsrooms (WP6), the NLG technology will be adapted for the requirements of news generation. The task will develop mechanisms for (i) determining what is interesting or important in the given data and deciding what to report, and for (ii) rendering that information in an accurate manner (iii) in multiple languages (report and source code) (T5.1).

Final report on EMBEDDIA Assistant platform evaluation (T6.3) (öffnet in neuem Fenster)

Final report on EMBEDDIA Assistant platform evaluation by media partners T63

Platform requirements documentation and platform design (T6.2) (öffnet in neuem Fenster)

The EMBEDDIA Toolkit will incorporate different tools and resources developed in WP1–WP5 and on top of it build the EMBEDDIA Media Assistant platform. The platform will be built as a series of base microservices, functional microservices and task oriented APIs. This deliverable will report on platform requirements and platform design (T6.2).

Final cross-lingual comment filtering technology (T3.2) (öffnet in neuem Fenster)

Final report on developed tools for automatic flagging or filtering of user comments specifically targeted at the use cases defined by end user partners in WP6 eg detection of hate speech and political trolling attempts to elicit extreme reactions and influence others opinions report and source code T32

Initial cross-lingual news viewpoints identification technology (T4.3) (öffnet in neuem Fenster)

Initial approaches for detecting viewpoints and sentiments based on media sources (report and source code) (T4.3) .

Final cross-lingual semantic enrichment technology (T2.1) (öffnet in neuem Fenster)

Generalization of approaches to multiple domains and languages large scale corpora and integrating crosslingual embeddings report and source code T21

Creative multilingual technology for news and headline generation (T5.3) (öffnet in neuem Fenster)

We will make the generated texts more varied and colourful by generating creative expressions especially in headlines report and source code T53

Final cross-lingual context and opinion analysis technology (T3.1) (öffnet in neuem Fenster)

Final report on developed technology for a range of user comment analyses including topic modelling conversation structure and context modelling sentiment stance and opinion detection and effect and information spread measurement report and source code T31

Datasets, benchmarks and evaluation metrics for advanced cross-lingual NLP technology (T2.4) (öffnet in neuem Fenster)

Report on existing evaluation datasets and benchmarks for NER, NEL and event detection (for instance, ACE, Meantime and TAC KBP’s Entity Discovery and Linking tasks) (report and dataset) (T2.4).

Initial cross-lingual comment filtering technology (T3.2) (öffnet in neuem Fenster)

Report on developed tools for automatic flagging or filtering of user comments, specifically targeted at the use cases defined by end user partners in WP6, e.g., detection of hate speech and political trolling, attempts to elicit extreme reactions and influence others’ opinions (report and source code) (T3.2).

Datasets, benchmarks and evaluation metrics for multilingual text generation (T5.4) (öffnet in neuem Fenster)

From news partners texts (news stories) and structured datasets from which news can be generated will be collected (report and datasets) and methodology for evaluation defined (T5.4).

Selected EMBEDDIA components in ClowdFlows (T7.4) (öffnet in neuem Fenster)

Initial selection of tools and procedures incorporated as widgets in webbased platform Clowsflows to make them available beyond the media context and assure reusability and repeatability of experiments report and source code T74

Initial cross-lingual news summarisation and visualisation technology (T4.2) (öffnet in neuem Fenster)

Development of textual and visual language-independent multi-document news summarisation (report and source code) (T4.2).

Final evaluation report on multilingual text generation technology (T5.4) (öffnet in neuem Fenster)

Final evaluation report on multilingual text generation technology T54

Datasets, benchmarks and evaluation metrics for cross-lingual word embeddings (T1.5) (öffnet in neuem Fenster)

A repository of training and evaluation data, stored in a dedicated GitHub repository (report and datasets) (T1.5).

Final EMBEDDIA Media Assistant platform, packaged in docker container (T6.2) (öffnet in neuem Fenster)

Final EMBEDDIA Media Assistant platform incorporating different tools and resourcespackaged in docker container report and source code T62

Project website and social media accounts (T7.1) (öffnet in neuem Fenster)

Created project website --- which will function both as a project dissemination tool and for providing access to the technical outcomes produced by the project --- and social media accounts/pages on relevant social networks will be created (T7.1)

Veröffentlichungen

To BAN or Not to BAN: Bayesian Attention Networks for Reliable Hate Speech Detection (öffnet in neuem Fenster)

Autoren: Kristian Miok, Blaž Škrlj, Daniela Zaharie, Marko Robnik-Šikonja
Veröffentlicht in: Cognitive Computation, 2021, ISSN 1866-9956
Herausgeber: Springer Verlag
DOI: 10.1007/s12559-021-09826-9

Cross-lingual alignments of ELMo contextual embeddings (öffnet in neuem Fenster)

Autoren: Ulčar, Matej; Robnik-Šikonja, Marko
Veröffentlicht in: Neural Computing and Applications, Ausgabe 3, 2022, ISSN 0941-0643
Herausgeber: Springer Verlag
DOI: 10.1007/s00521-022-07164-x

NeSyChair: Automatic Conference Scheduling Combining Neuro-Symbolic Representations and Constrained Clustering (öffnet in neuem Fenster)

Autoren: Škvorc, Tadej; Lavrač, Nada; Robnik-Šikonja, Marko
Veröffentlicht in: IEEE Access, Ausgabe 10, 2022, ISSN 2169-3536
Herausgeber: Institute of Electrical and Electronics Engineers Inc.
DOI: 10.1109/ACCESS.2022.3144932

autoBOT: evolving neuro-symbolic representations for explainable low resource text classification (öffnet in neuem Fenster)

Autoren: Blaž Škrlj, Matej Martinc, Nada Lavrač, Senja Pollak
Veröffentlicht in: Machine Learning, 2021, ISSN 0885-6125
Herausgeber: Kluwer Academic Publishers
DOI: 10.1007/s10994-021-05968-x

MICE: Mining Idioms with Contextual Embeddings (öffnet in neuem Fenster)

Autoren: Škvorc, Tadej; Gantar, Polona; Robnik-Šikonja, Marko
Veröffentlicht in: Knowledge-Based Systems, Ausgabe 237, 2022, ISSN 0950-7051
Herausgeber: Elsevier BV
DOI: 10.1016/j.knosys.2021.107606

Zero-Shot Learning for Cross-Lingual News Sentiment Classification (öffnet in neuem Fenster)

Autoren: Andraž Pelicon, Marko Pranjić, Dragana Miljković, Blaž Škrlj, Senja Pollak
Veröffentlicht in: Applied Sciences, Ausgabe 10/17, 2020, Seite(n) 5993, ISSN 2076-3417
Herausgeber: MDPI
DOI: 10.3390/app10175993

Supervised and Unsupervised Neural Approaches to Text Readability (öffnet in neuem Fenster)

Autoren: Matej Martinc; Senja Pollak; Marko Robnik-Šikonja
Veröffentlicht in: Computational Linguistics, Ausgabe 47.1, 2021, Seite(n) 141-179, ISSN 0891-2017
Herausgeber: MIT Press
DOI: 10.1162/coli_a_00398

Nazaj v prihodnost: avtomatizacija in preobrazba novinarske epistemologije (öffnet in neuem Fenster)

Autoren: Igor Vobič, Marko Robnik Šikonja, Monika Kalin Golob
Veröffentlicht in: Javnost - The Public, Ausgabe 26/sup1, 2019, Seite(n) S41-S61, ISSN 1318-3222
Herausgeber: European Institute for Communication and Culture
DOI: 10.1080/13183222.2019.1696600

What makes a reporter human? A Research Agenda for Augmented Journalism (öffnet in neuem Fenster)

Autoren: Lindén, Carl-Gustav
Veröffentlicht in: Questions de communication, 2020, ISSN 2259-8901
Herausgeber: Presses universitaires de Lorraine
DOI: 10.4000/questionsdecommunication.23301

Cross-lingual Transfer of Sentiment Classifiers (öffnet in neuem Fenster)

Autoren: Robnik-Šikonja, Marko; Reba, Kristjan; Mozetič, Igor
Veröffentlicht in: Slovenščina 2.0, Ausgabe 9(1), 2021, Seite(n) 1-25, ISSN 2335-2736
Herausgeber: Ljubljana University Press, Faculty of Arts
DOI: 10.4312/slo2.0.2021.1.1-25

Completability vs (In)completeness (öffnet in neuem Fenster)

Autoren: Eleni Gregoromichelaki, Gregory James Mills, Christine Howes, Arash Eshghi, Stergios Chatzikyriakidis, Matthew Purver, Ruth Kempson, Ronnie Cann, Patrick G. T. Healey
Veröffentlicht in: Acta Linguistica Hafniensia, Ausgabe 52/2, 2020, Seite(n) 260-284, ISSN 0374-0463
Herausgeber: Nordisk Sprog- og Kulturforlag
DOI: 10.1080/03740463.2020.1795549

TNT-KID: Transformer-based neural tagger for keyword identification (öffnet in neuem Fenster)

Autoren: Matej Martinc, Blaž Škrlj, Senja Pollak
Veröffentlicht in: Natural Language Engineering, 2021, Seite(n) 1-40, ISSN 1351-3249
Herausgeber: Cambridge University Press
DOI: 10.1017/s1351324921000127

Investigating cross-lingual training for offensive language detection (öffnet in neuem Fenster)

Autoren: Andraž Pelicon, Ravi Shekhar, Blaž Škrlj, Matthew Purver, Senja Pollak
Veröffentlicht in: PeerJ Computer Science, Ausgabe 7, 2021, Seite(n) e559, ISSN 2376-5992
Herausgeber: PeerJ Publishing
DOI: 10.7717/peerj-cs.559

Journalistic Passion as Commodity : A Managerial Perspective (öffnet in neuem Fenster)

Autoren: Carl-Gustav Lindén; Katja Lehtisaari; Mikko Grönlund; Mikko Villi
Veröffentlicht in: Journalism Studies, Ausgabe 22(12), 2021, Seite(n) 1701--1719, ISSN 1461-670X
Herausgeber: Routledge
DOI: 10.1080/1461670x.2021.1911672

Re-Representing Metaphor: Modeling Metaphor Perception Using Dynamically Contextual Distributional Semantics (öffnet in neuem Fenster)

Autoren: Stephen McGregor, Kat Agres, Karolina Rataj, Matthew Purver, Geraint Wiggins
Veröffentlicht in: Frontiers in Psychology, Ausgabe 10, 2019, ISSN 1664-1078
Herausgeber: Frontiers Research Foundation
DOI: 10.3389/fpsyg.2019.00765

Towards Robust Text Classification with Semantics-Aware Recurrent Neural Architecture (öffnet in neuem Fenster)

Autoren: Blaž Škrlj, Jan Kralj, Nada Lavrač, Senja Pollak
Veröffentlicht in: Machine Learning and Knowledge Extraction, Ausgabe 1/2, 2019, Seite(n) 575-589, ISSN 2504-4990
Herausgeber: MDPI AG
DOI: 10.3390/make1020034

Predicting Slovene Text Complexity Using Readability Measures

Autoren: Tadej Škvorc, Simon Krek, Senja Pollak, Špela Arhar Holdt, Marko Robnik-Šikonja
Veröffentlicht in: In Contributions to Contemporary History, 2019, ISSN 2463-7807
Herausgeber: OJS/PKP

Combining n -grams and deep convolutional features for language variety classification (öffnet in neuem Fenster)

Autoren: Matej Martinc, Senja Pollak
Veröffentlicht in: Natural Language Engineering, Ausgabe 25/5, 2019, Seite(n) 607-632, ISSN 1351-3249
Herausgeber: Cambridge University Press
DOI: 10.1017/S1351324919000299

TermEnsembler (öffnet in neuem Fenster)

Autoren: Andraž Repar, Vid Podpečan, Anže Vavpetič, Nada Lavrač, Senja Pollak
Veröffentlicht in: Terminology, Ausgabe 25/1, 2019, Seite(n) 93-120, ISSN 0929-9971
Herausgeber: John Benjamins Publishing Company
DOI: 10.1075/term.00029.rep

Reproduction, replication, analysis and adaptation of a term alignment approach (öffnet in neuem Fenster)

Autoren: Andraž Repar, Matej Martinc, Senja Pollak
Veröffentlicht in: Language Resources and Evaluation, 2019, ISSN 1574-020X
Herausgeber: Springer Verlag
DOI: 10.1007/s10579-019-09477-1

‘Our task is to demystify fears’: Analysing newsroom management of automation in journalism (öffnet in neuem Fenster)

Autoren: Marko Milosavljević, Igor Vobič
Veröffentlicht in: Journalism, 2019, Seite(n) 146488491986159, ISSN 1464-8849
Herausgeber: SAGE Publications
DOI: 10.1177/1464884919861598

Methods and visualization tools for the analysis of medical, political and scientific concepts in Genealogies of Knowledge (öffnet in neuem Fenster)

Autoren: Saturnino Luz, Shane Sheehan
Veröffentlicht in: Palgrave Communications, Ausgabe 6/1, 2020, ISSN 2055-1045
Herausgeber: Humanities and Social Sciences Communications
DOI: 10.1057/s41599-020-0423-6

Exploring the Relations Between Net Benefits of IT Projects and CIOs’ Perception of Quality of Software Development Disciplines (öffnet in neuem Fenster)

Autoren: Damjan Vavpotič, Marko Robnik-Šikonja, Tomaž Hovelja
Veröffentlicht in: Business & Information Systems Engineering, 2019, ISSN 2363-7005
Herausgeber: Springer Gabler
DOI: 10.1007/s12599-019-00612-4

Data Journalism as a Service: Digital Native Data Journalism Expertise and Product Development (öffnet in neuem Fenster)

Autoren: Ester Appelgren, Carl-Gustav Lindén
Veröffentlicht in: Media and Communication, Ausgabe 8/2, 2020, Seite(n) 62, ISSN 2183-2439
Herausgeber: Cogitatio
DOI: 10.17645/mac.v8i2.2757

How Furiously Can Colorless Green Ideas Sleep? Sentence Acceptability in Context (öffnet in neuem Fenster)

Autoren: Jey Han Lau, Carlos Armendariz, Shalom Lappin, Matthew Purver, Chang Shu
Veröffentlicht in: Transactions of the Association for Computational Linguistics, Ausgabe 8, 2020, Seite(n) 296-310, ISSN 2307-387X
Herausgeber: The MIT Press
DOI: 10.1162/tacl_a_00315

Computational generation of slogans (öffnet in neuem Fenster)

Autoren: Khalid Alnajjar, Hannu Toivonen
Veröffentlicht in: Natural Language Engineering, 2020, Seite(n) 1-33, ISSN 1351-3249
Herausgeber: Cambridge University Press
DOI: 10.1017/S1351324920000236

In the Name of the Right to be Forgotten: New Legal and Policy Issues and Practices regarding Unpublishing Requests in Slovenian Online News Media (öffnet in neuem Fenster)

Autoren: Marko Milosavljević, Melita Poler, Rok Čeferin
Veröffentlicht in: Digital Journalism, 2020, Seite(n) 1-17, ISSN 2167-0811
Herausgeber: Taylor & Francis
DOI: 10.1080/21670811.2020.1747942

(Mis)Information Operations: An Integrated Perspective

Autoren: Cinelli, Matteo; Conti, Mauro; Finos, Livio; Grisolia, Francesco; Kralj Novak, Petra; Peruzzi, Antonio; Tesconi, Maurizio; Zollo, Fabia; Quattrociocchi, Walter
Veröffentlicht in: Journal of Information Warfare, Ausgabe 18(3), 2020, ISSN 1445-3312
Herausgeber: Mt. Eliza : Teamlink Australia

A Multilingual Study of Multi-Sentence Compression using Word Vertex-Labeled Graphs and Integer Linear Programming

Autoren: Linhares Pontes, Elvys; Huet, Stéphane; Torres Moreno, Juan Manuel; Gouveia da Silva, Thiago; Carneiro Linhares, Andréa
Veröffentlicht in: Computación y Sistemas, Ausgabe 24(2), 2020, ISSN 1405-5546
Herausgeber: Centro de Investigacion en Computacion (CIC) del Instituto Politecnico Nacional (IPN)

Automated Journalism as a Source of and a Diagnostic Device for Bias in Reporting (öffnet in neuem Fenster)

Autoren: Leo Leppänen, Hanna Tuulonen, Stefanie Sirén-Heikel
Veröffentlicht in: Media and Communication, Ausgabe 8/3, 2020, Seite(n) 39, ISSN 2183-2439
Herausgeber: Cogitatio
DOI: 10.17645/mac.v8i3.3022

tax2vec: Constructing Interpretable Features from Taxonomies for Short Text Classification (öffnet in neuem Fenster)

Autoren: Blaž Škrlj, Matej Martinc, Jan Kralj, Nada Lavrač, Senja Pollak
Veröffentlicht in: Computer Speech & Language, Ausgabe 65, 2021, Seite(n) 101104, ISSN 0885-2308
Herausgeber: Academic Press
DOI: 10.1016/j.csl.2020.101104

Knowledge Graph informed Fake News Classification via Heterogeneous Representation Ensembles (öffnet in neuem Fenster)

Autoren: Koloski, Boshko; Stepišnik-Perdih, Timen; Robnik-Šikonja, Marko; Pollak, Senja; Škrlj, Blaž
Veröffentlicht in: Neurocomputing journal, 2022, ISSN 0925-2312
Herausgeber: Elsevier BV
DOI: 10.1016/j.neucom.2022.01.096

Cross-lingual transfer of abstractive summarizer to less-resource language (öffnet in neuem Fenster)

Autoren: Aleš Žagar, Marko Robnik-Šikonja
Veröffentlicht in: Journal of Intelligent Information Systems, 2021, ISSN 0925-9902
Herausgeber: Kluwer Academic Publishers
DOI: 10.1007/s10844-021-00663-8

Bisociative Literature-Based Discovery: Lessons Learned and New Word Embedding Approach (öffnet in neuem Fenster)

Autoren: Nada Lavrač, Matej Martinc, Senja Pollak, Maruša Pompe Novak, Bojan Cestnik
Veröffentlicht in: New Generation Computing, Ausgabe 38/4, 2020, Seite(n) 773-800, ISSN 0288-3635
Herausgeber: Springer Verlag
DOI: 10.1007/s00354-020-00108-w

Propositionalization and embeddings: two sides of the same coin (öffnet in neuem Fenster)

Autoren: Nada Lavrač; Nada Lavrač; Blaž Škrlj; Marko Robnik-Šikonja
Veröffentlicht in: Machine Learning, Ausgabe 109, 2020, ISSN 0885-6125
Herausgeber: Kluwer Academic Publishers
DOI: 10.1007/s10994-020-05890-8

Automating News Comment Moderation with Limited Resources: Benchmarking in Croatian and Estonian (öffnet in neuem Fenster)

Autoren: Shekhar, Ravi; Pranjić. Marko; Pollak, Senja; Pelicon, Andraž; Purver, Matthew
Veröffentlicht in: Journal for Language Technology and Computational Linguistics, Ausgabe 2, 2020, Seite(n) 49-79, ISSN 2190-6858
Herausgeber: German Society for Computational Linguistics and Language Technology (GSCL)
DOI: 10.5281/zenodo.4032371

Enhancing deep neural networks with morphological information (öffnet in neuem Fenster)

Autoren: Klemen, Matej; Krsnik, Luka; Robnik-Šikonja, Marko
Veröffentlicht in: Natural Language Engineering, 2022, ISSN 1351-3249
Herausgeber: Cambridge University Press
DOI: 10.1017/S1351324922000080

Slovene and Croatian word embeddings in terms of gender occupational analogies (öffnet in neuem Fenster)

Autoren: Matej Ulčar, Anka Supej, Marko Robnik-Šikonja, Senja Pollak
Veröffentlicht in: Slovenščina 2.0: empirical, applied and interdisciplinary research, Ausgabe 9/1, 2021, Seite(n) 26-59, ISSN 2335-2736
Herausgeber: Ljubljana University Press, Faculty of Arts
DOI: 10.4312/slo2.0.2021.1.26-59

MELHISSA: A Multilingual Entity Linking Architecture for Historical Press Articles (öffnet in neuem Fenster)

Autoren: Linhares Pontes, Elvys; Cabrera-Diego, Luis Adrian; Moreno, Jose G.; Boros, Emanuela; Hamdi, Ahmed; Doucet, Antoine; Sidere, Nicolas; Coustaty, Mickael
Veröffentlicht in: International Journal on Digital Libraries, 2021, ISSN 1432-1300
Herausgeber: Springer
DOI: 10.1007/s00799-021-00319-6

Recycling a genre for news automation (öffnet in neuem Fenster)

Autoren: Lauri Haapanen, Leo Leppänen
Veröffentlicht in: AILA Review, Ausgabe 33, 2020, Seite(n) 67-85, ISSN 1461-0213
Herausgeber: John Benjamins Publishing Company
DOI: 10.1075/aila.00030.haa

Incremental Composition in Distributional Semantics (öffnet in neuem Fenster)

Autoren: Matthew Purver, Mehrnoosh Sadrzadeh, Ruth Kempson, Gijs Wijnholds, Julian Hough
Veröffentlicht in: Journal of Logic, Language and Information, Ausgabe 30/2, 2021, Seite(n) 379-406, ISSN 0925-8531
Herausgeber: Kluwer Academic Publishers
DOI: 10.1007/s10849-021-09337-8

Kratt: Developing an Automatic Subject Indexing Tool for the National Library of Estonia (öffnet in neuem Fenster)

Autoren: Asula, Marit; Makke, Jane; Freienthal, Linda; Kuulmets, Hele-Andra; Sirel, Raul
Veröffentlicht in: Cataloging & Classification Quarterly, Ausgabe 59:8, 2021, Seite(n) 775-793, ISSN 0163-9374
Herausgeber: Haworth Press Inc.
DOI: 10.1080/01639374.2021.1998283

SNoRe: Scalable Unsupervised Learning of Symbolic Node Representations (öffnet in neuem Fenster)

Autoren: Sebastian Meznar, Nada Lavrac, Blaz Skrlj
Veröffentlicht in: IEEE Access, Ausgabe 8, 2020, Seite(n) 212568-212588, ISSN 2169-3536
Herausgeber: Institute of Electrical and Electronics Engineers Inc.
DOI: 10.1109/access.2020.3039541

Token-Level Multilingual Epidemic Dataset for Event Extraction (öffnet in neuem Fenster)

Autoren: Stephen Mutuvi, Emanuela Boros, Antoine Doucet, Gaël Lejeune, Adam Jatowt, Moses Odeo
Veröffentlicht in: Linking Theory and Practice of Digital Libraries - 25th International Conference on Theory and Practice of Digital Libraries, TPDL 2021, Virtual Event, September 13–17, 2021, Proceedings, Ausgabe 12866, 2021, Seite(n) 55-59, ISBN 978-3-030-86323-4
Herausgeber: Springer International Publishing
DOI: 10.1007/978-3-030-86324-1_6

Entity Linking for Historical Documents: Challenges and Solutions (öffnet in neuem Fenster)

Autoren: Elvys Linhares Pontes, Luis Adrián Cabrera-Diego, Jose G. Moreno, Emanuela Boros, Ahmed Hamdi, Nicolas Sidère, Mickaël Coustaty, Antoine Doucet
Veröffentlicht in: Digital Libraries at Times of Massive Societal Transition - 22nd International Conference on Asia-Pacific Digital Libraries, ICADL 2020, Kyoto, Japan, November 30 – December 1, 2020, Proceedings, Ausgabe 12504, 2020, Seite(n) 215-231, ISBN 978-3-030-64451-2
Herausgeber: Springer International Publishing
DOI: 10.1007/978-3-030-64452-9_19

Prioritization of COVID-19-Related Literature via Unsupervised Keyphrase Extraction and Document Representation Learning (öffnet in neuem Fenster)

Autoren: Blaž Škrlj, Marko Jukič, Nika Eržen, Senja Pollak, Nada Lavrač
Veröffentlicht in: Discovery Science - 24th International Conference, DS 2021, Halifax, NS, Canada, October 11–13, 2021, Proceedings, Ausgabe 12986, 2021, Seite(n) 204-217, ISBN 978-3-030-88941-8
Herausgeber: Springer International Publishing
DOI: 10.1007/978-3-030-88942-5_16

Identification of COVID-19 Related Fake News via Neural Stacking (öffnet in neuem Fenster)

Autoren: Boshko Koloski, Timen Stepišnik-Perdih, Senja Pollak, Blaž Škrlj
Veröffentlicht in: Combating Online Hostile Posts in Regional Languages during Emergency Situation - First International Workshop, CONSTRAINT 2021, Collocated with AAAI 2021, Virtual Event, February 8, 2021, Revised Selected Papers, Ausgabe 1402, 2021, Seite(n) 177-188, ISBN 978-3-030-73695-8
Herausgeber: Springer International Publishing
DOI: 10.1007/978-3-030-73696-5_17

FinEst BERT and CroSloEngual BERT - Less Is More in Multilingual Models (öffnet in neuem Fenster)

Autoren: Matej Ulčar, Marko Robnik-Šikonja
Veröffentlicht in: Text, Speech, and Dialogue - 23rd International Conference, TSD 2020, Brno, Czech Republic, September 8–11, 2020, Proceedings, Ausgabe 12284, 2020, Seite(n) 104-111, ISBN 978-3-030-58322-4
Herausgeber: Springer International Publishing
DOI: 10.1007/978-3-030-58323-1_11

RaKUn: Rank-based Keyword Extraction via Unsupervised Learning and Meta Vertex Aggregation (öffnet in neuem Fenster)

Autoren: Blaž Škrlj, Andraž Repar, Senja Pollak
Veröffentlicht in: Statistical Language and Speech Processing - 7th International Conference, SLSP 2019, Ljubljana, Slovenia, October 14–16, 2019, Proceedings, Ausgabe 11816, 2019, Seite(n) 311-323, ISBN 978-3-030-31371-5
Herausgeber: Springer International Publishing
DOI: 10.1007/978-3-030-31372-2_26

Language Comparison via Network Topology (öffnet in neuem Fenster)

Autoren: Blaž Škrlj, Senja Pollak
Veröffentlicht in: Statistical Language and Speech Processing - 7th International Conference, SLSP 2019, Ljubljana, Slovenia, October 14–16, 2019, Proceedings, Ausgabe 11816, 2019, Seite(n) 112-123, ISBN 978-3-030-31371-5
Herausgeber: Springer International Publishing
DOI: 10.1007/978-3-030-31372-2_10

Prediction Uncertainty Estimation for Hate Speech Classification (öffnet in neuem Fenster)

Autoren: Kristian Miok, Dong Nguyen-Doan, Blaž Škrlj, Daniela Zaharie, Marko Robnik-Šikonja
Veröffentlicht in: Statistical Language and Speech Processing - 7th International Conference, SLSP 2019, Ljubljana, Slovenia, October 14–16, 2019, Proceedings, Ausgabe 11816, 2019, Seite(n) 286-298, ISBN 978-3-030-31371-5
Herausgeber: Springer International Publishing
DOI: 10.1007/978-3-030-31372-2_24

Symbolic Graph Embedding Using Frequent Pattern Mining (öffnet in neuem Fenster)

Autoren: Blaž Škrlj, Nada Lavrač, Jan Kralj
Veröffentlicht in: Discovery Science - 22nd International Conference, DS 2019, Split, Croatia, October 28–30, 2019, Proceedings, Ausgabe 11828, 2019, Seite(n) 261-275, ISBN 978-3-030-33777-3
Herausgeber: Springer International Publishing
DOI: 10.1007/978-3-030-33778-0_21

EMBEDDIA Tools, Datasets and Challenges: Resources and Hackathon Contributions (öffnet in neuem Fenster)

Autoren: Pollak, Senja; Robnik-Šikonja, Marko; Purver, Matthew; Boggia, Michele; Shekhar, Ravi; Pranjić, Marko; Salmela, Salla; Krustok, Ivar; Paju, Tarmo; Linden, Carl-Gustav; Leppänen, Leo; Zosa, Elaine; Ulčar, Matej; Freienthal, Linda; Traat, Silver; Cabrera-Diego, Luis Adrián; Martinc, Matej; Lavrač, Nada; Škrlj, Blaž; Žnidaršič, Martin; Pelicon, Andraž; Koloski, Boshko; Podpečan, Vid; Kra
Veröffentlicht in: Ausgabe Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation (EACL2021), 2021
Herausgeber: Association for Computational Linguistics
DOI: 10.5281/zenodo.4730464

EMBEDDIA hackathon report: Automatic sentiment and viewpoint analysis of Slovenian news corpus on the topic of LGBTIQ+ (öffnet in neuem Fenster)

Autoren: Martinc, Matej; Perger, Nina; Pelicon, Andraž; Ulčar, Matej; Vezovnik, Andreja; Pollak, Senja
Veröffentlicht in: In the Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation (EACL2021), 2021
Herausgeber: Association for Computational Linguistics
DOI: 10.5281/zenodo.4730336

Exploring Neural Language Models via Analysis of Local and Global Self-Attention Spaces (öffnet in neuem Fenster)

Autoren: Škrlj, Blaž; Sheehan, Shane; Eržen, Nika; Robnik-Šikonja, Marko; Luz, Saturnino; Pollak, Senja
Veröffentlicht in: In the Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation (EACL2021), 2021
Herausgeber: Association for Computational Linguistics
DOI: 10.5281/zenodo.4730396

Grammatical Profiling for Semantic Change Detection

Autoren: Giulianelli, Mario; Kutuzov, Andrey; Pivovarova, Lidia
Veröffentlicht in: In the Proceedings of the 25th Conference on Computational Natural Language Learning (CoNLL 2021), 2021, Seite(n) 423-434
Herausgeber: ACL

Cross-lingual Transfer of Twitter Sentiment Models Using a Common Vector Space (öffnet in neuem Fenster)

Autoren: Robnik-Šikonja, Marko; Reba, Kristijan; Mozetič, Igor
Veröffentlicht in: In Proceedings of the Conference on Language Technologies and Digital Humanities, JTDH2020, 2020, Seite(n) 87-92
Herausgeber: Institute of Contemporary History
DOI: 10.5281/zenodo.4059725

When a Computer Cracks a Joke: Automated Generation of Humorous Headlines

Autoren: Alnajjar, Khalid; Hämäläinen, Mika
Veröffentlicht in: In the Proceedings of the 12th International Conference on Computational Creativity (ICCC21), 2021, ISBN 978-989-54160-3-5
Herausgeber: Association for Computational Creativity

Knowledge graph aware text classification (öffnet in neuem Fenster)

Autoren: Petrželková, Nela; Škrlj, Blaž; Lavrač, Nada
Veröffentlicht in: In Proceedings of the 23rd International Multiconference – IS2020, 2020
Herausgeber: Jožef Stefan Institute
DOI: 10.5281/zenodo.4072961

Relation Classification via Relation Validation (öffnet in neuem Fenster)

Autoren: Moreno, Jose G.; Doucet, Antoine; Grau, Brigitte
Veröffentlicht in: Proceedings of the 6th Workshop on Semantic Deep Learning (SemDeep-6), 2021
Herausgeber: Association for Computational Linguistics
DOI: 10.5281/zenodo.4730492

Simple ways to improve NER in every language using markup (öffnet in neuem Fenster)

Autoren: Cabrera-Diego, Luis Adrián; Moreno, Jose G.; Doucet, Antoine
Veröffentlicht in: In Proceedings of ECIR 2021, 2021
Herausgeber: CEUR Workshops
DOI: 10.5281/zenodo.4680998

A bilingual approach to specialised adjectives through word embeddings in the karstology domain (öffnet in neuem Fenster)

Autoren: Grčić Simeunović, Larisa; Martinc, Matej; Vintar, Špela
Veröffentlicht in: In Proceedings of TOTH 2020, 2020
Herausgeber: Université Savoie Mont Blanc
DOI: 10.5281/zenodo.6435390

Using contextual and cross-lingual word embeddings to improve variety in template-based NLG for automated journalism (öffnet in neuem Fenster)

Autoren: Rämö, Miia; Leppänen, Leo
Veröffentlicht in: In the Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation (EACL2021), 2021
Herausgeber: Association for Computational Linguistics
DOI: 10.5281/zenodo.4730334

Know your Neighbors: Efficient Author Profiling via Follower Tweets (öffnet in neuem Fenster)

Autoren: Koloski, Boško; Pollak, Senja; Škrlj, Blaž
Veröffentlicht in: Notebook for PAN at CLEF 2020, 2020
Herausgeber: CEUR-WS.org
DOI: 10.5281/zenodo.4059641

Corpus KAS 2.0: Cleaner and with New Datasets

Autoren: Žagar, Aleš; Kavaš, Matic; Robnik-Šikonja, Marko
Veröffentlicht in: In Proceedings of the 24th International Multiconference – IS2021 (Slovenian Conference on Artificial Intelligence), 2021
Herausgeber: Jožef Stefan Institute

Atténuer les erreurs de numérisation dans la reconnaissance d'entités nommées pour les documents historiques (öffnet in neuem Fenster)

Autoren: Emanuela Boros; Ahmed Hamdi; Elvys Linhares Pontes; Luis Adrián Cabrera-Diego; Jose G. Moreno; Nicolas Sidère; Antoine Doucet
Veröffentlicht in: Ausgabe 29, 2021
Herausgeber: l’Association Francophone de Recherche d’Information et Applications ARIA
DOI: 10.5281/zenodo.4734435

Automated Hate Speech Target Identification

Autoren: Pelicon, Andraž; Škrlj, Blaž; Kralj Novak, Petra
Veröffentlicht in: In Proceedings of the 24th International Multiconference – IS2021 (Slovenian Conference on Artificial Intelligence), 2021
Herausgeber: Jožef Stefan Institute

Slav-NER: the 3rd Cross-lingual Challenge on Recognition, Normalization,Classification, and Linking of Named Entities across Slavic languages (öffnet in neuem Fenster)

Autoren: Piskorski et al
Veröffentlicht in: In Proceedings of the 8th Workshop on Balto-Slavic Natural Language Processing in conjunction to EACL2021, 2021
Herausgeber: Association for Computational Linguistics
DOI: 10.5281/zenodo.4730512

Bayesian Methods for Semi-supervised Text Annotation

Autoren: Miok, Kristian; Pirs, Gregor; Robnik-Sikonja, Marko
Veröffentlicht in: In Proceedings of the 14th Linguistic Annotation Workshop Co-located with COLING 2020, Ausgabe 2, 2020
Herausgeber: Association for Computational Linguistics

Training dataset and dictionary sizes matter in BERT models: the case of Baltic languages

Autoren: Ulčar, Matej; Robnik-Šikonja, Marko
Veröffentlicht in: In the Proceedings of the 10th International Conference on Analysis of Images, Social Networks and Texts (AIST 2021), 2021
Herausgeber: Springer

Preliminary experimentation with combinations and extensions of forward-looking sentence detection wordlists

Autoren: Štihec, Jan; Pollak, Senja; Žnidaršič, Martin
Veröffentlicht in: In Proceedings of the 3rd financial narrative processing workshop, 2021
Herausgeber: Association for Computational Linguistics

Bayesian BERT for Trustful Hate Speech Detection

Autoren: Miok, Kristian; Škrlj, Blaž; Zaharie, Daniela; Robnik-Šikonja, Marko
Veröffentlicht in: ICML 2020 Workshop on Uncertainty & Robustness in Deep Learning, 2021
Herausgeber: ICML UDL

Underreporting of errors in NLG output, and what to do about it

Autoren: van Miltenburg, Emiel; Clinciu, Miruna; Dušek, Ondrej; Gkatzia, Dimitra; Inglis, Stephanie; Leppänen, Leo; Mahamood, Saad; Manning, Emma; Schoch, Stephanie; Thomson, Craig; Wen, Luou
Veröffentlicht in: In the Proceedings of the 14th International Conference on Natural Language Generation, 2021
Herausgeber: Association for Computational Linguistics

Primerjava slovenskih besednih vektorskih vložitev z vidika spola na analogijah poklicev

Autoren: Supej, Anka; Ulčar, Matej; Robnik-Šikonja, Marko; Pollak, Senja
Veröffentlicht in: In the Proceedings of the Conference on Language Technologies and Digital Humanities (JTDH 2021), 2021, Seite(n) 93-100
Herausgeber: Slovensko društvo za jezikovne tehnologije

Simple discovery of COVID ISWAR Metaphors Using Word Embeddings

Autoren: Brglez, Mojca; Pollak, Senja; Vintar, Špela
Veröffentlicht in: In Proceedings of the 24th International Multiconference – IS2021 (SiKDD), 2021
Herausgeber: Jožef Stefan Institute

COVID-19 v slovenskih spletnih medijih: analiza s pomočjo računalniške obdelave jezika

Autoren: Pollak, Senja; Martinc, Matej; Pelicon, Andraž; Ulčar, Matej; Vezovnik, Andreja
Veröffentlicht in: Pandemična družba: slovensko sociološko srečanje, 2021
Herausgeber: Slovenska sociološka družba

Visual Topic Modelling for NewsImage Task at MediaEval 2021 (öffnet in neuem Fenster)

Autoren: Pivovarova, Lidia; Zosa, Elaine
Veröffentlicht in: MediaEval 2021 Multimedia Benchmark Workshop : Work ing Notes Proceedings of the MediaEval 2021 Workshop, 2021
Herausgeber: MediaEval Multimedia Benchmark
DOI: 10.5281/zenodo.6384719

TeMoTopic: Temporal Mosaic Visualisation of Topic Distribution, Keywords, and Context (öffnet in neuem Fenster)

Autoren: Sheehan, Shane; Luz, Saturnino; Masoodian, Masood
Veröffentlicht in: In the Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation (EACL2021), 2021
Herausgeber: Association for Computational Linguistics
DOI: 10.5281/zenodo.4730388

Robust Named Entity Recognition and Linking on Historical Multilingual Documents (öffnet in neuem Fenster)

Autoren: Boros, Emanuela; Linhares Pontes, Elvys; Cabrera-Diego, Luis Adrián; Hamdi, Ahmed; Moreno, Jose G.; Sidère, Nicolas; Doucet, Antoine
Veröffentlicht in: Working Notes of CLEF 2020 - Conference and Labs of the Evaluation Forum (CLEF-HIPE 2020), 2020
Herausgeber: http://ceur-ws.org/
DOI: 10.5281/zenodo.4059652

Impact Analysis of Document Digitization on Event Extraction (öffnet in neuem Fenster)

Autoren: Nguyen, Nhu Khoa; Boroş, Emanuela; Lejeune, Gaël; Doucet, Antoine
Veröffentlicht in: In 4th Workshop on Natural Language for Artificial Intelligence (NL4AI 2020) co-located with the 19th International Conference of the Italian Association for Artificial Intelligence (AI* IA 2020), 2020
Herausgeber: CEUR Workshop Proceedings
DOI: 10.5281/zenodo.4680744

Using a Frustratingly Easy Domain and Tagset Adaptation for Creating Slavic Named Entity Recognition Systems (öffnet in neuem Fenster)

Autoren: Cabrera-Diego, Luis Adrián; Moreno, Jose G.; Doucet, Antoine
Veröffentlicht in: In Proceedings of the 8th Workshop on Balto-Slavic Natural Language Processing in conjunction to EACL2021, 2021
Herausgeber: Association for Computational Linguistics
DOI: 10.5281/zenodo.4730478

Topic modelling discourse dynamics in historical newspapers

Autoren: Marjanen, Jani; Zosa, Elaine; Hengchen, Simon; Pivovarova, Lidia; Tolonen, Mikko
Veröffentlicht in: In Post-Proceedings of the DHN2020 Conference: the 5th conference on Digital Humanities in the Nordic Countries, 2021
Herausgeber: CEUR Workshop Proceedings (CEUR-WS.org)

BERT meets Shapley: Extending SHAP Explanations to Transformer-based Classifiers (öffnet in neuem Fenster)

Autoren: Kokalj, Enja; Škrlj, Blaž; Lavrač, Nada; Pollak, Senja; Robnik-Šikonja, Marko
Veröffentlicht in: In the Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation (EACL2021), 2021
Herausgeber: Association for Computational Linguistics
DOI: 10.5281/zenodo.4730384

Multilingual Epidemic Event Extraction (öffnet in neuem Fenster)

Autoren: Mutuvi, Stephen; Boros, Emanuela; Doucet, Antoine; Lejeune, Gaël; Jatowt, Adam; Odeo, Moses
Veröffentlicht in: In the Proceedings of ICADL 2021, 2021, ISBN 978-3-030-91669-8
Herausgeber: Springer
DOI: 10.1007/978-3-030-91669-5_12

Transformer-based Methods for Recognizing Ultra Fine-grained Entities (RUFES) (öffnet in neuem Fenster)

Autoren: Boroş, Emanuela; Doucet, Antoine
Veröffentlicht in: In Proceedings of the Thirteenth Text Analysis Conference (TAC 2020), 2021
Herausgeber: NIST USA
DOI: 10.5281/zenodo.4681008

SloBERTa: Slovene monolingual large pretrained masked language model

Autoren: Ulčar, Matej; Robnik-Šikonja, Marko
Veröffentlicht in: In Proceedings of the 24th International Multiconference – IS2021 (SiKDD, 2021
Herausgeber: Jožef Stefan Institute

Alleviating Digitization Errors in Named Entity Recognition for Historical Documents (öffnet in neuem Fenster)

Autoren: Emanuela Boros, Ahmed Hamdi, Elvys Linhares Pontes, Luis Adrián Cabrera-Diego, Jose G. Moreno, Nicolas Sidere, Antoine Doucet
Veröffentlicht in: Proceedings of the 24th Conference on Computational Natural Language Learning, 2020, Seite(n) 431-441
Herausgeber: Association for Computational Linguistics
DOI: 10.18653/v1/2020.conll-1.35

Not All Comments Are Equal: Insights into Comment Moderation from a Topic-aware Model

Autoren: Zosa, Elaine; Shekhar, Ravi; Karan, Mladen; Purver, Matthew
Veröffentlicht in: In the Proceedings of the Conference on Recent Advances in Natural Language Processing (RANLP 2021), 2021
Herausgeber: ACL

TLR at the NTCIR-15 FinNum-2 Task: Improving Text Classifiers for Numeral Attachment in Financial Social Data (öffnet in neuem Fenster)

Autoren: Moreno, Jose G.; Boros, Emanuela; Doucet, Antoine
Veröffentlicht in: In Proceedings of the 15th NTCIR Conference on Evaluation of Information Access Technologies, Ausgabe 2, 2020
Herausgeber: Association for Computing Machinery
DOI: 10.5281/zenodo.4680695

Multilingual Detection of Fake News Spreaders via Sparse Matrix Factorization (öffnet in neuem Fenster)

Autoren: Koloski, Boško; Pollak, Senja; Škrlj, Blaž
Veröffentlicht in: Notebook for PAN at CLEF 2020, 2020
Herausgeber: http://ceur-ws.org/
DOI: 10.5281/zenodo.4059635

Unsupervised Approach to Cross-Lingual User Comments Summarization (öffnet in neuem Fenster)

Autoren: Žagar, Aleš; Robnik-Šikonja, Marko
Veröffentlicht in: In the Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation (EACL2021), 2021
Herausgeber: Association of Computational Linguistics
DOI: 10.5281/zenodo.4730327

Semantic Reasoning from Model-Agnostic Explanations (öffnet in neuem Fenster)

Autoren: Stepišnik-Perdih, Timen; Lavrač, Nada; Škrlj, Blaž
Veröffentlicht in: In the Proceedings of the 2021 IEEE 19th World Symposium on Applied Machine Intelligence and Informatics (SAMI)., 2021, ISBN 978-1-7281-8053-3
Herausgeber: IEEE
DOI: 10.1109/sami50585.2021.9378668

Linking Named Entities across Languages using Multilingual Word Embeddings (öffnet in neuem Fenster)

Autoren: Elvys Linhares Pontes, Jose G. Moreno, Antoine Doucet
Veröffentlicht in: Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020, 2020, Seite(n) 329-332, ISBN 9781450375856
Herausgeber: ACM
DOI: 10.1145/3383583.3398597

Discovery Team at SemEval-2020 Task 1: Context-sensitive Embeddings not Always Better Than Static for Semantic Change Detection (öffnet in neuem Fenster)

Autoren: Martinc, Matej; Montariol, Syrielle; Zosa, Elaine; Pivovarova, Lidia
Veröffentlicht in: In Proceedings of the Fourteenth Workshop on Semantic Evaluation (SemEval 2020), 2020, Seite(n) 67-73
Herausgeber: International Committee for Computational Linguistics
DOI: 10.5281/zenodo.4681022

Creative Language Generation in a Society of Engagement and Reflection (öffnet in neuem Fenster)

Autoren: Wright, George A.; Purver, Matthew
Veröffentlicht in: In Proceedings of the Eleventh International Conference on Computational Creativity (ICCC2020), 2020
Herausgeber: Association for Computational Creativity (ACC)
DOI: 10.5281/zenodo.4680484

A Review of Cross-Domain Text-to-SQL Models (öffnet in neuem Fenster)

Autoren: Yujian Gan, Purver, Matthew, & Woodward, John
Veröffentlicht in: In the Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing: Student Research Workshop, 2020, Seite(n) 108-115
Herausgeber: Association for Computational Linguistics
DOI: 10.5281/zenodo.4699229

Event Detection with Entity Markers

Autoren: Boros, Emanuela; Moreno, Jose G.; Doucet, Antoine
Veröffentlicht in: In the Proceedings of the 43rd European Conference on Information Retrieval (ECIR 2021), 2021
Herausgeber: Springer

Parsing Text in a Workspace for Language Generation

Autoren: Wright, George A.; Purver, Matthew
Veröffentlicht in: In the Proceedings of the 2021 Society for Text & Discourse Annual Conference, 2021, 2021
Herausgeber: Easychair

Zero-shot cross-lingual content filtering: offensive language and hate speech detection (öffnet in neuem Fenster)

Autoren: Andraž, Pelicon; Shekhar, Ravi; Martinc, Matej; Škrlj, Blaž; Pollak, Senja; Purver, Matthew
Veröffentlicht in: In the Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation (EACL2021), 2021
Herausgeber: Association Of Computational Linguistics
DOI: 10.5281/zenodo.4730308

Intérêt des modèles de caractères pour la détection d’événements

Autoren: Boros, Emanuela; Besançon, Romaric; Ferret, Olivier; Grau, Brigitte
Veröffentlicht in: In Proceedings of TALN 2021, 2021
Herausgeber: HAL-LIST

Embeddia at SemEval-2019 Task 6: Detecting hate with neural network and transfer learning approaches

Autoren: Andraž Pelicon, Matej Martinc, and Petra Kralj Novak
Veröffentlicht in: Proceedings of The 13th International Workshop on Semantic Evaluation (SemEval), 2019
Herausgeber: SemEval

Generating Data using Monte Carlo Dropout

Autoren: Kristian Miok, Dong Nguyen-Doan, Daniela Zaharie, and Marko Robnik-Šikonja
Veröffentlicht in: IEEE 15th International Conference on Intelligent Computer Communication and Processing (ICCP 2019), 2019
Herausgeber: IEEE

Detecting Depression with Word-Level Multimodal Fusion (öffnet in neuem Fenster)

Autoren: Morteza Rohanian, Julian Hough, Matthew Purver
Veröffentlicht in: Interspeech 2019, 2019, Seite(n) 1443-1447
Herausgeber: ISCA
DOI: 10.21437/interspeech.2019-2283

Clustering Ideological Terms in Historical Newspaper Data with Diachronic Word Embeddings

Autoren: Jani Marjanen, Lidia Pivovarova, Elaine Zosa, and Jussi Kurunmäki
Veröffentlicht in: Proceedings of the 5th International Workshop on Computational History, 2019
Herausgeber: Aachen : R. Piskac c/o Redaktion Sun SITE, Informatik V, RWTH Aachen

Karst exploration: Extracting terms and definitions from karst

Autoren: Senja Pollak, Andraž Repar, Matej Martinc, and Vid Podpečan
Veröffentlicht in: Proceedings of the 6th biennial conference on electronic lexicography, eLex 2019, 2019
Herausgeber: Presses Universitaires de Louvain

Who is hot and who is not? Profiling celebs on Twitter

Autoren: Martinc, Matej; Škrlj, Blaž; Pollak, Senja
Veröffentlicht in: Working Notes of CLEF 2019 - Conference and Labs of the Evaluation Forum, Ausgabe 6, 2019
Herausgeber: Aachen : R. Piskac c/o Redaktion Sun SITE, Informatik V, RWTH Aachen

Fake or Not: Distinguishing Between Bots, Males and Females

Autoren: Martinc, Matej; Škrlj, Blaž; Pollak, Senja
Veröffentlicht in: Working Notes of CLEF 2019 - Conference and Labs of the Evaluation Forum, Ausgabe 2, 2019
Herausgeber: Aachen : R. Piskac c/o Redaktion Sun SITE, Informatik V, RWTH Aachen

Pooled LSTM for Dutch cross-genre gender classification

Autoren: Matej Martinc, Senja Pollak
Veröffentlicht in: Proceedings of the Shared Task on Cross-Genre Gender Detection in Dutch at Computational Linguistic in Netherlands (CLIN 2019) conference, 2019
Herausgeber: Aachen : R. Piskac c/o Redaktion Sun SITE, Informatik V, RWTH Aachen

Methods for Generating Colourful and Factual Multilingual News Headlines

Autoren: Alnajjar, Khalid; Leppänen, Leo; Toivonen, Hannu
Veröffentlicht in: In Proceedings of the 10th International Conference on Computational Creativity (ICCC 2019), Ausgabe 1, 2019, Seite(n) 258-265, ISBN 978-989-54160-1-1
Herausgeber: Association for Computational Creativity (ACC)

TLR at BSNLP2019: A Multilingual Named Entity Recognition System (öffnet in neuem Fenster)

Autoren: Jose G. Moreno, Elvys Linhares Pontes, Mickael Coustaty, Antoine Doucet
Veröffentlicht in: Proceedings of the 7th Workshop on Balto-Slavic Natural Language Processing, 2019, Seite(n) 83-88
Herausgeber: Association for Computational Linguistics
DOI: 10.18653/v1/w19-3711

Clustering Ideological Terms in Historical Newspaper Data with Diachronic Word Embeddings (öffnet in neuem Fenster)

Autoren: Jani Marjanen; Lidia Pivovarova; Elaine Zosa; Jussi Kurunmäki
Veröffentlicht in: HistoInformatics 2019: International Workshop on Computational History 2019, 2019
Herausgeber: CEUR-WS.org
DOI: 10.5281/zenodo.3689467

A Corpus Study on Questions, Responses and Misunderstanding Signals in Conversations with Alzheimer's Patients (öffnet in neuem Fenster)

Autoren: Shamila Nasreen; Matthew Purver; Julian Hough
Veröffentlicht in: Proceedings of the 23rd Workshop on the Semantics and Pragmatics of Dialogue, Ausgabe 13, 2019
Herausgeber: SEMDIAL
DOI: 10.5281/zenodo.3689456

Word Clustering for Historical Newspapers Analysis (öffnet in neuem Fenster)

Autoren: Pivovarova, Lidia; Marjanen, Jani; Zosa, Elaine
Veröffentlicht in: Proceedings of the Workshop on Language Technology for Digital Historical Archives in conjuction with RANLP-2019, 2019, Seite(n) 3-10
Herausgeber: INCOMA Ltd.
DOI: 10.5281/zenodo.3402940

TeMoCo: A Visualization Tool for Temporal Analysis of Multi-party Dialogues in Clinical Settings (öffnet in neuem Fenster)

Autoren: Shane Sheehan, Pierre Albert, Saturnino Luz, Masood Masoodian
Veröffentlicht in: 2019 IEEE 32nd International Symposium on Computer-Based Medical Systems (CBMS), 2019, Seite(n) 690-695, ISBN 978-1-7281-2286-1
Herausgeber: IEEE
DOI: 10.1109/CBMS.2019.00140

Gender, language, and society: word embeddings as a reflection of social inequalities in linguistic corpora (öffnet in neuem Fenster)

Autoren: Supej, Anka; Plahuta, Marko; Purver, Matthew; Mathioudakis, Michael; Pollak, Senja
Veröffentlicht in: In Znanost in družbe prihodnosti, Slovensko sociološko srečanje [Annual meeting of the Slovenian Sociological Association: Science and future societies], 2019
Herausgeber: Slovensko sociološko društvo
DOI: 10.5281/zenodo.3894466

No Time Like the Present: Methods for Generating Colourful and Factual Multilingual News Headlines

Autoren: Alnajjar, Khalid; Leppänen, Leo; Toivonen, Hannu
Veröffentlicht in: Proceedings of the 10th International Conference on Computational Creativity (ICCC2019), 2019
Herausgeber: Association for Computational Creativity

Multiple Imputation for Biomedical Data using Monte Carlo Dropout Autoencoders (öffnet in neuem Fenster)

Autoren: Kristian Miok, Dong Nguyen-Doan, Marko Robnik-Sikonja, Daniela Zaharie
Veröffentlicht in: 2019 E-Health and Bioengineering Conference (EHB), 2019, Seite(n) 1-4, ISBN 978-1-7281-2603-6
Herausgeber: IEEE
DOI: 10.1109/EHB47216.2019.8969940

High Quality ELMo Embeddings for Seven Less-Resourced Languages (öffnet in neuem Fenster)

Autoren: Ulčar, Matej; Robnik-Šikonja Marko
Veröffentlicht in: Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020), 2020, Seite(n) 4731–4738
Herausgeber: The European Language Resources Association (ELRA)
DOI: 10.5281/zenodo.3894535

Leveraging Contextual Embeddings for Detecting Diachronic Semantic Shift (öffnet in neuem Fenster)

Autoren: Martinc, Matej; Kralj Novak, Petra; Pollak, Senja
Veröffentlicht in: Proceedings of the 12th Language Resources and Evaluation Conference (LREC2020), 2020, Seite(n) 4811‑4819
Herausgeber: The European Language Resources Association (ELRA)
DOI: 10.5281/zenodo.3894557

Multilingual Culture-Independent Word Analogy Datasets (öffnet in neuem Fenster)

Autoren: Ulčar, Matej; Vaik, Kristiina; Lindström, Jessica; Dailidėnaitė, Milda; Robnik-Šikonja, Marko
Veröffentlicht in: Proceedings of the 12th Language Resources and Evaluation Conference (LREC2020), Ausgabe 1, 2020, Seite(n) 4074‑4080
Herausgeber: The European Language Resources Association (ELRA)
DOI: 10.5281/zenodo.3894553

Dataset for Temporal Analysis of English-French Cognates (öffnet in neuem Fenster)

Autoren: Frossard, Esteban; Coustaty, Mickael; Doucet, Antoine; Jatowt, Adam; Hengchen, Simon
Veröffentlicht in: Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020), 2020, Seite(n) 855-859
Herausgeber: The European Language Resources Association (ELRA)
DOI: 10.5281/zenodo.3693651

A Dataset for Multi-lingual Epidemiological Event Extraction (öffnet in neuem Fenster)

Autoren: Mutuvi, Stephen; Doucet, Antoine; Lejeune, Gael; Odeo, Moses
Veröffentlicht in: Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020), 2020, Seite(n) 4139–4144
Herausgeber: The European Language Resources Association (ELRA)
DOI: 10.5281/zenodo.3709626

CoSimLex: A Resource for Evaluating Graded Word Similarity in Context (öffnet in neuem Fenster)

Autoren: Carlos Santos Armendariz; Matthew Purver; Matej Ulčar; Senja Pollak; Nikola Ljubešič; Marko Robnik-Šikonja; Mark Granroth-Wilding; Kristiina Vaik
Veröffentlicht in: Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020), 2020, Seite(n) 5878–5886
Herausgeber: The European Language Resources Association (ELRA)
DOI: 10.5281/zenodo.3894565

Text Visualization for the Support of Lexicography-Based Scholarly Work (öffnet in neuem Fenster)

Autoren: Sheehan, Shane; Luz, Saturnino
Veröffentlicht in: Proceedings of the 6th biennial conference on electronic lexicography, eLex 2019, 2019, Seite(n) 694-725
Herausgeber: Lexical Computing CZ s.r.o., Brno, Czech Republic
DOI: 10.5281/zenodo.3894619

Mining semantic relations from comparable corpora through intersections of word embeddings. (öffnet in neuem Fenster)

Autoren: Vintar, Špela; Grčič Simeunovič, Larisa; Martinc, Matej; Pollak, Senja; Stepišnik, Uroš
Veröffentlicht in: Proceedings of the LREC 2020 13th Workshop on Building and Using Comparable Corpora, 2020, Seite(n) 29-34
Herausgeber: European Language Resources Association
DOI: 10.5281/zenodo.3894635

Interaction Patterns in Conversations with Alzheimer's Patients (öffnet in neuem Fenster)

Autoren: Nasreen, Shamila; Purver, Matthew; Hough, Julian
Veröffentlicht in: Poster presentation at the 7th International Conference on Statistical Language and Speech Processing. Ljubljana, Slovenia, 2019
Herausgeber: Springer
DOI: 10.5281/zenodo.3894637

Multilingual Dynamic Topic Model (öffnet in neuem Fenster)

Autoren: Elaine Zosa, Mark Granroth-Wilding
Veröffentlicht in: Proceedings - Natural Language Processing in a Deep Learning World, 2019, Seite(n) 1388-1396, ISBN 9789-544520564
Herausgeber: Incoma Ltd., Shoumen, Bulgaria
DOI: 10.26615/978-954-452-056-4_159

The NetViz terminology visualization tool and the use cases in karstology domain modeling (öffnet in neuem Fenster)

Autoren: Pollak, Senja; Podpečan, Vid; Miljkovic, Dragana; Stepinšik, Uroš; Vintar, Špela
Veröffentlicht in: Proceedings of the 6th International Workshop on Computational Terminology (COMPUTERM 2020), 2020, Seite(n) 55-61
Herausgeber: European Language Resources Association (ELRA)
DOI: 10.5281/zenodo.3894686

Communities of related terms in Karst terminology co-occurrence network (öffnet in neuem Fenster)

Autoren: Miljkovic, Dragana; Kralj, Jan; Stepišnik, Uroš; Pollak, Senja
Veröffentlicht in: Proceedings of the 6th biennial conference on electronic lexicography, eLex 2019, 2019, Seite(n) 357-373
Herausgeber: Lexical Computing CZ s.r.o., Brno, Czech Republic
DOI: 10.5281/zenodo.3894684

A Comparison of Unsupervised Methods for Ad hoc Cross-Lingual Document Retrieval (öffnet in neuem Fenster)

Autoren: Zosa, Elaine; Granroth-Wilding, Mark; Pivovarova, Lidia
Veröffentlicht in: Proceedings of the Cross-Language Search and Summarization of Text and Speech Workshop, 2020, Seite(n) 32-37
Herausgeber: European Language Resources Association (ELRA)
DOI: 10.5281/zenodo.3898384

Capturing Evolution in Word Usage: Just Add More Clusters? (öffnet in neuem Fenster)

Autoren: Matej Martinc, Syrielle Montariol, Elaine Zosa, Lidia Pivovarova
Veröffentlicht in: Companion Proceedings of the Web Conference 2020, 2020, Seite(n) 343-349, ISBN 9781-450370240
Herausgeber: ACM
DOI: 10.1145/3366424.3382186

Evaluating the Robustness of Embedding-based Topic Models to OCR Noise (öffnet in neuem Fenster)

Autoren: Zosa, Elaine; Mutuvi, Stephen; Granroth-Wilding, Mark; Doucet, Antoine
Veröffentlicht in: In the Proceedings of the 23rd International Conference on Asia-Pacific Digital Libraries (ICADL 2021), 2021
Herausgeber: Springer
DOI: 10.1007/978-3-030-91669-5_30

Evaluating Natural Language Descriptions Generated in a Workspace-Based Architecture

Autoren: Wright, George A.; Purver, Matthew
Veröffentlicht in: In the Proceedings of the 12th International Conference on Computational Creativity, ICCC2021, 2021
Herausgeber: Association for Computational Creativity

Multi-label classification of COVID-19-related articles with an autoML approach

Autoren: Tavchioski, Ilija; Koloski, Boshko; Škrlj, Blaž; Pollak, Senja
Veröffentlicht in: In Proceedings of the BioCreative VII Challenge Evaluation Workshop, 2021, Seite(n) 295-299, ISBN 978-0-578-32368-8
Herausgeber: Biocreative

L3i_LBPAM at the FinSim-2 task: Learning Financial Semantic Similarities with Siamese Transformers (öffnet in neuem Fenster)

Autoren: Nhu Khoa Nguyen; Emanuela Boros; Gaël Lejeune; Antoine Doucet; Thierry Delahaut
Veröffentlicht in: Ausgabe 30, 2021
Herausgeber: IW3C2
DOI: 10.5281/zenodo.4734321

CTLR@WiC-TSV: Target Sense Verification using Marked Inputs and Pre-trained Models (öffnet in neuem Fenster)

Autoren: Moreno, Jose G.; Linhares Pontes, Elvys; Dias, Gaël
Veröffentlicht in: In 6th Workshop on Semantic Deep Learning (SemDeep-6) associated to 29th International Joint Conference on Artificial Intelligence and 17th Pacific Rim International Conference on Artificial Intelligence (IJCAI-PRICAI 2020), Ausgabe 2, 2021
Herausgeber: International Joint Conferences on Artificial Intelligence
DOI: 10.5281/zenodo.4680720

Exploratory analysis of news sentiment using subgroup discovery (öffnet in neuem Fenster)

Autoren: Valmarska, Anita; Cabrera-Diego, Luis Adrián; Linhares Pontes, Elvys; Pollak, Senja
Veröffentlicht in: In Proceedings of the 8th Workshop on Balto-Slavic Natural Language Processing in conjunction to EACL2021, 2021
Herausgeber: Association for Computational Linguistics
DOI: 10.5281/zenodo.4730472

COVID-19 Therapy Target Discovery with Context-Aware Literature Mining (öffnet in neuem Fenster)

Autoren: Martinc, Matej; Škrlj, Blaž; Pirkmajer, Sergej; Lavrač, Nada; Cestnik, Bojan; Marzidovšek, Martin; Pollak, Senja
Veröffentlicht in: In Proceedings of the 23rd International Conference on Discovery Science (DS 2020), 2020, Seite(n) 109-123
Herausgeber: Springer International Publishing
DOI: 10.5281/zenodo.4306020

A Baseline Document Planning Method for Automated Journalism

Autoren: Leppänen, Leo; Toivonen, Hannu
Veröffentlicht in: In the Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa), 2021
Herausgeber: Association for Computational Linguistics

Multilingual Epidemiological Text Classification: A Comparative Study (öffnet in neuem Fenster)

Autoren: Mutuvi, Stephen; Boros, Emanuela; Doucet, Antoine; Lejeune, Gaël; Jatowt, Adam; Odeo, Moses
Veröffentlicht in: Proceedings of the 28th International Conference on Computational Linguistics, Ausgabe 44, 2020
Herausgeber: International Committee on Computational Linguistics
DOI: 10.5281/zenodo.4476039

Multi-Modal Fusion with Gating Using Audio, Lexical and Disfluency Features for Alzheimer’s Dementia Recognition from Spontaneous Speech (öffnet in neuem Fenster)

Autoren: Morteza Rohanian, Julian Hough, Matthew Purver
Veröffentlicht in: Interspeech 2020, 2020, Seite(n) 2187-2191
Herausgeber: ISCA
DOI: 10.21437/interspeech.2020-2721

Temporal Mental Health Dynamics on Social Media (öffnet in neuem Fenster)

Autoren: Tom Tabak, Matthew Purver
Veröffentlicht in: Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020, 2020
Herausgeber: Association for Computational Linguistics
DOI: 10.18653/v1/2020.nlpcovid19-2.7

Extending Neural Keyword Extraction with TF-IDF tagset matching (öffnet in neuem Fenster)

Autoren: Koloski, Boshko; Pollak, Senja; Škrlj, Blaž; Martinc, Matej
Veröffentlicht in: In the Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation (EACL2021), 2021
Herausgeber: Association for Computational Linguistics
DOI: 10.5281/zenodo.4730354

The Importance of Character-Level Information in an Event Detection Model

Autoren: Boros, Emanuela; Besançon, Romaric; Ferret, Olivier; Grau, Brigitte
Veröffentlicht in: In Proceedings of NLDB 2021, 2021
Herausgeber: Springer

Benchmarks for Unsupervised Discourse Change Detection

Autoren: Duong, Quan; Pivovarova, Lidia; Zosa, Elaine
Veröffentlicht in: In the Proceedings of the Histoinformatics workshop 2021, 2021
Herausgeber: CEUR

Three-part diachronic semantic change dataset for Russian (öffnet in neuem Fenster)

Autoren: Andrey Kutuzov, Lidia Pivovarova
Veröffentlicht in: Proceedings of the 2nd International Workshop on Computational Approaches to Historical Language Change 2021, 2021, Seite(n) 7-13
Herausgeber: Association for Computational Linguistics
DOI: 10.18653/v1/2021.lchange-1.2

SemEval2020 Task 3: Graded Word Similarity in Context (öffnet in neuem Fenster)

Autoren: Armendariz, Carlos Santos; Purver, Matthew; Pollak, Senja; Ljubešić, Nikola; Ulčar, Matej; Robnik-Šikonja, Marko; Vulić, Ivan; Mohammed Taher Pilehvar
Veröffentlicht in: In Proceedings of the 14th International Workshop on Semantic Evaluation (SemEval 2020), 2020, Seite(n) 36-49
Herausgeber: International Committee for Computational Linguistics
DOI: 10.5281/zenodo.4309679

Hybrid Tagger – An Industry-driven Solution for Extreme Multi-label Text Classification

Autoren: Vaik, Kristiina; Asula, Marit; Sirel, Raul
Veröffentlicht in: In Proceedings of the LREC2020 Industry Track, 2020, Seite(n) 26-30
Herausgeber: The European Language Resources Association (ELRA)

A Baseline Document Planning Method for Automated Journalism

Autoren: Leppänen, Leo; Toivonen, Hannu
Veröffentlicht in: In Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa 2021), 2021
Herausgeber: Linköping University Electronic Press, Sweden

TeMoCo-Doc - A visualization for supporting temporal and contextual analysis of dialogues and associated documents (öffnet in neuem Fenster)

Autoren: Shane Sheehan, Saturnino Luz, Pierre Albert, Masood Masoodian
Veröffentlicht in: Proceedings of the International Conference on Advanced Visual Interfaces, 2020, Seite(n) 1-3, ISBN 9781450375351
Herausgeber: ACM
DOI: 10.1145/3399715.3399956

Named Entity Recognition Architecture Combining Contextual and Global Features (öffnet in neuem Fenster)

Autoren: Tran Thi Hong, Hahn; Doucet, Antoine; Sidere, Nicolas; Moreno, Jose G.; Pollak, Senja
Veröffentlicht in: In the Proceedings of the 23rd International Conference on Asia-Pacific Digital Libraries (ICADL 2021), 2021
Herausgeber: Springer
DOI: 10.1007/978-3-030-91669-5_21

Aligning Estonian and Russian news industry keywords with the help of subtitle translations and an environmental thesaurus (öffnet in neuem Fenster)

Autoren: Repar, Andraž; Shumakov, Andrej
Veröffentlicht in: In the Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation (EACL2021), 2021
Herausgeber: Association for Computational Linguistics
DOI: 10.5281/zenodo.4730392

Zaznavanje sentimenta v novicah z globokimi nevronskimi mrežami (öffnet in neuem Fenster)

Autoren: Arhar Holdt, Špela; Pollak, Senja; Robnik-Šikonja, Marko; Krek, Simon
Veröffentlicht in: Ausgabe In Proceedings of the Conference on Language Technologies and Digital Humanities, JTDH2020, 2020, Seite(n) 10-15
Herausgeber: Institute of Contemporary History
DOI: 10.5281/zenodo.4059729

Étude comparative de méthodes de classification multilingue appliquées à l'épidémiologie (öffnet in neuem Fenster)

Autoren: Stephen Mutuvi; Emanuela Boros; Antoine Doucet; Gaël Lejeune; Adam Jatowt; Moses Odeo
Veröffentlicht in: Ausgabe 29, 2021
Herausgeber: l’Association Francophone de Recherche d’Information et Applications ARIA
DOI: 10.5281/zenodo.4734472

Word-embedding based bilingual terminology alignment

Autoren: Repar, Andraž; Martinc, Matej; Ulčar, Matej; Pollak, Senja
Veröffentlicht in: In Proceedings of eLex 2021 (eLex2021), 2021
Herausgeber: Brno: Lexical Computing CZ, s.r.o.

Investigating the Semantic Wave in Tutorial Dialogues: An Annotation Scheme and Corpus Study on Analogy Components

Autoren: Del-Bosque-Trevino, Jorge, Hough, Julian, and Purver, Matthew
Veröffentlicht in: In Proceedings of the 24th SemDial Workshop on the Semantics and Pragmatics of Dialogue (SemDial), 2020
Herausgeber: SEMDIAL

Interesting cross-border news discovery using cross-lingual article linking and document similarity (öffnet in neuem Fenster)

Autoren: Koloski, Boshko; Zosa, Elaine; Stepišnik-Perdih, Timen; Škrlj, Blaž; Paju, Tarmo; Pollak, Senja
Veröffentlicht in: In the Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation (EACL2021), 2021
Herausgeber: Association for Computational Linguistics
DOI: 10.5281/zenodo.4730369

An evaluation of BERT and Doc2Vec model on the IPTC Subject Codes prediction dataset

Autoren: Pranjić, Marko; Robnik-Šikonja, Marko; Pollak, Senja
Veröffentlicht in: In Proceedings of the 24th International Multiconference – IS2021 (SiKDD), 2021
Herausgeber: Jožef Stefan Institute

Evaluation of related news recommendations using document similarity methods (öffnet in neuem Fenster)

Autoren: Pranjić, Marko; Podpečan, Vid; Robnik-Šikonja, Marko; Pollak, Senja
Veröffentlicht in: Ausgabe In Proceedings of the Conference on Language Technologies and Digital Humanities, JTDH2020, 2020, Seite(n) 81-86
Herausgeber: Institute of Contemporary History
DOI: 10.5281/zenodo.4059710

Dimenzija spola v slovenskih vektorskih vložitvah besed: primerjava modelov prek analogij poklicev (öffnet in neuem Fenster)

Autoren: Supej, Anka; Ulčar, Matej; Robnik-Šikonja, Marko; Pollak, Senja
Veröffentlicht in: In Proceedings of the Joint Conference on Digital Libraries (JCDL 2020), 2020, Seite(n) 93-100
Herausgeber: Institute of Contemporary History
DOI: 10.5281/zenodo.4059700

Mitigating Gender Bias in Word Embeddings using Explicit Gender Free Corpus

Autoren: Hargrave, David
Veröffentlicht in: Masters thesis, School of Electronic Engineering and Computer Science, Queen Mary University of London, 2021
Herausgeber: Queen Mary University of London

Silicon Valley och makten över medierna [Silicon Valley and the power over media] (öffnet in neuem Fenster)

Autoren: Carl-Gustav Linden
Veröffentlicht in: Ausgabe 1, 2020
Herausgeber: Nordicom
DOI: 10.48335/9789188855350

Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation (öffnet in neuem Fenster)

Autoren: Toivonen, Hannu; Boggia, Michele
Veröffentlicht in: 2021, ISBN 978-1-954085-13-8
Herausgeber: Association for Computational Linguistics
DOI: 10.5281/zenodo.4730375

Suche nach OpenAIRE-Daten ...

Bei der Suche nach OpenAIRE-Daten ist ein Fehler aufgetreten

Es liegen keine Ergebnisse vor

Mein Booklet 0 0