PUBLICATIONS

The list of all my scientific publications in journals and peer-reviewed conferences.

2020

Anne Lauscher, Ivan Vulić, Edoardo Maria Ponti, Anna Korhonen and Goran Glavaš. Specializing Unsupervised Pretraining Models for Word-Level Semantic Similarity. Proceedings of the 28th International Conference on Computational Linguistics (COLING). to appear, Online, 2020.

Goran Glavaš*, Mladen Karan* and Ivan Vulić*. XHate-999: Analyzing and Detecting Abusive Language Across Domains and Languages. Proceedings of the 28th International Conference on Computational Linguistics (COLING). to appear, Online, 2020. (*equal contribution)

Robert Litschko, Ivan Vulić, Željko Agić and Goran Glavaš. Towards Instance-Level Parser Selection for Cross-Lingual Transfer of Dependency Parsers. Proceedings of the 28th International Conference on Computational Linguistics (COLING). to appear, Online, 2020.

Goran Glavaš, Ivan Vulić, Anna Korhonen and Simone Paolo Ponzetto. SemEval-2020 Task 2: Predicting Multilingual and Cross-Lingual (Graded) Lexical Entailment. Proceedings of the 13th International Workshop on Semantic Evaluation (SemEval 2020). to appear, Online, 2020.

Anne Lauscher, Olga Majewska, Leonardo F. R. Ribeiro, Iryna Gurevych, Nikolai Rozanov and Goran Glavaš. Common Sense or World Knowledge? Investigating Adapter-Based Knowledge Injection into Pretrained Transformers. Proceedings of the the First Workshop on Knowledge Extraction and Integration for Deep Learning Architectures (DeeLIO). to appear, Online, 2020.

Anne Lauscher, Vinit Ravishankar, Ivan Vulić, Goran Glavaš. From Zero to Hero: On the Limitations of Zero-Shot Cross-Lingual Transfer with Multilingual Transformers. Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP). to appear, Online, 2020.

Ivan Vulić, Edoardo Maria Ponti, Robert Litschko, Goran Glavaš and Anna Korhonen. Probing Pretrained Language Models for Lexical Semantics. Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP). to appear, Online, 2020.

Edoardo Maria Ponti*, Goran Glavaš*, Olga Majewska, Qianchu Liu, Ivan Vulić, and Anna Korhonen. XCOPA: A Multilingual Dataset for Causal Commonsense Reasoning. Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP). to appear, Online, 2020. (*equal contribution)

Anne Lauscher, Rafik Takieddin, Simone Paolo Ponzetto and Goran Glavaš. AraWEAT: Multidimensional Analysis of Biases in Arabic Word Embeddings. Proceedings of the Fifth Arabic Natural Language Processing Workshop (WANLP 2020). to appear, Online, 2020.

Ralph Peeters, Christian Bizer, and Goran Glavaš. Intermediate Training of BERT for Product Matching. Proceedings of the 2nd International Workshop on Challenges and Experiences from Data Integration to Knowledge Graphs (DI2KG), Held in conjunction with VLDB 2020. to appear, Online, 2020.

Ivan Vulić, Anna Korhonen, and Goran Glavaš. Improving Bilingual Lexicon Induction with Unsupervised Post-Processing of Monolingual Word Vector Spaces. Proceedings of the The 5th Workshop on Representation Learning for NLP (RepL4NLP-2020). 45-54, Online, 2020. Best Paper Award.

Goran Glavaš and Ivan Vulić. Non-Linear Instance-Based Cross-Lingual Mapping for Non-Isomorphic Embedding Spaces. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL). 7548-7555, Online, 2020.

Wei Zhao, Goran Glavaš, Maxime Peyrard, Yang Gao, Robert West and Steffen Eger. On the Limitations of Cross-lingual Encoders as Exposed by Reference-Free Machine Translation Evaluation. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL). 1656-1671, Online, 2020.

Mladen Karan, Ivan Vulić, Anna Korhonen and Goran Glavaš. Classification-Based Self-Learning for Weakly Supervised Bilingual Lexicon Induction. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL). 6915-6922, Online, 2020.

Ivan Krsnik, Goran Glavaš, Marina Krsnik, Damir Miletić, Ivan Štajduhar. Automatic Annotation of Narrative Radiology Reports. Diagnostics, Volume 10, Issue 4, 196-210, 2020.

Anne Lauscher, Goran Glavaš, Simone Paolo Ponzetto and Ivan Vulić. A General Framework for Implicit and Explicit Debiasing of Distributional Word Vector Spaces. Proceedings of the The Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI), New York, 2020.

Goran Glavaš and Swapna Somasundaran. Two-Level Transformer and Auxiliary Coherence Modeling for Improved Text Segmentation. Proceedings of the The Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI), New York, 2020.

2019

Ivan Vulić, Goran Glavaš, Roi Reichart and Anna Korhonen. Do We Really Need Fully Unsupervised Cross-Lingual Embeddings? Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP). 4406-4417, Hong Kong, 2019.

Edoardo Maria Ponti, Ivan Vulić, Goran Glavaš, Roi Reichart and Anna Korhonen. Cross-lingual Semantic Specialization via Lexical Relation Induction. Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP). 2206-2217, Hong Kong, 2019.

Fabian David Schmidt, Markus Dietsche, Simone Paolo Ponzetto, and Goran Glavaš. SEAGLE: A Platform for Comparative Evaluation of

Semantic Encoders for Information Retrieval. Proceedings of the Conference on Empirical Methods in Natural Language Processing: System Demonstrations. pages 199-204, Hong Kong, 2019.

Taha Tobaili, Miriam Fernandez, Harith Alani, Goran Glavaš, Sanaa Sharafeddine and Hazem Hajj. SenZi: A Sentiment Analysis Lexicon for the Latinised Arabic Arabizi. Proceedings of the Conference on Recent Advances in Natural Language Processing (RANLP), tpages 1203-1211, Varna, 2019.

Aishwarya Kamath, Jonas Pfeiffer, Edoardo Maria Ponti, Goran Glavaš and Ivan Vulić. Specializing Distributional Vectors of All Words for Lexical Entailment. Proceedings of the 4th Workshop on Representation Learning for Natural Language Processing (Rep4NLP), pages 72-83, Florence, 2019. Best Paper Award.

Goran Glavaš, Robert Litschko, Sebastian Ruder and Ivan Vulić. How to (Properly) Evaluate Cross-Lingual Word Embeddings: On Strong Baselines, Comparative Analyses, and Some Misconceptions. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL), pages 710-721, Florence, 2019.

Goran Glavaš and Ivan Vulić. Generalized Tuning of Distributional Word Vectors for Monolingual and Cross-Lingual Lexical Entailment. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL), pages 4824-4830, Florence, 2019.

Ivan Vulić, Simone Paolo Ponzetto and Goran Glavaš. Multilingual and Cross-Lingual Graded Lexical Entailment. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL), pages 4963-4974, Florence, 2019.

Bilal Ghanem, Goran Glavaš, Anastasia Giachanou, Simone Paolo Ponzetto, Paolo Rosso and Francisco Rangel. UPV-UMA at CheckThat! Lab: Verifying Arabic Claims using Cross Lingual Approach. In Proceedings of the Conference and Labs of the Evaluation Forum (CLEF), to appear, Lugano, 2019.

Anne Lauscher and Goran Glavaš. Are We Consistently Biased? Multidimensional Analysis of Biases in Distributional Word Vectors. In Proceedings of the Eighth Joint Conference on Lexical and Computational Semantics (*SEM), pages 85-91, Minneapolis, 2019.

Robert Litschko, Goran Glavaš, Ivan Vulić and Laura Dietz. Evaluating Resource-Lean Cross-Lingual Embedding Models in Unsupervised Retrieval. In Proceedings of the 42nd Annual International Conference on Research and Development in Information Retrieval (SIGIR), to appear, Paris, 2019.

Goran Glavaš and Ivan Vulić. Zero-Shot Language Transfer for Cross-Lingual Sentence Retrieval with the Bidirectional Attention Model. In the Proceedings of the 41st European Conference on Information Retrieval (ECIR), pages 523-538, Cologne, 2019.

2018

Anne Lauscher, Goran Glavaš, Kai Eckert, and Simone Paolo Ponzetto. Investigating the Role of Argumentation in the Rhetorical Analysis of Scientific Publications with Neural Multi-Task Learning Model. In the Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 3326-3338, Bruxelles, 2018.

Edoardo Ponti, Ivan Vulić, Goran Glavaš, Nikola Mrkšić, and Anna Korhonen. Adversarial Propagation and Zero-Shot Cross-Lingual Transfer of Word Vector Specialization. In the Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 282-293, Bruxelles, 2018.

Anne Lauscher, Goran Glavaš, and Kai Eckert. ArguminSci: A Tool for Analyzing Argumentation and Rhetorical Aspects in Scientific Writing. In the Proceedings of the Fifth Workshop on Argument Mining (ArgMining), pages 22-28, Bruxelles, 2018.

Anne Lauscher, Goran Glavaš, and Simone Paolo Ponzetto. An Argument-Annotated Corpus of Scientific Publications. In the Proceedings of the Fifth Workshop on Argument Mining (ArgMining), pages 40-46, Bruxelles, 2018.

Goran Glavaš and Ivan Vulić. Explicit Retrofitting of Distributional Word Vectors. In the Proceedings of the The 56th Annual Meeting of the Association for Computational Linguistics (ACL), pages 34-45, Melbourne, 2018.

Robert Litschko, Goran Glavaš, Simone Paolo Ponzetto, and Ivan Vulić. Unsupervised Cross-Lingual Information Retrieval using Monolingual Data Only. In the Proceedings of the 41st Annual International Conference on Research and Development in Information Retrieval (SIGIR), pages 1253-1256, Ann Arbor, 2018.

Goran Glavaš and Ivan Vulić. Discriminating between Lexico-Semantic Relations with the Specialization Tensor Model. In the Proceedings of the Sixteenth Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), pages 181-187, New Orleans, 2018.

Ivan Vulić and Goran Glavaš. Post-Specialisation: Retrofitting Vectors of Words Unseen in Lexical Resources. In the Sixteenth Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), pages 516-527, New Orleans, 2018.

Federico Nanni, Goran Glavaš, Simone Paolo Ponzetto, Sara Tonelli, Nicolo Conti et al. Findings from the Hackathon on Understanding Euroscepticism Through the Lens of Textual Data. In ParlaCLARIN@LREC2018 -- Workshop on Creating and Using Parliamentary Corpora, pages 59-66, Miyazaki, 2018.

Goran Glavaš, Marc Franco-Salvador, Simone Paolo Ponzetto and Paolo Rosso . A Resource-Light Method for Cross-Lingual Semantic Textual Similarity. Knowledge-Based Systems, 2018, Vol. 143, pages 1-9. DOI: https://doi.org/10.1016/j.knosys.2017.11.041

2017

Goran Glavaš, Ivan Vulić, Simone Paolo Ponzetto. If Sentences Could See: Investigating Visual Information for Semantic Textual Similarity. Proceedings of the 12th International Conference on Computational Semantics (IWCS), Montpellier, 2017.

Goran Glavaš, Simone Paolo Ponzetto. Dual Tensor Model for Detecting Asymmetric Lexico-Semantic Relations. Proceedings of the Conference of Empirical Methods in Natural Language Processing (EMNLP), pages 1757-1767, Copenhagen 2017.

Maria Pia di Buono, Jan Šnajder, Bojana Dalbelo Bašić, Goran Glavaš, Martin Tutek and Nataša Milić-Frayling. Predicting News Values from Headline Text and Emotions. Proceedings of the Natural Language Processing Meets Journalism Workshop (NLPMJ), pages 1-6, Copenhagen, 2017.

Anne Lauscher, Goran Glavaš, Kai Eckert. Citation-Based Summarization of Scientific Articles Using Semantic Textual Similarity. Proceedings of the Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL), Tokyo, 2017.

Anne Lauscher, Goran Glavaš, Simone Paolo Ponzetto, Kai Eckert. Investigating Convolutional Networks and Domain-Specific Embeddings for Semantic Classification of Citations. Proceedings of the 6th Workshop on Mining Scientific Publications, Toronto, 2017.

Goran Glavaš, Federico Nanni, Simone Paolo Ponzetto. Cross-Lingual Classification of Topics in Political Texts. Proceedings of the Second Workshop on NLP and Computational Social Science (NLP+CSS), pages 42-46, Vancouver, 2017.

Sanja Štajner, Goran Glavaš. Leveraging event-based semantics for automated text simplification. Expert Systems with Applications. Vol. 82, Issue 1, 2017., pages 383-395.

Goran Glavaš, Federico Nanni, Simone Paolo Ponzetto. Unsupervised Cross-Lingual Scaling of Political Texts. Proceedings of the Conference of the European Chapter of Association for Computational Linguistics (EACL), pages 688-693, Valencia, 2017.

Patrick Klein, Simone Paolo Ponzetto, Goran Glavaš. Improving Neural Knowledge Base Completion with Cross-Lingual Projections. Proceedings of the Conference of the European Chapter of Association for Computational Linguistics (EACL), pages 516-522, Valencia, 2017.

Maria Pia di Buono, Martin Tutek, Jan Šnajder, Goran Glavaš, Bojana Dalbelo Bašić, Nataša Milić-Frayling. Two Layers of Annotation for Representing Event Mentions in News Stories. Proceedings of the 11th Linguistic Annotation Workshop, pages 82-90, Valencia, 2017.

Sanja Štajner, Goran Glavaš, Simone Paolo Ponzetto, Heiner Stuckenschmidt. Domain Adaptation for Automatic Detection of Speculative Sentences. Proceedings of IEEE International Conference on Semantic Computing (ICSC). pages 164-171, San Diego, 2017.

2016

Martin Tutek, Goran Glavaš, Jan Šnajder, Nataša Milić-Frayling, Bojana Dalbelo Bašić. Detecting and Ranking Conceptual Links between Texts Using a Knowledge Base. In Proceedings of the 25th Conference on Information and Knowledge Management (CIKM 2016), pages 2077-2080. Indianapolis, 2016.

Goran Glavaš, Federico Nanni, Simone Paolo Ponzetto. Unsupervised Text Segmentation Using Semantic Relatedness Graphs. In Proceedings of Fifth Joint Conference on Lexical and Computational Semantics (*SEM 2016), pages 125-130. Berlin, 2016.

Federico Nanni, Laura Dietz, Stefano Faralli, Goran Glavaš, Simone Paolo Ponzetto. Capturing Interdisciplinarity in Academic Abstracts. D-Lib Magazine, Volume 22, Number 9/10.

Mladen Karan, Jan Šnajder, Daniela Širinić, Goran Glavaš. Analysis of Policy Agendas: Lessons Learned from Automatic Topic Classification of Croatian Political Texts. In Proceedings of the 10th SIGHUM Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH 2016), pages 12-21. Berlin, 2016.

Jenny Copara, Jose Ochoa, Camilo Thorne, Goran Glavaš. Spanish NER with Word Representations and Conditional Random Fields. In Proceedings of the Sixth Named Entity Workshop (NEWS 2016), pages 34-40. Berlin, 2016.

Federico Nanni, Cäcilia Zirn, Goran Glavaš, Jason Eichorst, Simone Paolo Ponzetto: TopFish: Topic- Based Analysis of Political Position in US Electoral Campaigns. In Proceedings of the International Conference on the Advances in Computational Analysis of Political Text (PolText 2016). Dubrovnik, 2016.

Cäcilia Zirn, Goran Glavaš, Federico Nanni, Jason Eichorst, Heiner Stuckenschmidt. Classifying Topics and Detecting Topic Shifts in

Political Manifestos. In Proceedings of the International Conference on the Advances in Computational Analysis of Political Text (PolText 2016). Dubrovnik, 2016.

Krešimir Baksa, Dino Dolović, Goran Glavaš, Jan Šnajder. Tagging Named Entities in Croatian Tweets. Slovenščina 2.0. Vol 4. (Issue 1). 2016. pages 20-41.

2015

Goran Glavaš, Jan Šnajder. Resolving Entity Coreference in Croatian with a Constrained Mention-Pair Model. Proceedings of the Fifth Workshop on Balto-Slavic Natural Language Processing (BSNLP 2015), pages 17-23. Hissar, 2015.

Goran Glavaš, Jan Šnajder. Construction and Evaluation of Event Graphs. In Natural Language Engineering, Volume 21, Number 4, pages 607-652. Cambridge University Press.

Goran Glavaš, Sanja Štajner. Simplifying Lexical Simplification: Do We Need Simplified Corpora? Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics (ACL 2015), pages 63-68. Beijing, 2015.

Goran Glavaš. TakeLab: Medical Information Extraction and Linking with MINERAL. Proceedings of the Seventh Workshop on Semantic Evaluations (SemEval 2015), pages 389-393. Denver, 2015.

Mladen Karan, Goran Glavaš, Jan Šnajder, Bojana Dalbelo Bašić, Ivan Vulić, Marie Francine Moens. TKLBLIIR: Detecting Twitter Paraphrases with TweetingJay. In Proceedings of the Seventh Workshop on Semantic Evaluations (SemEval 2015), pages 70-74. Denver, 2015.

2014

Goran Glavaš, Jan Šnajder. Constructing Coherent Event Hierarchies from News Stories. Proceedings of the 9th Workshop on Graph-based Methods for Natural Language Processing (TextGraphs-9, EMNLP), Doha, pages 34-38. 2014.

Goran Glavaš, Jan Šnajder, Parisa Kordjamshidi, Marie-Francine Moens. HiEvents: A Corpus for Extracting Event Hierarchies from News Stories. Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC '14), pages. 3678-3783, Reykjavik, 2014.

Goran Glavaš, Jan Šnajder. Event Graphs for Information Retrieval and Multi-Document Summarization. Expert Systems with Applications. Vol. 41, 2014., pages. 6904-6916.

Luka Skukan, Goran Glavaš, Jan Šnajder. HeidelTime.Hr: Extracting and Normalizing Temporal Expressions in Croatian. Proceedings of the Ninth Language Technologies Conference, Information Society (IS-JT), pages 99-103, Ljubljana, 2014.

Siniša Biđin, Jan Šnajder, Goran Glavaš. Predicting Croatian Phrase Sentiment Using a Deep Matrix-Vector Model. Proceedings of the Ninth Language Technologies Conference, Information Society (IS-JT 2014), pages 95-98, Ljubljana, 2014.

Krešimir Baksa, Dino Dolović, Goran Glavaš, Jan Šnajder. Named Entity Recognition in Croatian Tweets. Proceedings of the Ninth Language Technologies Conference, Information Society (IS-JT 2014), pages 85-89, Ljubljana, 2014.

2013

Mladen Karan, Goran Glavaš, Frane Šarić, Jan Šnajder, Bojana Dalbelo Bašić. CroNER: Recognizing Named Entities in Croatian Using Conditional Random Fields, Informatics, Vol. 37, December 2013., pages. 165-172. [pdf][bib]

Goran Glavaš, Jan Šnajder. Event-Centered Information Retrieval Using Kernels on Event Graphs. Proceedings of 8th the Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-8), pages 1–5, Seattle. 2013.

Goran Glavaš, Sanja Štajner. Event-Centered Simplification of News Stories.Proceedings of Student Research Workshop of the Conference on Recent Advances in Natural Language Processing (RANLP), pages 71–78, Hissar, 2013. Best paper award.

Goran Glavaš, Jan Šnajder. Recognizing Identical Events with Graph Kernels. Proceedings of 51st Annual Meeting of the Association for Computational Linguistics (ACL), pages 797–803, Sofia, 2013.

Goran Glavaš, Damir Korenčić, Jan Šnajder. Aspect-Oriented Opinion Mining from User Reviews in Croatian. Proceedings of the 4th Workshop on Balto-Slavonic Natural Language Processing (BSNLP), pages 18–22, Sofia, 2013.

Goran Glavaš, Jan Šnajder. Exploring Coreference Uncertainty of Generically Extracted Event Mentions. Proceedings of the Conference on Computational Linguistics and Intelligent Text Processing (CICLing), pages 408–422, Samos, 2013.

2012

Goran Glavaš, Jan Šnajder, Bojana Dalbelo Bašić. Are You for Real? Learning Event Factuality in Croatian Texts. Proceedings of the Conference on Data Mining and Data Warehouses (SiKDD), pages 181-184, Ljubljana, 2012.

Goran Glavaš, Mladen Karan, Frane Šarić, Jan Šnajder, Jure Mijić, Artur Šilić, Bojana Dalbelo Bašić. Cro-NER: A State-of-the-Art Named Entity Recognition and Classification for Croatian. Proceedings of the 8th Language Technologies Conference (IS-LTC), pages 73–78, Ljubljana, 2012.

Mladen Marović, Jan Šnajder, Goran Glavaš. Event and Temporal Relation Extraction from Croatian Newspaper Texts. Proceedings of the 8th Language Technologies Conference (IS-LTC), pages 141–146, Ljubljana, 2012.

Goran Glavaš, Jan Šnajder, Bojana Dalbelo Bašić. Semi-Supervised Acquisition of Croatian Sentiment Lexicon. Proceedings of the 15th International Conference on Text, Speech, and Dialogue (TSD), pages 166–173, Pilsen, 2012.

Goran Glavaš, Krešimir Fertalj, Jan Šnajder. From Requirements to Code: Syntax-Based Requirements Analysis for Data-Driven Application Development. Proceedings of the 17th International Conference on Applications of Natural Language Processing to Information Systems (NLDB), pages 339–344, Groningen, 2012.

Frane Šarić, Goran Glavaš, Mladen Karan, Jan Šnajder, Bojana Dalbelo Bašić. TakeLab Systems for Measuring Semantic Text Similarity, Proceedings of the First Joint Conference on Lexical and Computational Semantics (*SEM & SemEval), pages 441–448, Montreal, 2012.

Goran Glavaš, Jan Šnajder, Bojana Dalbelo Bašić. Experiments on Hybrid Corpus-Based Sentiment Lexicon Acquisition. Proceedings of the Workshop on Innovative Hybrid Approaches to Processing Textual Data, pages 1–9, Avignon, 2012.

2011

Goran Glavaš, Krešimir Fertalj. Solving the Class Responsibility Assignment Problem Using Metaheuristic Approach. Journal of Computing and Information Technology. Vol. 19 (Issue 4), 2011, pages 275-283.

Goran Glavaš, Krešimir Fertalj. Metaheuristic Approach to Class Responsibility Assignment Problem. Proceedings of the 33rd International Conference on Information Technology Interfaces (ITI), pages 591-596, Cavtat, 2011.

2010

Anđelo Martinović, Glavaš Goran, Matko Juribašić, Davor Sutić, Zoran Kalafatić. Real-Time Detection and Recognition of Traffic Signs. Proceedings of the 33rd International Convention MIPRO (2010), pages 247-252, Opatija, 2010.

Goran Glavaš

Natural language processing

professional

PUBLICATIONS