Browsing by Author "Ros, Salvador"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
Publication A bridge too far for artificial intelligence?: Automatic classification of stanzas in Spanish poetry(John Wiley and Sons Inc, 2022) Pérez Pozo, Álvaro; Rosa, Javier de la; Ros, Salvador; González Blanco, Elena; Hernández, Laura; Sisto, Mirella de; Horizon 2020 Framework Programme; European Research Council; https://ror.org/02jjdwm75The rise in artificial intelligence and natural language processing techniques has increased considerably in the last few decades. Historically,the focus has been primarily on texts expressed in prose form,leaving mostly aside figurative or poetic expressions of language due to their rich semantics and syntactic complexity. The creation and analysis of poetry have been commonly carried out by hand,with a few computer-assisted approaches. In the Spanish context,the promise of machine learning is starting to pan out in specific tasks such as metrical annotation and syllabification. However,there is a task that remains unexplored and underdeveloped: stanza classification. This classification of the inner structures of verses in which a poem is built upon is an especially relevant task for poetry studies since it complements the structural information of a poem. In this work,we analyzed different computational approaches to stanza classification in the Spanish poetic tradition. These approaches show that this task continues to be hard for computers systems,both based on classical machine learning approaches as well as statistical language models and cannot compete with traditional computational paradigms based on the knowledge of experts. © 2021 The Authors. Journal of the Association for Information Science and Technology published by Wiley Periodicals LLC on behalf of Association for Information Science and Technology.Publication Automated Metric Analysis of Spanish Poetry: Two Complementary Approaches(Institute of Electrical and Electronics Engineers Inc., 2021) Marco, Guillermo; Rosa, Javier de la; Gonzalo, Julio; Ros, Salvador; González Blanco, Elena; Horizon 2020 Framework Programme; European Commission; https://ror.org/02jjdwm75The automatic metric analysis (commonly referred to as scansion) of Spanish poetry is not a trivial problem since it combines the nuances of the language,the different poetic traditions related to melodic patterns,and the personal stylistic preferences and intentions of the author. In this paper,we explore two alternative algorithmic approaches tailored to different applications scenarios. The first approach,Rantanplan,is a rule-based method that consists of four Natural Language Processing modules that work together to perform scansion and other related analysis: Part of Speech tagging,syllabification,stress assignment,and metrical adjustment. The second approach,Jumper,explores the possibility of performing scansion without syllabification,with a twofold purpose: to minimize the errors propagated in different parts of the linguistic processing pipeline (including the syllabification step),and to improve the efficiency of the process. Both systems outperform the state of the art and provide either a more informative solution (suitable,for instance,for teaching purposes) or a more efficient processing (when a correct scansion is all the linguistic knowledge required,as in scholar philological studies). The combined use of both systems turns out to provide a practical tool to clean-up manual annotation errors in corpora. © 2013 IEEE.Publication Transformers analyzing poetry: multilingual metrical pattern prediction with transfomer-based language models(Springer Science and Business Media Deutschland GmbH, 2023) Rosa, Javier de la; Pérez Pozo, Álvaro; Sisto, Mirella de; Hernández, Laura; Díaz, Aitor; Ros, Salvador; González Blanco, Elena; Horizon 2020 Framework Programme; European Commission; https://ror.org/02jjdwm75The splitting of words into stressed and unstressed syllables is the foundation for the scansion of poetry,a process that aims at determining the metrical pattern of a line of verse within a poem. Intricate language rules and their exceptions,as well as poetic licenses exerted by the authors,make calculating these patterns a nontrivial task. Some rhetorical devices shrink the metrical length,while others might extend it. This opens the door for interpretation and further complicates the creation of automated scansion algorithms useful for automatically analyzing corpora on a distant reading fashion. In this paper,we compare the automated metrical pattern identification systems available for Spanish,English,and German,against fine-tuned monolingual and multilingual language models trained on the same task. Despite being initially conceived as models suitable for semantic tasks,our results suggest that transformers-based models retain enough structural information to perform reasonably well for Spanish on a monolingual setting,and outperforms both for English and German when using a model trained on the three languages,showing evidence of the benefits of cross-lingual transfer between the languages. © 2021,The Author(s).