In this article we argue that corpus linguistics is a powerful methodology that only recently has started to explore languages other than English, such as Spanish. At the same time, in developing automated tools to analyze Spanish and other languages researchers face some common challenges, even more so when the texts are multimodal in nature. Here we will explore key research problems in corpus linguistics for the Spanish language, identify emerging niches, and highlight issues in the automatic description of multimodal texts. We will, however, not move into the discussion about the status of corpus linguistics, the debate between corpus-based studies versus corpus-driven approaches (Tognini-Bonelli, 2001), the difference between light and strong corpus linguistics (Thompson & Hunston, 2006) or the distinctions between corpus linguistics research and discourse analysis (Biber, Connor, & Upton, 2007; Parodi, 2008). For a review of these distinctions, we refer to Parodi (2009). In short, we will discuss two research challenges for cross-linguistic corpus analyses of multimodal texts. The first challenge concerns issues regarding non-English corpora, specifically Spanish. The second challenge concerns the overcoming of the monopoly of the verbal language by facing automatic analysis of multimodal texts.
Research Challenges for Corpus Cross-linguistics and Multimodal texts.
Information Design Journal, 18(1), 69-73.