CoMeCaYo: Corpus Mediático de Canarias en YouTube (1.0) Corpus uri icon

abstract

  • Comecayo (Corpus mediático de Canarias en YouTube) is a comprehensive corpus of Spanish-language YouTube videos from Canarian media outlets containing timestamped transcriptions. This dataset was developed as part of project A09 "On the interplay between register and socio-geographic variation in Canarian Spanish" within the Collaborative Research Centre 1412 "REGISTER" (Register: Language Users' Knowledge of Situational-Functional Variation), led by Prof. Dr. Miriam Bouzouita at Humboldt-Universität zu Berlin. The corpus spans over 15 years of video content from major Canarian television channels, radio stations, and digital media platforms, providing a valuable resource for linguistic research, natural language processing, and computational linguistics studies focusing on Spanish language evolution and regional media discourse patterns.