Potsdam Commentary Corpus Corpus uri icon

abstract

  • The Potsdam Commentary Corpus (PCC) is a corpus of 220 German newspaper commentaries (2.900 sentences, 44.000 tokens) taken from the online issues of the Märkische Allgemeine Zeitung (MAZ subcorpus) and Tagesspiegel (ProCon subcorpus) and is annotated with a range of different types of linguistic information.

    [Bourgonje & Stede 2020] Bourgonje, Peter and Stede, Manfred (2020). The Potsdam Commentary Corpus 2.2: Extending Annotations for Shallow Discourse Parsing Proc. of the Language Resources and Evaluation Conference (LREC), Marseille.