A Corpus of Indefinite Uses
The Corpus of Indefinite Uses is an output of the project Indefinites and beyond. Evolutionary pragmatics and typological semantics. It makes available data collected and annotated in the course of a cross-linguistic synchronic and diachronic corpus study of indefinite expressions.
The corpus contains data for the following languages and forms:
Synchronic
Diachronic
The indefinites have been annotated with the functions in an extended version of Haspelmath’s (1997) semantic map proposed by Aguilar-Guevara et al. (2011). A description of the functions and the annotation procedure can be found in the Annotation Guidelines. Aloni et al. (2012) reports results on inter-annotator agreement.
The corpus is searchable through an online web interface and is also available as raw data.
Full documentation describing the organization of the database and the search functionality, as well as highlights of key results, is available here.
The following publications are based on data included in the database
@inproceedings{AloniEtAl2012, author = {Aloni, Maria and van Cranenburgh, Andreas and Fernandez, Raquel and Sznajder, Marta}, title = {Building a Corpus of Indefinite Uses Annotated with Fine-grained Semantic Functions}, booktitle = {Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12)}, year = {2012}, publisher = {European Language Resources Association (ELRA)} }
External References
This work was financially supported by the NWO.