SCOTS: Problems and Challenges

Wendy Anderson, SCOTS Project, University of Glasgow

David Beavan, SCOTS Project, University of Glasgow

The SCOTS project has posed challenges throughout its genesis and subsequent development. Issues stem from the facts that Scots and Scottish English are not clearly-defined language varieties, and that texts are simply not available in a full range of genres in Scots, a minority language. Other textual genres can be difficult to obtain, and require public involvement. Part of our solution to the issue of representing linguistic variety has been to collect considerable quantities of sociolinguistic metadata with each text.

The heterogeneous data collected and the varied nature of SCOTS’ users have both posed computational challenges. In this presentation, we will discuss how the project has met such challenges, and the nature of the challenges which remain live. We will end with a vivid case study of our approach to the problem of maximising the value of the rich contextual data SCOTS holds.