Corpus Selection Criteria
The aim of developing the corpus was to create a set of clauses which include geospatial language.
For our purposes, geospatial language includes natural language expressions that describe
(1) the location of static objects, either absolutely (the mountain in the north) or more commonly, relative to other objects (the mountain beside the bay);
(2) the movement of objects relative to static geographic features (she crossed the bridge) including people’s walking paths, bus and van routes, etc.
We further refine our criteria for selection of language to be included in the corpus as follows:
(1) As far as geographical scale is concerned the selection of expressions considered as spatial was informed by research by Montello (1993) who categorizes qualitatively psychological space into four scale classes. These are: figural (space that is smaller than the body and perceived without locomotion, e.g. a table), vista (space larger than the body and perceived without locomotion, e.g. a room), environmental (space larger than the body and perceived through locomotion, e.g., a field), and geographical (space much larger than the body and perceived through locomotion but also pictorial/symbolic aids, e.g., the world). For the scopes of this corpus only spatial expressions with environmental and geographical scale classes are included in the selection. Spatial references to figural or vista scale were therefore excluded.
(2) Language referring to sport or weather was excluded, as it is often specialised in nature. However, expressions regarding for example natural disasters resulting from weather events (e.g. floods) that addressed locations were included, where they adopted everyday language rather than specialised language (as is common in news reports for example).
(3) Metaphorical use of space in language was excluded (e.g his business is on the road to nowhere).