ECAI banner

 

 

Folklore Excavations: Machine Learning And Historical GIS In A Folklore

Timothy R. Tangherlini
University of California, Los Angeles

 

Abstract

We present preliminary findings from the application of statistical analysis of folklore data in GIS and from the application of unsupervised machine learning on this large corpus. For the GIS analysis of the data, stories are linked to geographic places in two ways—through internal mention of places, and through the places of story collection.

The study corpus, collected in Denmark by Evald Tang Kristensen, spans forty years from 1870-1910, and comprises 6,500 storytellers and 250,000 stories. This study is based on a much smaller subset of 340 storytellers and 1,000 stories. Several examples are given of both the application of standard GIS tools to folklore "event" data, as well as from machine learning. Unlike many applications of machine learning to the "clustering" of Humanities data, because of the geographic referents related to each story, we can project the data clusters onto historic maps in the GIS. Fortunately, the Danish cadastral survey has geo-referenced a series of high resolution maps from 1880-1890. Subsequent statistical analysis of the characteristics of the story clusters projected onto these historical maps (e.g. Standard Deviational Ellipse) reveals interesting patterns in the relationship between stories and the environment, be it natural or man-made.