Thursday, December 13, 2012

Spatial Analysis of Surnames

Today I'd like to present research of Dr. James Cheshire from UCL (University College London) who works there as a Lecturer for Advanced Spatial Analysis and Visualization.

He has spent the last few years investigating the geography of family names (also called surnames). He works with the team who assembled the UCL Department of Geography Worldnames Database that contains the names and geographic locations of over 300 million people in nearly 30 countries (a few of these are yet to be added to the website). His research has focussed on the 152 million or so people they have data for in Europe and they all come from publicly available telephone directories or electoral rolls. He also had access to a historical dataset for Great Britain in the form of the 1881 census.  He has tried to answer two questions:

1. Is it possible to approximately establish the origin of a surname based on its modern day geographic distribution?

2. Are particular surnames more likely to be found together and if so do they form distinct geographic regions?

In the past surname research has involved  lot of manual work to create a detailed history of a particular name. With so many surnames in the database he had to think of some automated ways to do this computationally. The patterns he produces are much more generalised than the manual work- he finds broad patterns rather than specific genealogical facts- but they provide useful context for population genetics, migration, historical geography and demography. If you want to find out more about this research here are titles for the papers he had published in academic journals:

The Surname Regions of Great Britain.

Creating a Regional Geography of Great Britain Through the Spatial Analysis of Surnames.

Identifying Spatial Concentrations of Surnames.

People of the British Isles: A Preliminary Analysis of Genotypes and Surnames in a UK Control Population.

Delineating Europe’s Cultural Regions: Population Structure and Surname Clustering.

The left map of the post is from the last paper he listed and shows how the surname regions vary across Europe. The map on the right shows how confident he is of the regions based on the number of times they emerge in the cluster analysis.

