Sunday, March 10, 2013

Onomastics 2.0

Several fresh articles have appeared in the area of applied onomastics, namely social network onomastics


Folke Mitzlaff

and Prof. Gerd Stumme

from the Knowledge and Data Engineering Group (KDE), University of Kassel
Wilhelmshöher Allee 73, D-34121 Kassel, Germany

Onomastics 2.0
The Power of Social Co-Occurrences

Abstract. Onomastics is “the science or study of the origin and forms of proper names of persons or places.” Especially personal names play an important role in daily life, as all over the world future parents are facing the task of finding a suitable given name for their child. This choice is influenced by different factors, such as the social context, language, cultural background and, in particular, personal taste.
With the rise of the Social Web and its applications, users more and more interact digitally and participate in the creation of heterogeneous, distributed, collaborative data collections. These sources of data also reflect current and new naming trends as well as new emerging interrelations among names.
The present work shows, how basic approaches from the field of social network analysis and information retrieval can be applied for discovering relations among names, thus extending Onomastics by data mining techniques. The considered approach starts with building co-occurrence graphs relative to data from the Social Web, respectively for given names and city names. As a main result, correlations between semantically grounded similarities among names (e. g., geographical distance for city names) and structural graph based similarities are observed.
The discovered relations among given names are the foundation of the Nameling, a search engine and academic research platform for given names which attracted more than 30,000 users within four months, underpinning the relevance of the proposed methodology.

Keywords: Inter-Network Correlations, Onomastics, Named Entities, Entity Relation Analysis, Given Names, Network Analysis, Vertex Similarity

If you want to read it, follow

Recommending Given Names
Mining Relatedness of Given Names 
based on Data from the Social Web

All over the world, future parents are facing the task of finding a suitable given name for their child. This choice is influenced by different factors, such as the social context, language, cultural background and especially personal taste. Although this task is omnipresent, little research has been conducted on the analysis and application of interrelations among given names from a data mining perspective. The present work tackles the problem of recommending given names, by firstly mining for inter-name relatedness in data from the Social Web. Based on these results, the name search engine “Nameling” was built, which attracted
more than 35,000 users within less than six months, underpinning the relevance of the underlying recommendation task. The accruing usage data is then used for evaluating different state-of-the-art recommendation systems, as well our new NameRank algorithm which we adopted from our previous work on folksonomies and which yields the best results, considering the trade-off between prediction accuracy and runtime performance as well as its ability to generate personalized recommendations. We also show, how the gathered inter-name relationships can be used for meaningful result diversification of PageRank-based recommendation systems. As all of the considered usage data is made publicly available, the present work establishes baseline results, encouraging other researchers to implement advanced recommendation systems for given names.

Start reading under:

Relatedness of Given Names

As a result of the author’s need for help in finding a given name for the unborn baby, the Nameling, a search engine for given names, based on data from the “Social Web” was born. Within less than six months, more than 35,000 users accessed Nameling with more than 300,000 search requests, underpinning the relevance of the underlying research questions.
The present work proposes a new approach for discovering relations among given names, based on cooccurrences within Wikipedia. In particular, the task of finding relevant names for a given search query is considered as a ranking task and the performance of different measures of relatedness among given names are evaluated with respect to Nameling’s actual usage data. We will show that a modification for the PageRank algorithm overcomes limitations imposed by global network characteristics to preferential PageRank computations.
By publishing the considered usage data, the research community is stipulated for developing advanced recommendation systems and analyzing influencing factors for the choice of a given name.

Keep reading here: