As a sample use case for the Gendre API, we’ve created an open source Android application on GitHub to enrich Android contacts with classic Title information (Mr,Ms,…) or the more iconic Gender (♀,♂,∅) and Heart (♥,♤,♢), inferred from the contact name.
To recognize name gender, the statistical approach works really well and we’ve deployed a free API with more that half a million unique names to deliver excellent results. But I have seen objections as to the feasibility of to that statistical approach to work globally (“But I don’t think it is feasible to cover all the names in the world. – Ramesh“)
We think we can cover all the names in the world by combining different approaches, including sociolinguistics, machine learning and we have a roadmap to do that.
For example, in most countries, the gender is ‘encoded’ in the first name (John, Isabel, …) but in other countries, the gender is encoded in the last name (O. Sokolova is probably Slavic/of Slavic origin and a Female).
Rare names (or invented names) are also difficult to classify using the statistical approach but we can guess their likely gender by looking at whether they ‘sound’ male or female, according to a particular culture (again having the last name is critical to pin down a particular culture/locale).
The current GENDRE App features are:
- The gender prediction runs as a background service (every 10 sec, or 1 min, or 10 min or 1 hour)
- Possibility to choose between three Title formats : Classic (Ms.,Mr.,M.), Gender (♀,♂,∅), Heart (♥,♤,♢)
- Your existing Title data (Mr., Dr. etc.) is not overwritten, unless you specifically request a wipe
- Once all contacts are genderized, the App shows a summary of how many Female / Male contacts were detected
- You can share this #funstat on Twitter, if you like
You can find GENDRE App on GitHub (https://github.com/namsor/gendreapp). Feedback, as well as Open source contributors, are both welcome
No comments:
Post a Comment