Thursday, April 10, 2014

U.S. baby names: Variations on a theme, girls named after months, and Britney

U.S. baby names: Variations on a theme, girls named after months, and Britney - Prooffreader



In my last two forays into the U.S. Social Security Administration baby names database, I explored the extent to which data was skewed towards adults before social security numbers were introduced in 1935, giving parents more incentive to register their children. It later occurred to me that that there is an important, and verifiable, difference between the names of adults and babies: nicknames. While things have become less formal nowadays, in the '30s I'm guessing it was a rare baby who was called "Larry" from the get-go.

A stacked-area or stream graph gives a good view of the distribution of the name "William" and its major variants (it has plenty of other minor variants like "Willy" or "Willem" which were not included):


The pattern takes a little thought to interpret: there's a big dropoff in the adult nicknames "Will" and "Willie" around 1914; in 1935, they would be 21-year-olds. A greater proportion of minors would officially identify themselves as "William". "Billy" sees a big increase from 1920 to 1932; were people emulating the popular radio star Billy Jones? I suspect what's going on is more subtle -- and chilling. If you read about World War I, there seem to be a lot of Billys who crop up, notably Billy Mitchell andBilly Bishop; the Google Ngram Viewer shows a spike in mentions of "Billy" in printed matter during World War I and a fall afterwards, the opposite of what we see above.

In 1935, adults who applied for a social security number had to have survived World War I. I think what we're seeing is Billys who were too young to have perished in the war aging, getting their first job and applying for a social security number. Billy wan't unpopular before 1920; it's just that a lot of them died without leaving a record. Note that the relative proportions between name and nicknames stay relatively constant after 1932.

Let's cheer ourselves up a bit with another name, a girl's name this time (and thus less vulnerable to WWI), and one with two variants right off the bat: Rachel. The huge explosion of popularity of the common spelling in the 1970s makes it a little difficult to see some of the less common versions, so I've added a normalized graph, as well (with all forms of 'Rachel' adding up to 100%, so it's no longer a graph of name popularity, but of variant trends).




Before 1900, the only variant, making up about 5% of the total, is "Rachael", and here's an interesting etymology: basically, parents liked the baroque feel of the name "Michael" and copied it. In Hebrew, the vowel before the "l" in Rachel and Michael is different; this isn't a transliteration, it's an adaptation for aesthetic reasons. The same thing starts to happen later with some girls being named "Racheal". (As an aside, I know someone named Micheal; he rues the irony that his parents wanted it to be easier to spell, but instead ensured everyone misspells his name. One time after a few too many libations I asked him if it was possible his parents were just poor spellers; he claims not.)

Once again we see a huge discontinuity in 1935 with the name Rochelle (which isn't etymologically linked to Rachel, but it's still pretty similar); my guess is this is an artifact of parents for the first time signing up young children before they had a chance to die of childhood illnesses, but that hypothesis would require further testing.

By the way, there are 79 versions of Rachel in the database (here's my first, failed attempt to graph them all); what you see here is the top 16, with all the rest lumped together as "Other", including Rchel (which is, I suspect, actually R'chel. Could be inspired by Hebrew, could be inspired by Klingon. I leave you to judge what, if any, trauma will occur to baby girls named Ratchel.)

Stream or stacked-area graphs are a good way of exploring certain patterns, such as girls named after months:

A few observations: (1) The only month missing is February. (2) Spider-Man's Aunt May is well named, when the comic started in the 1960s it was already old fashioned. (3) Apparently there are a bunch of 30-year-olds named April whom I've never met. (4) January Jones's name is not as uncommon as I thought; statistically, at least one of them must be a good actor. (5) I would not have predicted that the first unusual month name would be "September", nor that it would have started in the 1950s. (5) Who names a girl "March"?!

I've gotten good feedback (read: Tumblr reposts) from my graphs of Heather and Sigourney, so I'll leave you with Britney:


No comments:

Post a Comment