Typo Popularity Tracking with Google

Armed with a list of spelling errors and my old friend Google, I decided to see if I could find the most commonly misspelled word on the Web. If you can do better, leave a comment. (The number of results is in parentheses after each word or term.)

transexual (2860k)

didnt (1230k, via Matt)

doesnt (1080k, via Evan)

seperate (804k, via Bill)

calender (727k, via Graham)

definately (693k, via Shannon)

recieve (667k, via Matt)

offical (366k)

managment (359k)

goverment (317k)

commerical (277k)

Febuary (245k)

enviroment (242k)

occurence (186k)

commision (167k)

assocation (134k)

Cincinatti (70k)

milennium (32k)

Special mention: “could of” (166k results), “would of” (296k), “should of” (123k)

Can anybody find a misspelled word that’s more popular than its correct spelling? Update: We have a winner! Ewin found that “transexual” (2860k) is the more common (but incorrect) spelling of “transsexual” (1660k)!

Comments

    A short and extremely commonly-used word like “the” might be the most popular, but I’m not going to go through the 434,000 references to “teh” in Google and remove the invalid ones.

    According to Dictionary.com, thru is informally an English word.

    Heh. I should’ve checked when I got that results page, huh. (I don’t buy it, though — that’s just not a correct spelling, even if it isn’t an accidental misspelling….)

    Interestingly, also, referer, which Dictionary.com defines as a misspelling but does define (W3C standard, &c.), gets 502k.

    recieve gets 667k, a decent amount (though obviously not tops).

    And kewl, surprisingly, only gets 324k.

    I found two words where the misspellings are more common than their proper spellings, but it doesn’t count because the misspelled words have a distinctly different meaning than their proper counterparts: warez and appz.

    I always have trouble with maintenance, 52k for “maintenence” and 121k for “maintainance” says others do too…

    Ocurred (494k) always gets me. I have problems with any words that have double letters.

    Gah! Just seeing all these spelling errors makes me want to poke my eyes out. I bet I’d win if "teh" counts. (953k) Except that people spell it that way intentionally now, and it’s more of a typo than a spelling error.

    Coming in near the bottom of the list is “dosent”, with 52K. If you throw in two other misspellings of the same word— “doesent”, with 10.8K and “dosnt” with 16.4K — we’ve got a grand total of 79.2K misspellings of a pretty common word. Of course, if we really stretch the rules and add in the version without the contracting apostrophe, that comes in with 1,080,000 hits!

    I’m not sure whether this would count, but ‘it’s’ and ‘its’ do tend to get mixed up quite a lot. That’s not so much incorrect spelling as it is confusion of context.

    (This seemed to be lurking in the back of my mind still….) I remembered you can restrict searches by language; restricted to English-language web pages, adress comes in at 483k. Similarly restricted, seperate comes in at 484k, at this moment, and calender gets 398k….

    No doubt there are many English-language web pages this search restriction rules out, that are written in English but aren’t specified as such in the header. But maybe it’s more interesting to include foreign-language misspellings as well (though not in the case of ‘adress’, which (of course) actually is spelled that way in other languages). And I wonder what figures would be like for common misspellings in other languages….

    Also, I dunno about the missing apostrophe thing. But there’s also theres, 703k; and dont, 12,700k.

    judgment is kind of iffy. so many people spell it judgement that it’s now part of popular (and acceptable, according to dictionary.com) lexicon.

    Probably difficult to track, but the number of times I read “loose” when the writer meant “lose” is astounding. Even worse is its longer form: “looser,” as in, “She’s kewl, but he’s a total looser.”

    acount gets 97k.

    ammount comes in with 49k.

    One I always seem to mess up, herf (as in A HERF=) only gets 1600. Either I’m the only one that does that, or lots of other people do, but we all notice the error, ’cause it doesn’t make a link.

    Here’s one for you: alright @ 1,380K. This is a commonly used non-word that should technically be spelled “all right,” as two words. “Already” and “all ready” have two separate meanings, but “all right” is correct while “alright” is just plain wrong.

    how about the ever popular becasue? You’d think that was obvious in print, but 89K sites would beg to differ…

    My all-time favorite net-typo: the misspelling of lesbian as “lesbain”. (~339,000 instances on Google)

    It sounds rather like a detective in a series of mystery novels. The Purloined Toolbelt: A Les Bain Mystery

    I suspect “referer” is that common because of the “HTTP_REFERER” attribute.

    “Millenium” produces over 1.6 million results; it’s a much more common misspelling than the one you tried. Result #2 is particularly noteworthy.

    Just to be pedantic, since this is all about the pedantry …

    Note that Google usually ignores punctuation in searches, as far as I can tell.

    Also, a “calender” is a machine used to smooth paper or other materials during the manufacturing process, so it’s an actual word. Not a frequently used one, to be sure, but it should account for some of the results on that search.

    verbal.

    One of the most common Internet misspellings I encounter is “lose” (or related words like “loser”) misspelled as “loose”. Since “loose” is also a common word (though a completely different one), it’s hard to find the frequency of this misspelling through a search, but it’s very widespread in discussion forums (newsgroups, mailing lists, etc.).

    “totaly” gets 225K.

    i’d have to second the comments about “loose” being used way too often instead of “lose”. i am not a prescriptivist, but this one grates on my nerves.

    Gonna (6,730K), wanna (5,060K), and gotta (3,670K) are pretty common. Not surprisingly, they appear together frequently (475K).

    Less common but more amusing: duude (1.3K), duuude (2.9K), duuuude (2.5K), duuuuude (2.3K), duuuuuude (1.2K), duuuuuuude (0.5K), etc.

    How about Virgina (240K), West Virgina (30K), and Washinton (62K)? Or Los Vegas (25K) and Las Angeles (8K)?

    Here’s one that I’ve always been amused by. Try this search: site:.gov “untied states”

    You’ll get 1290 results. Wow.

    “Nite” had more than it’s fair share…and, just for the sake of lovely irony, “libary” didn’t do too badly either.

    Try my favorite: “Untied States.” It will even pull up some government sites, as well. (I realize this is not a true spelling mistake, more of a typing mistake — but interesting nonetheless.)

    Interestingly, ‘mispelled’ only has 24,600 hits, but the very first one is a link to a list of commonly “mispelled” words! They’re awfully strange words to misspell, too. Here are some contributions, mostly from that list:

    I’m surprised no one tried “suprise” yet, which has 213k, and “suprised” brings in 220k. “embarassed” has 268k. “beutiful” has a remarkable 133k, considering I’ve never seen it before.

    The word “referer” has 588k hits, so large a number only because it was misspelled this way in the HTTP RFC and has since worked its way into a large number of technical documents. So maybe this doesn’t count.

    “tatoo” has 423k, but it seems this might be a correct spelling in some languages.

    Oops.. some of those which I thought had been missed were actually just not showing up when I was searching in the page. Doh!

    How about “your” mistakenly used for “you’re” and their/there/they’re, all of which I seem to see quite a lot, is there any way of finding these on Google or elsewhere?

    “Febuary” – the place where people spend 8% of their lives (197,000 hits)

    its always fun using referrer in english, and in some code, but then suddenly having to spell referrer referer-

    I just wish we could stick with referrs to 😉

    sorry – im being a mega smartass tonight.

    ‘Stoopid’ gets my vote (87K). Aaah, the irony. Admittedly, some use this spelling intentionally.

    my pet hate is people using apostrophes when there is no need for them…

    such as Your’s sincerely

    This occurs frequently on websites and in newspapers (and it’s disgraceful!). Send these people back to skool at once. 🙂

    And a misspelling more common than the correct spelling: warez at 12 million vs wares at 1.2 mllion.

    Granted the misspellings are probably intentional.

    How about past tense of “lead” as “lead?” The correct past tense is “led” as in Zeppelin. I guess the pronunciation of the metal “lead” (Pb) is confusing the issue. I don’t know how to set up a Google search that would resolve whether “lead” as past tense is more common than “led” on the Web, but I suspect it is.

    A few that should be on the

    list are:

    “web sight” vs “web site”

    “manuel” vs “manual” (a book)

    “slowle” vs “slowly”

    “sperate” vs “separate”

    “there” vs “their”

    “a” vs “an” vs “and”

    “I’am” is an annoying one, and what’s with “tonite”? Is that just laziness–save one letter?

    “then” for “than”

    I second the unnecessary use of apostrophes… it kills me when people try to look smart and end up showing you how dumb they really are.

    How ’bout “leverages” (211k) — a nonexistent verb that appears in almost every marketing document produced over the past 10 years.

    I realise some people will not like this, but I don’t mind “warez” or “prolly” so much, the former is a new word, in my opinion…oh…how about “pacific” used instead of “specific” I hear “pacific” almost exclusively and I’m not talking about the ocean either – (maybe people are just not aware it’s a separate word? )- or is that just here in the UK??

    The most common error that I come across is the use of the word “to” where it should be “too”.

    1. Surprised not to see “bookeeper” on the list.

    2. A couple of context-sensitive bugaboos of mine: :”You” instead of “your” as in “send in you comments.”

    “That” instead of “than” as in “We got more that a hundred replies.”

    One I see a lot is monitor, misspelled “moniter”.

    Google gives 45,800 results.

    One of my pet peeves is people who think the possessive pronoun its is spelled it’s.

    You can’t simply google on “it’s” because you get the legitimate contraction of “it is.”

    However, you can google on a preposition with “it’s,” which is usually indicative of incorrect usage (the exception is usually the possessive form of the acronym IT=”Information Technology”).

    A partial list:

    “With it’s” gets 559K

    “Through it’s” gets 103K

    “By it’s” gets 164K

    “Of it’s” gets 768K

    “gets it’s” gets 23K

    This is only a sampling … if you add up these alone, you get 1617K misspellings, putting “it’s” at number two on the list behind “transexual!” A few examples follow:

    ZDNet Awards E-Book Systems with it’s Highest Ranking

    Vision of Atrias: A world as seen through it’s people

    You Can’t Judge a Town by it’s Railway Station – Wareham, England …

    InterAKT releases new versions of it’s MX products – Dreamweaver

    digital-sea.com Downloads : Where the net gets it’s software.

    PS … The Oxford English Dictionary lists “transexual … see: transsexual.”

    As to [quote] judgment is kind of iffy. so many people spell it judgement that it’s now part of popular (and acceptable, according to dictionary.com) lexicon.

    posted by brent on April 8, 2003 08:05 PM

    [/quote] if you check Black’s Law Dictionary you will find that judgement is the older word and has been in use for several hundred years.

    Did anyone notice on CNN just last evening (Mon. April 14), where they run supers that summarize what’s being discussed (as opposed to the crawl along the very bottom), someone typed in that chemical facilities recently uncovered in Iraq appeared to be “duel use”. About a half hour later it was repeated, but this time with the correct spelling. The first version was certainly ironic, though — kind of a Freudian typing slip.

    Ya’ll (for y’all): 239K

    cum: 4,230K, although some indeterminable number were for the Latin word meaning “with”, despite a site restriction to English language.

    In addition, there is perhaps inevitable confusion of phrases derived from the era when horses were commonplace: “free reign” (rein); “he bridaled at this” (bridled); “chomping at the bit” (champing). No way to count this that I can think of.

    Finally, to the site/sight confusion mentioned above I would add that “cite” is often mixed up with the first two.

    Now that orthography is widely sneered at as petty and pedantic we must expect more of this. Spelling is left to so-called spell checkers, although one immortal comment covers that:

    “The jargonny term spell checker is an affront to the English language–unless you really WANT to check for spells, curses and incantations.”

    404k : pasword for password

    310k : thankyou not thank you

    217k : absolutly for absolutely

    138k : rythm for rhythm

    132k : adminstration for administration

    132k : intrest for interest

    94k : rithm for rhythm

    82k : cancelation for cancellation

    45k : reciept for receipt

    45k : refferal for referral

    34k : proceedure for procedure

    Almost always a typo rather than a misspelling I should think and a particular bugbear of mine, but anyway “remeber” (for “remember”) weighs in at a creditable 166k

    Recently, a FOOD LION press release on the web had the President’s name spelled “BUSCH.” It was a NASCAR related release, and “BUSCH” is a great racing word!

    And there’s always CONGRADULATIONS with 40k hits on Google…

    My personal problem is using THEN instead of THE. (I’m a programmer, and am always typing “IF…THEN”) The spellchecker can’t even catch that one.

    Robert,

    Programmers don’t type IF…THEN if they’re programming in any respectable language. 😉

    I found a good one: “omelet” (122k) vs. “omelette” (145k). Close, close!

    I rather doubt that most of the appearances of “cum” were misspellings of the word “come,” considering, uh, some of the most popular Internet subject matter.

    I think “transexual” should be disqualified because it is the correct spelling, apparently, in Spanish and probably several other languages. Even if you search just in English, “transexual” outnumbers “transsexual,” however. It is worth noting that several of the pages that show up in a GS for “transexual” actually only contain the correct spelling; this is because a large number of people link to them with the incorrect spelling. So Google is really an imperfect tool for this sort of analysis.

    Not sure if anyone cares any more, but in reference to “teh” mention of lesbain way up top, my high school librarian was named Les Bain… rest assured, we all had fun with that one. 🙂

    “Forth Worth” is very common in print as well as on the Web. Someone with proper skills could find its frequency. It is repeated in the Dallas-Ft. Worth lawyer lists and appears to be std. in Spanish for DFW airport.

    Something about the 2nd word attracts imitation in the first, or perhaps it’s just the habitual following of “t” by “h.”

    A common one that’s slipped through the cracks so far is “developement” at 240K.

    I suppose this one is debatable. The genre is correctly called thrash metal, not trash metal (as a styles search on allmusic.com should reveal, though the alternate spelling seems to be widely accepted.

    jewlery (145K)

    perscription (83K)

    attornies (12K)

    I’m not sure if the following misspellings “count” because I’d guess that, in most cases, the misspellings were intentional. There are some big numbers though:

    fone (879K)

    quik (676K)

    skool (289K)

    micro$oft (166K)

    And as for the following, there aren’t a lot, but the fact that there are ANY is funny:

    mispelling (9K)

    mispellings (9K)

    neil,

    I did so much BASIC programming in the 80’s that my fingers still want to type an “N” after T-H-E. But these days I do use the “||” a lot more from scripting, and tend to type that im messages when I really mean THEN. Rambling on >> I need a cheat sheet just to switch gears from HTML, XML, VB, C++, SQL, and UNIX/Linux. (Is it END IF ,IFEND or just ‘}’ ??) I love the ‘case’ … ‘esac’ syntax – that makes a lot more sense that most. I’m gonna use if..fi while..,elihw loop…pool and read…dear (for writing) when I get around to writing the script language in MYOS. Look for it soon!

    You need two things: a word that is commonly misspelled, and a word that is common on the internet. Following the example of “transexual”, I tried “beastiality”. 809K results. Unfortunately, the correct spelling “bestiality” had more: 894K. I guess animal lovers are better spellers than she-male lovers.

    unfortunatly (130k)

    I always misspell that one.

    I went to look for other studies of commonly misspelled words and came across A Study of Some of the Most Commonly Misspelled Words. This was done on Usenet back in the Deja days.

    I also found a little quiz with some goodies on it:

    reccomend (198k)

    independant (483k)

    Oh, and the big momma? millenium which is misspelled 1.7 million times compared to the measly milennium which is only garners 32k hits. (funny, some of the hits for this misspelling are targetted to find people who misspell.. if you click on the Gates foundation, for example, they actually give you the right spelling after a quick redirection.)

    Great fun. 🙂

    how about “wierd” (correct = weird)… i just got 431,000 incorrect ones, but i didn’t check every single one!

    aparently (35k)

    misstake (6k)

    “loose” instead of “lose” and “looser” instead of “loser” are very frequent, but it’s hard to tell how many are actual misspellings.

    yah umm have u ever noticed that people with poor grammar always put “What is up?” instead of “What’s up?” because when the spell that it always comes out “Whatup”

    I begin to wonder how many extra misspelled were added to these counts from this conversation.

    “miniscule” hits 116K. Google does not question this, but for “majescule” suggests the possible

    substitution of “majuscule.”

    I searched on Google and found 354,000 misspelled versions of “really” to “realy.” Some of these might be last names…

    alot…2420k, doesn’t quite beat transexual, but seems to have second place locked up.

    Posted by Joe on April 18, 2003 09:59 AM:

    You need two things: a word that is commonly misspelled, and a word that is common on the internet. Following the example of “transexual”, I tried “beastiality”.

    Along those lines, “amature” gets a solid 3,000k, although “amateur” pulls in double that. oh well.

    Following up on Xian’s. . .

    masterbation 1,210,000

    masturbation 6,420,000

    1 out of every 6 people spells it wrong?

    My favorite misspelled phrase:

    “pubic library” 1460 (I work in one–er, a public library)

    A word that I see commonly misused is the word “further” when it should be “farther”. Many people do not know the difference.

Comments are closed.