Within the shortcuts category classification, we grouped slang words commonly found on the Web as insider word (e.g., a hottie is a very attractive and desirable person; fugly is something or someone that is extremely ugly). We classified words regularly shortened by removing at least one phoneme or tow morphemes as an abbreviation (e.g., as per Netlingo.com, prolly is an abbreviated type of probably) and compressions of various words into a single, phonetically spelled word (e.g., wanna for want to) as word combination. We classified regular acronyms (e.g., lol for laughing out loud or bf for boyfriend) an acronym. We characterized substitution of a word or part of a word with an alphabetic name (e.g., u for you) or a number (e.g., 2morrow for tomorrow) as alphabetic/number words. We characterized basic phonetic spellings (e.g., wat for what) as phonetic new language words. We arranged the utilization of lower situation where letters ought to be promoted (e.g., as in the first letter of a proper noun) as lower case. We didn’t check the first word on a line (see example in conversation window in Fig. 1) as a lower case blunder despite the fact that each line generally relates to a sentence; along these lines, overlooking case might be underrepresented in our checks. We classified exclusion of an apostrophe (e.g., thats for that’s) as a contraction. We did not score other punctuation omissions as we could not determine whether a line break in sending an instant message was being used to represent punctuation.
Within the pragmatic devices category, we classified the use of words to express emotion, such as representing laughter (e.g., hahaha) or repeating vowels to mirror pragmatic lengthening (e.g., whaaat to represent a drawn out expression of surprise) as emotion words. We classified the use of acronyms to express emotion (e.g., lol) as emotion acronyms. We classified the use of upper case to represent emotion (e.g., WHAT to represent surprise) as uppercase and extraneous use of punctuation for emphasis e.g., !!!!!) or as emoticons (e.g., 😎 as emotion punctuation.
Finally, we classified common letter typing errors (e.g., knwo for know) as typographical error and apparently misspelled words (e.g., hungary for hungry) as a misspelling. Neither of these categories is an example of a new language so we classified these separately as Errors.
Some words received more than one classification; for example, from Table 1, the insider word, hottie, was in all uppercase letters and so was scored both an insider word and as upper case. Other examples of multiple classifications include im, classified as lowercase and as a contraction error, and lol, classified both as a shortcut acronym and as a pragmatic device emotion acronym. Approximately 3% of the words were scored as more than one category. The conversations were initially scored by the researcher to define the categories of new language.
New language use
Participants used an average instance of a new language (in their instant messages. The means among the different categories add up to greater than the overall total because a few words could be classified into two categories (e.g., im is classified as both lower case and a contraction error; WUT is classified as both phonetic and upper case).
Considering first the three groupings of categories, shortcuts, pragmatic devices, and errors, we conducted a single factor repeated measures analysis of variance with the three general categories of shortcuts, pragmatic devices, and errors as the repeated measures factor and participants as the random factor. We found a statistically significant difference among these three general categories,
Participants most often used shortcuts in their instant messaging, followed by pragmatic devices to support their messages, followed by errors.
We also analyzed use of new language in each of the three general categories.
We conducted a single factor repeated measures analysis of variance with the eight different shorts cuts categories as the repeated measures factor and participants as the random factor. We found a statistically significant difference among these three general categories, post hoc tests demonstrated that use of lower case (e.g., i just got home) was the most common short cut. Participants next most frequently used short cuts of abbreviations (e.g., doin), acronyms (e.g., bf), letter/number words (e.g., u, r), and omitting contraction punctuation (e.g., dont) in their instant messaging. The remaining short cuts of using insider words (e.g., hottie), word combinations (e.g., gonna) and phonetic representations (e.g., wuz)—all commonly cited in the news media as new language—were least commonly observed.
Participants used pragmatic devices less than half as often as they used shortcuts.
We conducted a single factor repeated measures analysis of variance with the four different categories of pragmatic devices as the repeated measures factor and participants as the random factor. This analysis was not statistically significant.
We also found few typographical errors and misspellings in our corpus. A paired samples t-test revealed no statistically significant differences between misspellings and typographical errors. Misspellings were generally phonetic misspellings of words that are not likely of high enough frequency to merit a shortcut of their own (e.g., actualy, awsome). Typographical errors included common right-left hand errors, such a knwo (knwo is such a common typographical error that some word processing programs will automatically correct for it), letter reversals (e.g., carzy for crazy), and extra letters that are adjacent on the keyboard (e.g. whatds).
We found many individual differences in use of new language. For example, one participant expressed shock with pragmatic lengthening (WHAT), another used pragmatic lengthening (whaaaat), and another added extra punctuation (what!!!!!).
Similarly, one participant finished a conversation with word combination (gotta go), another used an acronym (g2g), and another used a number word (got 2 go). These individual differences led to few correlations between different types of new language use, namely between abbreviations and use of word combinations, and between typographical errors and use of upper case to express emotion. On the other hand, these correlations make intuitive sense: both abbreviations (e.g., em for them or cause for because) and word combinations (e.g., gonna for going to or kinda for kind of) shorten common words and speed up typing.
Typing quickly can lead to errors. Some errors have become so common that they may be becoming new language words themselves, surprisingly, we found few typographical errors or misspellings in our study, but misspellings were predicted by spelling ability. On the other hand, abbreviations, acronyms, letter or number words, and phonetic spellings are all shortcuts that minimize typographical errors and misspellings in the first place.
Our development of taxonomy of shortcuts, pragmatic devices, and errors has several important uses, including comparing new language use across different social media, the acquisition of new language among young people, and