of unique words was tabulated [appendix B, table 6], it was found that they had fewer unique words
than the non-English translations. Hence, this leads to higher frequencies in the more common words
and a steep drop-off as the word rank is increased. The most obvious explanation for this is inflection.
Inflection in language is essentially a categorisation of grammar structure and the importance of
word order. English, as a weakly inflected language, is heavily dependent on word order, articles,
prepositions, pronouns and auxiliary verbs in order to correctly convey the information in a sentence.
On the other hand, heavily inflected languages (such as Quenya and Latin), usually indicate this
information by changing noun and verb endings. In an extreme example:
English: The dog gives a dog to a dog.
Latin: canis canem cani dat.
Here, the different endings to ‘canis’ (dog) indicate which dog it is. The ‘-is’ ending denotes the dog
doing the action, the ‘-em’ ending denotes the dog having the action done to it, and the ‘-i’ ending
denotes the dog having something given to it. A similar thing occurs with verb person, number, mood,
tense and aspect. Consequently, inflected languages use fewer words of a greater variety to convey the
same information.
It is also worth noting that, as demonstrated by the entropy table (figure 3) and box plot (figure
4), heavily inflected languages appear to have lower Shannon entropies than weakly inflected ones.
However, in order to draw a confident conclusion, more translations would need to be analysed.
6.2 Potential Extentions
The possible extensions of this project are many and varied. Ideally, comparisons would be performed
between the Neo-Quenya and at least 30 other natural languages, to allow for some indication of
statistical significance. A range of inflected and non-inflected languages, as well as non-Indo-European
ones could be included, to investigate the relationship between Quenya, languages it is based off (e.g.
Latin, Finish and Ancient Greek) (Fauskanger n.d.) and languages that have no deliberate connection
(e.g. Mandarin or Wiradjuri). Ideally, the entropies would be calculated in such a way as to make
strongly and weakly inflected languages comparable.
Whilst better than letter entropy, word entropy is far from perfect in measuring the entropy of
a language. The selection of a word as the token of measure, causes a direct split between heavily
and weakly inflected languages. Potentially, the ideal measure for entropy in language would be the
morpheme. This would designate the set of tokens as being the set of smallest units of meaning within
a language, allowing easier comparison between Quenya and a variety of languages.
13