Greek letter frequency and entropy

To address this question, I downloaded a copy of the Greek New Testament from Project Gutenberg and ran the word frequency script from my previous post.

This lead to the follow table of letters and percent frequency.

α 13.

10 ο 10.

44 ι 9.

76 ε 9.

38 σ 7.

72 τ 6.

29 ν 4.

82 υ 4.

52 κ 3.

77 π 7.

90 ρ 3.

42 η 3.

32 μ 2.

85 λ 2.

47 ω 2.

16 γ 1.

75 δ 1.

54 θ 1.

48 χ 0.

78 φ 0.

77 β 0.

75 ζ 0.

41 ξ 0.

40 ψ 0.

20 From this I calculated the Shannon entropy of a Greek letter to be 4.

045 bits.

Using English letter frequencies I found on Wikipedia, I calculated the corresponding entropy for English to be 3.

915.

So in this regard, the two languages are pretty similar.

By the way, the frequency table for ancient (Koine) Greek letters is something like the famous ETAOIN SHRDLU order for English.

The most common letters in Greek line up roughly with their English counterparts.

Update: Homer and Plato I first wrote this post just looking at the New Testament, written in Koine Greek.

The table below includes the results from Homer’s Iliad and Plato’s Republic to get a sample of other ancient Greek sources.

|—+——-+——-+———-| | | NT | Iliad | Republic | |—+——-+——-+———-| | α | 13.

10 | 13.

71 | 12.

86 | | β | 0.

75 | 0.

92 | 0.

53 | | γ | 1.

75 | 1.

82 | 1.

18 | | δ | 1.

54 | 1.

53 | 1.

94 | | ε | 9.

38 | 8.

10 | 8.

34 | | ζ | 0.

41 | 0.

43 | 0.

36 | | η | 3.

32 | 2.

94 | 4.

01 | | θ | 1.

48 | 1.

09 | 1.

38 | | ι | 9.

76 | 8.

82 | 9.

86 | | κ | 3.

77 | 4.

18 | 3.

52 | | λ | 2.

47 | 2.

75 | 2.

97 | | μ | 2.

85 | 3.

38 | 3.

13 | | ν | 4.

82 | 5.

72 | 8.

95 | | ξ | 0.

40 | 0.

54 | 0.

36 | | ο | 10.

44 | 10.

72 | 10.

23 | | π | 7.

90 | 3.

55 | 3.

78 | | ρ | 3.

42 | 4.

33 | 3.

29 | | σ | 7.

72 | 7.

88 | 6.

61 | | τ | 6.

29 | 8.

34 | 7.

53 | | υ | 4.

52 | 4.

27 | 4.

54 | | φ | 0.

77 | 1.

25 | 0.

83 | | χ | 0.

78 | 1.

44 | 1.

02 | | ψ | 0.

20 | 0.

16 | 0.

13 | | ω | 2.

16 | 2.

15 | 2.

63 | |—+——-+——-+———-| The frequencies are very similar, and they lead to very similar entropy calculations: 4.

08 bits for Iliad and 4.

05 bits for Republic.

Related posts How fast were ancient languages spoken?.ETAOIN SHRDLU Quantifying the information content of personal data.

. More details

Leave a Reply