Wednesday 12 August 2020

A little text analysis using SimHebrew instead of a Hebrew Keyboard

Here's the mapping of square text to the internal form I use in my software. (For a detailed introduction to SimHebrew see this post by Jonathan Orr-Stav,)

א

ב

ג

ד

ה

ו

ז

ח

ט

י

כ

ל

מ

נ

ס

ע

פ

צ

ק

ר

ש

ת

a

b

g

d

h

v

z

k

+

i

c

l

m

n

s

y

p

x

q

r

w

t

  • The first 7 are quite straightforward. (Except for /v/ which may map to v, o, or u, depending on the context - but there's more to this than meets the ear. Oh, and except for /a/, but see below.)
  • For the guttural het, a second h in the Hebrew alphabet, /k/ has been chosen. One normally uses k for kaf, but the graphic similarity of the mirror-image of kaf to c is too good to pass up. So k is free for this mapping to the more guttural het. 
  • For the extra 't' (tet) in the Hebrew alphabet, I substitute a + sign. That saves me from using an escape sequence as the display version of SimHebrew does.
  • Yod, the tenth consonant, is a clear mapping, but takes some getting used to where it might operate more like a consonant (as does the English /y/) than a vowel.
  • The next four revert to the familiar k-l-m-n, identically sequenced in the Greek and Latin alphabets.
  • Samech, comfort, comes in the circle in place of O in the Latin alphabet. There must be story here.
  • Ayin takes y for graphic similarity, but it takes some getting used to for internal aural clues to pronunciation.
  • Of the last 6 only tsade requires a note. Again, X, sometimes used for aleph is also graphically similar to tsade. This takes getting used to as well for one's silent inner pronunciation, since X is ts.
I have avoided making any distinction of final letters internally, so /f/ and /c-cidil/ do not play a part internally in the program, but are used as part of an external display in a book.

Note that /a/ and /y/ are both gutturals. Note that /a/ and /i/ in English are often guttural. If you have followed any of my playful translations of the acrostic poems of the Bible in Lamentations (1-4), Proverbs (31), and Psalms (9-10, 25, 34, 37, 111-112, 119, 145), you will know that since aleph and ayin can carry any vowel, I have allowed any English guttural to play the role of aleph or ayin in these poems.

There are books to write on this mapping, but my purpose here is to try and show you some of the things I had done to analyse sequences of words in the Hebrew Bible using these letters for filters. It is much easier to select a pair of letters in Latin text when operating in English than it is to filter by Hebrew letters which must be translated back and forth between their square form and Unicode.

Here is a form I sketched to look at sequences of 6 consecutive Hebrew stems filtered by any combination of three consecutive stems. 

This filters the filters by the word /bn/, where we see that the three stems, al bn iwral, occur 69 times in the 39 books of Tanach. I you were interested in the stem /bn/ you could type it in as a filter and then click on any of the lines in the frequency column, and presto, all the verses in the selected range of books would appear. You could then note the places where your translation was inconsistent - and see if there was any excuse for your work.

Sample screen showing filtered usage of 6 consecutive stems in the Hebrew Bible
I have been thinking of what I have done and what I could do with it. This sketch I made a month or so ago and adjusted to filter the data a bit as I was writing this post.


No comments:

Post a Comment