Monday, 5 October 2020

Every jot and tittle - Sisyphus strikes again

It occurs to me that I could have entitled this whole series 'every jot and title'.

Genesis 3 to 10 is the current test extension. The first pass gave 56 discrepancies between my calculation of what the unpointed text should be and what it really is. That is 56 in 2,638 words - 2% error rate. Sometimes there is a misalignment - extra words in the source or two words combined into one, but not in this case. Sometimes the verse numbers differ, but not in this section. What surprised me a little in these chapters was the count of distinct discrepancies. I often hope there will be a bunch that will be fixed with one little tweak and this time there are several bunches. 

  • The first is tbh, תבה, the word for the barge that Noah built. 
    • (It also may be rendered as creel, the basket used by Miriam for Moses in Exodus 2:3, 5.The same word tbh is used for each. I have let the two stand as different words, and different again from ark which I reserve for ארה arh.)
  • Back from the distraction of translation, tbh requires the rendering of the tsere. This rule will account for 26 of the discrepancies so I am back down to a 1% error rate with one slight tweak.
  • Two more stems, xlh and yrh have three instances each to be reviewed. 
    • For xlh, hireq becomes i without conditions. 
    • For yrh there is a rule that doubles an inner v. When v occurs with  qamats, tsere, and patah, for some stems it will be doubled in the unpointed text. In the test data so far, about 50 stems follow one of these rules. 

Now it's a matter of one by one. like ahl with a final holem-h drops the h. There are a few stems where final h's get dropped. Some are holem-h and some qamats-h. (Not initial h's as if there were a cockney pronunciation, but the final ones.)

Below are the remaining discrepancies. I examined them one at a time to determine the adjustment required. I think my comments are understandable and they reveal inner workings.

It's not at all clear to me that I will find the simplest algorithm. It may well be that our brains are encoded by stem to just 'know' our language. We supply context to each stem without effort. It is a marvel, but not without its own idiosyncrasies. (Yes English has stems and gadgets too but more words since English prepositions and pronouns are separate words but this is not the general case in Hebrew.)

It might drive you round the bend for me to explain every detail, but it occurs to me in this deliberate walk through of the process that there are other possible ways of implementing the rules. And that the rule patterns as they stand will probably survive the remaining 55% of the data.

These are the types of question I entertain for every discrepancy.

  • is there an existing explicit rule?
  • is there an existing implicit or general rule that needs an exception?
  • is there a new rule?
  • am I reading the problem backwards (something I often do), i.e. is the program making the change when it should not?
  • is there a conflict or even a contradiction - two words that transform differently?
  • is this a spelling adjustment - does it agree with other instances?
Everything I have done to date is automatically retested as I put in new data. That keeps the stone moving up the hill. It only falls down again when I try and simplify the rules.

Reference

stem

form

morph

sim

calculated

Hebrew

comment

 

Genesis 3:9(8)

aic

aich

aic\h

aiiç

aich

אַיֶּֽכָּה

yod-segol must become ii and the h disappears. This is unique pointing in the Bible.

Genesis 5:29(14)

arr

arrh

arr\h

airrh

arrh

אֵֽרְרָ֖הּ

tsere becomes i

Genesis 8:2(6)

cla

vicla

vi/cla

viicla

vicla

וַיִּכָּלֵ֥א

prefixed i must be doubled

Genesis 6:16(6)

clh

tclnh

t/cl\nh

tclnh

tclinh

תְּכַלֶ֣נָּה

segol does not become i - conflict? No - following an open syllable, the tsere is not realized. Following a closed syllable it is. 

Genesis 3:21(6)

ctn

ctnvt

ctn\vt

cvtnvt

ctnot

כָּתְנ֥וֹת

qamats under the first consonant becomes o

Genesis 4:14(2)

grw

grwt

grw\t

girwt

grwt

גֵּרַ֨שְׁתָּ

tsere becomes i

Genesis 3:16(7)

hrh

vhrnc

vh/r\nc

vhrvnç

vhrnç

וְהֵֽרֹנֵ֔ךְ

holem must be observed

Genesis 3:10(10)

kba

vakba

va/kba

vaikba

vakba

וָאֵחָבֵֽא

tsere becomes i

Genesis 7:3(8)

kih

lkivt

l/ki\vt

lkiivt

lkiot

לְחַיּ֥וֹת

ii is missing - I don’t think this one is easy for the program to see. It's a suffix issue and kih is a very complex beast.

Genesis 10:8(6)

kil

hkl

h/kl

hkl

hkil

הֵחֵ֔ל

the first character of the stem followed by tsere should not become i. - a conflict? Turns out no, but must be distinguished from Obadiah 1:20, a unique pointing of the definite article with patah.

Genesis 4:4(7)

klb

vmklbhn

vm/klb\hn

vmklbihn

umklbhn

וּמֵֽחֶלְבֵהֶ֑ן

the second tsere should become an i, an artifact of the suffix. tsere under the third consonant is not a rule on its own.

Genesis 8:12(1)

kvl

viikl

vii/kl

viikl

vivkl

וַיִּיָּ֣חֶל

there are so many yods here, it must be ii but there is also an internal pattern giving 'iv'  that requires an exception for kvl.

Genesis 9:11(10)

mim

mmi

mm\i

mmi

mimi

מִמֵּ֣י

This stem must not allow hireq as i (the default for stems containing a yod)

Genesis 3:16(15)

mwl

imwl

i/mwl

imwvl

imwl

יִמְשָׁל

qamats under the second consonant becomes o

Genesis 4:7(14)

mwl

tmwl

t/mwl

tmwvl

tmwl

תִּמְשָׁל

as above - in both cases the prefix and suffix conditions apply.

Genesis 3:13(11)

nwa

hwiani

h/wa\ni

hwiani

hiwiani

הִשִּׁיאַ֖נִי

prevent the prefix hi. nwa has instances when it is required and when it is not. Some stems are complex for various reasons.

Genesis 4:11(7)

pxh

pxth

px\th

pxth

pxt

פָּצְתָ֣ה

don’t drop the h - conflict? No, just a conditioned rule.

Genesis 9:21(1)

wth

viwt

vi/wt

viiwt

viwt

וַיֵּ֥שְׁתְּ

prefixed i must be doubled? But this is not the case for four other instances of the identical word. So this goes into my database of errors in the source text.

Genesis 4:4(8)

wyh

viwy

vi/wy

viiwy

viwy

וַיִּ֣שַׁע

prefixed i must be doubled

Genesis 10:19(13)

xbaim

vxbim

v/xb\im

vxbviim

uxboim

וּצְבֹיִ֖ם

double i for the name here and in Hosea, spelled differently in WLC, respelled in the unpointed text.

Genesis 6:16(15)

ywh

tywh

t/ywh

tywha

tywh

תַּֽעֲשֶֽׂהָ

add the final ha, conditions on prefix and suffix apply

Genesis 3:19(1)

zyh

bzyt

b/zy\t

bziyt

bzyt

בְּזֵעַ֤ת

tsere becomes i. The word zyh, sweat occurs only twice in the Bible. Perhaps I will do Ezekiel next to pick up the other instance.

 




No comments:

Post a Comment