ḥaser in the context of ktiv (spelling) means 'lacking, deficient'—as opposed to malé, which means 'full'.
The former refers to the austere use of yods and vavs to indicate /i/ and /o/ or /u/ sounds — limiting them to where they are actually part of the word stem, and relying on niqqud to dispel misunderstandings. The latter refers to the generous use of yods and vavs to indicate /i/, /o/ or /u/. [and as will be noted below, sometimes qamats and patah (a)].
Undotted Hebrew (both today and in the Second Temple era) tends to use malé, esp. for secular purposes.I was wondering if it was possible to write a program that would analyse the WLC (a ḥaser text) and produce an undotted version (malé). I think it is.
And it is very clear to me that information is lost in this conversion process, notably vowels and accents, but this does simulate the ancient versions that had no vowels (though I would not rush to say that they had no accents). And with that simulation plus a few extra mater lectiones, (vavs and yods that aid the reader), we do have a text that could be subject to text mining without the need to manage Unicode.
I am holding my breath about putting the SimHebrew text into the music. At present that will have to wait. But the exercise of programming 9 chapters of test data has proven very instructive so far. One thing it has taught me is how illogical I am at times, thinking one thing and coding another. This is common in programming, especially I imagine in old programmers, but that is anecdotal.
I am now going to try and 'explain' my rules in English.
There is an easy mapping from Unicode to Latin characters. For the sake of understanding I mapped the members of the non-grammatical team first. A straightforward replace. The Unicode values translate unambiguously:
1490 - g, 1491 - d, 1494 - z, 1495 - k, 1496 - +, 1505 - s,
1506 - y, 1508 - p, 1507 - f, 1510 - x, 1509 - x, 1511 - q, 1512 - r
Note I use + for tet (internally). It does have the odd use grammatically, but not for hitting runs, just keeping score.
Then the grammatical letters:
1488 - a, 1489 - b, 1492 - h, 1493 - v, 1497 - i, 1499 - c, 1498 - ç,
1500 - l, 1502 - m, 1501 - m, 1504 - n, 1503 - n, 1513 - w, 1514 - t
These all have significant impacts on the placement and usage of i's and o's in the SimHebrew representation of the malé square text. Yod and vav have the most complex problems. There is an initial quick conversion for vav, vav+1460 is vi, vav+other vowels are vv, vav+1466 is vo. There's some nuance here since these are not final decisions. They depend on other variables. The holam 1465 can follow many letters and has a number of rules. 1466 is used only with vav and is generally fixed.
I also allow myself the generality of converting some common suffixes. It's a bit surprising, but it saves a lot of hunting later.
Pattern | Result |
t-1461 c-1462;m | ticm |
l-1461 c-1462;m | licm |
l-1461 h-1462;m | lihm |
n-1464 i-1460;m | ni*im |
t-1463 i-1460;m | ti*im |
b-1468-1463;i-1460;m | bi*im |
i-1468-1464;h | i*ih |
Some of the above may need restricting, e.g. there are 291 rows with the last combo and they might not all behave the same way in the rest of the WLC.
This is the beginning. And I won't continue this level of detail. I need to explain that each of the remaining rules by stem are processed in sequence. They allow one to see if a vowel in the text will cause a conversion to a mater lectionis. All the jots and tittles gradually disappear.
This is the matrix: (It will extend - and who knows, may become simpler if the rules appear to have patterns, particularly with respect to some diphthongs.)
It is similar to doing a program to deal with English lemmas. So many exceptions. I began my career as a programmer 54 years ago. I got the job because I could remember a host of three-character nonsense syllables. This program seems to be my bookend.
This table breaks down to three sections: Getting to the o's, Getting to the i's, A. vowels that undergo strange transformations, and B. Finally getting to the real i.
Rule abbreviated | Applies to stem (+ = ט) | Comment |
tsere vav | nvh | exceptions to vav+vowel becomes vv |
qamats qatan | ycrn nsy mlc krm
ahl awih azn +rk krb kpry lcd pyl sll | render vowel a (Unicode 1464) as o |
qamats qatan afx | pqd acl | render a as o except for some affixes |
qamats qatan b | wmy | render a as o for prefix b |
allow o l | acl | allow holam with prefix l |
allow o b | ywh | allow holam with prefix b |
allow o v | ywh | allow holam with prefix v |
allow o e | ywh | allow holam with segol |
allow o sf | ywh | allow holam with some suffixes |
prevent o | ywh acl aph azn
ch la mwh pry raw xan zat | prevent holam |
prevent o pref | amr | prevent holam with some common prefixes |
tsere hireq | rcc yrc rgn zrh wrq wrp kth kmr kln bar aph yvr ird pl+ wpl +vb | render tsere (1461) as i (some conditions) |
tsere hireqm | man | allow for some stems beginning with m but prevent i for single prefix mem with tsere |
tsere hireq t | rpa ywh | allow hireq from tsere for t |
patah hireq | rcc wnh mdi avli ybd pnh kq dbr yin al | ph becomes ii but not for prefixed vai |
patah hireqi | ild ird | יַ ip (1497-1463) becomes ii |
qamats u | pqd | הָת ht becomes hut -- specialized prefix |
qamats hireqv | wlv ikd | 'ָv' qamats-v (1464-1493) becomes iv |
qamats hireq | hih pnh yl | qamats i becomes ii except for suffix ' th nh ' |
qamats hireq pf | itr ivm | qamats i becomes ii - except for prefix 'vָi' |
prevent final i | kih | suppress rendering of final ai as ii |
allow init pi | lvn wby | allow initial patah-i as ii |
allow init pi exc | iwb lqk | allow initial patah-i as ii except for trailing u |
allow i | ww yzz rpa qxx nsy ywr nqb nkl lvn ktt abd acr am amn awh aw at azn bxr clm csh cpr cys dbr dmm gbr gll hlc hll hnm kmw kx lb lbb lqk mxa mla ml+ nba ntc npl ntn npx pla psl q+r qnh rgy rnn wqx wbr wck wvb +vb tmm xvh xih yl yll ycb yxb yxm | allow hireq (1460) to be realized --
too generous?
Note that hireq is rendered as i when a step contains an i anywhere. |
allow i t | qrb nwa ntn mxa ngp lcd lkm | allow hireq to be realized for prefix t |
allow i h | wlv wby rby lvi kll nqm nsc lkm | allow hireq to be realized for prefix h |
allow i i | tmm wmm lkm | allow hireq to be realized for prefix i |
allow i v | zvd wlk wlm tpw rmh nwa nqm nkm itr | allow hireq to be realized for prefix v |
allow i l | pnh | allow hireq to be realized for prefix v |
allow i c | xpkt | allow hireq to be realized for prefix c |
allow i m | zmm | allow hireq to be realized for prefix m |
prevent i | ymindb ymihvd ynqi xvriwdi wlmial pgyial wyir tnin sin sini sir lvi ymiwdi gdyni cnyni brik bin di riq kih acl abidn cid rib irmih hia yir ci mi kli nbia csdi ihvdi cwdi ict itr bli ikm ial id idy anci bnimn iwb cli bit iwral ivm hih ani | prevent hireq -- too restrictive?
And what will happen with names? |
prevent i n | wbr rpa lkm ird mxa qnh csh | suppress hireq prefix n |
prevent i h | ixg ixt hlc rgy +tb q+r | suppress hireq prefix h |
prevent i m | dmm | suppress hireq prefix m |
prevent i c | dbr | suppress hireq prefix c |
prevent i a | ww ird | suppress tsere or hireq prefix aleph or h - may need refining |
prevent i io | ird | suppress hireq prefix yod vav |
prevent i v | qrb lqk dmm | suppress hireq prefix vav -- |
prevent i t | suppress hireq for prefix t | |
prevent i i | lqk qnh | suppress hireq for preterite/imperfect yod |
prevent i l | -xxx | suppress hireq for prefix l |
prevent i u | mla mxa | suppress hireq for suffix u |
allow dbl i | rcc wmm lqk ixt iin mdi bxr nplti yl yvr dmm igy wbr qnh kq ntn ml+ pla irw yll ixm gvi npl | allow initial double hireq |
prevent dbl i | wlk wlm xpkt wby rpa rby yzz ywr ww nqm nkl mni tmm mla mim kx lvn ill ild nptli ixg nwa cpr kmw npx gmlial gll amn dbr azn mah +vb aliab aliwmy alixvr ink amr at ail ain akiry akiyzr adryi iyd iwb aim psl ptiw abir xvh xih xivn wit bvw wvb aiw awh gbr anci lb cys brit ira bli idy nsc q+r pl+ | prevent double hireq |
except for | lkm | ensures leading ii for stem, no ii anywhere else. There is supposedly a rule that with a prefix, the i is not doubled. Sometimes... |
No comments:
Post a Comment