Sunday, 6 December 2020

Analysing the transformation from pointed to malé text

 Now that I have my answer, I admit that I can't find the energy to analyse how I got there and to simplify the rules. Two examples in this post: 

  1. kih (life, etc) has 20 explicit rules in my program. 
  2. tab (longing) has none! 
OK חיה is complex and demands a simplification. תאב on the other hand, is simply direct substitution of consonant for consonant and ignore all vowels.

RootDomainWordCountPointingGlossReferences
tabLoveltabh1לְתַאֲבָ֑הfor longingPsa 119:20(3)
mtab1מְתָאֵ֤בam longing forAmo 6:8(9)
mtabl2מִתְאַבֵּ֣לbe longing1Sa 16:1(8)
מִתְאַבֵּ֖לhe was longingEzr 10:6(21)
tabu2תֹאב֖וּyou are amenableLev 26:21(6)
־תֹּאב֖וּyou are amenableIsa 1:19(2)
tabti2תָּאַ֣בְתִּיI have longedPsa 119:40(2)
תָּאַ֣בְתִּיI have longed to comePsa 119:174(1)
Admittedly, some of them are ignored because of specific rules: 
  • Hireq is not rendered in a closed syllable. so m(i)t is rendered as mt (orange above). 
  • A double i is not permitted, 
  • but in any case, re the above two issues, tab has no yod and therefore hireq is not rendered by default (fuschia above). 
  • Vav with dagesh is always u. So no need to specify a rule or an exception for this word.
(Maybe there are not enough uses of this root to show where its complexity might lie.)

In contrast, for life, these are the 20 explicit mentions of kih in my program. Needless to say it obeys or disobeys the implicit or general rules as it sees fit.

To start, kih is a named exception to 'my' generic rule about feminine plural suffixes: 
  1. vvi-dagesh-v-holem-t becomes vviiot, 
  2. hireq-i-dagesh-v-holem-t becomes iiot. 
  3. i-dagesh-v-holem-t becomes iiot.
  4. hireq-i-dagesh-holem-t becomes iiot.
  5. i-dagesh-qamats-h becomes iih.
You might well ask where did the vv come from! It is not a normal pairing of letters in a pointed text. There are some places in the program where the sequence of application of the rules is important. I reduced this gradually in the first 8 months of experimentation, but in the last month since my most recent dive into the program, I compromised with this rule and moved about half the vv processing into the first step in the transformation.

My guess is that there is a generic vv process (as intimated by the wiki article on this transformation) that I should fully implement first. And then things in other places would simplify. In a 2500+ line program with several subordinate functions, you can see that if I implemented such a 'perfect' rule, the remainder of the program might still work (unlikely), but it would be full of unnecessary code. This is not a game I want to play at the moment, since I have my desired answer, a full list of Hebrew roots validated against two different databases, the WLC and the malé text.

When I began this process 14 years ago, I could find no such list. I did inquire a few times, though I never bought or even used free versions of other people's data because validation was an essential component of my mushroom hunt. I have some mushrooms in my collection that are offspring of others, but none of them is poisonous. (Metaphor, folks). I.e. there are some roots that are very similar and clearly one is derived from the other, but I have kept them separate, like xdiq and xdq. And I have some that are joined that some glossaries distinguish, like hlc and ilc. As I look at that data, my feeling is that the noun and verb forms of righteous should be treated as one root, not two. 

Getting back to kih - why so many explicit mentions in my program? 
  • It is also an exception to a common final i rule. Almost all stems transform qamats-final i or patah-final i to ii. But not kih and a half-dozen others.  
  • kih allows a double i, with about 10 other roots, 
  • it transforms an internal tsere into ii with some exceptions, 
  • it transforms an initial or internal qamats-i into a double i. 
  • It prevents hireq from instantiating as an i. 
  • It does not generally transform holem to 'o', but there are exceptions. 
  • v-qamats and v-patah become vv for kih along with several other roots. 
  • Under some conditions patah-i-patah becomes ii and hatef-patah-i-hireq becomes ii. 
  • Under several more complex decisions, the middle i becomes ii. 
  • Back to the feminine ending, patah-i-dagesh-v-holem-t becomes iiot. 
  • and for prefix 'l' and suffix 'to', i-dagesh-holam becomes iio. 
  • For kih I remove a trailing h sometimes 
  • and v-holem may become a final h. 
  • tsere-a may become i. 
  • A final qamats may become an h. 
  • And a final h-qamats may become an ha and generate an interior ii.
Now do you see why I don't want to take on a simplification of this program. Its work is done. Perhaps I will find some easier transformation rules. But perhaps not. You can see kih in all its glory here. If you can derive a simpler set of rules for it, please let me know. Not for the sake of a program that will never be used again, but for the sake of reading Hebrew.



No comments:

Post a Comment