Monday, 26 October 2015

So just how do you go about writing the Hebrew Bible in English?

What? Hasn't that already been done?

Yes repeatedly but - every generation reads it again and makes new guesses. (Just slightly kidding you.)

I have wondered if I really should continue this project. But it's on its way and it's measurable. That's a key. You have not only to actually read and write but you have to control the project. And project control is not insignificant.

What's the text
Some time ago I did not have access to all the raw data in the form I wanted it in. Now I do. But leave off the problems of textual criticism. I am using the Leningrad codex from I haven't time to question it much as to its differences with other sources of the text. But this I know as of today: I have 100% of the text for all chapters and books in my database (or you might say in a pile on my desk ready to process.)

How do I measure where I am and what I've done so I know to forge ahead rather than backwards?

Where are you
I have a 'where am I' list containing each book and my current bookmark. Every time I press update, the list remembers the date, chapter and verse, and calculates the % done. For example

Last Changed
% Draft
% Avail

You'll see that even though I have completed the Psalms 3 years ago, they are still being questioned and may be changed. I was questioning my use of the gloss bar related to סגּר. I am still questioning it but for the moment I let it stand (though it may violate my concordance rule - but maybe it is a figurative homonym in English). This 'where am I' list is 39 entries long with dates ranging from today to August 26, the last time I looked at Hosea. No way I could remember this without help from the computer. (39 entries, 24 books? The 12 are 1 book, Samuel, Kings, Chronicles, as well as Ezra-Nehemiah are combined. 39-12-4+1 = 24.)

How do you decide where to go next
I began with the Psalms. They are poetry - short lines of densely packed ideas in parallels. Most people, I think, begin with prose and many begin at the beginning. I think I would have gotten stuck doing that. I am all over the place, translating in all books concurrently and playing with conceptual categories while I do it. I go to long books and when I get tired, I go to a short one. I work with the glosses and when I get tired I check out some categorization and classification of the roots of the words or I imagine how to extend my automated translation algorithm. That gets the computer to read ahead and make its best guesses at gloss, root, and semantic domain based on the words in sequence that I have already done. 

How do you chose a gloss
I'm working towards a very close translation of the Hebrew so that in principle, one could underlay the English to the (possibly) ancient music of the accents. I want the glosses to be concordant as far as possible. That means for me a Hebrew root may map to several different English glosses, but an English root (and related glosses) should map back to only one Hebrew root. This is of course impossible to some extent. But it does have some surprising results that change the way I read the text.

Here's a bit of the main screen
On this screen, I work on one verse at a time, word by word. I have a pattern of breaking the collected text (middle section above) at the atenach ^. It tells me something of the rhythm of the text. The table on the right holds all the words of the verse in their physical order on the manuscript. In this case:
upper right reading top to bottom shows how the words on the upper left are mapped to each English gloss:
and were (וַיִּהְיוּ֙ ) all the days of (כָּל־יְמֵ֣י) Methuselah (מְתוּשֶׁ֔לַח ) nine sixty (תֵּ֤שַׁע וְשִׁשִּׁים֙) years (שָׁנָ֔ה ) and nine hundred (וּתְשַׁ֥ע מֵא֖וֹת) --
and he died (וַיָּמֹֽת).

Obviously word by word is not English but it does represent the mechanics of what I have done. One instance of years is in this case not glossed as shown by the double dash --.

Sometimes I write the middle first and then work out the word by word, and sometimes the other way round. It's the presence of the word by word that allows me to test concordance.

Below the text I have three tools: 
  1. A table showing where this stem (root) is used in the other books. Can be filtered by book. 
  2. A table in which I can search in English for other instances of a gloss. 
  3. And a tree from which I can select my semantic domain and subdomain. Stem, domain and subdomain will gradually evolve as I complete more and more of the text.
Oh - and there are two strings of characters that are quite revealing: One is the grammatical component - a work in progress, In the case of the selected word וַיִּהְיוּ֙ is VYoYV*b1 Conj., Imp. or 3rd person, the o is where the root (היה) was found with a few mater substitutions. (Not an easy algorithm - the *b1 tells me which final rule in the program decided the structure of the word). The second is the sequence of the accents for the music. e qad,B z-q,C qad,z-q,f g# ^A e This sequence occurs exactly twice in the Bible (2). 

Buttons allow me to get text from the site of the Westminster Leningrad Codex and there is another that will produce the music with the English overlaid (as shown above).

Above the selection you can see 7 additional screens that I have written: those that list the text and its verbal and semantic structures, either 1 chapter at a time or several together or two from any two books. These are the pages that let me report on the blog (via Word) what I have done.

And there are screens for domain analysis, for output of the music a book at a time, for experimental searches, and for data recovery - although I do much of the last two and data base administration with Oracle developer tools. (I forgot to mention the DBA requirement but it's not too onerous.)

As a further control, every record in the tables has a time stamp so I know what has recently changed. I could get much more into control with auditing changes but it is not needed for this application.

Can you maintain respect for the text when you take it apart like this
The text is very respectable, but respect is not piety. Much of our reading of the text is focused on the wrong things. It is not all pious. It is not all serious. It is not all smooth. It is not all known. It is not all understood. It is not all good for you to imitate. There are people who are far more qualified to read it than I am but most people concentrate on just a small section of it. That's the nature of academic pressure. I don't have any such pressure. I have in a sense nothing to prove. Or perhaps I am searching for what I have to prove. Some people have a bias of course. Not me(!)

What strikes me most about the text is how well loved it has been by the Jewish tradition. They too must have had conflict and difference of opinion but the process of copying and maintaining this body of work must have been a serious act of devotion. That alone demands my respect even if the text itself describes a brutal fact that I could not respect or an anecdote that is anything but serious. 

But that's not quite right - what strikes me most is that I am in dialogue with the tradition and the authors and the text myself. I ask - in what sense does this have authority in me, for me, over me? 

It was made for me, not me for it, but we live together: it in its silence, unless I read or sing or hear it; and I in the prejudice of my own clothing, depending on how the text shakes me out of it and renews me.

Where are you again?
Based on the automated look-ahead that I wrote in the last three weeks, I still have 75,699 unknown words, 139,330 that the computer can guess based on a single word, 22,835 that it has guessed based on two consecutive words, 2,669 based on three consecutive words, 66 based on 4 and 30 based on 5. I have 64,093 words with a selected gloss that is not computer driven.

Published on this blog or in my book on the Psalms, there are 221 of 929 chapters or 23.8%. On a verse by verse basis, all 929 chapters have been touched, 17,154 of 23,143 verses have been touched, but when you look at the empty verses as collected, there remain 18,131 to do or 78.3%. On a word basis, 70.6% of the 304,646 words  in the database remain undecided though many are drafted with a possibility. At least that's what my numbers tell me today and they are more or less consistent. That means I have probably 3 to 4 years at semi-throttle to complete the task.

The 75+ thousand that are unknown are a concern. I could have bought a database probably, but then I would not have had the challenge of attempting to derive root and grammar from the raw data and I would have been accepting someone else's decisions - no deal! So the unknowns will reveal things and the knowns that are also in error will also be revealed in due course.

No comments:

Post a Comment