A TEI-Conformant version of the 1611 text of the Bible For full disclosure, see https://foxglove.hypotheses.org/524
lb42 / kjv_1611 Goto Github PK
View Code? Open in Web Editor NEWA TEI-Conformant version of the 1611 text of the Bible
A TEI-Conformant version of the 1611 text of the Bible
A TEI-Conformant version of the 1611 text of the Bible For full disclosure, see https://foxglove.hypotheses.org/524
The occurrence of the split word Si rach
in the prologue to Ecclesiasticus prompts me to ask, how common is the problem?
How many other places might there be where a word contains a spurious space?
Detecting these will not be straightforward.
It might require generating a counted words list and then looking for such discrepancies.
It will require a keen eye to discern between unfamiliar 1611 spelling variations and real typos.
Counting Hebrews as Pauline for the sake of this issue, it's evident that the "colophon" at the end of each of St Paul's epistles has been merely appended to the text of the last verse.
Sometimes even the space is missing. e.g. Romans 16:27
<ab n="27">To God, onely wise, bee glorie through Iesus Christ, for euer. Amen.Written to the Romanes from Corinthus, and sent by Phebe seruant of the Church at Cenchrea.</ab>
Ideally, these non-canonical lines in the NT should be separated from the last verse.
In modern editions, these are usually in a smaller typeface and most are preceded with a Pilcrow sign.
It would be sensible to place each of these text items in a separate TEI element.
Modern editions of the KJV use small-caps to style the divine name where it corresponds to the tetragrammaton and a few cognate words.
LORD
gave only 182 matches.Lord
gave 8276 matches.Compare similar searches for the SWORD KJVA module:
LORD
has 6574 matchesLord
has 1875 matchesThis is based on the diatheke output which converts the internal OSIS markup to capitals.
cf. OSIS has a divineName element to identify such words, such that they can be suitable styled.
The totals differ only by 9, and are probably easily reconciled.
The issue here is that the HTML source text has not faithfully represented the printed black letter capitalisation of the divine name in a large proportion of cases, or so it seems.
NB. I've ignored possessives in this comparison, but the analysis could be readily extended.
Are the two words that have the small letter thorn in the HTML transcription errors?
00191 þe
00013 þt
The letter thorn was no longer used in 1611.
This is what I think they represented.
<abbr expansion="the">yͤ</abbr>
<abbr expansion="that">yͭ</abbr>
Due to a possible scripting bug, 21 of the the XML files still contain the following line:
<ab n="KJV">2015 Copyright King James Bible Online | ..</ab>
I noticed that there is a mismatch in the number of left and right parentheses:
U+0028 ( 664 LEFT PARENTHESIS
U+0029 ) 662 RIGHT PARENTHESIS
At least 2 must be unpaired. Further analysis showed that there are only 2 such locations:
<ab n="1">Now the sonnes of Reuben the first borne of Israel, (for hee was the first borne, but, forasmuch as he defiled his fathers bed, his birthright was giuen vnto the sonnes of Ioseph the sonne of Israel: and the genealogie is not to be reckoned after the birthright.<note> Gen. 35. 22.and 49. 4.</note></ab>
<ab n="13">(For not the hearers of the Law are iust before God, but the doers of the Law shalbe iustified;</ab>
The respective locations are:
FIO. There are currently 11378 note elements in the work.
Of particular interest is that many of these contain cross-references.
The chaps folder contains a file called includes.xml
that references only 1194 of the 1361 chapters.
Looks as though it was possibly made during earlier work in progress and may since have been superseded by the file called KJV_1611.xml
in the root level.
Is it still required for anything? @lb42
Facsimile page images for the KJV 1611 are also available at Original Bibles.
The images are in PDF format. Navigation is somewhat awkward.
Many other historic Bibles are available on the same site.
In view of the way GitHub throws a "wobbler" for a folder containing more than 1000 files,
Sorry, we had to truncate this directory to 1,000 files. 361 entries were omitted from the list.
I wonder if it's worth considering changing the folder/file structure to have a books folder with 79 folders, each named after a book and containing the set of chapter files for that book ?
Yes - it's a big change, but it would mean that any visitor could more readily access the XML files for the books with names later in the alphabet than Matthew_19.xml
.
These currently use the TEI head element.
Currently there is no XML attribute that would distinguish the 116 canonical Psalm titles from the noncanonical descriptions at the start of other chapters.
These canonical Psalm titles in the KJV correspond to those in the Hebrew text.
I might be mistaken, but based on my knowledge of the Blayney 1769 edition, the chapter labels in Psalms are probably in the form PSAL. I.
rather than CHAP. I.
.
cf. Yours are like:
<div xmlns="http://www.tei-c.org/ns/1.0" xml:id="CPsalms_01" type="chapter">
<pb facs="http://www.kingjamesbibleonline.org/1611-Bible-KJV/Psalms-Chapter-1-3.jpg"/>
<head>CHAP. I.</head>
<head>1 The happinesse of the godly. 4 The
vnhappinesse of the vngodly.</head>
The 16 chapter files Romans_##.xml seem to be devoid of verse text.
They simply have the div element and nothing inside it.
FIO: The attached Zip file contains a counted words list for the verse text only, excluding notes.
merged.vpl.words.count.txt.zip
It may be of use for proof reading, etc.
Notes:
FIO. The attached text file is a character frequency count for the 1361 xml files concatenated from the chap folder (NB. Analysis now includes Romans, and excludes bogus copyright lines.)
merged.xml.character.frequency.txt
The XML entity &
occurs 2091 times. There are no other entities.
Of particular interest are the non-ASCII letters and characters:
U+00B6 ¶ 2,977 PILCROW SIGN
U+00C6 Æ 1 LATIN CAPITAL LETTER AE
U+00E6 æ 7 LATIN SMALL LETTER AE
U+00FE þ 204 LATIN SMALL LETTER THORN
U+0101 ā 5 LATIN SMALL LETTER A WITH MACRON
U+0113 ē 36 LATIN SMALL LETTER E WITH MACRON
U+014D ō 153 LATIN SMALL LETTER O WITH MACRON
U+016B ū 6 LATIN SMALL LETTER U WITH MACRON
It's evident that the source web-site must not have made any systematic attempt to use the following letter that was in the original KJV of 1611.
U+017F ſ LATIN SMALL LETTER LONG S
Reverse engineering a fix for this discrepancy would not be a simple task.
Even so, the long s might only have been present in the translators' added words that were styled with Roman typeface; and also the chapter descriptions in head elements and the page titles in fw elements.
cf. The main text of the KJV was in blackletter typeface.
Currently this pair of prologues is coded as part of verse 1,
<ab n="1">[A Prologue made by an vncertaine Authour.] This Iesus was the sonne of Sirach, and grarul-childe to Iesus of the same name with him; This man therefore liued in the latter times, after the people had bene led away captiue, and called home againe, and almost after all the Prophets. Now his grandfather Iesus (as he himselfe witnesseth) was a man of great diligence and wisedome among the Hebrewes, who did not onely gather the graue and short Sentences of wise men, that had bene before him, but himselfe also vttered some of his owne, full of much vnderstanding and wisedome. When as therefore the first Iesus died, leauing this booke almost perfected, Si rach his sonne receiuing it after him, left it to his owne sonne Iesus, who hauing gotten it into his hands, compiled it all orderly into one Volume, and called it Wisdome, Intituling it, both by his owne name, his fathers name, and his grandfathers, alluring the hearer by the very name of Wisedome, to haue a greater loue to the studie of this Booke. It conteineth therefore wise Sayings, darke Sentences, and Parables, and certaine particular ancient godly stories of men that pleased God. Also his Prayer and Song. Moreouer, what benefits God had vouchsafed his people, and what plagues he had heaped vpon their enemies. This Iesus did imitate Solomon, and was no lesse famous for Wisedome, and learning, both being indeed a man of great learning, and so reputed also. [The Prologue of the Wisdome of Jesus the sonne of Sirach] Whereas many and great things haue bene deliuered vnto vs by the Law and the Prophets, and by others that haue followed their steps, for the which things Israel ought to be commended for learning and Wisedome, and whereof not onely the Readers must needs become skilful themselues, but also they that desire to learne, be able to profit them which are without, both by speaking and writing : My grandfather Iesus, when he had much giuen himselfe to the reading of the Law, and the Prophets, and other Bookes of our fathers, and had gotten therein good iudgement, was drawen on also himselfe, to write something pertayning to learning and Wisedome, to the intent that those which are desirous to learne, and are addicted to these things, might profit much more in liuing according to the Law. Wherefore, let me intreat you to reade it with fauour and attention, and to pardon Vs, wherein wee may secme to come short of some words which we haue laboured to interprete. For the same things vttered in Hebrew, and translated into an other tongue, haue not the same force in them : and not onely these things, but the Law it selfe, and the Prophets, and the rest of the Bookes, haue no small difference, when they are spoken in their owne language. For in the eight and thirtieth yeere comming into Egypt, when Euergetes was King, and continuing there some time, I found a Booke of no small learning, therefore I thought it most necessary for mee, to bestow some diligence and trauaile to interprete it: Vsing great watchfulnesse, and skill in that space, to bring the Booke to an end, and set it foorth for them also, which in a strange countrey are willing to learne, being prepared before in maners to line after the Law. [1] All wisedome commeth from the Lord, and is with him for euer.<note> 1.Kings 3.9.</note></ab>
Ideally this should be separated into two TEI elements that should be placed before verse 1.
The first verse would then be merely:
<ab>All wisedome commeth from the Lord, and is with him for euer.<note> 1.Kings 3.9.</note></ab>
and the [1]
could then be dispensed with.
Aside: Notice the split word in this prologue: Si rach
.
I wonder how many similar instances there might be with such a spurious space?
Currently these have been coded as the English name of the Hebrew letter punctuated with a period and wrapped in square brackets at the start of the first verse in each of the 22 stanzas.
Ideally, these should have their own XML element, one that ought to precede the verse.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.