Freely-Given.org ESFM Bibles

Picture of the FG Logo

More than you wanted to know about ESFM Bibles

Enhanced Standard Format Marker (ESFM) Bibles use an extension of the better known USFM (Unified Standard Format Marker) 2.4 format. ESFM Bibles are usually created by a program such as our own Biblelator, or by Paratext, Bibledit, or even Toolbox or (horrors!) a text editor or word processor (cringe!). They consist of one file per book encoded in UTF-8 Unicode and with a .esfm filename extension. They only contain the Bible text and a few extras like book titles and introductions, but a lot of other necessary information (metadata) is not defined or included in the USFM or ESFM format, thus they do not provide enough data for typesetting from directly.

Why ESFM?

First let me answer the question: why not XML? Part of the whole philosophy of this whole suite of programs (Bible Organisational System (BOS), Biblelator translation editor, Android CustomBible app, etc.) is to create Bible software that is hackable. In order to do this, it must be relatively readable. I don't find XML easy to scan or read, and it's very difficult to extract information out of. Besides that, if you want an XML Bible format, there's already the horrible OSIS and the more recent USX.

So ESFM is more user/hacker friendly than XML Bible file formats, but it's also more rigid than USFM in many ways, e.g., insisting on UTF-8 encoding, .esfm file extension, etc. It also extends USFM 2.4 to cover some missing features such as the ability to encode Strongs and other numbers, and better ease in encoding internal and external links.

ESFM Definition

ESFM is still being developed. It's based on the latest version of USFM, which at the time of writing is v2.4 dated June 26,2013 (PDF) but will be updated to become compatible with USFM v3 as soon as the new specifications are publicly released.

File headers

The file (which must be encoded in UTF-8) starts with the following three lines:
  \id JON - Open English Translation—Literal Version (OET-LV) v0.2.03
    id line contains 3-character Paratext book code followed by
    optional text and ending with the book version number
  \ide UTF-8
    ide must contain only UTF-8
  \rem ESFM v0.5 BBB
    rem (remark/comment) line contains the letters ESFM,
    the ESFM format version number (currently at v0.5)
    and the three-character BOS book code.
If any of the above are missing, it may not be detected as an ESFM file. These lines are designed to allow Paratext and Bibledit compatibility, yet can be detected by our Biblelator and other software.

Restrictions over USFM 2.4

The following are the restrictions over USFM 2.4:

  1. Multiple newline markers (e.g., \p, \v) cannot occur on the same line.
  2. Leading spaces are allowed on lines to enable display of nesting for better human readibility. These leading spaces are not publishable, i.e., don't appear in formatted exports of the Scriptures.
  3. After any optional leading spaces, the first item on every line must be a newline marker, i.e., long lines are not allowed to wrap.
  4. Numbering is compulsory for numbered USFM 2.4 markers, i.e., \s1 is allowed but not \s.
  5. Deprecated USFM 2.4 markers (e.g., \ph1) may not be used.
  6. Chapter numbers are compulsory in books with verses (even in one-chapter books).
  7. Chapter numbers are NOT permitted in "books" without verses (such as glossaries).

Special characters

USFM 2.4 special characters are supported, plus some extras:

  1. A double forward slash // represents a line break within a field.
  2. A tilde ~ will be converted to a non-break space for USFM compatibility. ESFM prefers Unicode non-break spaces to be placed directly in the file.
  3. An underline character _ is used to join words for semantic reasons, usually indicating where one source language word requires two translated words, e.g., he_said. The underline characters may be separated where the word order must be changed, e.g., in such cases as I_have_ not _come where the Greek not preceded the verb.
  4. The equals sign = indicates tagged semantic information, usually linked to the previous word, e.g., he=PSimon indicates that the pronoun he refers to Simon. The end of the tag is marked by a space, end of line, or any punctuation character other than a forward slash which is used to indicate alternative possibilities (with the most likely one first). See here for more information on semantic tags.

Additional newline markers

In addition to the non-deprecated USFM 2.4 newline markers, the following are defined:

  1. \gl marks the start of a new glossary entry and contains only the keyword or phrase. An additional marker (or markers), e.g., \li will contain the actual definition. These should only appear in the GLO (glossary) "book".

Additional internal markers

In addition to the non-deprecated USFM 2.4 internal markers, the following are defined:

  1. \str .. \str* marks a Strongs number and should be immediately after the word it applies to.

USFM 3

USFM 3.0 is soon to be released (i.e., second half of 2016) by UBS/SIL. Since the new specification provides a way to encode Strongs numbers, then the \str tag will no longer be required in ESFM.

Sample Files

Some handcrafted files can be viewed here (OET-LV).

Some exported ESFM files can be viewed here and (WEB) here.

Submitting ESFM Bible files to the Bible Drop Box

Paratext provides some additional information in a .ssf file, and this information can be used by the Bible Drop Box if included. However, even with the SSF file, ESFM Bibles don't yet specify things like which order to include the books (believe it or not, not all Christian traditions put Bible books in the same order), nor does it carefully define the short names of the books or the book abbreviations which will be used in section references and cross-references, etc. For this reason, the Bible Drop Box tries to make some intelligent guesses as to how to best format your data. In the future, we hope to find an easy way for you to specify some of this additional information. (Suggestions happily accepted.)

Depending on how the files were formed and what checks have been made on them, ESFM Bible files can easily contain MANY errors, e.g., fields like \v 7And Jesus said... (missing space after the verse number) or a cross-reference like Maat 3:7 (instead of Mat) or missing ESFM closing markers such as in Jesus said, \wj"Go." (missing \wj* and failure to use proper “quotation” characters). The Bible Drop Box software attempts to find and inform you of such errors and inconsistencies, at the same time trying to guess how to best handle them without just giving up. This may or may not be successful. (Your mileage may vary.) Of course, you will have the best result if you take the time to fix the issues in your files.

The Bible files (which must all have the .esfm file extension) should be combined into a zip archive (which must have the .zip file extension). Take careful note of which folder you are creating your zip file in, because you will need that information in order to submit the file.