## Introduction to XML and TEI Part II: Critical Editing with TEI --- Jeffrey C. Witt (Loyola University Maryland) ---

Encoding the Apparatus Criticus

The first thing we need to know is that the creation of a field standard schemata for critical editing is still in development.

The central challenge lies in replacing the visually and ambiguously encoded apparatus criticus with a semantically encoded data base.

This primarily means we need to abstract from the way we eventually expect our apparatus to look. We must think in terms of data types.

What are the different types required to construct an apparatus?

  • Lemma
  • Reading
  • Sigla
  • Some kind of abbreviated description of the type of variation

TEI Guidelines for a Critical Apparatus

See: http://www.tei-c.org/release/doc/tei-p5-doc/en/html/TC.html

The basic form of a TEI apparatus entry is as follows:

<app>
<lem>fides</lem>
<rdg wit="#V">fidem</rdg>
</app>

The @type attribute

Now there are obviously more kinds of variants than a simple form variation.

There are omissions, corrections, additions, etc.

To deal with this variation, we are developing a taxonomy of "types" that you can add to your apparatus entry as follows.

<app>
<lem>fides</lem>
<rdg wit="#V" type="omission"/>
</app>

Or

<app>
<lem>fides</lem>
<rdg wit="#V" type="correction-substitution">
<subst>
<del>fidem</del>
<add>fides</add>
</subst>
</rdg>
</app>

Semantic encoding and new possibilities

Because we are encoding data types and not presentation forms...

And because our digital representation allows us to transcend the concerns of limited space...

We can add additional data to our apparatus criticus that is simply not possible in printed editions.

For example, we can add a note specific to each apparatus entry.

<app>
<lem>fides</lem>
<rdg wit="#V">fidem</rdg>
<note>This is a note about this specific apparatus entry. Here you could explain at length why you made this decision along with any other pertinent details related to this variant entry.</note>
</app>

With all this is in place, we can render the critical apparatus as seen in this demo:

A note on the state of development

It needs to be noted that while we can offer some guidelines about how to semantically encode an apparatus, these guidelines are very much a work in progress.

My goal and the goal of the groups I work with is to help provide guidelines that can standardize this progress as much as possible.

Our work is advanced by partnering with groups who are committed to semantic encoding.

As you encounter new use cases and encoding problems, we can develop an increasingly robust set of guidelines and best practices.

But the point is that this is a partnership, in which we develop together.

This means we will encounter encoding issues to which I cannot give you a direct answer.

But if you can log your encoding issue into our system, we can develop a feature request and work to support this or that data type.

Dealing with uncertainty

The @type=manual

<app>
<lem>fides</lem>
<rdg wit="#V" type="manual">fides] fides
<hi rend="italic">corr ex.</hi> interl. fidem</rdg>
</app>

Note: This is BAD Semantic encoding. It should not be encouraged. But it is a method we can use as we collect information about various use cases. Each @type=manual becomes an instance that alerts us to the need to create an abstract data type.

Feature Requests and Encoding Questions

Feature requests and encoding questions can be asked here: https://github.com/lombardpress/lombardpress-schema/issues. All you need to do is create a new issue and describe your encoding problem. It is ideal if you can add a screen shot of the encoding problem you are facing.

Every time you are tempted to create an @type=manual, you should create an issue asking if there is proper type for the variant instance you are currently dealing with.

The emerging form of organizational oversight for editorial schema

Planned Meetings

  • LombardPress Team Meeting, Barcelona, July 12th-14th
  • Digital Latin Library Meeting, Raleigh, NC, USA July 27th-28th
  • GRPL meeting, Boston, MA, USA August 1st
  • Sentences Commentary Meeting, Basel, Switzerland, August 15th-19th

The point is that a lot of work is currently on-going. And there is a lot of energy behind it.

If your team commits to this approach, we can leverage your interest and you can have a major voice at the table.

For example, it would be ideal to meet on a yearly basis to develop our encoding standards. Next summer, it would be great to have a representative from the Petrus Hispanus project at each of these standards meetings.

Encoding the Apparatus Fontium

There are two major types of data in a critical edition that usually get linked to an apparatus fontium.

  • Quotations
  • References

Quotations

We encode quotations as follows:

<cit>
<quote>This is a quotation, perhaps from Augustine's City of God</quote>
<bibl>Augustine, City of God, III, 2, 5 (PL XX:XX)</bibl>
</cit>

References

We encode references as follows:

<cit>
<ref>Magister in libro I, distinctione 23</ref>
<bibl>Lombard, Sententia, I, d. 23, c. 2 (Brady I:XX)</bibl>
</cit>

Encoding the Diplomatic or Documentary Edition

The emerging form of organizational oversight for schema of diplomatic editions

Creating diplomatic transcriptions alongside our critical texts allows us to record all kinds of detail about each witness that normally is squeezed out of a print critical editions on account of the limits of printing space.

For example we can record medieval and normalized spellings

<choice>
<orig>sicud</orig>
<reg>sicut</reg>
</choice>

But that only begins to scratch the surface. We can record line breaks, marginal notes, and even medieval punctuation.

Creating diplomatic transcriptions opens up all new kinds of possibilities.

For example: On demand collations

For example: transcription-image line alignment