... | ... | @@ -26,6 +26,8 @@ Plays in this corpus most often have a small description regarding the time and |
|
|
|
|
|
These descriptions are generally simple, and our encoding is very simple so far. We keep the text but not its formatting. We do not tag place or time expressions, although this would be very useful future work. See some examples below.
|
|
|
|
|
|
We keep capitalization (e.g. in headings) as in the play.
|
|
|
|
|
|
![simple set](../img/set-id-12.png)
|
|
|
|
|
|
```xml
|
... | ... | @@ -34,6 +36,9 @@ These descriptions are generally simple, and our encoding is very simple so far. |
|
|
<p>Zytt: Jetzt.</p>
|
|
|
</set>
|
|
|
```
|
|
|
### Initial encoding of complex set information
|
|
|
|
|
|
Used in 2020.
|
|
|
|
|
|
Some authors use more complex conventions (e.g. Camille Jost, who breaks up the setting information in a more detailed way). For these cases, so far we have used `<p>` elements and indeed kept the bold format used for headings (_ORT_, _ZYTT_ and so on, see example below). Keeping the format was used instead of a `<head>` element, since we were using a single `<set>` element, and each `<set>` only allows one `<head>`. We will improve this in the future; the whole encoding of setting information may be made more semantic in the future.
|
|
|
|
... | ... | @@ -48,7 +53,23 @@ Some authors use more complex conventions (e.g. Camille Jost, who breaks up the |
|
|
</set>
|
|
|
```
|
|
|
|
|
|
The way we encode sets may be revised in the future, and be made more semantic (e.g. with several `<set>` elements with a relevant `@type` value like _time_, _place_ and so on) instead of `<p>` elements, and with named-entity tagging.
|
|
|
### Current encoding of complex set information
|
|
|
|
|
|
Used from 2021 onwards, earlier convention was revised to make it more semantic.
|
|
|
|
|
|
Different `<set>` elements are used.
|
|
|
|
|
|
When the content is specifically about one type of information (time, place, scene description), an `@ana` attribute specifies this (note that `<set>` does not accept a `@type` attribute).
|
|
|
|
|
|
Current values of `@ana` are in the table below
|
|
|
|
|
|
| `@ana` | description |
|
|
|
| ------ | ------ |
|
|
|
| `performance-duration` | How long a performance of the play takes |
|
|
|
| `place` | Place in which the play takes place |
|
|
|
| `scene` | Scene information (decoration, character position when play starts, sounds ...). As long as this information is part of the front (not as part of the first act/scene, in which case it would be a `<stage>` as per the TEI Guidelines) |
|
|
|
| `time` | time at which the play is set |
|
|
|
|
|
|
|
|
|
## Play summary
|
|
|
|
... | ... | @@ -58,4 +79,4 @@ A small number of plays have this, e.g. in the shape of an _Inhaltsangabe_ parag |
|
|
|
|
|
If the different types of information bear a **heading**, this is kept in a `<head>` tag (as long as allowed by the TEI-All schema (we have no custom schema so far)).
|
|
|
|
|
|
When the **language** for the content of an element is not Alsatian, we add an `xml:lang` attribute with the [ISO-6392 (B)](https://www.loc.gov/standards/iso639-2/php/code_list.php) language code as value. The language for front information is often German (`ger`) in this corpus. |
|
|
\ No newline at end of file |
|
|
When the **language** for the content of an element is not Alsatian, we add an `xml:lang` attribute with the [ISO-6392 (B)](https://www.loc.gov/standards/iso639-2/php/code_list.php) language code as value. The language for front information is often German (`ger`) in this corpus. |
|
|
\ No newline at end of file |