Tei-Parsel 1.0

GitHub: https://github.com/michaelafalatkova/tei-parsel-1.0.git

Authors: Michaela Falátková and Jan Marek

Tei-Parser 1.0 is a Python application that transforms the content of TEI document that follows Lombard Press Schema 1.0.0 guidelines (https://lombardpress.org/schema/docs/diplomatic/) into a web presentation. The web presentation is a static display of pages of the TEI document. Each page contains an image and a transcribed content.

How to Use

  1. Install python (miniconda) – Download from https://docs.conda.io/en/latest/miniconda.html
  2. Setup a specific python environment or use base one
  3. Clone the repository (or download it), cd to it
  4. Install dependencies
  5. Run the program from CLI
  6. Copy the generated file(html) + images + bootstrap folders to a location

Detailed Guide

Download and install miniconda (https://docs.conda.io/en/latest/miniconda.html)

Run Anaconda Prompt (miniconda3) – search with windows

cd c:\tei-parser (or to a location where you downloaded this repository)

Python environment setup (Required for running the program)

pip install -r requirements.txt

Python environment setup (Required for running the program)

Run from CLI (from root folder)

python run.py --input_file_path ./_example/Paris-Lat-9765.xml --output_file_path output-full.html

Example Structure

Extracted Fields

teiHeadertitletitleheader, title
editionStmt/edition/datedata of translationheader
textpbpage break<pb facs=““ n=“2v“/>will break page; facs – link to an image; n – above page
pparagraph<p></p>same as html p
lbline break</lb>not used
notenote<note></note>under text, types do not matter, number the notes, display the number in the text
quotequote<quote></quote>italica (nothing else)
refreference<ref><name>Horosii</name> narrat<title>historia</title><link>seznam.cz</link></ref>link <a href=“seznam.cz“><u>Horosiii narrat historia</u></a>