Projects per year
Abstract
These are the official datasets created for the Tibetan Manuscript Project Vienna (TMPV) in the years 2023 and 2024. These datasets contain:
OCR datasets (line image - line label pairs) created from the PageXML annotations
PageXML (Transkribus) annotations in Unicode and Wylie
PageXML Layout annotations (lines, images, captions, margins) used for image segmentation training
OCR models (PyTorch checkpoints and ONNX model files)
OCR datasets (line image - line label pairs) created from the PageXML annotations
PageXML (Transkribus) annotations in Unicode and Wylie
PageXML Layout annotations (lines, images, captions, margins) used for image segmentation training
OCR models (PyTorch checkpoints and ONNX model files)
Original language | English |
---|---|
Media of output | Online |
Size | 1,3GB |
DOIs | |
Publication status | Published - 29 Nov 2024 |
Austrian Fields of Science 2012
- 602050 Tibetan studies
Projects
- 1 Active
-
Himalayan Sutra Collections
Viehbeck, M., Silk, J., Helman-Ważny, A., Berounsky, D., Choekhortshang, N. W., Laine, B. & Tauscher, H.
1/09/22 → 31/08/26
Project: Research funding