What it does
The tool takes a Yolo model trained for annotating bounding boxes around text. It takes a pretrained Yolo model and predicts bounding boxes in the input image where any text is found. It is based on document layout analysis: https://github.com/opendatalab/DocLayout-YOLO. The Yolo model can be downloaded from: https://huggingface.co/juliozhao/DocLayout-YOLO-DocLayNet-Docsynth300K_pretrained/tree/main