Skip to content

Releases: ocropus/hocr-tools

Add new script hocr-cut for cutting a page

02 Mar 15:47
Compare
Choose a tag to compare
  • Add new script hocr-cut for cutting a page #108
  • Add --savefile argument to hocr-pdf #125 #126
  • Reformat code according to PEP8 and several other cleanup and documentation work

See details v1.2.0...v1.3.0

Add new script hocr-wordfreq + switch to argparse module

29 Mar 16:44
Compare
Choose a tag to compare
  • hocr-wordfreq: word frequency counter #93 #96 #98 #99 #100 #104
  • Switch to argparse module #82 #97
  • Delete numpy dependency, rewrite edit dist algo #88
  • Extend hocr-pdf to work also with lines #107

See details: v1.1.1...v1.2.0

Fix hocr-combine, hocr-eval, hocr-lines and add more tests

23 Oct 07:00
Compare
Choose a tag to compare
  • Fix hocr-combine: Delete the function call to importNode which does not exists in etree and seems not necessary anymore.
  • Fix hocr-eval: The function get_text of this file failed in Python 3 and we use now the same code of this function as in the other tools.
  • Fix hocr-lines: It was outputting byte strings in Python 3.
  • Add tests for hocr-combine, hocr-eval, hocr-eval-geom, hocr-lines

See details: v1.1.0...v1.1.1

Python 3 Compatibility

27 Sep 19:58
Compare
Choose a tag to compare

The hocr-tools are now compatible with Python 2 as well as Python 3!

  • Change print statements according to Python 3 and use from __future__ import print_function
  • Fix hocr-eval-lines, add tests
  • Start code cleaning according to PEP 8 coding styles
  • Add Dockerfile for consistent local testing
  • Load from filename not stream

See details: v1.0.1...v1.1.0

Bugfixing in hocr-split, hocr-pdf, hocr-check

20 Sep 14:50
Compare
Choose a tag to compare

Fixed bugs

  • hocr-split: Duplicate content in <html> #58
  • hocr-pdf: ocr_line does not have to be a span (e.g. also a div is possible) #57
  • hocr-check: Fix containment checks and metadata checks, add tests #52 #61 #62

Ongoing work

  • Check handling of non ASCII characters in hOCR files #53
  • Make hocr-tools fit for Python 3 #37

See details: v1.0.0...v1.0.1

Start releasing on GitHub and PyPI

01 Sep 15:54
Compare
Choose a tag to compare

We start now to release on GitHub and also PyPI. Today with v1.0.0 marks the beginning of this activity. However, we retrospectively also tag some older important points with version numbers starting with 0.

Fix rawtext

01 Sep 15:58
Compare
Choose a tag to compare

Unit tests, Travis

01 Sep 15:58
Compare
Choose a tag to compare

Switch to lxml, add Apache licence

01 Sep 15:58
Compare
Choose a tag to compare

Readme

01 Sep 15:57
Compare
Choose a tag to compare

.