OCR for Puppy Linux
Posted: Sat 01 Oct 2011, 15:41
Here is a pet package called puppyocr that is based on the well-established Tesseract optical character recognition engine. Pet is 1.9MB.
I have included a command-line interface wrapper written in C++ that makes the task of using OCR a bit friendlier than the raw Tesseract command-line interface.
To use it, install the pet and then type "puppyocr" (no quotes) in a terminal and simply follow the prompts there.
When asked for, type the name (including the extension) of a 'tif' file that is stored in your home folder.
You can use the MTPAINT or Gimp software programs to create a 'tif' image file from a scanner.
When prompted for the name of the output file, you can use any name you like. This output file will be created in your home folder but with a 'txt' extension.
edit: replaced with updated version that checks for a suitable input file. If none found program exits with a warning.
Enjoy!
I have included a command-line interface wrapper written in C++ that makes the task of using OCR a bit friendlier than the raw Tesseract command-line interface.
To use it, install the pet and then type "puppyocr" (no quotes) in a terminal and simply follow the prompts there.
When asked for, type the name (including the extension) of a 'tif' file that is stored in your home folder.
You can use the MTPAINT or Gimp software programs to create a 'tif' image file from a scanner.
When prompted for the name of the output file, you can use any name you like. This output file will be created in your home folder but with a 'txt' extension.
edit: replaced with updated version that checks for a suitable input file. If none found program exits with a warning.
Enjoy!