(old)Puppy Linux Discussion Forum

Posted: **Sat 01 Oct 2011, 15:41**

Here is a pet package called puppyocr that is based on the well-established Tesseract optical character recognition engine. Pet is 1.9MB.

I have included a command-line interface wrapper written in C++ that makes the task of using OCR a bit friendlier than the raw Tesseract command-line interface.

To use it, install the pet and then type "puppyocr" (no quotes) in a terminal and simply follow the prompts there.

When asked for, type the name (including the extension) of a 'tif' file that is stored in your home folder.

You can use the MTPAINT or Gimp software programs to create a 'tif' image file from a scanner.

When prompted for the name of the output file, you can use any name you like. This output file will be created in your home folder but with a 'txt' extension.

edit: replaced with updated version that checks for a suitable input file. If none found program exits with a warning.

Enjoy!

Posted: **Mon 03 Oct 2011, 04:18**

Thanks! I was just thinking about using something like this earlier today.

I've downloaded it, and I'll give it a try sometime later this week!

Posted: **Sun 10 May 2015, 00:53**

an error occured... Is the link dead ?

Posted: **Sun 10 May 2015, 01:06**

That would be the forum attachment limit biting at you, Pelo. When Flash and/or John Murga set the attachment size to 256kb maximum, it actually deleted everything over that, that had been attached previously.

Sorry, it's gone...

Posted: **Sun 10 May 2015, 06:26**

Newer version available here:
http://akita.scottjarvis.com/puppyocr-1.22.pet
And mirrored here:
http://smokey01.com/saintless/Fredx181/ ... r-1.22.pet
Maybe this will also help (Edit: No, the download links do not work there):
http://www.murga-linux.com/puppy/viewto ... 7f975e7829

Posted: **Sat 23 May 2015, 11:12**

pet stored in my tool case.
With PuppyOCR i read old documents from 1800 to 1900, about history of france, In spite of errors, 95 percent of text is recognized. Don't want too much. fifteen lines are enough . the whole page don't suit. often these docs were scanned from books end edges are trunked.
PuppyOCR does as it can, but as much as others

(old)Puppy Linux Discussion Forum

OCR for Puppy Linux

OCR for Puppy Linux

an error occured.

Merci saintless