Hi All,
I am looking for a software which can extract texts from an image. Basically, input is a scanned image and output is texts in it. Example could be, I will scan a page from a book, I will give the jpg/png image to the software and it will give me the texts written on it. Is there such a software on puppy to try?
Sincerely,
Srinivas Nayak
Character recognizing software
Character recognizing software
[Precise 571 on AMD Athlon XP 2000+ with 512MB RAM]
[Fatdog 720 on Intel Pentium B960 with 4GB RAM]
[url]http://srinivas-nayak.blogspot.com/[/url]
[Fatdog 720 on Intel Pentium B960 with 4GB RAM]
[url]http://srinivas-nayak.blogspot.com/[/url]
Re: Character recognizing software
Hi snayak, the software you require is called "OCR" software - "Optical Character Recognition".snayak wrote: I will scan a page from a book, I will give the jpg/png image to the software and it will give me the texts written on it. Is there such a software on puppy to try?
I have tried several OCR programmes in Puppy but the one I had most success with is called Tesseract.
Forum member rcrsn51 also released a utility called pic2txt which works with Tesseract,
Pic2txt (which is placed in the Graphics menu) was a component of peasy pdf (which is in the document menu). (EDIT : probably a component of "peasyscan" not peasy pdf - but I also use peasypdf to extract text from some pdfs so maybe that can also do some of what you require)
It is critical to ensure that the scan uses the best resolution - so trial and error is needed to find the settings that work best on your equipment. Also - the image sometimes needs scaling in order for pic2txt to best analyse the characters.
Successful OCR is a combination of Art and Science. Patience required to find the most reliable setup parameters.
I will post back if I can find the appropriate peasypdf, pic2txt and Tesseract threads.
EDIT : start with this post and the ones following it:
http://murga-linux.com/puppy/viewtopic. ... 756#462756