PeasyScan Image Scanner Program

Message

rcrsn51 · #101 Post by **rcrsn51** » Fri 03 Nov 2017, 12:20

No reply after two days?

Argolance · #102 Post by **Argolance** » Fri 03 Nov 2017, 17:26

Hello,
Sorry! It was while trying Puppy on the laptop of a friend. I will see him tomorow then give you what you asked for!

Cordialement.

rcrsn51 · #103 Post by **rcrsn51** » Fri 03 Nov 2017, 17:30

I am going to guess that your friend has an old Epson scanner. See the main post on page 1 for special instructions about Epson.

Argolance · #104 Post by **Argolance** » Sun 05 Nov 2017, 15:27

Bonjour,

rcrsn51 wrote:I am going to guess that your friend has an old Epson scanner. See the main post on page 1 for special instructions about Epson.

You just guessed! Epson Stylus CX5400 printer-scanner works making this:

rcrsn51 wrote:Open the file /etc/sane.d/dll.conf. Uncomment the line "epson" by removing the # symbol. Then comment out "epson2" by adding a #.

... and using PeasyScan 2.7/2.12 because it was already working using xsane without making any changes.

Thanks.

Cordialement.

rcrsn51 · #105 Post by **rcrsn51** » Sun 05 Nov 2017, 15:34

Excellent. Just for fun, read the post here from 2010.

I have always wondered what happened to Béèm .

Argolance · #106 Post by **Argolance** » Mon 06 Nov 2017, 10:11

Bonjour,

rcrsn51 wrote:Peasyscan generates some large, temporary PNM image files in /root. They are deleted when the program terminates. Maybe they should be placed elsewhere.

Yes indeed! If the program ends not properly for any reason, the image will not be deleted and may cause problem with a small nearly full pupsave: so Just for my own curiosity, why /root/scan (by default)?

Cordialement.

rcrsn51 · #107 Post by **rcrsn51** » Mon 06 Nov 2017, 12:20

That was the original program in 2010. Now the large pnm file is stored in /tmp. Only the final png file is stored in /root after you save it.

Argolance · #108 Post by **Argolance** » Mon 06 Nov 2017, 16:43

rcrsn51 wrote:That was the original program in 2010. Now the large pnm file is stored in /tmp. Only the final png file is stored in /root after you save it.

OK! Thanks.

Argolance · #109 Post by **Argolance** » Fri 10 Nov 2017, 10:32

Bonjour,
- Scanning image for OCR, I noticed that PeasyScan is searching for the "tessdata" directory inside /usr/share/ while it is (usually?) inside /usr/share/tesseract-ocr/. So the conversion into text is not done.

Code: Select all

ls: cannot access /usr/share/tessdata/*.traineddata: No such file or directory
pnmtotiff: computing colormap...
pnmtotiff: Too many colors - proceeding to write a 24-bit RGB file.
pnmtotiff: If you want an 8-bit palette file, try doing a 'pnmquant 256'.
Error opening data file /usr/share/tesseract-ocr/tessdata/.traineddata.traineddata
Please make sure the TESSDATA_PREFIX environment variable is set to the parent directory of your "tessdata" directory.
Failed loading language '.traineddata'
Tesseract couldn't load any languages!

If I make a symbolic link /usr/share/tessdata to /usr/share/tesseract-ocr/./tessdata it runs well.
I consequently had a look to the PeasyScan script and changed the line:

Code: Select all

LANGUAGE=$(basename $(ls -1 /usr/share/tessdata/*.traineddata | head -n1) .traineddata)

to:

Code: Select all

LANGUAGE=$(basename $(ls -1 /usr/share/tesseract-ocr/tessdata/*.traineddata | head -n1) .traineddata)

And now all is OK!

- Scanning image for PDF, the generated pdf file has no .pdf extention unless a pdf extension is added to the name of the scanned image itself in the field.

Small suggestion: would it be possible to display the text file using the defaulttextviewer at the end of the OCR process as well as the pdf file using the defaultpdtviewer. I think it is what user is expecting for, instead of the image which is not really welcome in this case?

Thinking this could be useful.

Cordialement.

rcrsn51 · #110 Post by **rcrsn51** » Fri 10 Nov 2017, 11:56

Argolance wrote:- Scanning image for OCR, I noticed that PeasyScan is searching for the "tessdata" directory inside /usr/share/ while it is (usually?) inside /usr/share/tesseract-ocr/. So the conversion into text is not done.

Where did you get your "tessdata" package? On page 1, I have given the instruction:

3. Copy the file xxx.traineddata to /usr/share/tessdata

Small suggestion: would it be possible to display the text file using the defaulttextviewer at the end of the OCR process as well as the pdf file using the defaultpdtviewer. I think it is what user is expecting for, instead of the image which is not really welcome in this case?

Try this: Between lines 114 and 115, insert

Code: Select all

 defaulttexteditor "$SAVEFILENAME"

I think that seeing the image is still useful. Its quality determines how well the OCR works.

Argolance · #111 Post by **Argolance** » Tue 14 Nov 2017, 12:13

Bonjour,

rcrsn51 wrote:Where did you get your "tessdata" package? On page 1, I have given the instruction:
3. Copy the file xxx.traineddata to /usr/share/tessdata

I simply installed Tesseract and all its needed dependancies from the PPM...

Try this: Between lines 114 and 115, insert
Code: Select all
 defaulttexteditor "$SAVEFILENAME"

It is what I did for my own use, as well as for PDF file: with some minor changes, it works quite well.

Now, as simple user:

Something is a bit confusing:
- "Select the image format"? PDF and TXT are not "images" as such?
- "Name the scanned image as"? There are not only "scanned" images that are named scan, but the *.png, *.jpg, *.pdf and *.txt files, so "scan" is only the base name of them?

In the basic recipe for using Peasyscan, you first mention:
1. Select the image format.
This must be this way while automating scans because all is done at a time, but I noticed that when scanning only a single document it is possible to select the image format (the "Output file type") after scanning, just before saving, so, from the same scanned image it is possible to get the full range of images, pdf or text files (provided that the script is adapted for that: I did it for my own use too).
This can be useful.

Cordialement.

rcrsn51 · #112 Post by **rcrsn51** » Tue 14 Nov 2017, 13:09

When I added OCR to PeasyScan in 2010, I included my own Tesseract 3.00 package. It's still on page 1. Its default location is /usr/share/tessdata, so PeasyScan is written to look there.

If you get a Tesseract package elsewhere, I can't predict where the language files will be located. So you need to provide the link into /usr/share/tessdata.

I agree that the phrases are confusing. The original versions of PeasyScan only saved to graphics files, so they made sense then.

How about "Select the output format" and "Name the output file as"?

Argolance · #113 Post by **Argolance** » Tue 14 Nov 2017, 16:29

rcrsn51 wrote:When I added OCR to PeasyScan in 2010, I included my own Tesseract 3.00 package. It's still on page 1. Its default location is /usr/share/tessdata, so PeasyScan is written to look there.

When I installed Tesseract, as it is the case for all the PPM users, Puppy didn't ask me where to copy the tessdata folder and put it inside the /usr/share/tesseract-ocr directory, which seems to be the default/usual one. In any case, I think it may be appropriate your script to take this into account and search for this directory too?

If you get a Tesseract package elsewhere, I can't predict where the language files will be located. So you need to provide the link into /usr/share/tessdata

.
It is not 'elsewhere' but very problably 'where' most of the (ToOpPy) users may usually find the package and install it from.

I agree that the phrases are confusing. The original versions of PeasyScan only saved to graphics files, so they made sense then.

Your script, as always, is very interesting, as simple and efficient as possible... but looks a bit tough for my taste!

So I took the liberty and had fun to get it smoother dressed! The only thing I hope is not to have impaired its functions!

How about "Select the output format" and "Name the output file as"?

See the pictures below, this is the choices I made...

Cordialement.

rcrsn51 · #114 Post by **rcrsn51** » Tue 14 Nov 2017, 20:30

I have updated my version to look for the language files in both places.

charlie6 · #115 Post by **charlie6** » Mon 29 Jan 2018, 17:12

Hi Bill,
hope you are doing well !

using this well appreciated version-2.12-peasyscan on Dpup Stretch-7.5: saving to pdf reports:

Code: Select all

# peasyscan
tiff2pdf: invalid option -- 'F'
LIBTIFF, Version 4.0.8
...
 -f: set PDF "Fit Window" user preference
...
tiff2pdf: invalid option -- 'F'
LIBTIFF, Version 4.0.8

editing /usr/local/bin/peasyscan
and replacing -F by -f after the two instances of tiff2pdf fixes the issues.

HTH
Charlie

rcrsn51 · #116 Post by **rcrsn51** » Mon 29 Jan 2018, 17:41

charlie6 wrote:
Code: Select all
# peasyscan
tiff2pdf: invalid option -- 'F'
LIBTIFF, Version 4.0.8
...
 -f: set PDF "Fit Window" user preference
...
tiff2pdf: invalid option -- 'F'
LIBTIFF, Version 4.0.8
editing /usr/local/bin/peasyscan
and replacing -F by -f after the two instances of tiff2pdf fixes the issues.

Thanks. The developers of libtiff are always changing something.

I have added an Update to the main page that explains this issue.

charlie6 · #117 Post by **charlie6** » Wed 31 Jan 2018, 16:06

Hi,
just for memory sake ...

might also help those who want a well contrasted black & white scan, even from original drawn with a common graphite pencil (hardness=HB2).

All what follows when Using the «Auto» button in peasyscans GUI see screenshot hereunder (NOT when using the «Start» button)

reference man pages:
scanimage https://www.systutorials.com/docs/linux ... scanimage/
and
gamma4scanimage http://www.pkill.info/linux/man/1-gamma4scanimage/
https://www.systutorials.com/docs/linux ... scanimage/

Here is a pixma160_4B&W_config.cfg files code for Canon MP160 scanner:
NB: URI is to be determined by running in terminal

export URI="pixma:04A91714_F30F67"
export SOURCE="flatbed"
export MODE="lineart"
export RESOLUTION="300"
export PAPER="A4"
export LANGUAGE="fra"
export OTHER="--custom-gamma=yes --gamma-table [0]1-[4095]255 --gamma=0.4"

Got the URI="pixma:04A91714_F30F67" by typing in terminal:

# scanimage -h

or better, which reports all available scanimages options for the current scanner

# scanimage -A
All options specific to device `pixma:04A91714_F30F67':
Scan mode:
--resolution auto||75|150|300|600dpi [75]
Sets the resolution of the scanned image.
--mode auto|Color|Gray|Lineart [Color]
Selects the scan mode (e.g., lineart, monochrome, or color).
--source Flatbed [Flatbed]
Selects the scan source (such as a document-feeder). Set source before
mode and resolution. Resets mode and resolution to auto values.
--button-controlled[=(yes|no)] [no]
When enabled, scan process will not start immediately. To proceed,
press "SCAN" button (for MP150) or "COLOR" button (for other models).
To cancel, press "GRAY" button.
Gamma:
--custom-gamma[=(auto|yes|no)] [yes]
Determines whether a builtin or a custom gamma-table should be used.
--gamma-table auto|0..255,...
Gamma-correction table. In color mode this option equally affects the
red, green, and blue channels simultaneously (i.e., it is an intensity
gamma table).
--gamma auto|0.299988..5 [2.2]
Changes intensity of midtones
Geometry:
-l auto|0..216.069mm [0]
Top-left x position of scan area.
-t auto|0..297.011mm [0]
Top-left y position of scan area.
-x auto|0..216.069mm [216.069]
Width of scan-area.
-y auto|0..297.011mm [297.011]
Height of scan-area.
Buttons:
--button-update
Update button state
--button-1 <int> [0] [read-only]
Button 1
--button-2 <int> [0] [read-only]
Button 2
--original <int> [0] [read-only]
Type of original to scan
--target <int> [0] [read-only]
Target operation type
--scan-resolution <int> [0] [read-only]
Scan resolution
Extras:
--threshold auto|0..100% (in steps of 1) [inactive]
Select minimum-brightness to get a white point
--threshold-curve auto|0..127 (in steps of 1) [inactive]
Dynamic threshold curve, from light to dark, normally 50-65
#

--gamma=0.4 : values less than 1 increase the contrast
minimum/maximum-scanner-specific-values are given above
see «...--gamma auto|0.299988..5 »: 0.299988 and 5

--gamma-table [0]1-[4095]255 : this syntax defines a kind of
" XY-4095points-Xrange0->4095-and-Yrange1->255-gamma curve "
(for details read above referenced links

; just make some other "cut-and-try" trials like as example [0]0-[4095]125 or [0]2-[4095]145 ranges to see what happens, looking at the "image.pnm" output of the below given scanimage console command for faster testing than each time editing the peasyscans config.cfg file)

the value 4095 is specific to the pixma MP160 scanner, is given reading the error report of the following console command:

# scanimage --mode Gray --custom-gamma=yes --gamma-table [0]0-[99999]255 >image.pnm
scanimage: option --gamma-table: index 99999 out of range [0..4095]

To mention: the above scanner-specific-options "Extras:" would be specific to the MODE="Color", and then inactive if MODE=lineart, which I did not have needed to investigate till now.

Mode="gray" gives a continuous-gray-ranged image, which size is larger than the lineart mode one: also not investigated.

Have fun,

HTH
Charlie

rcrsn51 · #118 Post by **rcrsn51** » Sat 03 Feb 2018, 15:13

Verson 2.13 posted above. See the Update note about optional compression when saving to PDF.

rcrsn51 · #119 Post by **rcrsn51** » Sat 03 Feb 2018, 17:46

@Charlie:

I looked at libtiff v4.0.8 and I cannot replicate your problem with the "-F" option.

It continues to work as advertised - it expands the image to fill the PDF page.

Bill

(old)Puppy Linux Discussion Forum

(old)Puppy Linux Discussion Forum

PeasyScan Image Scanner Program

tiff2pdf: invalid option -F

Re: tiff2pdf: invalid option -F

/usr/local/peasyscan/config4BLACK&WHITE.cfg; «Auto» key