Hey All:
Trying to change a directory of pdf files into txt files using the pdftotext utility installed with xpdf through the puppyfiles office section. I've tried
ls *.pdf | pdftotext
with no success... it just goes to the pdftotext help file. I've used pdftotext to convert single files successfully, so I know that part works. Do I need to use xargs or something?
Thanks,
-- Brandan
Convert folder of pdf to txt?
Branarchist,
If you save this script as /usr/bin/pdfs2txt, and give it executable permission with chmod +x /usr/bin/pdfs2txt, then it will recursively run pdftotext on any .pdf files found, including any with spaces in their names.
You can either run it without any argument, in the target directory, or pass it the target directory as an argument. e.g.
cd /xxx
pdfs2txt
or
pdfs2txt /xxx
If you save this script as /usr/bin/pdfs2txt, and give it executable permission with chmod +x /usr/bin/pdfs2txt, then it will recursively run pdftotext on any .pdf files found, including any with spaces in their names.
You can either run it without any argument, in the target directory, or pass it the target directory as an argument. e.g.
cd /xxx
pdfs2txt
or
pdfs2txt /xxx
Code: Select all
#!/bin/sh
params=$#
if [ "$params" -eq 0 ];then
directory=`pwd`
elif [ "$params" -eq 1 ];then
directory="$@"
cd $directory
if [ "$directory" != `pwd` ];then
directory=`pwd`
fi
else
echo "wrong number of arguments!"
exit 1
fi
for file in $directory/*
do
if [ -d "$file" ]; then
cd "$file"
pdfs2txt
cd ..
elif [ `head -c 4 "$file"` = "%PDF" ];then
filename=${file%.pdf}
pdftotext -layout -raw -eol unix $file > "$filename.txt"
fi
done