Puppy Linux Discussion Forum Forum Index Puppy Linux Discussion Forum
Puppy HOME page : puppylinux.com
"THE" alternative forum : puppylinux.info
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

The time now is Tue 30 Sep 2014, 10:14
All times are UTC - 4
 Forum index » House Training » Beginners Help ( Start Here)
Convert folder of pdf to txt?
Moderators: Flash, Ian, JohnMurga
Post new topic   Reply to topic View previous topic :: View next topic
Page 1 of 1 [3 Posts]  
Author Message
Branarchist

Joined: 27 Nov 2007
Posts: 7

PostPosted: Mon 03 Dec 2007, 19:54    Post subject:  Convert folder of pdf to txt?  

Hey All:

Trying to change a directory of pdf files into txt files using the pdftotext utility installed with xpdf through the puppyfiles office section. I've tried

ls *.pdf | pdftotext

with no success... it just goes to the pdftotext help file. I've used pdftotext to convert single files successfully, so I know that part works. Do I need to use xargs or something?

Thanks,
-- Brandan
Back to top
View user's profile Send private message 
MU


Joined: 24 Aug 2005
Posts: 13642
Location: Karlsruhe, Germany

PostPosted: Mon 03 Dec 2007, 23:30    Post subject:  

ls *.pdf | while read a;do pdftotext "$a";done

Did not try myself, but this is a common way to batch-process a list of files.
I use it with other programs than pdftotext often.
Mark
Back to top
View user's profile Send private message Visit poster's website 
muggins

Joined: 20 Jan 2006
Posts: 6687
Location: lisbon

PostPosted: Tue 04 Dec 2007, 06:38    Post subject:  

Branarchist,

If you save this script as /usr/bin/pdfs2txt, and give it executable permission with chmod +x /usr/bin/pdfs2txt, then it will recursively run pdftotext on any .pdf files found, including any with spaces in their names.

You can either run it without any argument, in the target directory, or pass it the target directory as an argument. e.g.

cd /xxx
pdfs2txt

or

pdfs2txt /xxx

Code:

#!/bin/sh
params=$#

if [ "$params" -eq 0 ];then
   directory=`pwd`
elif [ "$params" -eq 1 ];then
   directory="$@"
   cd $directory
   if [ "$directory" != `pwd` ];then
      directory=`pwd`
   fi
else
   echo "wrong number of arguments!"
   exit 1
fi

for file in $directory/*
do
   if [ -d "$file" ]; then
      cd "$file"
          pdfs2txt
          cd ..
     elif [ `head -c 4 "$file"` = "%PDF" ];then
      filename=${file%.pdf}         
      pdftotext -layout -raw -eol unix $file > "$filename.txt"
   

   fi

done
Back to top
View user's profile Send private message 
Display posts from previous:   Sort by:   
Page 1 of 1 [3 Posts]  
Post new topic   Reply to topic View previous topic :: View next topic
 Forum index » House Training » Beginners Help ( Start Here)
Jump to:  

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Powered by phpBB © 2001, 2005 phpBB Group
[ Time: 0.0499s ][ Queries: 12 (0.0060s) ][ GZIP on ]