Puppy Linux Discussion Forum Forum Index Puppy Linux Discussion Forum
Puppy HOME page : puppylinux.com
"THE" alternative forum : puppylinux.info
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

The time now is Tue 21 Oct 2014, 10:14
All times are UTC - 4
 Forum index » House Training » Beginners Help ( Start Here)
Convert folder of pdf to txt?
Moderators: Flash, Ian, JohnMurga
Post_new_topic   Reply_to_topic View_previous_topic :: View_next_topic
Page 1 of 1 Posts_count  
Author Message
Branarchist

Joined: 27 Nov 2007
Posts: 7

PostPosted: Mon 03 Dec 2007, 19:54    Post_subject:  Convert folder of pdf to txt?  

Hey All:

Trying to change a directory of pdf files into txt files using the pdftotext utility installed with xpdf through the puppyfiles office section. I've tried

ls *.pdf | pdftotext

with no success... it just goes to the pdftotext help file. I've used pdftotext to convert single files successfully, so I know that part works. Do I need to use xargs or something?

Thanks,
-- Brandan
Back to top
View user's profile Send_private_message 
MU


Joined: 24 Aug 2005
Posts: 13642
Location: Karlsruhe, Germany

PostPosted: Mon 03 Dec 2007, 23:30    Post_subject:  

ls *.pdf | while read a;do pdftotext "$a";done

Did not try myself, but this is a common way to batch-process a list of files.
I use it with other programs than pdftotext often.
Mark
Back to top
View user's profile Send_private_message Visit_website 
muggins

Joined: 20 Jan 2006
Posts: 6688
Location: lisbon

PostPosted: Tue 04 Dec 2007, 06:38    Post_subject:  

Branarchist,

If you save this script as /usr/bin/pdfs2txt, and give it executable permission with chmod +x /usr/bin/pdfs2txt, then it will recursively run pdftotext on any .pdf files found, including any with spaces in their names.

You can either run it without any argument, in the target directory, or pass it the target directory as an argument. e.g.

cd /xxx
pdfs2txt

or

pdfs2txt /xxx

Code:

#!/bin/sh
params=$#

if [ "$params" -eq 0 ];then
   directory=`pwd`
elif [ "$params" -eq 1 ];then
   directory="$@"
   cd $directory
   if [ "$directory" != `pwd` ];then
      directory=`pwd`
   fi
else
   echo "wrong number of arguments!"
   exit 1
fi

for file in $directory/*
do
   if [ -d "$file" ]; then
      cd "$file"
          pdfs2txt
          cd ..
     elif [ `head -c 4 "$file"` = "%PDF" ];then
      filename=${file%.pdf}         
      pdftotext -layout -raw -eol unix $file > "$filename.txt"
   

   fi

done
Back to top
View user's profile Send_private_message 
Display_posts:   Sort by:   
Page 1 of 1 Posts_count  
Post_new_topic   Reply_to_topic View_previous_topic :: View_next_topic
 Forum index » House Training » Beginners Help ( Start Here)
Jump to:  

Rules_post_cannot
Rules_reply_cannot
Rules_edit_cannot
Rules_delete_cannot
Rules_vote_cannot
You cannot attach files in this forum
You can download files in this forum


Powered by phpBB © 2001, 2005 phpBB Group
[ Time: 0.0438s ][ Queries: 12 (0.0047s) ][ GZIP on ]