Commandline script for converting .doc to .txt

How to do things, solutions, recipes, tutorials
Post Reply
Message
Author
toowoombalinux
Posts: 95
Joined: Tue 16 Feb 2010, 00:22

Commandline script for converting .doc to .txt

#1 Post by toowoombalinux »

G'day,
Very basic script that need to be in the same directory as the .doc file. Oh and Abiword needs to be installed.

Code: Select all

#!/bin/sh
for FILENAME in *.doc
do :
  BASEFILENAME=`basename $FILENAME .doc`
  abiword --to=$BASEFILENAME.abw $BASEFILENAME.doc
done
for FILENAME in *.abw
do :
  BASEFILENAME=`basename $FILENAME .abw`
  abiword --to=$BASEFILENAME.txt $BASEFILENAME.abw
done
For a cleaner txt file (which you may use with speech synth) I found it better to convert to .abw then convert .abw to .txt.

Hopes this helps.
Cheers
Martin
Toowoomba Linux Community
[url]http://groups.google.com/group/toowoombalinux[/url]
Puppy Linux 301 - KDE 3.5.8

aragon
Posts: 1698
Joined: Mon 15 Oct 2007, 12:18
Location: Germany

#2 Post by aragon »

you might want to also look at

antiword: http://www.winfield.demon.nl/

and

docx2txt: http://docx2txt.sourceforge.net/

aragon

Post Reply