Puppy Linux Discussion Forum Forum Index Puppy Linux Discussion Forum
Puppy HOME page : puppylinux.com
"THE" alternative forum : puppylinux.info
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

The time now is Tue 21 Oct 2014, 22:10
All times are UTC - 4
 Forum index » House Training » Users ( For the regulars )
Need 2 Bash scripts that compare, delete files
Moderators: Flash, Ian, JohnMurga
Post_new_topic   Reply_to_topic View_previous_topic :: View_next_topic
Page 1 of 1 Posts_count  
Author Message
technosaurus


Joined: 18 May 2008
Posts: 4353

PostPosted: Sat 20 Dec 2008, 17:04    Post_subject:  Need 2 Bash scripts that compare, delete files
Sub_title: for an ultimate puplet CD/DVD
 

I need to write 2 scripts to create the ultimate puplet CD/DVD

The first compares 2 directories and deletes all files in <dirA> that are "different" from <dirB>(md5 sum would be fine or - maybe just name & file size)

The second compares 2 directories and deletes all files in <dirA> that are "the same" as in <dirB>

What is the best way to do this? Any suggestions?

Here is what it is for:
Merge the zdrv_XXX.sfs (if applicable) with all shared files between all of the puplets (of same major version) using dir2sfs zdrv_XXX - this is from the first script

Use the "phome=" boot parameter to access pup_XXX.sfs files from the second script (dir2sfs of each folder)

_________________
Web Programming - Pet Packaging 100 & 101
Back to top
View user's profile Send_private_message 
Pizzasgood


Joined: 04 May 2005
Posts: 6270
Location: Knoxville, TN, USA

PostPosted: Sat 20 Dec 2008, 20:41    Post_subject:  

Try these. WARNING: These will not work with spaces in the paths. That could be done, but I'm lazy. Wink If you do fix it to handle spaces, it can also get messed up if a filename or directory has the string " and " in it (that's and with a space on each side).

These assume that the first directory passed to it is the one you want to delete things from. So if you run them like this:
script <dirA> <dirB>
They'll delete the different or identical files from <dirA>.

Code:
#!/bin/sh
FILES=$(diff -qsr "$1" "$2")
IDENTICALS="$(echo "$FILES" | grep 'are identical$' | sed 's/ are identical$//' | sed 's|^Files \(.*\) and .*|\1|')"
echo "Identical files"
echo "$IDENTICALS"
echo
for i in $IDENTICALS; do rm -rf "$i"; done

Code:
#!/bin/sh
FILES=$(diff -qsr "$1" "$2")
DIFFERENTS="$(echo "$FILES" | grep -v 'are identical$' | sed 's|^Only in \(.*\): \(.*\)|\1/\2|' | sed 's|^Files \(.*\) and .* differ$|\1|' | grep -v "^$2")"
echo "Different files"
echo "$DIFFERENTS"
echo
for i in $DIFFERENTS; do rm -rf "$i"; done

_________________
Between depriving a man of one hour from his life and depriving him of his life there exists only a difference of degree. --Muad'Dib

Back to top
View user's profile Send_private_message Visit_website 
technosaurus


Joined: 18 May 2008
Posts: 4353

PostPosted: Sun 21 Dec 2008, 00:07    Post_subject:  

Thanks pizza' & thanks for the warning - looks like I have some more puplet downloading to do.
_________________
Web Programming - Pet Packaging 100 & 101
Back to top
View user's profile Send_private_message 
trapster


Joined: 28 Nov 2005
Posts: 2006
Location: Maine, USA

PostPosted: Sun 21 Dec 2008, 08:24    Post_subject:  

I like the identicals script. It will be handy for comparing new music files to my collection.
How do I change it to ignore the suffix so I can compare .mp3 to .ogg filenames?

_________________
trapster
Maine, USA

Asus eeepc 1005HA PU1X-BK
Frugal install: Puppeee4.31 + 1.0, Puppy4.10 + Lupu52
Currently using Slacko AND lupu52 w/ fluxbox
Back to top
View user's profile Send_private_message Visit_website 
Pizzasgood


Joined: 04 May 2005
Posts: 6270
Location: Knoxville, TN, USA

PostPosted: Sun 21 Dec 2008, 15:16    Post_subject:  

I don't think a simple change will suffice. If you just want to find all identical files in any given spot(s), you could just take the md5sum of every file, then identify any duplicate sums.

Here's a script that seems to work, even with spaces. It can be passed a list of files and directories, and it will compare every file to see which ones are identical, regardless of name:
Code:
#!/bin/sh

#get a unique filename in /tmp
CHECKSUMS="/tmp/checksums_$RANDOM"
while [ -f "$CHECKSUMS" ]; do
  CHECKSUMS="/tmp/checksums_$RANDOM"
done

#put the md5sums of all files in the passed directories into the file
find "$@" -type f -exec md5sum "{}" + >> "$CHECKSUMS"

#loop through each md5sum in the file
NUM_MATCHES=0
MATCHES=""
for i in $(grep -o '^[^ ]*' "$CHECKSUMS"); do
  #grab all entries that have the same md5sum as the current one
  ENTRIES="$(grep "$i" "$CHECKSUMS")"
  #parse out the filenames
  FILES="$(echo "$ENTRIES" | sed 's/^[0-9a-f]\{32\}\s*//')"
  #if there are more than one file with the same md5sum as the current one, and
  #the current one hasn't been used before, then list the files as being identical
  if [ $( echo "$FILES" | grep -c '^') -gt 1 ] && [ "$(echo ${MATCHES[*]} | grep $i)" = "" ]; then
    #and add the md5sum to the list of used ones in $MATCHES
    MATCHES[$NUM_MATCHES]=$i
    NUM_MATCHES=$[$NUM_MATCHES+1]
    echo "These files are identical:"
    echo "$FILES"
    echo
  fi
done

#clean up
rm -f "$CHECKSUMS"

Code:
# ./Script .
These files are identical:
./asd fff
./sdd

These files are identical:
./q/c/d/FILE
./k/c/d/FILE
./b

These files are identical:
./c
./a

#

_________________
Between depriving a man of one hour from his life and depriving him of his life there exists only a difference of degree. --Muad'Dib

Back to top
View user's profile Send_private_message Visit_website 
vtpup


Joined: 15 Oct 2008
Posts: 1141
Location: Republic of Vermont

PostPosted: Tue 20 Jan 2009, 00:11    Post_subject:  

Just a note of clarification ..... the two scripts that Technosaurus asked for in the first post are in reverse order to the two scripts that Pizzasgood provided in the second post.

The script that leaves behind all the different files is Pizzasgood's first script, and the script that leaves behind all the identical files is his second script.

Thanks, of course, for both!
Back to top
View user's profile Send_private_message 
Display_posts:   Sort by:   
Page 1 of 1 Posts_count  
Post_new_topic   Reply_to_topic View_previous_topic :: View_next_topic
 Forum index » House Training » Users ( For the regulars )
Jump to:  

Rules_post_cannot
Rules_reply_cannot
Rules_edit_cannot
Rules_delete_cannot
Rules_vote_cannot
You cannot attach files in this forum
You can download files in this forum


Powered by phpBB © 2001, 2005 phpBB Group
[ Time: 0.0616s ][ Queries: 11 (0.0033s) ][ GZIP on ]