I need to write 2 scripts to create the ultimate puplet CD/DVD
The first compares 2 directories and deletes all files in <dirA> that are "different" from <dirB>(md5 sum would be fine or - maybe just name & file size)
The second compares 2 directories and deletes all files in <dirA> that are "the same" as in <dirB>
What is the best way to do this? Any suggestions?
Here is what it is for:
Merge the zdrv_XXX.sfs (if applicable) with all shared files between all of the puplets (of same major version) using dir2sfs zdrv_XXX - this is from the first script
Use the "phome=" boot parameter to access pup_XXX.sfs files from the second script (dir2sfs of each folder)
Need 2 Bash scripts that compare, delete files
- technosaurus
- Posts: 4853
- Joined: Mon 19 May 2008, 01:24
- Location: Blue Springs, MO
- Contact:
Need 2 Bash scripts that compare, delete files
Check out my [url=https://github.com/technosaurus]github repositories[/url]. I may eventually get around to updating my [url=http://bashismal.blogspot.com]blogspot[/url].
- Pizzasgood
- Posts: 6183
- Joined: Wed 04 May 2005, 20:28
- Location: Knoxville, TN, USA
Try these. WARNING: These will not work with spaces in the paths. That could be done, but I'm lazy. If you do fix it to handle spaces, it can also get messed up if a filename or directory has the string " and " in it (that's and with a space on each side).
These assume that the first directory passed to it is the one you want to delete things from. So if you run them like this:
script <dirA> <dirB>
They'll delete the different or identical files from <dirA>.
These assume that the first directory passed to it is the one you want to delete things from. So if you run them like this:
script <dirA> <dirB>
They'll delete the different or identical files from <dirA>.
Code: Select all
#!/bin/sh
FILES=$(diff -qsr "$1" "$2")
IDENTICALS="$(echo "$FILES" | grep 'are identical$' | sed 's/ are identical$//' | sed 's|^Files \(.*\) and .*|\1|')"
echo "Identical files"
echo "$IDENTICALS"
echo
for i in $IDENTICALS; do rm -rf "$i"; done
Code: Select all
#!/bin/sh
FILES=$(diff -qsr "$1" "$2")
DIFFERENTS="$(echo "$FILES" | grep -v 'are identical$' | sed 's|^Only in \(.*\): \(.*\)|\1/\2|' | sed 's|^Files \(.*\) and .* differ$|\1|' | grep -v "^$2")"
echo "Different files"
echo "$DIFFERENTS"
echo
for i in $DIFFERENTS; do rm -rf "$i"; done
[size=75]Between depriving a man of one hour from his life and depriving him of his life there exists only a difference of degree. --Muad'Dib[/size]
[img]http://www.browserloadofcoolness.com/sig.png[/img]
[img]http://www.browserloadofcoolness.com/sig.png[/img]
- technosaurus
- Posts: 4853
- Joined: Mon 19 May 2008, 01:24
- Location: Blue Springs, MO
- Contact:
- Pizzasgood
- Posts: 6183
- Joined: Wed 04 May 2005, 20:28
- Location: Knoxville, TN, USA
I don't think a simple change will suffice. If you just want to find all identical files in any given spot(s), you could just take the md5sum of every file, then identify any duplicate sums.
Here's a script that seems to work, even with spaces. It can be passed a list of files and directories, and it will compare every file to see which ones are identical, regardless of name:
Here's a script that seems to work, even with spaces. It can be passed a list of files and directories, and it will compare every file to see which ones are identical, regardless of name:
Code: Select all
#!/bin/sh
#get a unique filename in /tmp
CHECKSUMS="/tmp/checksums_$RANDOM"
while [ -f "$CHECKSUMS" ]; do
CHECKSUMS="/tmp/checksums_$RANDOM"
done
#put the md5sums of all files in the passed directories into the file
find "$@" -type f -exec md5sum "{}" + >> "$CHECKSUMS"
#loop through each md5sum in the file
NUM_MATCHES=0
MATCHES=""
for i in $(grep -o '^[^ ]*' "$CHECKSUMS"); do
#grab all entries that have the same md5sum as the current one
ENTRIES="$(grep "$i" "$CHECKSUMS")"
#parse out the filenames
FILES="$(echo "$ENTRIES" | sed 's/^[0-9a-f]\{32\}\s*//')"
#if there are more than one file with the same md5sum as the current one, and
#the current one hasn't been used before, then list the files as being identical
if [ $( echo "$FILES" | grep -c '^') -gt 1 ] && [ "$(echo ${MATCHES[*]} | grep $i)" = "" ]; then
#and add the md5sum to the list of used ones in $MATCHES
MATCHES[$NUM_MATCHES]=$i
NUM_MATCHES=$[$NUM_MATCHES+1]
echo "These files are identical:"
echo "$FILES"
echo
fi
done
#clean up
rm -f "$CHECKSUMS"
Code: Select all
# ./Script .
These files are identical:
./asd fff
./sdd
These files are identical:
./q/c/d/FILE
./k/c/d/FILE
./b
These files are identical:
./c
./a
#
[size=75]Between depriving a man of one hour from his life and depriving him of his life there exists only a difference of degree. --Muad'Dib[/size]
[img]http://www.browserloadofcoolness.com/sig.png[/img]
[img]http://www.browserloadofcoolness.com/sig.png[/img]
Just a note of clarification ..... the two scripts that Technosaurus asked for in the first post are in reverse order to the two scripts that Pizzasgood provided in the second post.
The script that leaves behind all the different files is Pizzasgood's first script, and the script that leaves behind all the identical files is his second script.
Thanks, of course, for both!
The script that leaves behind all the different files is Pizzasgood's first script, and the script that leaves behind all the identical files is his second script.
Thanks, of course, for both!