Speeding up the SnapMerge

For discussions about programming, programming questions/advice, and projects that don't really have anything to do with Puppy.
Message
Author
jamesbond
Posts: 3433
Joined: Mon 26 Feb 2007, 05:02
Location: The Blue Marble

#31 Post by jamesbond »

Q5sys wrote:
jamesbond wrote:Tadaa ... s9 (version 9 of the script). Same performance as s7. Comment out the echo to reduce verbosity. This code is only for the copy-down only - I'm not sure what else snapmergepuppy does. Perhaps I'll try it out later - but meanwhile, anyone is welcome to try it.
How much faster do you estimate this is over the default way of doing things? I'm quite impressed by everyones work in this thread.
Image for everyone. :P
Until this is really merged into a puplet for testing, no one can tell for sure, unfortunately. Benchmarks doesn't always translate into real-world performance :oops:
Fatdog64 forum links: [url=http://murga-linux.com/puppy/viewtopic.php?t=117546]Latest version[/url] | [url=https://cutt.ly/ke8sn5H]Contributed packages[/url] | [url=https://cutt.ly/se8scrb]ISO builder[/url]

User avatar
technosaurus
Posts: 4853
Joined: Mon 19 May 2008, 01:24
Location: Blue Springs, MO
Contact:

#32 Post by technosaurus »

Umm... are we trying to manually do what aubrsync does?
http://aufs.sourceforge.net/aufs2/brsync/README.txt
Check out my [url=https://github.com/technosaurus]github repositories[/url]. I may eventually get around to updating my [url=http://bashismal.blogspot.com]blogspot[/url].

jamesbond
Posts: 3433
Joined: Mon 26 Feb 2007, 05:02
Location: The Blue Marble

#33 Post by jamesbond »

technosaurus wrote:Umm... are we trying to manually do what aubrsync does?
http://aufs.sourceforge.net/aufs2/brsync/README.txt
Hahaha yes !!! Good find technosaurus :)

EDIT: Incidentally that script also use rsync ... so we're in the right track (when trying to re-invent the wheel, that is) :D
Fatdog64 forum links: [url=http://murga-linux.com/puppy/viewtopic.php?t=117546]Latest version[/url] | [url=https://cutt.ly/ke8sn5H]Contributed packages[/url] | [url=https://cutt.ly/se8scrb]ISO builder[/url]

User avatar
technosaurus
Posts: 4853
Joined: Mon 19 May 2008, 01:24
Location: Blue Springs, MO
Contact:

#34 Post by technosaurus »

I really have no clue how the save to cd/dvd parts work, but...
http://freshmeat.net/projects/rdiff-backup/
Seems like it could be sensible?
Check out my [url=https://github.com/technosaurus]github repositories[/url]. I may eventually get around to updating my [url=http://bashismal.blogspot.com]blogspot[/url].

User avatar
jemimah
Posts: 4307
Joined: Wed 26 Aug 2009, 19:56
Location: Tampa, FL
Contact:

#35 Post by jemimah »

Here is the actual script.
Attachments
aubrsync.gz
(3.2 KiB) Downloaded 489 times

User avatar
jemimah
Posts: 4307
Joined: Wed 26 Aug 2009, 19:56
Location: Tampa, FL
Contact:

#36 Post by jemimah »

Looking at comments in the code.
The dst_branch must be mounted as writable.
During the operation, the mntpnt is set readonly.
If you are opening a file for writing on the writable branch,
you need to close the file before invoking this script.
They do have a move option, but it has problems.
Like above (2 branches), move and reflect all modifications
from upper to lower. Almost all files on the upper branch will
be removed. You can still use this aufs after the
operation. But the inode number may be changed. If your
application which depends upon the inode number was running at
that time, it may not work correctly.
I get the feeling this is not going to be very transparent to the user. :)

jamesbond
Posts: 3433
Joined: Mon 26 Feb 2007, 05:02
Location: The Blue Marble

#37 Post by jamesbond »

Long way ahead ... but, I think, if one doesn't really try to deliberately break things (like removing /lib or /usr/sbin and then re-creating and re-populating them), it should be ok for most of them. Again, it's a statement that can only be proven by experiments ...
Fatdog64 forum links: [url=http://murga-linux.com/puppy/viewtopic.php?t=117546]Latest version[/url] | [url=https://cutt.ly/ke8sn5H]Contributed packages[/url] | [url=https://cutt.ly/se8scrb]ISO builder[/url]

jamesbond
Posts: 3433
Joined: Mon 26 Feb 2007, 05:02
Location: The Blue Marble

#38 Post by jamesbond »

Took a jump and tried my script (replacing snapmergepuppy directly). Run the script - doesn't crash, but all other executables are gone (and freemem shows zero space available). But it doesn't crash (although reboot is impossible - power button is required).
After a reboot, I removed all the tmpfs-deletion stuff - and things work! (doesn't crash, system continue as normal, and data is saved, reboot works properly). I think I just need to be careful on what can and cannot be deleted, as jemimah said before.
I tried this on my netbook with harddisk, so the delay isn't noticeable. I should try this on my eeepc with the slow sd-card access - and then we can get a benchmark.
Fatdog64 forum links: [url=http://murga-linux.com/puppy/viewtopic.php?t=117546]Latest version[/url] | [url=https://cutt.ly/ke8sn5H]Contributed packages[/url] | [url=https://cutt.ly/se8scrb]ISO builder[/url]

User avatar
jemimah
Posts: 4307
Joined: Wed 26 Aug 2009, 19:56
Location: Tampa, FL
Contact:

#39 Post by jemimah »

So the slow performance of the script is really only a problem on shutdown. However while you are running, it'd actually be better if the snapmerge doesn't hog your cpu.

It should be fine to use the aubrsync at shutdown as all files should be closed at that point anyway.

I've been stress testing Dougal's patch and it's about twice as fast. But I've discovered some bugs during my tests - deleted files coming back from the dead after a reboot. But turns out the original script has the same problem - hopefully I can identify the source.

jpeps
Posts: 3179
Joined: Sat 31 May 2008, 19:00

#40 Post by jpeps »

jemimah wrote: But turns out the original script has the same problem - hopefully I can identify the source.
I see debugging commented out "set -x"

User avatar
jemimah
Posts: 4307
Joined: Wed 26 Aug 2009, 19:56
Location: Tampa, FL
Contact:

#41 Post by jemimah »

The first major problem is that the script is copying down files without checking if the whiteout is there in the base layer. So that is the probable source for the I/O errors I've been seeing occaisonally.

User avatar
technosaurus
Posts: 4853
Joined: Mon 19 May 2008, 01:24
Location: Blue Springs, MO
Contact:

#42 Post by technosaurus »

Man - I wish I had looked at this stuff before starting the zdrv cutter. It seems like it may be possible to write the needed modules directly to the save file. (I could probably just "touch" those and then delete the zdrv altogether without having to do all of the crazy shifting around and rebuilding the squash file) oh well - hindsight is 2010
Check out my [url=https://github.com/technosaurus]github repositories[/url]. I may eventually get around to updating my [url=http://bashismal.blogspot.com]blogspot[/url].

jamesbond
Posts: 3433
Joined: Mon 26 Feb 2007, 05:02
Location: The Blue Marble

#43 Post by jamesbond »

Applied my s10 script (same as s9 except I optimise the "find" command, using the same command as found in the original snapmergepuppy, and add --inplace to rsync) to my puppeee running on 701SD with a very slow SD card.
Merge time is reduced frrom 30secs to less than 3secs. No data loss, no I/O error so far. No ghost file coming back after deletion either :) This is on a 128MB save file.
Oh yes I disable the last rm command that clears the tmpfs - change that to echo. Haven't got around to make it work.
Fatdog64 forum links: [url=http://murga-linux.com/puppy/viewtopic.php?t=117546]Latest version[/url] | [url=https://cutt.ly/ke8sn5H]Contributed packages[/url] | [url=https://cutt.ly/se8scrb]ISO builder[/url]

jpeps
Posts: 3179
Joined: Sat 31 May 2008, 19:00

#44 Post by jpeps »

Why is it that files with a whiteout in the TMPFS layer also appear in lower layers?

example:
/initrd/pup_rw/etc/.wh.windowmanager.openbox
/initrd/pup_ro2/etc/windowmanager.openbox

Edit Oh..got it....File is in lupu.sfs and got deleted..

To be safe, looks like aubrsynch needs to be run during shutdown (if I understand it correctly).

User avatar
jemimah
Posts: 4307
Joined: Wed 26 Aug 2009, 19:56
Location: Tampa, FL
Contact:

#45 Post by jemimah »

jamesbond wrote:Applied my s10 script (same as s9 except I optimise the "find" command, using the same command as found in the original snapmergepuppy, and add --inplace to rsync) to my puppeee running on 701SD with a very slow SD card.
Merge time is reduced frrom 30secs to less than 3secs. No data loss, no I/O error so far. No ghost file coming back after deletion either :) This is on a 128MB save file.
Oh yes I disable the last rm command that clears the tmpfs - change that to echo. Haven't got around to make it work.
Does rsync stop copying before your file system fills up? I guess the worry is corruption if you run out of space in the middle of a transfer.

User avatar
jemimah
Posts: 4307
Joined: Wed 26 Aug 2009, 19:56
Location: Tampa, FL
Contact:

#46 Post by jemimah »

Here is my bug-fixed version of Barry's script combined with Dougal's performance enhancements.

Note, if anyone wants to use this, you need to change the notify-send command back to yaf-splash if your system doesn't have notification daemon.

Bug fixes include:
  • Apparently opaque files used to be called .wh__dir_opaque but the name has since changed to .wh..wh..opq. Opaque directiories were not working at all. Looks like the init script has this issue too!

    Remount aufs in notfiy mode during the copy down. This prevents the temporary resurection of undead files.

    Delete whiteouts from the base if copying a new file down. This prevents I/O errors.

    Call df only when needed - speeds things up a little.
Attachments
snapmergepuppy.gz
(3.03 KiB) Downloaded 501 times
Last edited by jemimah on Wed 09 Feb 2011, 03:47, edited 1 time in total.

jamesbond
Posts: 3433
Joined: Mon 26 Feb 2007, 05:02
Location: The Blue Marble

#47 Post by jamesbond »

Yes, it will stop if the file is full. I tested this scenario. Of course some of the files can't be saved - but no corruption whatsoever. So far so good.
Some of the directories need to be excluded from copy down (e.g /var) - I will work on this later. It's just passing the correct --exclude option to rsync.
Fatdog64 forum links: [url=http://murga-linux.com/puppy/viewtopic.php?t=117546]Latest version[/url] | [url=https://cutt.ly/ke8sn5H]Contributed packages[/url] | [url=https://cutt.ly/se8scrb]ISO builder[/url]

User avatar
jemimah
Posts: 4307
Joined: Wed 26 Aug 2009, 19:56
Location: Tampa, FL
Contact:

#48 Post by jemimah »

If rsync is really that much faster, I'm not sure there's any reason to worry about excludes.

You will want to checkout the bit about remounting aufs in notifiy mode though.

Notify mode for when you're modifying the layers by hand:

Code: Select all

busybox mount -t aufs -o remount,udba=notify unionfs
Normal mode:

Code: Select all

busybox mount -t aufs -o remount,udba=reval unionfs /
Otherwise weird stuff happens, like files that are whited out appear anyway and then disappear 10 minutes later.

jamesbond
Posts: 3433
Joined: Mon 26 Feb 2007, 05:02
Location: The Blue Marble

#49 Post by jamesbond »

jemimah wrote:If rsync is really that much faster, I'm not sure there's any reason to worry about excludes.
Hmm I didn't think of it that way, I though the reason to exclude /var etc is because we don't want to keep all the junk in savefile. But you're right, in PUPMODE=12 all these things go straight into savefile as well and nobody bothers to clean it, so perhaps I'll just leave it as is. And yes, rsync is definitely faster than doing find/cp loop.
You don't want to do this, really.
You will want to checkout the bit about remounting aufs in notifiy mode though.
Notify mode for when you're modifying the layers by hand:

Code: Select all

busybox mount -t aufs -o remount,udba=notify unionfs
Normal mode:

Code: Select all

busybox mount -t aufs -o remount,udba=reval unionfs /
Otherwise weird stuff happens, like files that are whited out appear anyway and then disappear 10 minutes later.
You don't want to do this. Really. Try it without even doing anything on terminal. Even an "-o remount" can kill. I checked everywhere in Barry's code - he didn't even bother to do this.

According to aufs docs, udba=notify is required when we try to make previously hidden files visible. Otherwise the normal cases is handled by udba=reval quite nicely. The default udba is reval (that's how it's setup in initrd), and it's enough for us in most cases, because we're just moving stuff down. What was visible remains visible, what isn't visible remains invisible.

The remount can kill, really, because it makes previously open handle stale. And when the rootfs handle becomes stale ... well ... :shock:

Anyway, I've implemented the "delete from tmpfs" using your lsof idea. It works - no stability issue so far. I've got a nice extra space in tmpfs after doing that. Of course, my comment about thin provisiong still applies - if one runs 128MB tmpfs with 1GB pupsave, that's probably ok, but the otherway around will get rough very quickly. Even a 1:1 tmpfs/pupsave will get rough very quickly. I'm not sure whether freememapplet shows tmpfs or pupsave in PUPMODE=13? Personally I prefer tmpfs=pupsave and just do rsync between them ... but that's just me.
Fatdog64 forum links: [url=http://murga-linux.com/puppy/viewtopic.php?t=117546]Latest version[/url] | [url=https://cutt.ly/ke8sn5H]Contributed packages[/url] | [url=https://cutt.ly/se8scrb]ISO builder[/url]

jamesbond
Posts: 3433
Joined: Mon 26 Feb 2007, 05:02
Location: The Blue Marble

s11 - the ultimate snapmergepuppy

#50 Post by jamesbond »

We need 3 files:
- s11 (direct replacement for snapmergepuppy) (replace /usr/sbin/snapmergepuppy with this one)
- closedfiles.awk
- and lsof (this is static binaries so should work for everyone)

Code for s11 and closedsource.awk here, the three of them are attached inside the tar.bz2 file below.

s11

Code: Select all

#!/bin/ash
# jamesbond 2011 - GPLv3
# s11 - add --inplace to rsync, replace sed with egrep (easier to understand)
#	  - required lsof and closedfiles.awk
# Note: works for AUFS only
#  0m6secs (lang utf8)

# Do not use trailing slash here.
TMPFS=/initrd/pup_rw
PUPSAVE=/initrd/pup_ro1

################# main ###################
# check for new whiteouts - remove them from pupsave
echo "deleting newly deleted files"
find "$TMPFS" -mount \( -regex '.*/\.wh\.[^/]*' -type f \) | 
grep -v -E ".wh..wh.orph|.wh..wh.plnk|.wh..wh.aufs|.wh..wh..opq" |
while read -r FILE; do
	#echo $FILE					# $FILE is TMPFS_WHITEOUT
	FULLNAME="${FILE#$TMPFS}"
	#echo $FULLNAME
	BASE="${FULLNAME%/*}"
	#echo $BASE
	LEAF="${FULLNAME##*/}"
	#echo $LEAF
	#echo $BASE/$LEAF
	
	PUPSAVE_FILE="${PUPSAVE}${BASE}/${LEAF:4}"	
	echo "Deleting $PUPSAVE_FILE"
	rm -rf "$PUPSAVE_FILE"		# delete the file/dir if it's there

done

# check for old whiteouts - remove them from pupsave
echo "deleting old whiteouts"
find "$PUPSAVE" -mount \( -regex '.*/\.wh\.[^/]*' -type f \) | 
grep -v -E ".wh..wh.orph|.wh..wh.plnk|.wh..wh.aufs|.wh..wh..opq" |
while read -r FILE; do
	#echo $FILE					# $FILE is PUPSAVE_WHITEOUT
	FULLNAME="${FILE#$PUPSAVE}"
	#echo $FULLNAME
	BASE="${FULLNAME%/*}"
	#echo $BASE
	LEAF="${FULLNAME##*/}"
	#echo $LEAF
	#echo $BASE/$LEAF
	
	TMPFS_FILE="${TMPFS}${BASE}/${LEAF:4}"
	#echo $TMPFS_FILE

	# delete whiteout only if a new file/dir has been created in the tmpfs layer
	if [ -e "$TMPFS_FILE" -o -L "$TMPFS_FILE" ]; then
		# if TMPFS_FILE is a dir, we need to add diropq when remove its pupsave whiteout
		[ -d "$TMPFS_FILE" ] &&	touch "$TMPFS_FILE/.wh..wh..opq"
		echo Deleting whiteout $FILE
		rm -f "$FILE"
	fi
done

# by now we should be consistent - so rsync everything
# and cleanup tmpfs if rsync is successful
echo rsync-ing
if ! rsync --inplace -a "$TMPFS"/ "$PUPSAVE"; then
	Xdialog --infobox "Your save file is full, please copy important items manually elsewhere." 0 0 10000
else
	if which lsof > /dev/null && which gawk > /dev/null && which closedfiles.awk > /dev/null; then 
		# only cleanup closed files
		closedfiles.awk -v TMPFS="$TMPFS" |
		grep -v -E ".wh..wh.orph|.wh..wh.plnk|.wh..wh.aufs|.wh..wh..opq" |	
		xargs rm -rf
	fi
fi
closedfiles.awk

Code: Select all

#!/bin/gawk -f
# jamesbond 2011 - print list of files in /initrd/pup_rw which is closed
# requires lsof
BEGIN {
	if (TMPFS == "") exit;
	
	# load our "list of open files" before processing - assume they all are in /initrd/pup_rw
	while ("lsof -Fn | sed '/^n/ {s/^n//; p}; d' | sort | uniq " | getline) {
		openfiles[TMPFS $0]=1
	}
	
	# now compare this with the list of files in /initrd/puprw
	CMDLINE = "find " TMPFS " -not -type d"
	while (CMDLINE | getline) {
		if (openfiles[$0] != 1) print $0;
	}
}
enjoy.

EDIT: minor fix for closedfiles.awk.
EDIT: file deleted, buggy, please use latest version
Last edited by jamesbond on Sat 12 Feb 2011, 13:10, edited 1 time in total.
Fatdog64 forum links: [url=http://murga-linux.com/puppy/viewtopic.php?t=117546]Latest version[/url] | [url=https://cutt.ly/ke8sn5H]Contributed packages[/url] | [url=https://cutt.ly/se8scrb]ISO builder[/url]

Post Reply