Puppy Linux Discussion Forum Forum Index Puppy Linux Discussion Forum
Puppy HOME page : puppylinux.com
"THE" alternative forum : puppylinux.info
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

The time now is Fri 19 Dec 2014, 05:00
All times are UTC - 4
 Forum index » Off-Topic Area » Programming
Speeding up the SnapMerge
Post new topic   Reply to topic View previous topic :: View next topic
Page 4 of 7 [92 Posts]   Goto page: Previous 1, 2, 3, 4, 5, 6, 7 Next
Author Message
jemimah


Joined: 26 Aug 2009
Posts: 4309
Location: Tampa, FL

PostPosted: Tue 08 Feb 2011, 23:42    Post subject:  

Here is my bug-fixed version of Barry's script combined with Dougal's performance enhancements.

Note, if anyone wants to use this, you need to change the notify-send command back to yaf-splash if your system doesn't have notification daemon.

Bug fixes include:

    Apparently opaque files used to be called .wh__dir_opaque but the name has since changed to .wh..wh..opq. Opaque directiories were not working at all. Looks like the init script has this issue too!

    Remount aufs in notfiy mode during the copy down. This prevents the temporary resurection of undead files.

    Delete whiteouts from the base if copying a new file down. This prevents I/O errors.

    Call df only when needed - speeds things up a little.
snapmergepuppy.gz
Description 
gz

 Download 
Filename  snapmergepuppy.gz 
Filesize  3.03 KB 
Downloaded  272 Time(s) 

Last edited by jemimah on Tue 08 Feb 2011, 23:47; edited 1 time in total
Back to top
View user's profile Send private message Visit poster's website 
jamesbond

Joined: 26 Feb 2007
Posts: 2232
Location: The Blue Marble

PostPosted: Tue 08 Feb 2011, 23:45    Post subject:  

Yes, it will stop if the file is full. I tested this scenario. Of course some of the files can't be saved - but no corruption whatsoever. So far so good.
Some of the directories need to be excluded from copy down (e.g /var) - I will work on this later. It's just passing the correct --exclude option to rsync.

_________________
Fatdog64, Slacko and Puppeee user. Puppy user since 2.13.
Contributed Fatdog64 packages thread
Back to top
View user's profile Send private message 
jemimah


Joined: 26 Aug 2009
Posts: 4309
Location: Tampa, FL

PostPosted: Wed 09 Feb 2011, 01:05    Post subject:  

If rsync is really that much faster, I'm not sure there's any reason to worry about excludes.

You will want to checkout the bit about remounting aufs in notifiy mode though.

Notify mode for when you're modifying the layers by hand:
Code:
busybox mount -t aufs -o remount,udba=notify unionfs


Normal mode:
Code:
busybox mount -t aufs -o remount,udba=reval unionfs /


Otherwise weird stuff happens, like files that are whited out appear anyway and then disappear 10 minutes later.
Back to top
View user's profile Send private message Visit poster's website 
jamesbond

Joined: 26 Feb 2007
Posts: 2232
Location: The Blue Marble

PostPosted: Wed 09 Feb 2011, 08:03    Post subject:  

jemimah wrote:
If rsync is really that much faster, I'm not sure there's any reason to worry about excludes.
Hmm I didn't think of it that way, I though the reason to exclude /var etc is because we don't want to keep all the junk in savefile. But you're right, in PUPMODE=12 all these things go straight into savefile as well and nobody bothers to clean it, so perhaps I'll just leave it as is. And yes, rsync is definitely faster than doing find/cp loop.

Quote:
You don't want to do this, really.
You will want to checkout the bit about remounting aufs in notifiy mode though.
Notify mode for when you're modifying the layers by hand:
Code:
busybox mount -t aufs -o remount,udba=notify unionfs


Normal mode:
Code:
busybox mount -t aufs -o remount,udba=reval unionfs /


Otherwise weird stuff happens, like files that are whited out appear anyway and then disappear 10 minutes later.
You don't want to do this. Really. Try it without even doing anything on terminal. Even an "-o remount" can kill. I checked everywhere in Barry's code - he didn't even bother to do this.

According to aufs docs, udba=notify is required when we try to make previously hidden files visible. Otherwise the normal cases is handled by udba=reval quite nicely. The default udba is reval (that's how it's setup in initrd), and it's enough for us in most cases, because we're just moving stuff down. What was visible remains visible, what isn't visible remains invisible.

The remount can kill, really, because it makes previously open handle stale. And when the rootfs handle becomes stale ... well ... Shocked

Anyway, I've implemented the "delete from tmpfs" using your lsof idea. It works - no stability issue so far. I've got a nice extra space in tmpfs after doing that. Of course, my comment about thin provisiong still applies - if one runs 128MB tmpfs with 1GB pupsave, that's probably ok, but the otherway around will get rough very quickly. Even a 1:1 tmpfs/pupsave will get rough very quickly. I'm not sure whether freememapplet shows tmpfs or pupsave in PUPMODE=13? Personally I prefer tmpfs=pupsave and just do rsync between them ... but that's just me.

_________________
Fatdog64, Slacko and Puppeee user. Puppy user since 2.13.
Contributed Fatdog64 packages thread
Back to top
View user's profile Send private message 
jamesbond

Joined: 26 Feb 2007
Posts: 2232
Location: The Blue Marble

PostPosted: Wed 09 Feb 2011, 08:19    Post subject: s11 - the ultimate snapmergepuppy  

We need 3 files:
- s11 (direct replacement for snapmergepuppy) (replace /usr/sbin/snapmergepuppy with this one)
- closedfiles.awk
- and lsof (this is static binaries so should work for everyone)

Code for s11 and closedsource.awk here, the three of them are attached inside the tar.bz2 file below.

s11
Code:
#!/bin/ash
# jamesbond 2011 - GPLv3
# s11 - add --inplace to rsync, replace sed with egrep (easier to understand)
#     - required lsof and closedfiles.awk
# Note: works for AUFS only
#  0m6secs (lang utf8)

# Do not use trailing slash here.
TMPFS=/initrd/pup_rw
PUPSAVE=/initrd/pup_ro1

################# main ###################
# check for new whiteouts - remove them from pupsave
echo "deleting newly deleted files"
find "$TMPFS" -mount \( -regex '.*/\.wh\.[^/]*' -type f \) |
grep -v -E ".wh..wh.orph|.wh..wh.plnk|.wh..wh.aufs|.wh..wh..opq" |
while read -r FILE; do
   #echo $FILE               # $FILE is TMPFS_WHITEOUT
   FULLNAME="${FILE#$TMPFS}"
   #echo $FULLNAME
   BASE="${FULLNAME%/*}"
   #echo $BASE
   LEAF="${FULLNAME##*/}"
   #echo $LEAF
   #echo $BASE/$LEAF
   
   PUPSAVE_FILE="${PUPSAVE}${BASE}/${LEAF:4}"   
   echo "Deleting $PUPSAVE_FILE"
   rm -rf "$PUPSAVE_FILE"      # delete the file/dir if it's there

done

# check for old whiteouts - remove them from pupsave
echo "deleting old whiteouts"
find "$PUPSAVE" -mount \( -regex '.*/\.wh\.[^/]*' -type f \) |
grep -v -E ".wh..wh.orph|.wh..wh.plnk|.wh..wh.aufs|.wh..wh..opq" |
while read -r FILE; do
   #echo $FILE               # $FILE is PUPSAVE_WHITEOUT
   FULLNAME="${FILE#$PUPSAVE}"
   #echo $FULLNAME
   BASE="${FULLNAME%/*}"
   #echo $BASE
   LEAF="${FULLNAME##*/}"
   #echo $LEAF
   #echo $BASE/$LEAF
   
   TMPFS_FILE="${TMPFS}${BASE}/${LEAF:4}"
   #echo $TMPFS_FILE

   # delete whiteout only if a new file/dir has been created in the tmpfs layer
   if [ -e "$TMPFS_FILE" -o -L "$TMPFS_FILE" ]; then
      # if TMPFS_FILE is a dir, we need to add diropq when remove its pupsave whiteout
      [ -d "$TMPFS_FILE" ] &&   touch "$TMPFS_FILE/.wh..wh..opq"
      echo Deleting whiteout $FILE
      rm -f "$FILE"
   fi
done

# by now we should be consistent - so rsync everything
# and cleanup tmpfs if rsync is successful
echo rsync-ing
if ! rsync --inplace -a "$TMPFS"/ "$PUPSAVE"; then
   Xdialog --infobox "Your save file is full, please copy important items manually elsewhere." 0 0 10000
else
   if which lsof > /dev/null && which gawk > /dev/null && which closedfiles.awk > /dev/null; then
      # only cleanup closed files
      closedfiles.awk -v TMPFS="$TMPFS" |
      grep -v -E ".wh..wh.orph|.wh..wh.plnk|.wh..wh.aufs|.wh..wh..opq" |   
      xargs rm -rf
   fi
fi


closedfiles.awk
Code:
#!/bin/gawk -f
# jamesbond 2011 - print list of files in /initrd/pup_rw which is closed
# requires lsof
BEGIN {
   if (TMPFS == "") exit;
   
   # load our "list of open files" before processing - assume they all are in /initrd/pup_rw
   while ("lsof -Fn | sed '/^n/ {s/^n//; p}; d' | sort | uniq " | getline) {
      openfiles[TMPFS $0]=1
   }
   
   # now compare this with the list of files in /initrd/puprw
   CMDLINE = "find " TMPFS " -not -type d"
   while (CMDLINE | getline) {
      if (openfiles[$0] != 1) print $0;
   }
}


enjoy.

EDIT: minor fix for closedfiles.awk.
EDIT: file deleted, buggy, please use latest version

_________________
Fatdog64, Slacko and Puppeee user. Puppy user since 2.13.
Contributed Fatdog64 packages thread

Last edited by jamesbond on Sat 12 Feb 2011, 09:10; edited 1 time in total
Back to top
View user's profile Send private message 
jemimah


Joined: 26 Aug 2009
Posts: 4309
Location: Tampa, FL

PostPosted: Wed 09 Feb 2011, 11:01    Post subject:  

jamesbond wrote:
You don't want to do this. Really. Try it without even doing anything on terminal. Even an "-o remount" can kill. I checked everywhere in Barry's code - he didn't even bother to do this.


I do know for certain my tests failed without it. Specifically, moving whiteout files down fails. Maybe it only fails if you delete them from the top layer.

Barry does use it in pkginstall.sh when he writes to lower layers.
http://bkhome.org/blog/?viewDetailed=01534



jamesbond wrote:

Anyway, I've implemented the "delete from tmpfs" using your lsof idea. It works - no stability issue so far. I've got a nice extra space in tmpfs after doing that. Of course, my comment about thin provisiong still applies - if one runs 128MB tmpfs with 1GB pupsave, that's probably ok, but the otherway around will get rough very quickly. Even a 1:1 tmpfs/pupsave will get rough very quickly. I'm not sure whether freememapplet shows tmpfs or pupsave in PUPMODE=13? Personally I prefer tmpfs=pupsave and just do rsync between them ... but that's just me.


I still think you'll have an ugly race condition without some kind of locking. You check if a file is open, find that it's not. Then before you can delete it, something opens it. As you noted above, Unix won't even complain, it will happily let you keep editing the deleted file. But once you close the file, whatever changes you made since the last save are gone.
Back to top
View user's profile Send private message Visit poster's website 
Dougal


Joined: 19 Oct 2005
Posts: 2505
Location: Hell more grotesque than any medieval woodcut

PostPosted: Wed 09 Feb 2011, 16:44    Post subject: Re: s11 - the ultimate snapmergepuppy  

technosaurus wrote:
Man - I wish I had looked at this stuff before starting the zdrv cutter. It seems like it may be possible to write the needed modules directly to the save file. (I could probably just "touch" those and then delete the zdrv altogether without having to do all of the crazy shifting around and rebuilding the squash file)

??
Code:
# du -s /initrd/pup_ro1/lib/modules/2.6.28.9/kernel/
2259    /initrd/pup_ro1/lib/modules/2.6.28.9/kernel/

After first boot zdrv is only used for new modules.
It's useful to look how something works before fiddling with it.

jamesbond wrote:
s11
Code:
#!/bin/ash
#ASH ^^^
...

   PUPSAVE_FILE="${PUPSAVE}${BASE}/${LEAF:4}"   
   #                                               BASH ^^^^^^

_________________
What's the ugliest part of your body?
Some say your nose
Some say your toes
But I think it's your mind
Back to top
View user's profile Send private message 
jamesbond

Joined: 26 Feb 2007
Posts: 2232
Location: The Blue Marble

PostPosted: Wed 09 Feb 2011, 18:45    Post subject:  

jemimah wrote:
jamesbond wrote:
You don't want to do this. Really. Try it without even doing anything on terminal. Even an "-o remount" can kill. I checked everywhere in Barry's code - he didn't even bother to do this.


I do know for certain my tests failed without it. Specifically, moving whiteout files down fails.
Ah, I see. I didn't move down the whiteout files. I copy it - so the original whiteout remains in tmpfs. They take 0 bytes anyway.

Quote:
Maybe it only fails if you delete them from the top layer.
Yes, it definitely fails, especially is one tries to delete open files (and directories). Adding files is safe, and udba=notify isn't necessary.

Quote:
Barry does use it in pkginstall.sh when he writes to lower layers.http://bkhome.org/blog/?viewDetailed=01534Haha, the installpkg script in my machine is still the old one, that's why I didn't see it. Anyway I think I need to take back my original comments - I have seen but couldn't recall the exact scenario where remount can kill. For my case, after doing those two remounts, I lose access to /initrd/* --- so the next invocation of the script will fail.

Quote:
[quote="jamesbond"]
Anyway, I've implemented the "delete from tmpfs" using your lsof idea. It works - no stability issue so far. I've got a nice extra space in tmpfs after doing that. Of course, my comment about thin provisiong still applies - if one runs 128MB tmpfs with 1GB pupsave, that's probably ok, but the otherway around will get rough very quickly. Even a 1:1 tmpfs/pupsave will get rough very quickly. I'm not sure whether freememapplet shows tmpfs or pupsave in PUPMODE=13? Personally I prefer tmpfs=pupsave and just do rsync between them ... but that's just me.


I still think you'll have an ugly race condition without some kind of locking. You check if a file is open, find that it's not. Then before you can delete it, something opens it. As you noted above, Unix won't even complain, it will happily let you keep editing the deleted file. But once you close the file, whatever changes you made since the last save are gone.
Yes, you're right unfortunately. Our only hope is that nothing like this happens in between, because the operation works in tmpfs and should be very very fast. In aubrsync, as you noted, Okajima locked the entire filesystem to "ro" before doing the work ... I'm not sure whether we can afford that in our rootfs?


Dougal wrote:
jamesbond wrote:
s11
Code:
#!/bin/ash
#ASH ^^^
...

   PUPSAVE_FILE="${PUPSAVE}${BASE}/${LEAF:4}"   
   #                                               BASH ^^^^^^


In my system (and also Fluppy), /bin/ash is symlinked to busybox. Busybox ash seems to support this:
Code:
# busybox ash
# a=abcde
# echo ${a:3}
de
# exit


It isn't documented, and if you do have a link to where busybox ash is documented, I'm happy to take it on (busybox applet man page list the details of every other applet except ash).

_________________
Fatdog64, Slacko and Puppeee user. Puppy user since 2.13.
Contributed Fatdog64 packages thread
Back to top
View user's profile Send private message 
technosaurus


Joined: 18 May 2008
Posts: 4424

PostPosted: Wed 09 Feb 2011, 18:53    Post subject: Re: s11 - the ultimate snapmergepuppy  

Dougal wrote:
After first boot zdrv is only used for new modules.
It's useful to look how something works before fiddling with it.
As I said before, I rarely use a save file unless I am testing it, so the way I use the devx is to rename the devx.sfs to the zdrv name and merge the cutdown modules into the pup*.sfs -- I suspect most users operate it the "normal" way (with a save file and stock files) though and then it would free up a loop device and ~20Mb of disk space after deletion.
_________________
Web Programming - Pet Packaging 100 & 101
Back to top
View user's profile Send private message 
jemimah


Joined: 26 Aug 2009
Posts: 4309
Location: Tampa, FL

PostPosted: Thu 10 Feb 2011, 00:40    Post subject:  

jamesbond wrote:
Yes, you're right unfortunately. Our only hope is that nothing like this happens in between, because the operation works in tmpfs and should be very very fast. In aubrsync, as you noted, Okajima locked the entire filesystem to "ro" before doing the work ... I'm not sure whether we can afford that in our rootfs?


Even if you rewrote it in C it wouldn't be fast enough. Imagine you're competing for timeslices with the C compiler or some other automated process that modifies files. Snapmergepuppy also runs reniced so it's not going to compete very well. I think it's extremely likely that this would cause problems for the user, not just in theory.

I don't even think it's possible to to make the root filesystem readonly while you are using it. You'd probably get a "device is busy" error. Patiot managed to do it for shutdown by adding a new rw layer and making only the save file read-only - but that is sort of the opposite of what we're trying to accomplish here. What we really need is a way to freeze all processes and flush the buffers like you'd do in a real snapcopy. But I don't think anything like that exists for this type of situation.
Back to top
View user's profile Send private message Visit poster's website 
jamesbond

Joined: 26 Feb 2007
Posts: 2232
Location: The Blue Marble

PostPosted: Thu 10 Feb 2011, 02:05    Post subject:  

jemimah wrote:
Even if you rewrote it in C it wouldn't be fast enough. Imagine you're competing for timeslices with the C compiler or some other automated process that modifies files. Snapmergepuppy also runs reniced so it's not going to compete very well. I think it's extremely likely that this would cause problems for the user, not just in theory.
Ok, you have a point there. I don't do that scenario (compiling inside savefile) so it eludes me.

Quote:
I don't even think it's possible to to make the root filesystem readonly while you are using it. You'd probably get a "device is busy" error. Patiot managed to do it for shutdown by adding a new rw layer and making only the save file read-only - but that is sort of the opposite of what we're trying to accomplish here.
Yes, that wasn't clear thinking on my part. Thinking more about it - even on a readonly fs, a process can still open and read a file, and If that file is then closed behind its back by aufs - well it's the equivalent of pulling the rug under one's feet. So making it readonly really doesn't help.

Quote:
What we really need is a way to freeze all processes and flush the buffers like you'd do in a real snapcopy. But I don't think anything like that exists for this type of situation.
The R1soft ad at the top of this forum - hotcopy for Linux - apparently can do that. Perhaps it has hooks into the kernel VFS or something. Sigh - so unless someone can come up with a better idea, tmpfs can't be reclaimed even after a successful copydown.

Anyone?

_________________
Fatdog64, Slacko and Puppeee user. Puppy user since 2.13.
Contributed Fatdog64 packages thread
Back to top
View user's profile Send private message 
jamesbond

Joined: 26 Feb 2007
Posts: 2232
Location: The Blue Marble

PostPosted: Thu 10 Feb 2011, 10:10    Post subject:  

Okay this is still racy. Can't beat them. But this is an inch closer. Anyone willing to give it a try / thoughts?

Code:
#!/bin/awk -f
BEGIN {
   # freeze all process excluding ourself
   MYPID = PROCINFO["pid"]   
   print "Freezing all processes"
   while ("ps -ef" | getline) {
      if ($2 != MYPID &&
         $2 != "PID") {
            system("kill -STOP " $2)
            frozen[$2]=$0            
      }
   }
   
   # do the job
   print "All process frozen, now do clean up"
   CMDLINE = "closedfiles.awk -v TMPFS=\"" TMPFS "\" | \
            grep -v -E \".wh..wh.orph|.wh..wh.plnk|.wh..wh.aufs|.wh..wh..opq\" | \
            xargs rm -rf"
   print CMDLINE
   #system(CMDLINE)
   
   # wake them up again
   for (i in frozen) {
      system("kill -CONT " i)
   }
   print "All processes thawed"
}


cheers!

_________________
Fatdog64, Slacko and Puppeee user. Puppy user since 2.13.
Contributed Fatdog64 packages thread
Back to top
View user's profile Send private message 
jamesbond

Joined: 26 Feb 2007
Posts: 2232
Location: The Blue Marble

PostPosted: Thu 10 Feb 2011, 10:20    Post subject:  

Here is s12 with the wrapper above, for anyone who's interested to try. Contains all the necessary stuff.

NOTE: Experimental stuff. Don't run with real pupsave - create a test disposable pupsave instead.

EDIT: file deleted, buggy, please use latest version.

_________________
Fatdog64, Slacko and Puppeee user. Puppy user since 2.13.
Contributed Fatdog64 packages thread

Last edited by jamesbond on Sat 12 Feb 2011, 09:09; edited 1 time in total
Back to top
View user's profile Send private message 
technosaurus


Joined: 18 May 2008
Posts: 4424

PostPosted: Thu 10 Feb 2011, 12:24    Post subject:  

jamesbond wrote:
Okay this is still racy. Can't beat them. But this is an inch closer. Anyone willing to give it a try / thoughts?

Code:
#!/bin/awk -f
BEGIN {
   # freeze all process excluding ourself
   MYPID = PROCINFO["pid"]   
   print "Freezing all processes"
   while ("ps -ef" | getline) {
      if ($2 != MYPID &&
         $2 != "PID") {
            system("kill -STOP " $2)
            frozen[$2]=$0            
      }
   }
   
   # do the job
   print "All process frozen, now do clean up"
   CMDLINE = "closedfiles.awk -v TMPFS=\"" TMPFS "\" | \
            grep -v -E \".wh..wh.orph|.wh..wh.plnk|.wh..wh.aufs|.wh..wh..opq\" | \
            xargs rm -rf"
   print CMDLINE
   #system(CMDLINE)
   
   # wake them up again
   for (i in frozen) {
      system("kill -CONT " i)
   }
   print "All processes thawed"
}


cheers!
I'm not on a puppy box at the moment, but a little concerned about pausing init and subsequently running processes before unpausing it (not sure if it matters though) - maybe add a ... && $2 != "1" ... if there are problems (1 _is_ the init procsess right?)
_________________
Web Programming - Pet Packaging 100 & 101
Back to top
View user's profile Send private message 
Dougal


Joined: 19 Oct 2005
Posts: 2505
Location: Hell more grotesque than any medieval woodcut

PostPosted: Thu 10 Feb 2011, 15:42    Post subject:  

jamesbond wrote:
In my system (and also Fluppy), /bin/ash is symlinked to busybox. Busybox ash seems to support this:
Code:
# busybox ash
# a=abcde
# echo ${a:3}
de
# exit


It isn't documented, and if you do have a link to where busybox ash is documented, I'm happy to take it on (busybox applet man page list the details of every other applet except ash).

It's version-dependent. My version of Busybox Ash doesn't support it. Busybox also supports "let", which is a bashism, so not safe to rely on.

_________________
What's the ugliest part of your body?
Some say your nose
Some say your toes
But I think it's your mind
Back to top
View user's profile Send private message 
Display posts from previous:   Sort by:   
Page 4 of 7 [92 Posts]   Goto page: Previous 1, 2, 3, 4, 5, 6, 7 Next
Post new topic   Reply to topic View previous topic :: View next topic
 Forum index » Off-Topic Area » Programming
Jump to:  

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Powered by phpBB © 2001, 2005 phpBB Group
[ Time: 0.1499s ][ Queries: 12 (0.0058s) ][ GZIP on ]