lz4cat - initrd decompression

Under development: PCMCIA, wireless, etc.
Post Reply
Message
Author
User avatar
rufwoof
Posts: 3690
Joined: Mon 24 Feb 2014, 17:47

lz4cat - initrd decompression

#1 Post by rufwoof »

For initrd.gz you typically extract that by cd to where the initrd.gz is located and

mkdir N
cd N
zcat ../initrd.gz | cpio -id

and reform using

find | cpio -o -H newc | gzip >../initrd.gz

For lz4 compressed i.e. a initrd.lz4 the commands are

mkdir N
cd N
lz4 -cd ../initrd.lz4 | cpio -id

and reform using

find | cpio -o -H newc | lz4 -l >../initrd lz4

i.e. lz4 -cd is comparable to lz4cat and lz4 -l (little L) on the reform side implements lz4 legacy (Linux kernel) mode compression.

The kernel will need to have been compiled to support lz4 and you'll need binary versions of mksquashfs and unsquashfs (in addition to the lz4 binary) in order to create/extract squashed file systems (SFS) that utilise lz4 compression (typically kernels > version 4.19 (that have lz4 configured in the kernel build)).

These are the ones I use in Tahr 6.0.5 32 bit pae

http://murga-linux.com/puppy/viewtopic. ... 889#879889

and the lz4 binary I use is attached (remove the fake .gz suffix and drop it into /usr/bin).
Attachments
lz4.gz
Fake .gz
(118.17 KiB) Downloaded 434 times
[size=75]( ͡° ͜ʖ ͡°) :wq[/size]
[url=http://murga-linux.com/puppy/viewtopic.php?p=1028256#1028256][size=75]Fatdog multi-session usb[/url][/size]
[size=75][url=https://hashbang.sh]echo url|sed -e 's/^/(c/' -e 's/$/ hashbang.sh)/'|sh[/url][/size]

User avatar
Ted Dog
Posts: 3965
Joined: Wed 14 Sep 2005, 02:35
Location: Heart of Texas

#2 Post by Ted Dog »

Thanks for spearheading these efforts. With only a slight increase in size the massively improved decompression speed, surprised this has not been more supported in the main spins. Since using layered squashfs in RAM most do not know how much cpu overhead is being wasted just continually decompression tiny text scripts.

User avatar
rufwoof
Posts: 3690
Joined: Mon 24 Feb 2014, 17:47

#3 Post by rufwoof »

Ted Dog wrote:With only a slight increase in size the massively improved decompression speed, surprised this has not been more supported in the main spins. Since using layered squashfs in RAM most do not know how much cpu overhead is being wasted just continually decompression tiny text scripts.
Since switching all my sfs's to lz4 - immediately after bootup the xload starts with just one Y axis line. More usually after bootup with other choices of compression such as xz I have several Y axis xload lines. I've commented out the Updating stage of bootup and with lz4 initrd and sfs's it really fly's through bootup even on this 15 year old single core Celeron.

I would have guessed that once a small script had been read once it would be paged into memory more or less thereafter during a single session and there probably aren't that many different unique small scripts that are being used during a session, more often the same single scripts that are repeatedly used.

I don't really understand how the layers work in practice, I'd have guessed that it wouldn't be decompressing one version from puppy sfs, another from zdrv sfs, another from the savefile, but just looking at pointers/presence and just running a actual read (decompression) once. I believe lzop doesn't compress directories only regular files, so for lzop at least lookup's wouldn't involve decompression time.

Shame that regular Tahr 6.0.5 only supports gzip and xz. I'm finding emsee 4.3.2 kernel that supports lzo and lz4 to be highly impressive so far.

With DOTconfig configured similar to emsee 4.3.2 - so lz4 and lzo supported, coupled with mksquashfs and unsquashfs with lzo and lz4 support (so you can create sfs's compressed with either lzop or lz4), and stand alone lzop (a version of which is usually in busybox anyway) and lz4, then a pup can be transformed to using lzo or lz4 for each of initrd and sfs's (puppy, zdrv, application sfs's).

I'm flipping between using lz4 high compressed and lzop -1 compressed. Both are pretty comparable operationally as far as I can tell. But the records show that lz4 to be very fast at decompressing, so I suspect it has the edge, but in practice I'm not seeing much of a difference to that of lzop -1. Perhaps the small size of puppy generally combined with the very fast speeds, even a three of four factor difference wouldn't be perceptable.

With gzip'd or xz compressed Libre Office sfs, after sfs loading its first run would take several seconds of a bar line expanding across the window. With lz4 compressed Libre sfs it more or less jumps straight across. Thereafter once memory bound (paged in) subsequent Libre loads are as quick as invoking the internal abiword (nigh on instant).
[size=75]( ͡° ͜ʖ ͡°) :wq[/size]
[url=http://murga-linux.com/puppy/viewtopic.php?p=1028256#1028256][size=75]Fatdog multi-session usb[/url][/size]
[size=75][url=https://hashbang.sh]echo url|sed -e 's/^/(c/' -e 's/$/ hashbang.sh)/'|sh[/url][/size]

User avatar
rufwoof
Posts: 3690
Joined: Mon 24 Feb 2014, 17:47

#4 Post by rufwoof »

One aspect that favours lz4 over lzop is multi-core systems. I believe lzop is single core bound, whereas lz4 apparently supports multi-core https://code.google.com/p/lz4/
LZ4 is a very fast lossless compression algorithm, providing compression speed at 400 MB/s per core, with near-linear scalability for multi-threaded applications. It also features an extremely fast decoder, with speed in multiple GB/s per core, typically reaching RAM speed limits on multi-core systems.
In that same link the benchmarks suggest lz4 is around 6 times quicker than gzip (zlib) and three times quicker than lzop at decompressing.

Worthy of highlight
typically reaching RAM speed limits on multi-core systems
If, as I'd casually guess from observations compression averages around a 2:1 factor, that's comparable to a double up of ram (virtually i.e. via compression).
[size=75]( ͡° ͜ʖ ͡°) :wq[/size]
[url=http://murga-linux.com/puppy/viewtopic.php?p=1028256#1028256][size=75]Fatdog multi-session usb[/url][/size]
[size=75][url=https://hashbang.sh]echo url|sed -e 's/^/(c/' -e 's/$/ hashbang.sh)/'|sh[/url][/size]

User avatar
rufwoof
Posts: 3690
Joined: Mon 24 Feb 2014, 17:47

#5 Post by rufwoof »

This article http://falando.es/en/definitions/data-compression/ notes (my bold highlight)
The cloud web hosting platform where your web hosting account will be created uses the outstanding ZFS file system. The LZ4 compression method which the aforementioned uses is superior in numerous aspects, and not only does it compress information better than any compression method that some other file systems use, but it is also much quicker. The gains are significant in particular on compressible content which includes website files. Despite the fact that it may sound unreasonable, uncompressing data with LZ4 is faster than reading uncompressed info from a hard disk
So not only like having additional (virual) memory (ram), but also like having a upgraded (faster) hard disc as well :)

Their HTML content benefit observation is pertinent to puppy - as much of puppy is like HTML, but utilising XML instead (textual files).
[size=75]( ͡° ͜ʖ ͡°) :wq[/size]
[url=http://murga-linux.com/puppy/viewtopic.php?p=1028256#1028256][size=75]Fatdog multi-session usb[/url][/size]
[size=75][url=https://hashbang.sh]echo url|sed -e 's/^/(c/' -e 's/$/ hashbang.sh)/'|sh[/url][/size]

User avatar
rufwoof
Posts: 3690
Joined: Mon 24 Feb 2014, 17:47

#6 Post by rufwoof »

I'm running Tahr 6.0.5 on a single core (15 year old Celeron D) PC. Just grabbed my PXE server SFS from Wary, loaded that and created a huge initrd with Tahr puppy/zdrv etc contained within and PXE (net) booted another 4 core PC from that.

Testing mksquashfs with LZ4 compression on that four core, running in ram and there was barely enough time to even see the cores blip (htop) when using a 160MB file (/lib folder content). Unsquashfs of that same sfs and I did get to see the 'using four cores..' message and the htop four cores all jump up a little.

So looking good, all four cores being used for both compression and decompression and when ram bound even 160MB+ of data flashes in/out of compressed state very very quick when using LZ4. Similar to each core processing 40MB using a very fast decompressor.

I would suggest that LZ4 is destined to become more widely adopted within Linux distro's.

If you fancy a try for yourself - http://murga-linux.com/puppy/viewtopic. ... 100#880100
Last edited by rufwoof on Fri 08 Jan 2016, 14:31, edited 1 time in total.
[size=75]( ͡° ͜ʖ ͡°) :wq[/size]
[url=http://murga-linux.com/puppy/viewtopic.php?p=1028256#1028256][size=75]Fatdog multi-session usb[/url][/size]
[size=75][url=https://hashbang.sh]echo url|sed -e 's/^/(c/' -e 's/$/ hashbang.sh)/'|sh[/url][/size]

User avatar
rufwoof
Posts: 3690
Joined: Mon 24 Feb 2014, 17:47

#7 Post by rufwoof »

Bearing in mind I know nowt about woof/T2, kernel compiling etc. this is how I compiled mksquashfs and unsquashfs. Its a sfs of the directories containing the source code for mksquashfs, unsquashfs and lz4, a image of those directories after I'd actually compiled (so binaries are there as well).

https://drive.google.com/file/d/0B4MbXu ... sp=sharing

I loaded the releveant devx and kernel sources in order to compile (tahr 6.0.5 devx and emsee 4.3.2 kernel sources in my case)

Other than simple ./configure and make I know little about compiling either so I guess you might have to run something like make clean or whatever it is.

To get the mksquashfs/unsquashfs with lz4 support I copied one (maybe some?) of the lz4 .h file across from the lz4 source/lib code directory into the mksquashfs source/lib directory. I also had to edit - uncomment a few lines, in Makefile to activate the different choices of compression being compiled into mksquashfs/unsquashfs.

After compilation I just copied them in as mksquashfs5 and unsquash5 as there were already 4,3,2 versions, and then changed the symlinks of mksquashfs and unsquashfs to point to those 5 versions instead of the 4 versions. I think I may also have copied across some libs (can't remember), i.e. I did a manual install rather than some make install type process (which I don't understand).

mksquashfs --help

now looks like

Code: Select all

root# mksquashfs --help
SYNTAX:mksquashfs source1 source2 ...  dest [options] [-e list of exclude
dirs/files]

Filesystem build options:
-comp <comp>		select <comp> compression
			Compressors available:
				gzip (default)
				lzo
				lz4
				xz
-b <block_size>		set data block to <block_size>.  Default 128 Kbytes
			Optionally a suffix of K or M can be given to specify
			Kbytes or Mbytes respectively
-no-exports		don't make the filesystem exportable via NFS
-no-sparse		don't detect sparse files
-no-xattrs		don't store extended attributes
-xattrs			store extended attributes (default)
-noI			do not compress inode table
-noD			do not compress data blocks
-noF			do not compress fragment blocks
-noX			do not compress extended attributes
-no-fragments		do not use fragments
-always-use-fragments	use fragment blocks for files larger than block size
-no-duplicates		do not perform duplicate checking
-all-root		make all files owned by root
-force-uid uid		set all file uids to uid
-force-gid gid		set all file gids to gid
-nopad			do not pad filesystem to a multiple of 4K
-keep-as-directory	if one source directory is specified, create a root
			directory containing that directory, rather than the
			contents of the directory

Filesystem filter options:
-p <pseudo-definition>	Add pseudo file definition
-pf <pseudo-file>	Add list of pseudo file definitions
-sort <sort_file>	sort files according to priorities in <sort_file>.  One
			file or dir with priority per line.  Priority -32768 to
			32767, default priority 0
-ef <exclude_file>	list of exclude dirs/files.  One per line
-wildcards		Allow extended shell wildcards (globbing) to be used in
			exclude dirs/files
-regex			Allow POSIX regular expressions to be used in exclude
			dirs/files

Filesystem append options:
-noappend		do not append to existing filesystem
-root-becomes <name>	when appending source files/directories, make the
			original root become a subdirectory in the new root
			called <name>, rather than adding the new source items
			to the original root

Mksquashfs runtime options:
-version		print version, licence and copyright message
-exit-on-error		treat normally ignored errors as fatal
-recover <name>		recover filesystem data using recovery file <name>
-no-recovery		don't generate a recovery file
-info			print files written to filesystem
-no-progress		don't display the progress bar
-progress		display progress bar when using the -info option
-processors <number>	Use <number> processors.  By default will use number of
			processors available
-mem <size>		Use <size> physical memory.  Currently set to 503M
			Optionally a suffix of K, M or G can be given to specify
			Kbytes, Mbytes or Gbytes respectively

Miscellaneous options:
-root-owned		alternative name for -all-root
-noInodeCompression	alternative name for -noI
-noDataCompression	alternative name for -noD
-noFragmentCompression	alternative name for -noF
-noXattrCompression	alternative name for -noX

-Xhelp			print compressor options for selected compressor

Compressors available and compressor specific options:
	gzip (default)
	  -Xcompression-level <compression-level>
		<compression-level> should be 1 .. 9 (default 9)
	  -Xwindow-size <window-size>
		<window-size> should be 8 .. 15 (default 15)
	  -Xstrategy strategy1,strategy2,...,strategyN
		Compress using strategy1,strategy2,...,strategyN in turn
		and choose the best compression.
		Available strategies: default, filtered, huffman_only,
		run_length_encoded and fixed
	lzo
	  -Xalgorithm <algorithm>
		Where <algorithm> is one of:
			lzo1x_1
			lzo1x_1_11
			lzo1x_1_12
			lzo1x_1_15
			lzo1x_999 (default)
	  -Xcompression-level <compression-level>
		<compression-level> should be 1 .. 9 (default 8)
		Only applies to lzo1x_999 algorithm
	lz4
	  -Xhc
		Compress using LZ4 High Compression
	xz
	  -Xbcj filter1,filter2,...,filterN
		Compress using filter1,filter2,...,filterN in turn
		(in addition to no filter), and choose the best compression.
		Available filters: x86, arm, armthumb, powerpc, sparc, ia64
	  -Xdict-size <dict-size>
		Use <dict-size> as the XZ dictionary size.  The dictionary size
		can be specified as a percentage of the block size, or as an
		absolute value.  The dictionary size must be less than or equal
		to the block size and 8192 bytes or larger.  It must also be
		storable in the xz header as either 2^n or as 2^n+2^(n+1).
		Example dict-sizes are 75%, 50%, 37.5%, 25%, or 32K, 16K, 8K
		etc.
root#
To create a sfs I now tend to use

mksquashfs <dir> somename.sfs -comp lz4 -Xhc

I did consider making lz4 the default as per the options in Makefile, but then decided to leave it at gzip as that's the current more common choice.

To compress the content of directory <dir> (change that to the actual name of a directory structure that you want to create a sfs of).

Be aware that the kernel has to have lz4 support compiled in, in order for you to be able to view/load sfs's compressed using lz4. That's the primary reason why I switched out the standard kernel that comes with Tahr to one of Stemsee's compiled kernels. See my prior post for a alternative to having to do all that yourself (ISO of Tahr pup with a lz4 supporting kernel).
[size=75]( ͡° ͜ʖ ͡°) :wq[/size]
[url=http://murga-linux.com/puppy/viewtopic.php?p=1028256#1028256][size=75]Fatdog multi-session usb[/url][/size]
[size=75][url=https://hashbang.sh]echo url|sed -e 's/^/(c/' -e 's/$/ hashbang.sh)/'|sh[/url][/size]

User avatar
rufwoof
Posts: 3690
Joined: Mon 24 Feb 2014, 17:47

#8 Post by rufwoof »

As a very cursory glance at LZ4 and (fast) solid state disk (SSD), whilst SSD are typically 3 times faster than HDD (that have physical moving parts), the throughput is still relatively slow compared to what LZ4 can run at - especially if running on a multi-core PC. LZ4 can run at 10GB/sec on a single core, so 40GB on a 4 core, whilst a reasonable/good SSD might be throughputting at 2GB/sec. Conceptually at least therefore LZ4 adds speed benefit to a SSD based installation, perhaps doubling the throughput.

For a SDD based frual puppy installation, I'd suggest pfix=ram,nocopy booting. With just pfix=ram the entire puppy gets copied into memory at startup, whilst with the nocopy option on parts of puppy are read into memory as/when required (faster boot), which more often will exclude certain parts of puppy ever being copied during a session (gparted for instance and/or any other less often used programs/libs). Given the relatively small size of puppy compared to available memory/ram, more often once a file is copied into ram once it will remain memory bound for the remainder of the session thereafter (paged in).
[size=75]( ͡° ͜ʖ ͡°) :wq[/size]
[url=http://murga-linux.com/puppy/viewtopic.php?p=1028256#1028256][size=75]Fatdog multi-session usb[/url][/size]
[size=75][url=https://hashbang.sh]echo url|sed -e 's/^/(c/' -e 's/$/ hashbang.sh)/'|sh[/url][/size]

Post Reply