system optimization techniques

Under development: PCMCIA, wireless, etc.
Message
Author
User avatar
technosaurus
Posts: 4853
Joined: Mon 19 May 2008, 01:24
Location: Blue Springs, MO
Contact:

system optimization techniques

#1 Post by technosaurus »

Wondering whether or not to recompile a program in your puplet to eliminate large shared libs that are only used once or twice

Code: Select all

for x in ${PATH//://* }/* ; do [ -x "$x" ] && objdump -x $x 2>/dev/null & done|grep NEEDED|sort |uniq -c |sort -n >binaudit
this will tell you how many times each library is needed (if it isn't on the list, it can likely just be removed --- not always though, need to remember dlopen and/or plugins may use it too ... in "module" mode)

now open binaudit in your text editor and get to work, starting with the ones that are needed only once ... you can build them in statically

The ones that are used the most would benefit the most tools by improving their compiles (I will discuss that later)

I'll go over optimization techniques soon.
Check out my [url=https://github.com/technosaurus]github repositories[/url]. I may eventually get around to updating my [url=http://bashismal.blogspot.com]blogspot[/url].

User avatar
technosaurus
Posts: 4853
Joined: Mon 19 May 2008, 01:24
Location: Blue Springs, MO
Contact:

#2 Post by technosaurus »

Reserved compile techniques.

In case you ever wondered whether or not to set certain flags, it is helpful to know which ones already are set. This can be done by:

Code: Select all

export CFLAGS="your cflags"
for x in x; do
gcc -Q --help=optimizers
gcc -Q --help=target
gcc -Q --help=warnings
gcc -Q --help=undocumented
done >gccflags
CFLAGS
-Os (optimize for speed unless it will increase size)
-finline-small-functions (inline if it is smaller to inline than call it)
-finline-functions-called-once -fearly-inline (requires gcc update)
-ffunction-sections -fdata-sections with -Wl,--gc-sections LDFLAG
pdf explanation by Denys Vlasenko (busybox maintainer)
(allows compiler to remove functions and data that aren't actually used even if they are in the same object file as one that is)
-fno-unwind-tables -fno-asynchronous-unwind-tables (don't include unnecessary unwind info)
Rich Felker wrote: By default, modern GCC generates DWARF2 debug/unwind tables in the .eh_frame section of the object files/binaries. This adds significant bloat (as much as 15%) to the size of the busybox binary, including the portion mapped/loaded into memory at runtime (possibly a big issue for NOMMU targets), and the section is not strippable with the strip command due to being part of the loaded program text. I've since done some further checking - both testing and asking the GCC developers about it - and it seems the solution is to add to the CFLAGS -fno-unwind-tables and -fno-asynchronous-unwind-tables. If debugging is disabled, this will prevent GCC from outputting DWARF2 tables entirely. But since busybox builds with -g by default, the interesting case is what happens then. I originally thought these options would break debugging, but they don't; instead, they tell GCC to output the DWARF2 tables in the .debug_frame section instead of the newish .eh_frame section (used for exception handling). With these options added, busybox_unstripped is still fully debuggable, and the final busybox binary loses the 15% bloat factor from the DWARF2 tables.

LDFLAGS
--gc-sections (see above)
--as-needed (don't explicity link all libraries passed on the command line - try to see if any are useless ... all the locations of functions accessed indirectly via direct dependencies get stored in the binary when they can be loaded at runtime much more effectively - assuming the libraries get updated periodically, the locations will all be wrong and take much much longer to load)
NOTE: gcc should have a --warn-as-needed to inform developers which libraries are unneeded when their crappy build scripts are broken
Last edited by technosaurus on Tue 09 Oct 2012, 05:26, edited 5 times in total.
Check out my [url=https://github.com/technosaurus]github repositories[/url]. I may eventually get around to updating my [url=http://bashismal.blogspot.com]blogspot[/url].

User avatar
technosaurus
Posts: 4853
Joined: Mon 19 May 2008, 01:24
Location: Blue Springs, MO
Contact:

#3 Post by technosaurus »

Shrinking files.
Check out my [url=https://github.com/technosaurus]github repositories[/url]. I may eventually get around to updating my [url=http://bashismal.blogspot.com]blogspot[/url].

User avatar
technosaurus
Posts: 4853
Joined: Mon 19 May 2008, 01:24
Location: Blue Springs, MO
Contact:

#4 Post by technosaurus »

Boot speed.

0. Do as little as necessary. and put off stuff not necessary to start X into xinit (you really only _need_ to mount /proc, /sys, /dev and the rootfs)

1. run as much as possible in parallel &

a. if there are no prerequisites to running a command/function and it is not a prerequisite to others, then you can just add a & to the end.

Code: Select all

foo &
b. if there are prerequisites involved, use the builtin $! variable to store the PID(s) (technically job IDs) of a parallel command(s), so that you can use the wait command to ensure the prerequisite calls are complete.

Code: Select all

wait_cmd(){ #wait for job ID(s) in $ to complete, then run rest of arguments as a command
  wait $1
  shift
  $@
}
foo &
foo_JID=$!
bar &
bar_JID=$!
#Note the quotes around and spaces between job IDs
wait_cmd "$foo_JID $bar_JID" baz
This method can be even more effective than systemd's pathetic parallel-ization methods

2. avoid unnecessary sleep(s) or at least unnecessarily long ones

a. use an appropriate check or wait command instead where possible with short sleeps in a loop rather than just using really long sleeps

Code: Select all

#example
#sleep 5 && jwm
#wait for the X server to establish the unix socket instead
COUNT=99
while
  [ ! -S /tmp/.X11-unix/X0 ] || [ $COUNT > 0 ]
do
  sleep .1
  COUNT=$((COUNT-1))
done
 [ $COUNT > 0 ] && jwm || echo X timed out
3. use busybox (or toybox) instead of GNU utils, they are smaller, so they load faster and the shells are about 4x faster than bash

4. when operating on strings, variables or even small files(< ~100 lines) use the shell's substring manipulation and other builtins instead of external utilities like awk, grep, sed, tr, which
see http://www.murga-linux.com/puppy/viewtopic.php?t=70238

5. minimize unnecessary files in your initramfs/rootfs...

One or two large files compress better (and load faster) than many smaller files - its better to include all the functions in the init script or at least a single function header than to use a bunch of smaller ones. This speeds up load times in the initramfs because it compresses better and reduces seek and load times in both the initramfs and rootfs (especially if the rootfs is on a slower or fragmented disk drive).

If a kernel module is needed in the init, chances are it should be builtin.

Similarly shared libraries add additional bloat and load time (for multiple reasons), a static busybox built against musl or uclibc can significantly cut down load times. The popular rant against static builds in favor of shared libraries by Ulrich Drepper is backed up by metrics based on glibc (which he maintained) which is horribly suited for static builds (a simple static hello world is over 400kb)
see https://sta.li/faq

6. use tools like bootchartd, strace and lsof to measure the files used and times spent in various sections and the order, you can use this to "readahead" files that will be used and to order the filesystem so that files are in sequential order

7. use kernel parameters

libahci.ignore_sss=1 #spin up disks in parallel instead of series
quiet #eliminates superfluous messages from the console
loglevel=3 #or less... only log the most critical issues the lower the number, the fewer log messages
#probably more

8. don't write unnecessary stuff to the console ... or if you must, write as much as possible at once (for example, use a single success/fail message that handles multiple items instead of doing them one ata time). printf is useful for this
Last edited by technosaurus on Mon 09 Oct 2017, 23:38, edited 1 time in total.

User avatar
technosaurus
Posts: 4853
Joined: Mon 19 May 2008, 01:24
Location: Blue Springs, MO
Contact:

#5 Post by technosaurus »

X
Check out my [url=https://github.com/technosaurus]github repositories[/url]. I may eventually get around to updating my [url=http://bashismal.blogspot.com]blogspot[/url].

User avatar
technosaurus
Posts: 4853
Joined: Mon 19 May 2008, 01:24
Location: Blue Springs, MO
Contact:

#6 Post by technosaurus »

Alternative apps/libs
Check out my [url=https://github.com/technosaurus]github repositories[/url]. I may eventually get around to updating my [url=http://bashismal.blogspot.com]blogspot[/url].

User avatar
technosaurus
Posts: 4853
Joined: Mon 19 May 2008, 01:24
Location: Blue Springs, MO
Contact:

#7 Post by technosaurus »

Kernel configs
Check out my [url=https://github.com/technosaurus]github repositories[/url]. I may eventually get around to updating my [url=http://bashismal.blogspot.com]blogspot[/url].

User avatar
technosaurus
Posts: 4853
Joined: Mon 19 May 2008, 01:24
Location: Blue Springs, MO
Contact:

#8 Post by technosaurus »

Portability
Check out my [url=https://github.com/technosaurus]github repositories[/url]. I may eventually get around to updating my [url=http://bashismal.blogspot.com]blogspot[/url].

User avatar
sunburnt
Posts: 5090
Joined: Wed 08 Jun 2005, 23:11
Location: Arizona, U.S.A.

#9 Post by sunburnt »

Hi technosaurus; I wondered if this could do a statistical analysis of lib. usage in common apps.
I did a sudo analysis myself that showed a small percentage of libs. are actually in common use.

This is all about rules for which libs. to include in static builds.
Once the libs. are identified, then scripts for statically building them.
I don`t know if a build script "scrap" can be used many times like this.

scsijon
Posts: 1596
Joined: Thu 24 May 2007, 03:59
Location: the australian mallee
Contact:

#10 Post by scsijon »

Technosaurus, I wonder if a common lib could be built, something on the Busybox basis?

It should shrink the overall size a bit!

disciple
Posts: 6984
Joined: Sun 21 May 2006, 01:46
Location: Auckland, New Zealand

#11 Post by disciple »

FWIW there was a project to build GTK and X and stuff all into one binary... but it seems to be dead.
Do you know a good gtkdialog program? Please post a link here

Classic Puppy quotes

ROOT FOREVER
GTK2 FOREVER

User avatar
technosaurus
Posts: 4853
Joined: Mon 19 May 2008, 01:24
Location: Blue Springs, MO
Contact:

#12 Post by technosaurus »

Goingnuts and I have done several x11, xaw and gtk1 apps as multicall binaries.... Gtk2 apps tend to use plugins - thus requiring significant patches... However there is an in between method where you mix static and dynamic libs based on analysis of dependencies. It gets complicated when you have to build pic versions of static libs so they can be statically linked into plugins (like png for imlib) and further complicated if you have another app that links to those libs directly (such as jwm with png support) requires changing the build to link to the imlib plugins for jwm. This is fine if you are doing a 1 time build of an embedded system with limited resources, but a royal pain to maintain for a distro. Thus the reason we hand picked a few compatible apps (sometimes even specific versions) and limited them to the minimum functionality needed to use/build/rebuild a system either from scratch using any distro's packages or so that if your whole shared library system gets corrupted, you can still recover comfortably with an intact desktop environment and still do basic everyday tasks like editing text, browsing the web or your file system, answering email and of course wasting time even if your libc gets corrupted.
Check out my [url=https://github.com/technosaurus]github repositories[/url]. I may eventually get around to updating my [url=http://bashismal.blogspot.com]blogspot[/url].

User avatar
sunburnt
Posts: 5090
Joined: Wed 08 Jun 2005, 23:11
Location: Arizona, U.S.A.

#13 Post by sunburnt »

I see no point in BusyBox in Squash file based O.S.s.
It just ends up being double compressed, usually with little improvement.

And here at Puppy we know BusyBox commands just don`t measure up.
Thought it could be compiled with full spec. commands...

BusyBox is a good idea for loose file systems, but not Squash file ones.

disciple
Posts: 6984
Joined: Sun 21 May 2006, 01:46
Location: Auckland, New Zealand

#14 Post by disciple »

It just ends up being double compressed, usually with little improvement.
What do you mean? Busybox isn't about compression, is it?
Do you know a good gtkdialog program? Please post a link here

Classic Puppy quotes

ROOT FOREVER
GTK2 FOREVER

User avatar
technosaurus
Posts: 4853
Joined: Mon 19 May 2008, 01:24
Location: Blue Springs, MO
Contact:

#15 Post by technosaurus »

The advantage of statically compiling a multicall binary is that unused functions get compiled out. Busybox also has the capacity to "prefer applets" and call them as a function or by forking itself (both are much faster than running a separate binary... even moreso if that binary has to find and load shared libs) the only compression is the help text. Really it wouldn't hurt much to put a full busybox in the initramfs and copy it over. The only reason busybox applets sometimes fail is because the scripts weren't specifically written for it (tinycore scripts all work because they were) But I did notice that upx slows busybox down and the extra compression is negligible.
Check out my [url=https://github.com/technosaurus]github repositories[/url]. I may eventually get around to updating my [url=http://bashismal.blogspot.com]blogspot[/url].

User avatar
sunburnt
Posts: 5090
Joined: Wed 08 Jun 2005, 23:11
Location: Arizona, U.S.A.

#16 Post by sunburnt »

Yes, I realize BusyBox isn`t for the purpose of reduced size.
I just question it`s seeming general usefulness, and it`s value in Squash based systems.
I have no doubt that there`s a lot more going on with it.

If it`s truly an enhancement then perhaps it should be expanded?
Add in sed, grep, cut, paste, comm, awk, and perhaps even more...
This would "containerize" most of the loose Linux utility files.

Ibidem
Posts: 549
Joined: Wed 26 May 2010, 03:31
Location: State of Jefferson

#17 Post by Ibidem »

Busybox has sed, awk, lpd, vi, fdisk, mkfs, fsck, and so on--it's enough to build the kernel, combined with binutils, gcc, and make. You probably want CONFIG_DESKTOP, though.
I've actually usedit as a full standalone OS.

IIRC, UPX hurts because it ends up killing the self-exec trick or something similar--could be wrong; I remember they mentioned it was NOT good for anything which has multiple instances.
I'd expect Busybox to be an improvement on compressed filesystems due to reduced filesystem io-it self-executes from RAM that is required just to run it, so you don't have it eating up cycles on decompression or cached in ram.
The big space savings (IIRC) was actually in just having one ELF header.
A shared, split Busybox with libbusybox.so was ~6-7 MB in the 1.18 timeline (an interesting config option that is well hidden: enable PIC, then edit .config and uncomment it; I forget the name of the option...)

User avatar
technosaurus
Posts: 4853
Joined: Mon 19 May 2008, 01:24
Location: Blue Springs, MO
Contact:

#18 Post by technosaurus »

The problem with libbusybox.so is that it only exports the <applet>_main functions, not the other functions that could be useful for linking against for creating other apps (I believe this was brought up on the mailing list at some point) ... things like the zlib/bz2/xz, md5/sha1 and other numerous helper functions, but this is likely due to license choice being gpl vs. lgpl - industry isn't usually interested in gpl-ing a major work, but now there is toybox.
Check out my [url=https://github.com/technosaurus]github repositories[/url]. I may eventually get around to updating my [url=http://bashismal.blogspot.com]blogspot[/url].

scsijon
Posts: 1596
Joined: Thu 24 May 2007, 03:59
Location: the australian mallee
Contact:

Re: system optimization techniques

#19 Post by scsijon »

technosaurus wrote:Wondering whether or not to recompile a program in your puplet to eliminate large shared libs that are only used once or twice

Code: Select all

for x in ${PATH//://* }/* ; do [ -x "$x" ] && objdump -x $x 2>/dev/null & done|grep NEEDED|sort |uniq -c |sort -n >binaudit
this will tell you how many times each library is needed (if it isn't on the list, it can likely just be removed --- not always though, need to remember dlopen and/or plugins may use it too ... in "module" mode)

now open binaudit in your text editor and get to work, starting with the ones that are needed only once ... you can build them in statically
Turned it into a shell script (easy) and ran it, it's interesting and usefull, however was wondering if you would be willing to extend it to start from / and test against all installed files and all subdirectorie depths and of course their libraries directories.

It could then become a tool for backtesting a completed build, and fill a void that we don't have now.

As you know, i'm not much of a coder and somedays just thinking is a mental migrane.

thanks
scsijon

User avatar
technosaurus
Posts: 4853
Joined: Mon 19 May 2008, 01:24
Location: Blue Springs, MO
Contact:

#20 Post by technosaurus »

If you start at /, it will check libs too even if no binaries use them. Waste of time and not useful. Maybe add some from /usr/share and $HOME, but those should really get fixed to be in proper locations.
Check out my [url=https://github.com/technosaurus]github repositories[/url]. I may eventually get around to updating my [url=http://bashismal.blogspot.com]blogspot[/url].

Post Reply