Author |
Message |
technosaurus

Joined: 18 May 2008 Posts: 4787 Location: Kingwood, TX
|
Posted: Fri 16 Mar 2012, 18:03 Post subject:
system optimization techniques |
|
Wondering whether or not to recompile a program in your puplet to eliminate large shared libs that are only used once or twice
Code: | for x in ${PATH//://* }/* ; do [ -x "$x" ] && objdump -x $x 2>/dev/null & done|grep NEEDED|sort |uniq -c |sort -n >binaudit |
this will tell you how many times each library is needed (if it isn't on the list, it can likely just be removed --- not always though, need to remember dlopen and/or plugins may use it too ... in "module" mode)
now open binaudit in your text editor and get to work, starting with the ones that are needed only once ... you can build them in statically
The ones that are used the most would benefit the most tools by improving their compiles (I will discuss that later)
I'll go over optimization techniques soon.
_________________ Check out my github repositories. I may eventually get around to updating my blogspot.
|
Back to top
|
|
 |
technosaurus

Joined: 18 May 2008 Posts: 4787 Location: Kingwood, TX
|
Posted: Fri 16 Mar 2012, 19:30 Post subject:
|
|
Reserved compile techniques.
In case you ever wondered whether or not to set certain flags, it is helpful to know which ones already are set. This can be done by:
Code: | export CFLAGS="your cflags"
for x in x; do
gcc -Q --help=optimizers
gcc -Q --help=target
gcc -Q --help=warnings
gcc -Q --help=undocumented
done >gccflags |
CFLAGS
-Os (optimize for speed unless it will increase size)
-finline-small-functions (inline if it is smaller to inline than call it)
-finline-functions-called-once -fearly-inline (requires gcc update)
-ffunction-sections -fdata-sections with -Wl,--gc-sections LDFLAG
pdf explanation by Denys Vlasenko (busybox maintainer)
(allows compiler to remove functions and data that aren't actually used even if they are in the same object file as one that is)
-fno-unwind-tables -fno-asynchronous-unwind-tables (don't include unnecessary unwind info)
Rich Felker wrote: | By default, modern GCC generates DWARF2 debug/unwind tables in the .eh_frame section of the object files/binaries. This adds significant bloat (as much as 15%) to the size of the busybox binary, including the portion mapped/loaded into memory at runtime (possibly a big issue for NOMMU targets), and the section is not strippable with the strip command due to being part of the loaded program text. I've since done some further checking - both testing and asking the GCC developers about it - and it seems the solution is to add to the CFLAGS -fno-unwind-tables and -fno-asynchronous-unwind-tables. If debugging is disabled, this will prevent GCC from outputting DWARF2 tables entirely. But since busybox builds with -g by default, the interesting case is what happens then. I originally thought these options would break debugging, but they don't; instead, they tell GCC to output the DWARF2 tables in the .debug_frame section instead of the newish .eh_frame section (used for exception handling). With these options added, busybox_unstripped is still fully debuggable, and the final busybox binary loses the 15% bloat factor from the DWARF2 tables. |
LDFLAGS
--gc-sections (see above)
--as-needed (don't explicity link all libraries passed on the command line - try to see if any are useless ... all the locations of functions accessed indirectly via direct dependencies get stored in the binary when they can be loaded at runtime much more effectively - assuming the libraries get updated periodically, the locations will all be wrong and take much much longer to load)
NOTE: gcc should have a --warn-as-needed to inform developers which libraries are unneeded when their crappy build scripts are broken
_________________ Check out my github repositories. I may eventually get around to updating my blogspot.
Last edited by technosaurus on Tue 09 Oct 2012, 01:26; edited 5 times in total
|
Back to top
|
|
 |
technosaurus

Joined: 18 May 2008 Posts: 4787 Location: Kingwood, TX
|
Posted: Fri 16 Mar 2012, 19:31 Post subject:
|
|
Shrinking files.
_________________ Check out my github repositories. I may eventually get around to updating my blogspot.
|
Back to top
|
|
 |
technosaurus

Joined: 18 May 2008 Posts: 4787 Location: Kingwood, TX
|
Posted: Fri 16 Mar 2012, 19:32 Post subject:
|
|
Boot speed.
0. Do as little as necessary. and put off stuff not necessary to start X into xinit (you really only _need_ to mount /proc, /sys, /dev and the rootfs)
1. run as much as possible in parallel &
a. if there are no prerequisites to running a command/function and it is not a prerequisite to others, then you can just add a & to the end.
b. if there are prerequisites involved, use the builtin $! variable to store the PID(s) (technically job IDs) of a parallel command(s), so that you can use the wait command to ensure the prerequisite calls are complete.
Code: | wait_cmd(){ #wait for job ID(s) in $ to complete, then run rest of arguments as a command
wait $1
shift
$@
}
foo &
foo_JID=$!
bar &
bar_JID=$!
#Note the quotes around and spaces between job IDs
wait_cmd "$foo_JID $bar_JID" baz |
This method can be even more effective than systemd's pathetic parallel-ization methods
2. avoid unnecessary sleep(s) or at least unnecessarily long ones
a. use an appropriate check or wait command instead where possible with short sleeps in a loop rather than just using really long sleeps
Code: | #example
#sleep 5 && jwm
#wait for the X server to establish the unix socket instead
COUNT=99
while
[ ! -S /tmp/.X11-unix/X0 ] || [ $COUNT > 0 ]
do
sleep .1
COUNT=$((COUNT-1))
done
[ $COUNT > 0 ] && jwm || echo X timed out |
3. use busybox (or toybox) instead of GNU utils, they are smaller, so they load faster and the shells are about 4x faster than bash
4. when operating on strings, variables or even small files(< ~100 lines) use the shell's substring manipulation and other builtins instead of external utilities like awk, grep, sed, tr, which
see http://www.murga-linux.com/puppy/viewtopic.php?t=70238
5. minimize unnecessary files in your initramfs/rootfs...
One or two large files compress better (and load faster) than many smaller files - its better to include all the functions in the init script or at least a single function header than to use a bunch of smaller ones. This speeds up load times in the initramfs because it compresses better and reduces seek and load times in both the initramfs and rootfs (especially if the rootfs is on a slower or fragmented disk drive).
If a kernel module is needed in the init, chances are it should be builtin.
Similarly shared libraries add additional bloat and load time (for multiple reasons), a static busybox built against musl or uclibc can significantly cut down load times. The popular rant against static builds in favor of shared libraries by Ulrich Drepper is backed up by metrics based on glibc (which he maintained) which is horribly suited for static builds (a simple static hello world is over 400kb)
see https://sta.li/faq
6. use tools like bootchartd, strace and lsof to measure the files used and times spent in various sections and the order, you can use this to "readahead" files that will be used and to order the filesystem so that files are in sequential order
7. use kernel parameters
libahci.ignore_sss=1 #spin up disks in parallel instead of series
quiet #eliminates superfluous messages from the console
loglevel=3 #or less... only log the most critical issues the lower the number, the fewer log messages
#probably more
8. don't write unnecessary stuff to the console ... or if you must, write as much as possible at once (for example, use a single success/fail message that handles multiple items instead of doing them one ata time). printf is useful for this
Last edited by technosaurus on Mon 09 Oct 2017, 19:38; edited 1 time in total
|
Back to top
|
|
 |
technosaurus

Joined: 18 May 2008 Posts: 4787 Location: Kingwood, TX
|
Posted: Fri 16 Mar 2012, 19:41 Post subject:
|
|
X
_________________ Check out my github repositories. I may eventually get around to updating my blogspot.
|
Back to top
|
|
 |
technosaurus

Joined: 18 May 2008 Posts: 4787 Location: Kingwood, TX
|
Posted: Fri 16 Mar 2012, 19:42 Post subject:
|
|
Alternative apps/libs
_________________ Check out my github repositories. I may eventually get around to updating my blogspot.
|
Back to top
|
|
 |
technosaurus

Joined: 18 May 2008 Posts: 4787 Location: Kingwood, TX
|
Posted: Fri 16 Mar 2012, 19:43 Post subject:
|
|
Kernel configs
_________________ Check out my github repositories. I may eventually get around to updating my blogspot.
|
Back to top
|
|
 |
technosaurus

Joined: 18 May 2008 Posts: 4787 Location: Kingwood, TX
|
Posted: Fri 16 Mar 2012, 19:44 Post subject:
|
|
Portability
_________________ Check out my github repositories. I may eventually get around to updating my blogspot.
|
Back to top
|
|
 |
sunburnt

Joined: 08 Jun 2005 Posts: 5087 Location: Arizona, U.S.A.
|
Posted: Thu 10 May 2012, 17:35 Post subject:
|
|
Hi technosaurus; I wondered if this could do a statistical analysis of lib. usage in common apps.
I did a sudo analysis myself that showed a small percentage of libs. are actually in common use.
This is all about rules for which libs. to include in static builds.
Once the libs. are identified, then scripts for statically building them.
I don`t know if a build script "scrap" can be used many times like this.
|
Back to top
|
|
 |
scsijon
Joined: 23 May 2007 Posts: 1313 Location: the australian mallee
|
Posted: Thu 23 Aug 2012, 22:27 Post subject:
|
|
Technosaurus, I wonder if a common lib could be built, something on the Busybox basis?
It should shrink the overall size a bit!
|
Back to top
|
|
 |
disciple
Joined: 20 May 2006 Posts: 6781 Location: Auckland, New Zealand
|
Posted: Fri 24 Aug 2012, 02:01 Post subject:
|
|
FWIW there was a project to build GTK and X and stuff all into one binary... but it seems to be dead.
_________________ If you have or know of a good gtkdialog application, please post a link here
Classic Puppy quotes
ROOT FOREVER
|
Back to top
|
|
 |
technosaurus

Joined: 18 May 2008 Posts: 4787 Location: Kingwood, TX
|
Posted: Fri 24 Aug 2012, 11:10 Post subject:
|
|
Goingnuts and I have done several x11, xaw and gtk1 apps as multicall binaries.... Gtk2 apps tend to use plugins - thus requiring significant patches... However there is an in between method where you mix static and dynamic libs based on analysis of dependencies. It gets complicated when you have to build pic versions of static libs so they can be statically linked into plugins (like png for imlib) and further complicated if you have another app that links to those libs directly (such as jwm with png support) requires changing the build to link to the imlib plugins for jwm. This is fine if you are doing a 1 time build of an embedded system with limited resources, but a royal pain to maintain for a distro. Thus the reason we hand picked a few compatible apps (sometimes even specific versions) and limited them to the minimum functionality needed to use/build/rebuild a system either from scratch using any distro's packages or so that if your whole shared library system gets corrupted, you can still recover comfortably with an intact desktop environment and still do basic everyday tasks like editing text, browsing the web or your file system, answering email and of course wasting time even if your libc gets corrupted.
_________________ Check out my github repositories. I may eventually get around to updating my blogspot.
|
Back to top
|
|
 |
sunburnt

Joined: 08 Jun 2005 Posts: 5087 Location: Arizona, U.S.A.
|
Posted: Sat 25 Aug 2012, 16:32 Post subject:
|
|
I see no point in BusyBox in Squash file based O.S.s.
It just ends up being double compressed, usually with little improvement.
And here at Puppy we know BusyBox commands just don`t measure up.
Thought it could be compiled with full spec. commands...
BusyBox is a good idea for loose file systems, but not Squash file ones.
|
Back to top
|
|
 |
disciple
Joined: 20 May 2006 Posts: 6781 Location: Auckland, New Zealand
|
Posted: Sat 25 Aug 2012, 17:53 Post subject:
|
|
Quote: | It just ends up being double compressed, usually with little improvement. |
What do you mean? Busybox isn't about compression, is it?
_________________ If you have or know of a good gtkdialog application, please post a link here
Classic Puppy quotes
ROOT FOREVER
|
Back to top
|
|
 |
technosaurus

Joined: 18 May 2008 Posts: 4787 Location: Kingwood, TX
|
Posted: Sat 25 Aug 2012, 20:21 Post subject:
|
|
The advantage of statically compiling a multicall binary is that unused functions get compiled out. Busybox also has the capacity to "prefer applets" and call them as a function or by forking itself (both are much faster than running a separate binary... even moreso if that binary has to find and load shared libs) the only compression is the help text. Really it wouldn't hurt much to put a full busybox in the initramfs and copy it over. The only reason busybox applets sometimes fail is because the scripts weren't specifically written for it (tinycore scripts all work because they were) But I did notice that upx slows busybox down and the extra compression is negligible.
_________________ Check out my github repositories. I may eventually get around to updating my blogspot.
|
Back to top
|
|
 |
|