Page 1 of 1

Reduced compiled size by eliminating redundant assembly

Posted: Tue 29 Oct 2013, 16:12
by technosaurus
The topic is covered with better formatting at:
http://stackoverflow.com/questions/1966 ... y-language

but I am interested in puppy-like solutions too, maybe someone can figure out a novel way to do this in 10 lines of awk :)

I've been going through some Assembly Programming Videos to get a better understanding of how to manually optimize the *.s files left after compiling with gcc/g++ -S ... One of the topics covered was Refactoring Redundant Code that demonstrates how to move redundant code to its own labeled block ending with a ret and replacing it with a call.

Out of curiosity, trying to reduce compiled size, I compiled some C and C++ projects with -S and various optimizations including -Os,-O2,-O3,-pipe,-combine and -fwhole-program and analyzed the resulting *.s files for redundancy using a lightly patched (for .s files) version of duplo. Only -fwhole-program (now deprecated IIRC) had a significant effect toward eliminating duplicate code across files (I assume its replacement(s) -flto would behave similarly during link time - roughly equivalent to compiling with -ffunction-sections -fdata-sections and linking with --gc-sections) but still misses significantly large blocks of code.

The example given in the video is 2 blocks containing:

Code: Select all

mov eax,power
mul ebx
mov power,eax
inc count
which it replaces with call CalculateNextPower and CalculateNextPower looks like:

Code: Select all

CalculateNextPower:
mov eax,power
mul ebx
mov power,eax
inc count
ret
Am I missing a compiler option (or even a standalone tool) that does this automatically when compiling for size (including other compilers: clang, icc, etc..) or is this functionality absent (for a reason?).

If it doesn't exist, it could be possible to modify duplo to ignore lines starting with a '.' or ';' (and others?) and replace duplicate code blocks with calls to functions with the duplicate code, but I am open to other suggestions that would work directly with the compiler's internal representations.

Posted: Tue 29 Oct 2013, 19:42
by technosaurus
Ok, I patched duplo for assembly files:

basic howto:

Code: Select all

compile your c/cpp files with
gcc -S <options_here> *.c
gcc -S <options_here> *.cpp
for x in *.s; do;
echo $x
done > files.txt
duplo -ml 5 -ip files.txt out.txt
If you downloaded this version, try the one in the next post.

Posted: Wed 30 Oct 2013, 00:47
by technosaurus
repatched duplo, need to keep labels in place (I thought it ended the string but it actually just removed the labels)