speeding up scripts

For discussions about programming, programming questions/advice, and projects that don't really have anything to do with Puppy.
Message
Author
User avatar
sc0ttman
Posts: 2812
Joined: Wed 16 Sep 2009, 05:44
Location: UK

speeding up scripts

#1 Post by sc0ttman »

I was looking at making a few puppy scripts run faster.
Mostly, I have found many scripts will work using ash.

I am using busybox 1.1.8.4-unicode (from Barry) and bash 4.2

Here's what we can do to make scripts faster...

If we open a shell script and change #!/bin/sh to #!/bin/ash, the script uses 'ash' not 'bash' to execute the code.
The 'ash' shell is part of busybox on most puppies, and smaller than bash, with less features.

Anyway... Ash is faster, but has some problems.

EDIT: A really easy way to check for 'Bashisms' in scripts: checkbashisms.pl!
Just download it, remove the.'gz' and make it executable

Then just 'checkbashisms /path/to/script' to check a script!

Older info continued below....

I learned that ash (supposedly) doesn't like

Code: Select all

[ ! "VAR" = "something" ]
Apparently, ash prefers

Code: Select all

[ "$VAR" != "something" ]
I changed these, but actually, it was not necessary.


Also, apparently, ash does not support

Code: Select all

${VARNAME%%cutmeofftheend}
or

Code: Select all

${VARNAME##cutmeoffthestart}
but I found this not to be true as well.


Another thing, this will not work in ash

Code: Select all

export VAR="some value"
, so use instead

Code: Select all

VAR="some value"; export VAR

Also, I think ash does not like:

Code: Select all

function my_function_name () {
instead, use only

Code: Select all

my_function_name () {
I also learned that using the following near the start of scripts,
should make things faster, but use only if you don't depend on locales in the script.

Code: Select all

OLDLANG=$LANG
LANG=C
Then you must do the following near the end:

Code: Select all

LANG=$OLDLANG
However... I have hit some problems, that if I can correct, will make ash more widely usable.

1. Is it possible to execute a command defined in the main script, from within a GTK-Dialog button? If so, that would be great, but at the moment, anything like this in the GTKDialog GUI

Code: Select all

<button>
<action>my_function_name</action>
</button>
simply returns (in the terminal)

Code: Select all

my_function_name: command not found
Can anyone fix this for ash scripts?

2. Does anyone know anything else that needs to be done to make bash scripts work in ash?
Which other features are missing from ash?

EDIT: Arrays are missing. See 3rd post, below.
Attachments
checkbashisms.gz
fake .gz, just remove the .gz extension and make executable
(20.76 KiB) Downloaded 1071 times
Last edited by sc0ttman on Tue 13 Dec 2011, 16:21, edited 10 times in total.
[b][url=https://bit.ly/2KjtxoD]Pkg[/url], [url=https://bit.ly/2U6dzxV]mdsh[/url], [url=https://bit.ly/2G49OE8]Woofy[/url], [url=http://goo.gl/bzBU1]Akita[/url], [url=http://goo.gl/SO5ug]VLC-GTK[/url], [url=https://tiny.cc/c2hnfz]Search[/url][/b]

User avatar
sc0ttman
Posts: 2812
Joined: Wed 16 Sep 2009, 05:44
Location: UK

ash vs bash

#2 Post by sc0ttman »

More thoughts..

1. We could export certain parts of non-ash compatible scripts to an ash script, which is created and made executable on the fly, then have that ash script run stuff and return its results to the main bash script..

2. I will list scripts which I have successfully converted ash.

3. I will list scripts that I could not convert, with the errors given.
[b][url=https://bit.ly/2KjtxoD]Pkg[/url], [url=https://bit.ly/2U6dzxV]mdsh[/url], [url=https://bit.ly/2G49OE8]Woofy[/url], [url=http://goo.gl/bzBU1]Akita[/url], [url=http://goo.gl/SO5ug]VLC-GTK[/url], [url=https://tiny.cc/c2hnfz]Search[/url][/b]

PANZERKOPF
Posts: 282
Joined: Wed 16 Dec 2009, 21:38
Location: Earth

Re: speeding up scripts

#3 Post by PANZERKOPF »

sc0ttman wrote: I learned that ash (supposedly) doesn't like

Code: Select all

[ ! "VAR" = "something" ]
Apparently, ash prefers

Code: Select all

[ "$VAR" != "something" ]
This is fine, I changed these, but did it did not seem to be necessary.

Code: Select all

${VARNAME%%cutmeofftheend}
or

Code: Select all

${VARNAME##cutmeoffthestart}
but I found this not to be true.
Strange....
My busybox ash recognizes these constructions.
This construction also works:

Code: Select all

${VARNAME/string/string}
sc0ttman wrote: 2. Does anyone know anything else that needs to be done to make bash scripts work in ash? Which other features are missing from ash?
For example, arrays are missing:

Code: Select all

$VARNAME[INDEX]
...but we need them?
SUUM CUIQUE.

User avatar
Dougal
Posts: 2502
Joined: Wed 19 Oct 2005, 13:06
Location: Hell more grotesque than any medieval woodcut

Re: speeding up scripts

#4 Post by Dougal »

sc0ttman wrote:Also, apparently, ash does not support

Code: Select all

${VARNAME%%cutmeofftheend}
or

Code: Select all

${VARNAME##cutmeoffthestart}
but I found this not to be true.
Those are Bourne-compatible (dash also supports them).
What is a bashism is the ${VAR/a/b} notation. Newer versions of busybox ash support it, but it's not portable.
1. Is it possible to execute a command defined in the main script, from within a GTK-Dialog button? If so, that would be great, but at the moment, anything like this in the GTKDialog GUI

Code: Select all

<button>
<action>my_function_name</action>
</button>
simply returns (in the terminal)

Code: Select all

my_function_name: command not found
This is odd. I don't recall encountering that. Did you make sure to export the function?
2. Does anyone know anything else that needs to be done to make bash scripts work in ash?
Which other features are missing from ash?
Just search the ABS-guide for bashisms... he's pretty good at pointing them out. I can't really remember them now, but one that comes to mind is extracting a field our of a string: ${VAR:m:n} (or whatever).
Also, "let" is a bashism (though supported by busybox ash).
What's the ugliest part of your body?
Some say your nose
Some say your toes
But I think it's your mind

User avatar
technosaurus
Posts: 4853
Joined: Mon 19 May 2008, 01:24
Location: Blue Springs, MO
Contact:

#5 Post by technosaurus »

Busybox hush supports parameter expansion, while ash doesn't
Ex. touch /tmp/{a,b}
Hush # /tmp/a and /tmp/b are touched
Ash # /tmp/{a,b} is touched (yes a file starting with "{" could be annoying)

Substring manipulations were mentioned $:{## & %% & // & :-default}, but...
With a little creativity they can replace sed, grep, cut, tr, truncate, head, basename, dirname and more. It is almost always faster to manipulate a string than to call an external program.

For compatibility and speed, make scripts compatible with
http://busybox.net/downloads/BusyBox.html

Use pipes with programs that can use them.
But don't use cat as the first command in a pipe if the second command can read the file itself.

Use strings instead of files if possible.

If you have to use a temp file, /dev/shm is the fastest location because it is in memory

Complicated if then elses can be reduced to...
[ $DISPLAY ] && xmessage desktop ¦¦ echo console
This format works with any variable _and_ you can include string manipulations

You can use $((<math>)) for fast integer math or busybox awk for non-integer.

it is much faster to call a function than a separate script (~20x)

...most importantly, you need to be able to accurately measure the changes

Sorry, no direct examples atm, but here is a way to time functions or external scripts/binaries (time doesn't work on functions)

Code: Select all

time2() {
STARTDELTA=`date +%s.%N`
STOPDELTA=`date +%s.%N`
#needed to remove the time it takes to call date
STARTTIME=`date +%s.%N`
$@
ENDTIME=`date +%s.%N`
awk "BEGIN{print $ENDTIME - $STARTTIME +$STARTDELTA -$STOPDELTA }" >/dev/stderr
}
Check out my [url=https://github.com/technosaurus]github repositories[/url]. I may eventually get around to updating my [url=http://bashismal.blogspot.com]blogspot[/url].

User avatar
alienjeff
Posts: 2265
Joined: Sat 08 Jul 2006, 20:19
Location: Winsted, CT - USA

#6 Post by alienjeff »

A couple/three years back, someone on here was expermenting with parsing or optimizing the scripts in Puppy Linux. He used a different term, though. Sadly, I don't recall the username or post Sj. Perhaps someone else has a better memory.
[size=84][i]hangout:[/i] ##b0rked on irc.freenode.net
[i]diversion:[/i] [url]http://alienjeff.net[/url] - visit The Fringe
[i]quote:[/i] "The foundation of authority is based upon the consent of the people." - Thomas Hooker[/size]

User avatar
sc0ttman
Posts: 2812
Joined: Wed 16 Sep 2009, 05:44
Location: UK

#7 Post by sc0ttman »

Thanks for the info guys, helpful as usual.

I found a nice site that details this stuff as well: https://wiki.ubuntu.com/DashAsBinSh


I found a nice little script that seems to work well,
called 'checkbashisms'.. It's a PERL script, runs fine, attached in main post.
Last edited by sc0ttman on Fri 12 Aug 2011, 20:18, edited 1 time in total.
[b][url=https://bit.ly/2KjtxoD]Pkg[/url], [url=https://bit.ly/2U6dzxV]mdsh[/url], [url=https://bit.ly/2G49OE8]Woofy[/url], [url=http://goo.gl/bzBU1]Akita[/url], [url=http://goo.gl/SO5ug]VLC-GTK[/url], [url=https://tiny.cc/c2hnfz]Search[/url][/b]

User avatar
sc0ttman
Posts: 2812
Joined: Wed 16 Sep 2009, 05:44
Location: UK

Re: speeding up scripts

#8 Post by sc0ttman »

Dougal wrote:This is odd. I don't recall encountering that. Did you make sure to export the function?
As usual don't really know whats going on, but I used 'set -a', which helps the GUIs find them normally for me... I tried 'export f', but it's apparently not supported by ash.

Also I read [[ blah ]] is a bashism.. And then the checkbashisms script I attached in main post pointed it out to me...
[b][url=https://bit.ly/2KjtxoD]Pkg[/url], [url=https://bit.ly/2U6dzxV]mdsh[/url], [url=https://bit.ly/2G49OE8]Woofy[/url], [url=http://goo.gl/bzBU1]Akita[/url], [url=http://goo.gl/SO5ug]VLC-GTK[/url], [url=https://tiny.cc/c2hnfz]Search[/url][/b]

User avatar
sc0ttman
Posts: 2812
Joined: Wed 16 Sep 2009, 05:44
Location: UK

#9 Post by sc0ttman »

After running 'checkbashisms' on a number of scripts, it says that

Code: Select all

echo -e "blah blah \n blah"
is not supported .. and to use actual new lines inside the ""

EDIT: Also printf can be used... I have made the init script in initrd.gz use the ash shell, replacing all echo -e with printf .. I need to add a static printf to initrd, and then it should work fine - already boots a bit quicker, but the green 'done' bits won't show without printf.

But maybe printf is slower than echo -e in the first place... Hmm...

and a \n added to the end of the string to produce the same output (I think)...

Anyway, it seems most puppy scripts have only 3 or 4 bashisms in there,
and the fixes are often small and easy. Mostly things like:

echo -e

[[ blah ]] ... also [ blah && blah ]

&>/dev/null
Last edited by sc0ttman on Sat 13 Aug 2011, 12:36, edited 1 time in total.
[b][url=https://bit.ly/2KjtxoD]Pkg[/url], [url=https://bit.ly/2U6dzxV]mdsh[/url], [url=https://bit.ly/2G49OE8]Woofy[/url], [url=http://goo.gl/bzBU1]Akita[/url], [url=http://goo.gl/SO5ug]VLC-GTK[/url], [url=https://tiny.cc/c2hnfz]Search[/url][/b]

amigo
Posts: 2629
Joined: Mon 02 Apr 2007, 06:52

#10 Post by amigo »

Much of the time I go the other way and depend more on bash, instead of less. Usually, slowness in scripts is due to poor design-making too many checks, unnecessary loops or from calling external programs. Using even medium-level bash functionality can replace many of the external calls.

A couple of years ago I wrote a collection of bash scripts which mimic the action of about a dozen common cli tools -like basename, dirname, cut, etc., which are widely used in shell scripts. Pasting these into a bash script *as functions* can really speed things up.
You can have a look here:
http://distro.ibiblio.org/pub/linux/dis ... /BashTrix/

All those there use absolutely no external calls. When writing scripts, I particularly try to avoid calling sed, awk and grep. When you see a one-line with multiple calls to cat, cut, grep you should know something is wrong. I see lots of puppy routines which pipe together a bunch of commands -if that is all inside a loop which gets run several times it really puts a drag on things.

User avatar
sc0ttman
Posts: 2812
Joined: Wed 16 Sep 2009, 05:44
Location: UK

#11 Post by sc0ttman »

amigo wrote:Much of the time I go the other way and depend more on bash, instead of less. Usually, slowness in scripts is due to poor design-making too many checks, unnecessary loops or from calling external programs. Using even medium-level bash functionality can replace many of the external calls.

A couple of years ago I wrote a collection of bash scripts which mimic the action of about a dozen common cli tools -like basename, dirname, cut, etc., which are widely used in shell scripts. Pasting these into a bash script *as functions* can really speed things up.
You can have a look here:
http://distro.ibiblio.org/pub/linux/dis ... /BashTrix/

All those there use absolutely no external calls. When writing scripts, I particularly try to avoid calling sed, awk and grep. When you see a one-line with multiple calls to cat, cut, grep you should know something is wrong. I see lots of puppy routines which pipe together a bunch of commands -if that is all inside a loop which gets run several times it really puts a drag on things.
Nice, thanks for the link to BashTrix.. I'll have a look...
There's also bashbox by techno, yet to dig deep into that one...

And I no what you mean about making the bash itself faster as a better alternative, but I must face facts, while I can easily use PHP to do such things (remove loops, find the fastest in-built methods, etc), sadly I my BASH/shell knowledge is pretty damn poor in comparison.. Although I know enough to (try) reduce the number of external commands ..

Aside from loops, cat, awk, grep and sed, are there any other time hogs we should avoid? And what should we use instead? If statements are not resource or time intensive, right? (providing the test is a simple one of course)..

Is using . /path/to/functions as quick as having them in the main file?
[b][url=https://bit.ly/2KjtxoD]Pkg[/url], [url=https://bit.ly/2U6dzxV]mdsh[/url], [url=https://bit.ly/2G49OE8]Woofy[/url], [url=http://goo.gl/bzBU1]Akita[/url], [url=http://goo.gl/SO5ug]VLC-GTK[/url], [url=https://tiny.cc/c2hnfz]Search[/url][/b]

amigo
Posts: 2629
Joined: Mon 02 Apr 2007, 06:52

#12 Post by amigo »

I'll start from your last question -sourcing a file obviously takes longer than reading it (the content) directly inline. But sourcing a common set of code has its' advantages.

About BashTrix and bashbox -long before technosaurus came out with bashbox, I had thought of producing a 'bashybox' -a multi-call thingy just as he came up with, incorporating all the BashTrix functionaliyt in one long script. But I tired of working with BashTrix as i have my plate full anyway. To be clear, I don't mean to take any credit or proprietorship of techno's ideas and work. Nor do I mean to talk it down. The thing about bashbox is that they are pretty much (all?) puppy-specific utilities which ahve been rolled into one. BashTrix contains replacements for general-usage GNU utilities -which can also be incorporated into your bash scripts at will. Each one is independent of the others -purposely- though most of them contains some code duplicated in others. The code is always significantly longer than is really needed, but they are meant to be easy to read, understand and modify.

Running through loops which iterate though lots of files or lines of a file is where most serious slow-downs occur. When there is no other way to eliminate the loop cycles. it is often better to go ahead and use the more sophisticated binary tools like sed, awk and grep. But, try to find a way to minimize the external calls by mixing them with inlined bash functions.

More tips:
case statements are much faster than if statements and more flexible except when you need to use test ([[]]) to check on file attributes. case staements can duplicate most of the functionality of grep for simple pattern matching.

When forced to use grep, try to make it possible to use fgrep -it is a smaller binary and uses no regex's (only pattern matching so you may need to adjust the comparison algorithm).

Good general programming practices vis-a-vis the outline/design of your program -spend lots of time working out as many loops and redundant if's as possible.

Reduce file/disk IO as much as possible. Often it is easier to write the output of some loop or grep into a file for later processing by another routine. But, if the answer or output from the original processing is just one line, or can be made into an array(or faked array), then it will be fsater to simply capture that output into a variable. Example: Instead of 'grep this-pattern * > file', do this:
var=$(grep this-pattern *)

When using grep, limit the response to a single match when possible, using the '-m1' option. For instance, you want to know if a certain pattern occurs in a file or files which may contain many matches. Using the -m (number of matches) will make grep exit once the first match is found instead of continuing through the whole file.

That said, good form and good comments are more important. If you write good comments about what your routine should be doing, you'll find it easier to improve the routine later when you have more experience. Fine optimization should be tackled later, but good planning at the first should give you a better platform to work from. It's usually good to do a first stab at the script, then start over from scratch at least once -before you start adding all the feature creep.

If you need to use 'which' to locate programs, use it only once for each program and store the result in a variable. Speaking of 'which', you can use the bash built-in 'type' to do that for you and it will also show you any functions which have been set up.

Shell programming and speed are not really meant to go together, so don't write hard-to-understand, fancy one-liners just for the sake of a few micro-seconds. The beauty of shell programming is the ease of altering/correcting the code and being able to quickly check the result.

Although there are many advanced features of bash (shopt & Co., arrays, etc) I generally avoid them -your bash may have some options compiled in which others don't have. And besides, the less-used features are less readable and you are likely to need to re-study them each time you want to use them -or have to modify code you wrote earlier.

Avoid lenghty multi-call pipes:
cat file.tx |while read LINE ; do blah blah |cut |grep |grep -v |awk |sed
One pppy routine (for working with package repos) was using a loop which, in each iteration, ran a series of 27 piped commands! No wonder some contrib was able to write a small C program which ran much faster. Once, I had written a routine which was looping through some input from hundreds of files. The routine was using multiple if statements. When I re-wrote it to use a single case staement, it ran in 3% of the time of the original!

Somewhere here on the forum is a thread which shows several examples of duplicating grep behaviour using case. The grep replacement in BashTrix shows the most basic 'starts-with', 'ends-with' and 'contains' functionality (at the end of the script). Oddly, when I stared writing BashTrix, one of the first programs I tackled was 'cut', which turned out to be the hardest one (except for wget). But the bash cut lets you specify delimiters of more than one char, so is very useful. Elsewhere, I have written small functions name lof, lol, rof and rol -that's left-of-first, left-of-last, right-of-first and right-of-last. Using bash to process strings is much, much faster than echo-cut-rev-cut-rev, and faster than sed too.

User avatar
Dougal
Posts: 2502
Joined: Wed 19 Oct 2005, 13:06
Location: Hell more grotesque than any medieval woodcut

#13 Post by Dougal »

sc0ttman wrote:After running 'checkbashisms' on a number of scripts, it says that

Code: Select all

echo -e "blah blah \n blah"
is not supported .. and to use actual new lines inside the ""
Yeah, the builtin echo in dash acts always as "echo -e"... I just hacked dash to ignore the "-e".

Note that with Busybox ash this is not a problem.

A simple workaround to this problem is to replace all instances of "echo" with "\echo", which will ensure that /bin/echo is used, but will slow down your script...
What's the ugliest part of your body?
Some say your nose
Some say your toes
But I think it's your mind

User avatar
Dougal
Posts: 2502
Joined: Wed 19 Oct 2005, 13:06
Location: Hell more grotesque than any medieval woodcut

#14 Post by Dougal »

amigo wrote:Much of the time I go the other way and depend more on bash, instead of less. Usually, slowness in scripts is due to poor design-making too many checks, unnecessary loops or from calling external programs. Using even medium-level bash functionality can replace many of the external calls.
I generally agree with this, but have some reservations.

Doing things smartly is obviously the best thing (see my x10 speed improvement of the xorgwizard from a few years ago).

Using builtins is also best wherever possible, even if the code ends up a little longer -- just like when coding in C you don't keep using system()... The way I look at it, grep, for example, is a C program optimized for finding strings fast, while the shell is a C program optimized for processing shell (builtin) commands fast... so more LOC is just fine as long as there are no subshells (and IO! Aggregate things you want to echo, if possible).

However, it doesn't have to be bashisms, Bourne shell builtins can solve most problems and sometimes the speed penalty of using Bash only for one place where the bashism is needed (and is not speed critical) makes it not worthwhile.

You also want to think of the amount of data you need to handle: if you need to find something in a short text file, a loop with a case structure and shell internals might be fastest, but at some stage it becomes faster to just open a subshell and use grep, as it is, after all, optimized for that. (A few months ago I actually implemented things like awk '{print $1}' in shell (based on positional parameters) and tested and saw when it became faster to just call awk... I PMed it to Zigbert and don't seem to have it anymore.)
What's the ugliest part of your body?
Some say your nose
Some say your toes
But I think it's your mind

User avatar
Dougal
Posts: 2502
Joined: Wed 19 Oct 2005, 13:06
Location: Hell more grotesque than any medieval woodcut

#15 Post by Dougal »

amigo wrote:When forced to use grep, try to make it possible to use fgrep -it is a smaller binary and uses no regex's (only pattern matching so you may need to adjust the comparison algorithm).
Hmm

Code: Select all

# cat /bin/fgrep 
#!/bin/sh
exec grep -F ${1+"$@"}
I've just trained myself to type "grep -F" by default...

Generally agree with the rest.
What's the ugliest part of your body?
Some say your nose
Some say your toes
But I think it's your mind

User avatar
technosaurus
Posts: 4853
Joined: Mon 19 May 2008, 01:24
Location: Blue Springs, MO
Contact:

#16 Post by technosaurus »

I just wanted to mention that almost all of the external binary advice goes out the window if you use a busybox shell (ash/hush) with the busybox_prefer_applets option enabled. No major x86 distro, with the exception of maybe alpine linux, enables it by default though.

here is a template that I use to parse desktop files

Code: Select all

#!/bin/ash
#Copywrong 2011 Brad Conroy - released to the public domain
#Parse through the .desktop files, generate use{ful,less} output
#Though they have Var=value structure, we cannot source them directly (damn) because:
#1. many Vars have illegal characters ... but not the ones we actually want (for now - see TODO)
#2. many values have spaces
#Solution
#1. grep for only the fields that we want to eliminate problem #1
#2.(a) use sed to put quotes after the = and at the end of the line to accomodate spaces
#2.(b) ayttm (possibly others) has a non-standard set of quotes so add a sed for that (could tr -s '"',but sed is already called)
#But wait we didn't _actually_ modify the file, so we can't source it - what a waste of time
#That's ok, we can just use eval on a variable (or in this case a return) to do the same thing
#
#Ok now we have the values, but what to do with them?  We will just output them to an easily awkable file
#but writing to a file is slow, so store it in a Variable in-loop and write once out-of-loop
#... at this point you could instead add a single case statement to generate menu entries or anything useful
#TODO add parameters to the sed/grep parts to localize if available (trivial, but would complicate the demo)
for x in /usr/share/applications/* ; do
    eval `grep -E ^Name=\|^Categories=\|^Comment=\|^Icon= $x |sed "s/=/=\"/g ; s/$/\"/g ; s/\"\"/\"/g"`
    OUTPUT=${OUTPUT}${Name:-UNDEFINED}\|${Categories:-UNDEFINED}\|${Comment:-UNDEFINED}\|${Icon:-UNDEFINED}"\n"
done
echo -e $OUTPUT
Check out my [url=https://github.com/technosaurus]github repositories[/url]. I may eventually get around to updating my [url=http://bashismal.blogspot.com]blogspot[/url].

User avatar
technosaurus
Posts: 4853
Joined: Mon 19 May 2008, 01:24
Location: Blue Springs, MO
Contact:

#17 Post by technosaurus »

Code: Select all

#this time with fgrep, it needs newline separators and doesn't support "^" (beginning of line)
#This pulls in GenericName and other problem children ... just for demo, then back to grep -E or egrep
#... thus the 4th sed (a blank entry becomes ="$ so fix it)
#This time we will thread the parsing of each file - it is ~10x faster but borks sorting by *.desktop name
#perhaps we want to sort by app name or category ... or, if you are using bash, store them in an mxn array

String="Name=
Categories=
Comment=
Icon=
Exec=" 

parse(){ #by pulling this out of the main loop we can fork it with the & and run many simultaneously
eval $(fgrep "$String" ${1} |sed 's/=/=\"/g ; s/$/\"/g ; s/\"\"/\"/g ; s/=\"$/=\"\"/')
echo ${Name:-UNDEFINED}\|${Categories:-UNDEFINED}\|${Comment:-UNDEFINED}\|${Icon:-UNDEFINED}\|${Exec:-UNDEFINED}
}

printf "" > /tmp/file #start with a clean file
for x in /usr/share/applications/* ; do
 parse "${x}" 2>/dev/null >>/tmp/file &
done

#hmm the recursing of the files is done, but what about all of the threads?
#lets try sorting the file and see?
#sort file > ${1:-sortedfile}
#nope missing entries, need to wait ... but how long
#don't just make a sufficiently long sleep for _your_ box
#we will just wait for sed to complete

while (pidof sed >/dev/null) do
printf . #as long as sed is running print dots every 10 milliseconds
sleep 0.01
done
sort /tmp/file > ${1:-sortedfile}
Check out my [url=https://github.com/technosaurus]github repositories[/url]. I may eventually get around to updating my [url=http://bashismal.blogspot.com]blogspot[/url].

amigo
Posts: 2629
Joined: Mon 02 Apr 2007, 06:52

#18 Post by amigo »

"grep -F by default". Yeah that's okay. Of course it's an extra process since it's a wrapper. On my older Slack, fgrep is a *link* to grep. But ony my own kiss-linux I install the real binary of fgrep. It's a smaller bin so has less latency.

Techno's bit of code makes a good example. I compared it to a pure-shell solution which gives the same output:

Code: Select all

#/bin/bin/bsh
# this runs fine on 'real' ash and should work with bb ash
#/usr/bin/bsh
# It will also run using 'bsh' the 'real' Bourne shell
# Quite obvioulsy would run with any bash, as well.


#Copywrong 2011 Brad Conroy - released to the public domain 
#Parse through the .desktop files, generate use{ful,less} output 
#Though they have Var=value structure, we cannot source them directly (damn) because: 
#1. many Vars have illegal characters ... but not the ones we actually want (for now - see TODO) 
#2. many values have spaces 
#Solution 
#1. grep for only the fields that we want to eliminate problem #1 
#2.(a) use sed to put quotes after the = and at the end of the line to accomodate spaces 
#2.(b) ayttm (possibly others) has a non-standard set of quotes so add a sed for that (could tr -s '"',but sed is already called) 
#But wait we didn't _actually_ modify the file, so we can't source it - what a waste of time 
#That's ok, we can just use eval on a variable (or in this case a return) to do the same thing 
# 
#Ok now we have the values, but what to do with them?  We will just output them to an easily awkable file 
#but writing to a file is slow, so store it in a Variable in-loop and write once out-of-loop 
#... at this point you could instead add a single case statement to generate menu entries or anything useful 
#TODO add parameters to the sed/grep parts to localize if available (trivial, but would complicate the demo) 
function techno() {
for x in /usr/share/applications/* ; do 
    eval `grep -E ^Name=\|^Categories=\|^Comment=\|^Icon= $x |sed "s/=/=\"/g ; s/$/\"/g ; s/\"\"/\"/g"` 
    OUTPUT=${OUTPUT}${Name:-UNDEFINED}\|${Categories:-UNDEFINED}\|${Comment:-UNDEFINED}\|${Icon:-UNDEFINED}"\n" 
done 
echo -e $OUTPUT
}
#time techno
# On my 700mHz P-III the above runs in
#real    0m0.882s
#user    0m0.380s
#sys     0m0.430s
# on a dir with 56 items
# Hmm, that echo -e may not be enirely portable, but you could use printf or /bin/echo instead

#Copyright Gilbert Ashley <amigo@ibiblio.org>
amigo() {
> test.file
for DESKTOP_FILE in /usr/share/applications/* ; do
#for DESKTOP_FILE in /usr/share/applications/Editra.desktop ; do
	OUT=
	while read LINE ; do
		case $LINE in
			Name=*) NAME="${LINE#*=}"'|'  ;;
			Comment=*) OUT=$OUT"${LINE#*=}"'|'  ;;
			Icon=*) ICON="${LINE#*=}"'|'  ;;
			#Terminal=*)
			#Type=*)
			Categories=*) CATS="${LINE#*=}"'|'  ;;
			Exec=*) EXEC="${LINE#*=}"'|'  ;;
			#Comment=*) COMM="${LINE#*=}"  ;;
		esac
	done < $DESKTOP_FILE
	echo $NAME$ICON$CATS$EXEC
	# To test the extract function below, use the following line instead of above
	# echo $NAME$ICON$CATS$EXEC >> test.file
done
}

time amigo
# On my 700mHz P-III the above runs in
#real    0m0.160s
#user    0m0.130s
#sys     0m0.010s
# on a dir with 56 items

# And about 'awkable', there again we can do that with pure shell:
extract() {
while read LINE ; do
	# skip the line if NULL
	case $LINE in '') continue ;; esac
	
	# more conventional approaches awk:
	#echo $LINE | awk -F '|' '{ print $4 }'
	#real    0m0.684s
	#user    0m0.290s
	#sys     0m0.330s
	
	# and with cut:
	#echo "$LINE" |cut -f4 -d'|'
	#real    0m0.613s
	#user    0m0.200s
	#sys     0m0.350s

	( IFS='|' ; set ${LINE} ; echo $4 )
	# Using ash:
	#real    0m0.298s
	#user    0m0.070s
	#sys     0m0.170s
	# Using bash:
	#real    0m0.296s
	#user    0m0.080s
	#sys     0m0.120s
	# Using bsh:
	#real    0m0.285s
	#user    0m0.060s
	#sys     0m0.110s

done < test.file
}

time extract

Should run even faster using bb ash since it's all a single process. I'd point out, that even though you are using bb grep and sed(right?), you are still running a separate process for each one.

PANZERKOPF
Posts: 282
Joined: Wed 16 Dec 2009, 21:38
Location: Earth

#19 Post by PANZERKOPF »

technosaurus wrote: busybox_prefer_applets option enabled.
That is great option but seems we must mount /proc before using it.
After making this in sysinit:
/bin/busybox mount -t proc proc /proc
We can call any builtin application directly.
Is that right?
SUUM CUIQUE.

User avatar
technosaurus
Posts: 4853
Joined: Mon 19 May 2008, 01:24
Location: Blue Springs, MO
Contact:

#20 Post by technosaurus »

PANZERKOPF wrote:
technosaurus wrote: busybox_prefer_applets option enabled.
That is great option but seems we must mount /proc before using it.
After making this in sysinit:
/bin/busybox mount -t proc proc /proc
We can call any builtin application directly.
Is that right?
I don't know about _any_, but most (some _may_ need other stuff like sys/dev or specific files/nodes) however, if you have busybox ash or hush as /bin/sh and a shell-compliant-script relies on standard utils, it will fail ... you'd need to mod it from util <args> to /path/to/util <args> for those instances ... which is why no major distro is doing it
Check out my [url=https://github.com/technosaurus]github repositories[/url]. I may eventually get around to updating my [url=http://bashismal.blogspot.com]blogspot[/url].

Post Reply