Puppy Linux Discussion Forum Forum Index Puppy Linux Discussion Forum
Puppy HOME page : puppylinux.com
"THE" alternative forum : puppylinux.info
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

The time now is Tue 21 Oct 2014, 15:59
All times are UTC - 4
 Forum index » Off-Topic Area » Programming
speeding up scripts
Post_new_topic   Reply_to_topic View_previous_topic :: View_next_topic
Page 1 of 3 Posts_count   Goto page: 1, 2, 3 Next
Author Message
sc0ttman


Joined: 16 Sep 2009
Posts: 2385
Location: UK

PostPosted: Tue 26 Jul 2011, 09:21    Post_subject:  speeding up scripts
Sub_title: waht I learned, what I wanna learn
 

I was looking at making a few puppy scripts run faster.
Mostly, I have found many scripts will work using ash.

I am using busybox 1.1.8.4-unicode (from Barry) and bash 4.2

Here's what we can do to make scripts faster...

If we open a shell script and change #!/bin/sh to #!/bin/ash, the script uses 'ash' not 'bash' to execute the code.
The 'ash' shell is part of busybox on most puppies, and smaller than bash, with less features.

Anyway... Ash is faster, but has some problems.

EDIT: A really easy way to check for 'Bashisms' in scripts: checkbashisms.pl!
Just download it, remove the.'gz' and make it executable

Then just 'checkbashisms /path/to/script' to check a script!

Older info continued below....

I learned that ash (supposedly) doesn't like
Code:
[ ! "VAR" = "something" ]
Apparently, ash prefers
Code:
[ "$VAR" != "something" ]
I changed these, but actually, it was not necessary.


Also, apparently, ash does not support
Code:
${VARNAME%%cutmeofftheend}
or
Code:
${VARNAME##cutmeoffthestart}
but I found this not to be true as well.


Another thing, this will not work in ash
Code:
export VAR="some value"
, so use instead
Code:
VAR="some value"; export VAR



Also, I think ash does not like:
Code:
function my_function_name () {
instead, use only
Code:
my_function_name () {


I also learned that using the following near the start of scripts,
should make things faster, but use only if you don't depend on locales in the script.

Code:
OLDLANG=$LANG
LANG=C
Then you must do the following near the end:

Code:
LANG=$OLDLANG


However... I have hit some problems, that if I can correct, will make ash more widely usable.

1. Is it possible to execute a command defined in the main script, from within a GTK-Dialog button? If so, that would be great, but at the moment, anything like this in the GTKDialog GUI
Code:

<button>
<action>my_function_name</action>
</button>
simply returns (in the terminal)
Code:
my_function_name: command not found


Can anyone fix this for ash scripts?

2. Does anyone know anything else that needs to be done to make bash scripts work in ash?
Which other features are missing from ash?

EDIT: Arrays are missing. See 3rd post, below.
checkbashisms.gz
Description  fake .gz, just remove the .gz extension and make executable
gz

 Download 
Filename  checkbashisms.gz 
Filesize  20.76 KB 
Downloaded  360 Time(s) 

_________________
Akita Linux, VLC-GTK, Pup Search, Pup File Search

Edited_times_total
Back to top
View user's profile Send_private_message 
sc0ttman


Joined: 16 Sep 2009
Posts: 2385
Location: UK

PostPosted: Tue 26 Jul 2011, 09:36    Post_subject: ash vs bash
Sub_title: more thoughts
 

More thoughts..

1. We could export certain parts of non-ash compatible scripts to an ash script, which is created and made executable on the fly, then have that ash script run stuff and return its results to the main bash script..

2. I will list scripts which I have successfully converted ash.

3. I will list scripts that I could not convert, with the errors given.

_________________
Akita Linux, VLC-GTK, Pup Search, Pup File Search
Back to top
View user's profile Send_private_message 
PANZERKOPF

Joined: 16 Dec 2009
Posts: 280
Location: Earth

PostPosted: Tue 26 Jul 2011, 10:02    Post_subject: Re: speeding up scripts
Sub_title: waht I learned, what I wanna learn
 

sc0ttman wrote:

I learned that ash (supposedly) doesn't like
Code:
[ ! "VAR" = "something" ]
Apparently, ash prefers
Code:
[ "$VAR" != "something" ]
This is fine, I changed these, but did it did not seem to be necessary.
Code:
${VARNAME%%cutmeofftheend}
or
Code:
${VARNAME##cutmeoffthestart}
but I found this not to be true.

Strange....
My busybox ash recognizes these constructions.
This construction also works:
Code:
${VARNAME/string/string}


sc0ttman wrote:

2. Does anyone know anything else that needs to be done to make bash scripts work in ash? Which other features are missing from ash?

For example, arrays are missing:
Code:
$VARNAME[INDEX]

...but we need them?

_________________
SUUM CUIQUE.
Back to top
View user's profile Send_private_message 
Dougal


Joined: 19 Oct 2005
Posts: 2505
Location: Hell more grotesque than any medieval woodcut

PostPosted: Wed 27 Jul 2011, 15:20    Post_subject: Re: speeding up scripts
Sub_title: waht I learned, what I wanna learn
 

sc0ttman wrote:
Also, apparently, ash does not support
Code:
${VARNAME%%cutmeofftheend}
or
Code:
${VARNAME##cutmeoffthestart}
but I found this not to be true.

Those are Bourne-compatible (dash also supports them).
What is a bashism is the ${VAR/a/b} notation. Newer versions of busybox ash support it, but it's not portable.

Quote:
1. Is it possible to execute a command defined in the main script, from within a GTK-Dialog button? If so, that would be great, but at the moment, anything like this in the GTKDialog GUI
Code:

<button>
<action>my_function_name</action>
</button>
simply returns (in the terminal)
Code:
my_function_name: command not found

This is odd. I don't recall encountering that. Did you make sure to export the function?

Quote:
2. Does anyone know anything else that needs to be done to make bash scripts work in ash?
Which other features are missing from ash?

Just search the ABS-guide for bashisms... he's pretty good at pointing them out. I can't really remember them now, but one that comes to mind is extracting a field our of a string: ${VAR:m:n} (or whatever).
Also, "let" is a bashism (though supported by busybox ash).

_________________
What's the ugliest part of your body?
Some say your nose
Some say your toes
But I think it's your mind
Back to top
View user's profile Send_private_message 
technosaurus


Joined: 18 May 2008
Posts: 4353

PostPosted: Sat 30 Jul 2011, 02:42    Post_subject:  

Busybox hush supports parameter expansion, while ash doesn't
Ex. touch /tmp/{a,b}
Hush # /tmp/a and /tmp/b are touched
Ash # /tmp/{a,b} is touched (yes a file starting with "{" could be annoying)

Substring manipulations were mentioned $:{## & %% & // & :-default}, but...
With a little creativity they can replace sed, grep, cut, tr, truncate, head, basename, dirname and more. It is almost always faster to manipulate a string than to call an external program.

For compatibility and speed, make scripts compatible with
http://busybox.net/downloads/BusyBox.html

Use pipes with programs that can use them.
But don't use cat as the first command in a pipe if the second command can read the file itself.

Use strings instead of files if possible.

If you have to use a temp file, /dev/shm is the fastest location because it is in memory

Complicated if then elses can be reduced to...
[ $DISPLAY ] && xmessage desktop ¦¦ echo console
This format works with any variable _and_ you can include string manipulations

You can use $((<math>)) for fast integer math or busybox awk for non-integer.

it is much faster to call a function than a separate script (~20x)

...most importantly, you need to be able to accurately measure the changes

Sorry, no direct examples atm, but here is a way to time functions or external scripts/binaries (time doesn't work on functions)

Code:
time2() {
STARTDELTA=`date +%s.%N`
STOPDELTA=`date +%s.%N`
#needed to remove the time it takes to call date
STARTTIME=`date +%s.%N`
$@
ENDTIME=`date +%s.%N`
awk "BEGIN{print $ENDTIME - $STARTTIME +$STARTDELTA -$STOPDELTA }" >/dev/stderr
}

_________________
Web Programming - Pet Packaging 100 & 101
Back to top
View user's profile Send_private_message 
alienjeff


Joined: 08 Jul 2006
Posts: 2291
Location: Winsted, CT - USA

PostPosted: Sat 30 Jul 2011, 14:09    Post_subject:  

A couple/three years back, someone on here was expermenting with parsing or optimizing the scripts in Puppy Linux. He used a different term, though. Sadly, I don't recall the username or post Sj. Perhaps someone else has a better memory.
_________________
hangout: ##b0rked on irc.freenode.net
diversion: http://alienjeff.net - visit The Fringe
quote: "The foundation of authority is based upon the consent of the people." - Thomas Hooker

Back to top
View user's profile Send_private_message 
sc0ttman


Joined: 16 Sep 2009
Posts: 2385
Location: UK

PostPosted: Fri 12 Aug 2011, 16:05    Post_subject:  

Thanks for the info guys, helpful as usual.

I found a nice site that details this stuff as well: https://wiki.ubuntu.com/DashAsBinSh


I found a nice little script that seems to work well,
called 'checkbashisms'.. It's a PERL script, runs fine, attached in main post.

_________________
Akita Linux, VLC-GTK, Pup Search, Pup File Search

Edited_time_total
Back to top
View user's profile Send_private_message 
sc0ttman


Joined: 16 Sep 2009
Posts: 2385
Location: UK

PostPosted: Fri 12 Aug 2011, 16:14    Post_subject: Re: speeding up scripts
Sub_title: waht I learned, what I wanna learn
 

Dougal wrote:
This is odd. I don't recall encountering that. Did you make sure to export the function?

As usual don't really know whats going on, but I used 'set -a', which helps the GUIs find them normally for me... I tried 'export f', but it's apparently not supported by ash.

Also I read [[ blah ]] is a bashism.. And then the checkbashisms script I attached in main post pointed it out to me...

_________________
Akita Linux, VLC-GTK, Pup Search, Pup File Search
Back to top
View user's profile Send_private_message 
sc0ttman


Joined: 16 Sep 2009
Posts: 2385
Location: UK

PostPosted: Fri 12 Aug 2011, 16:46    Post_subject:  

After running 'checkbashisms' on a number of scripts, it says that

Code:
echo -e "blah blah \n blah"

is not supported .. and to use actual new lines inside the ""

EDIT: Also printf can be used... I have made the init script in initrd.gz use the ash shell, replacing all echo -e with printf .. I need to add a static printf to initrd, and then it should work fine - already boots a bit quicker, but the green 'done' bits won't show without printf.

But maybe printf is slower than echo -e in the first place... Hmm...

and a \n added to the end of the string to produce the same output (I think)...

Anyway, it seems most puppy scripts have only 3 or 4 bashisms in there,
and the fixes are often small and easy. Mostly things like:

echo -e

[[ blah ]] ... also [ blah && blah ]

&>/dev/null

_________________
Akita Linux, VLC-GTK, Pup Search, Pup File Search

Edited_time_total
Back to top
View user's profile Send_private_message 
amigo

Joined: 02 Apr 2007
Posts: 2257

PostPosted: Sat 13 Aug 2011, 02:32    Post_subject:  

Much of the time I go the other way and depend more on bash, instead of less. Usually, slowness in scripts is due to poor design-making too many checks, unnecessary loops or from calling external programs. Using even medium-level bash functionality can replace many of the external calls.

A couple of years ago I wrote a collection of bash scripts which mimic the action of about a dozen common cli tools -like basename, dirname, cut, etc., which are widely used in shell scripts. Pasting these into a bash script *as functions* can really speed things up.
You can have a look here:
http://distro.ibiblio.org/pub/linux/distributions/amigolinux/download/AmigoProjects/BashTrix/

All those there use absolutely no external calls. When writing scripts, I particularly try to avoid calling sed, awk and grep. When you see a one-line with multiple calls to cat, cut, grep you should know something is wrong. I see lots of puppy routines which pipe together a bunch of commands -if that is all inside a loop which gets run several times it really puts a drag on things.
Back to top
View user's profile Send_private_message 
sc0ttman


Joined: 16 Sep 2009
Posts: 2385
Location: UK

PostPosted: Sat 13 Aug 2011, 08:28    Post_subject:  

amigo wrote:
Much of the time I go the other way and depend more on bash, instead of less. Usually, slowness in scripts is due to poor design-making too many checks, unnecessary loops or from calling external programs. Using even medium-level bash functionality can replace many of the external calls.

A couple of years ago I wrote a collection of bash scripts which mimic the action of about a dozen common cli tools -like basename, dirname, cut, etc., which are widely used in shell scripts. Pasting these into a bash script *as functions* can really speed things up.
You can have a look here:
http://distro.ibiblio.org/pub/linux/distributions/amigolinux/download/AmigoProjects/BashTrix/

All those there use absolutely no external calls. When writing scripts, I particularly try to avoid calling sed, awk and grep. When you see a one-line with multiple calls to cat, cut, grep you should know something is wrong. I see lots of puppy routines which pipe together a bunch of commands -if that is all inside a loop which gets run several times it really puts a drag on things.

Nice, thanks for the link to BashTrix.. I'll have a look...
There's also bashbox by techno, yet to dig deep into that one...

And I no what you mean about making the bash itself faster as a better alternative, but I must face facts, while I can easily use PHP to do such things (remove loops, find the fastest in-built methods, etc), sadly I my BASH/shell knowledge is pretty damn poor in comparison.. Although I know enough to (try) reduce the number of external commands ..

Aside from loops, cat, awk, grep and sed, are there any other time hogs we should avoid? And what should we use instead? If statements are not resource or time intensive, right? (providing the test is a simple one of course)..

Is using . /path/to/functions as quick as having them in the main file?

_________________
Akita Linux, VLC-GTK, Pup Search, Pup File Search
Back to top
View user's profile Send_private_message 
amigo

Joined: 02 Apr 2007
Posts: 2257

PostPosted: Sat 13 Aug 2011, 15:17    Post_subject:  

I'll start from your last question -sourcing a file obviously takes longer than reading it (the content) directly inline. But sourcing a common set of code has its' advantages.

About BashTrix and bashbox -long before technosaurus came out with bashbox, I had thought of producing a 'bashybox' -a multi-call thingy just as he came up with, incorporating all the BashTrix functionaliyt in one long script. But I tired of working with BashTrix as i have my plate full anyway. To be clear, I don't mean to take any credit or proprietorship of techno's ideas and work. Nor do I mean to talk it down. The thing about bashbox is that they are pretty much (all?) puppy-specific utilities which ahve been rolled into one. BashTrix contains replacements for general-usage GNU utilities -which can also be incorporated into your bash scripts at will. Each one is independent of the others -purposely- though most of them contains some code duplicated in others. The code is always significantly longer than is really needed, but they are meant to be easy to read, understand and modify.

Running through loops which iterate though lots of files or lines of a file is where most serious slow-downs occur. When there is no other way to eliminate the loop cycles. it is often better to go ahead and use the more sophisticated binary tools like sed, awk and grep. But, try to find a way to minimize the external calls by mixing them with inlined bash functions.

More tips:
case statements are much faster than if statements and more flexible except when you need to use test ([[]]) to check on file attributes. case staements can duplicate most of the functionality of grep for simple pattern matching.

When forced to use grep, try to make it possible to use fgrep -it is a smaller binary and uses no regex's (only pattern matching so you may need to adjust the comparison algorithm).

Good general programming practices vis-a-vis the outline/design of your program -spend lots of time working out as many loops and redundant if's as possible.

Reduce file/disk IO as much as possible. Often it is easier to write the output of some loop or grep into a file for later processing by another routine. But, if the answer or output from the original processing is just one line, or can be made into an array(or faked array), then it will be fsater to simply capture that output into a variable. Example: Instead of 'grep this-pattern * > file', do this:
var=$(grep this-pattern *)

When using grep, limit the response to a single match when possible, using the '-m1' option. For instance, you want to know if a certain pattern occurs in a file or files which may contain many matches. Using the -m (number of matches) will make grep exit once the first match is found instead of continuing through the whole file.

That said, good form and good comments are more important. If you write good comments about what your routine should be doing, you'll find it easier to improve the routine later when you have more experience. Fine optimization should be tackled later, but good planning at the first should give you a better platform to work from. It's usually good to do a first stab at the script, then start over from scratch at least once -before you start adding all the feature creep.

If you need to use 'which' to locate programs, use it only once for each program and store the result in a variable. Speaking of 'which', you can use the bash built-in 'type' to do that for you and it will also show you any functions which have been set up.

Shell programming and speed are not really meant to go together, so don't write hard-to-understand, fancy one-liners just for the sake of a few micro-seconds. The beauty of shell programming is the ease of altering/correcting the code and being able to quickly check the result.

Although there are many advanced features of bash (shopt & Co., arrays, etc) I generally avoid them -your bash may have some options compiled in which others don't have. And besides, the less-used features are less readable and you are likely to need to re-study them each time you want to use them -or have to modify code you wrote earlier.

Avoid lenghty multi-call pipes:
cat file.tx |while read LINE ; do blah blah |cut |grep |grep -v |awk |sed
One pppy routine (for working with package repos) was using a loop which, in each iteration, ran a series of 27 piped commands! No wonder some contrib was able to write a small C program which ran much faster. Once, I had written a routine which was looping through some input from hundreds of files. The routine was using multiple if statements. When I re-wrote it to use a single case staement, it ran in 3% of the time of the original!

Somewhere here on the forum is a thread which shows several examples of duplicating grep behaviour using case. The grep replacement in BashTrix shows the most basic 'starts-with', 'ends-with' and 'contains' functionality (at the end of the script). Oddly, when I stared writing BashTrix, one of the first programs I tackled was 'cut', which turned out to be the hardest one (except for wget). But the bash cut lets you specify delimiters of more than one char, so is very useful. Elsewhere, I have written small functions name lof, lol, rof and rol -that's left-of-first, left-of-last, right-of-first and right-of-last. Using bash to process strings is much, much faster than echo-cut-rev-cut-rev, and faster than sed too.
Back to top
View user's profile Send_private_message 
Dougal


Joined: 19 Oct 2005
Posts: 2505
Location: Hell more grotesque than any medieval woodcut

PostPosted: Sun 14 Aug 2011, 10:21    Post_subject:  

sc0ttman wrote:
After running 'checkbashisms' on a number of scripts, it says that

Code:
echo -e "blah blah \n blah"

is not supported .. and to use actual new lines inside the ""

Yeah, the builtin echo in dash acts always as "echo -e"... I just hacked dash to ignore the "-e".

Note that with Busybox ash this is not a problem.

A simple workaround to this problem is to replace all instances of "echo" with "\echo", which will ensure that /bin/echo is used, but will slow down your script...

_________________
What's the ugliest part of your body?
Some say your nose
Some say your toes
But I think it's your mind
Back to top
View user's profile Send_private_message 
Dougal


Joined: 19 Oct 2005
Posts: 2505
Location: Hell more grotesque than any medieval woodcut

PostPosted: Sun 14 Aug 2011, 10:54    Post_subject:  

amigo wrote:
Much of the time I go the other way and depend more on bash, instead of less. Usually, slowness in scripts is due to poor design-making too many checks, unnecessary loops or from calling external programs. Using even medium-level bash functionality can replace many of the external calls.

I generally agree with this, but have some reservations.

Doing things smartly is obviously the best thing (see my x10 speed improvement of the xorgwizard from a few years ago).

Using builtins is also best wherever possible, even if the code ends up a little longer -- just like when coding in C you don't keep using system()... The way I look at it, grep, for example, is a C program optimized for finding strings fast, while the shell is a C program optimized for processing shell (builtin) commands fast... so more LOC is just fine as long as there are no subshells (and IO! Aggregate things you want to echo, if possible).

However, it doesn't have to be bashisms, Bourne shell builtins can solve most problems and sometimes the speed penalty of using Bash only for one place where the bashism is needed (and is not speed critical) makes it not worthwhile.

You also want to think of the amount of data you need to handle: if you need to find something in a short text file, a loop with a case structure and shell internals might be fastest, but at some stage it becomes faster to just open a subshell and use grep, as it is, after all, optimized for that. (A few months ago I actually implemented things like awk '{print $1}' in shell (based on positional parameters) and tested and saw when it became faster to just call awk... I PMed it to Zigbert and don't seem to have it anymore.)

_________________
What's the ugliest part of your body?
Some say your nose
Some say your toes
But I think it's your mind
Back to top
View user's profile Send_private_message 
Dougal


Joined: 19 Oct 2005
Posts: 2505
Location: Hell more grotesque than any medieval woodcut

PostPosted: Sun 14 Aug 2011, 14:55    Post_subject:  

amigo wrote:
When forced to use grep, try to make it possible to use fgrep -it is a smaller binary and uses no regex's (only pattern matching so you may need to adjust the comparison algorithm).

Hmm
Code:
# cat /bin/fgrep
#!/bin/sh
exec grep -F ${1+"$@"}

I've just trained myself to type "grep -F" by default...

Generally agree with the rest.

_________________
What's the ugliest part of your body?
Some say your nose
Some say your toes
But I think it's your mind
Back to top
View user's profile Send_private_message 
Display_posts:   Sort by:   
Page 1 of 3 Posts_count   Goto page: 1, 2, 3 Next
Post_new_topic   Reply_to_topic View_previous_topic :: View_next_topic
 Forum index » Off-Topic Area » Programming
Jump to:  

Rules_post_cannot
Rules_reply_cannot
Rules_edit_cannot
Rules_delete_cannot
Rules_vote_cannot
You cannot attach files in this forum
You can download files in this forum


Powered by phpBB © 2001, 2005 phpBB Group
[ Time: 0.1485s ][ Queries: 13 (0.0096s) ][ GZIP on ]