gawk
gawk
I'm hoping someone can help me with this gawk problem I have.
Create two small files.
-----------------Cut------------------------------
Filename: testgawk with contents:
Uppercase
uppercase
-------------------Cut----------------------------
Filename: test-gawk.sh with contents:
#!/bin/sh
gawk '/Uppercase/ { print $0 } BEGIN {
IGNORECASE = 1
}
' testgawk
--------------------------Cut---------------------
Make test-gawk.sh executable.
When test-gawk.sh is run in a terminal you get:
Uppercase
uppercase
Now change IGNORECASE = 1 to IGNORECASE = 0
Run test-gawk.sh again, this time you get:
Uppercase only.
Now this is correct because /Uppercase/ is matched.
If you were to change /Uppercase/ to /uppercase/ only
uppercase would be found.
IGNORECASE = 0 is case sensitive (default)
IGNORECASE = 1 means ignore case.
Sorry about the long explanation, here comes the issue.
I want to be able to place a variable on IGNORECASE.
EG: IGNORECASE = variable
I also want to be able to define this variable outside of the
gawk function but use it inside the gawk function.
Something like this:
#!/bin/sh
variable="1"
gawk '/Uppercase/ { print $0 } BEGIN {
IGNORECASE = variable
}
' testgawk
Sounds easy doesn't it, but it's not. Yes, I can hear you thinking,
why not put a $ before variable, nope, gawk doesn't like that.
I would love a solution on this. I don't need an alternative solution
with awk, sed, grep or anything else. It has to be gawk.
TIA.
Create two small files.
-----------------Cut------------------------------
Filename: testgawk with contents:
Uppercase
uppercase
-------------------Cut----------------------------
Filename: test-gawk.sh with contents:
#!/bin/sh
gawk '/Uppercase/ { print $0 } BEGIN {
IGNORECASE = 1
}
' testgawk
--------------------------Cut---------------------
Make test-gawk.sh executable.
When test-gawk.sh is run in a terminal you get:
Uppercase
uppercase
Now change IGNORECASE = 1 to IGNORECASE = 0
Run test-gawk.sh again, this time you get:
Uppercase only.
Now this is correct because /Uppercase/ is matched.
If you were to change /Uppercase/ to /uppercase/ only
uppercase would be found.
IGNORECASE = 0 is case sensitive (default)
IGNORECASE = 1 means ignore case.
Sorry about the long explanation, here comes the issue.
I want to be able to place a variable on IGNORECASE.
EG: IGNORECASE = variable
I also want to be able to define this variable outside of the
gawk function but use it inside the gawk function.
Something like this:
#!/bin/sh
variable="1"
gawk '/Uppercase/ { print $0 } BEGIN {
IGNORECASE = variable
}
' testgawk
Sounds easy doesn't it, but it's not. Yes, I can hear you thinking,
why not put a $ before variable, nope, gawk doesn't like that.
I would love a solution on this. I don't need an alternative solution
with awk, sed, grep or anything else. It has to be gawk.
TIA.
Strange. It is possible to set a variable in an AWK script by passing a name=value pair with the -v switch, yet when I tried it with gawk, IGNORECASE can be set by a constant ie IGNORECASE=1 but not IGNORECASE=variable. Perhaps its just a version of gawk I have with a bug. Busybox awk worked as advertised.
The result I got from the above was:
Code: Select all
#!/bin/sh
AWKPROGRAM='BEGIN { IGNORECASE=casevar ; } /Uppercase/ { printf("\nIGNORECASE=%s, %s, casevar= %d\n",IGNORECASE,$0,casevar); } '
#
echo -e "\n\n\tDoesnt work:\n"
caseval=0
echo "UPPERCASE" | gawk -v casevar=$caseval "$AWKPROGRAM"
caseval=1
echo "UPPERCASE" | gawk -v casevar=$caseval "$AWKPROGRAM"
#
# works (busybox awk)
#
echo -e "\n\n\tWorks in busybox awk:\n"
caseval=0
echo "UPPERCASE" | awk -v casevar=$caseval "$AWKPROGRAM"
caseval=1
echo "UPPERCASE" | awk -v casevar=$caseval "$AWKPROGRAM"
#
echo -e "\n\n\tWorks in Gawk if IGNORECASE is set by a constant:\n"
AWKPROGRAM='BEGIN { if (casevar) { IGNORECASE=1 } ; } /Uppercase/ { printf("\nIGNORECASE=%s, %s, casevar= %d\n",IGNORECASE,$0,casevar); } '
caseval=0
echo "UPPERCASE" | gawk -v casevar=$caseval "$AWKPROGRAM"
caseval=1
echo "UPPERCASE" | gawk -v casevar=$caseval "$AWKPROGRAM"
Code: Select all
Doesnt work:
IGNORECASE=0, UPPERCASE, casevar= 0
IGNORECASE=1, UPPERCASE, casevar= 1
Works in busybox awk:
IGNORECASE=1, UPPERCASE, casevar= 1
Works in Gawk if IGNORECASE is set by a constant:
IGNORECASE=1, UPPERCASE, casevar= 1
Hi guys.
For the sake of argument, let's say that I have a list of members for a
small stamp collectors club. There are, say, 15 of us. And there are two
named "Peter".
The list is this format: Last-Name First-Name Street-Address E-Mail
To fish out the coordinates of the two Peters, I type this at console:First I define var. A.
Then, to query on field #2 (the "First-Name" field), inside the awk line, I
suspend awk's apostrophes around var. $A to let var. $A be "absorbed"
by the awk line as definition of field #2.
What happens is that awk is suspended for a millisecond, so bash re-
surfaces to provide the contents of the $A variable. And then we have
another apostrophe, which ends the suspension, and awk resumes its work.
(Of course I could've used the name "Peter" itself in awk to define field #2,
but there are times in bash scripting where we need the above formulation.)
I can't remember where I found this trick, but it's been fool-proof for me
ever since.
-v never worked for me, and I found that inserting an awk variable after
the awk formula but before the file name was iffy.
I hope this answers Smokey01's question.
BFN.
For the sake of argument, let's say that I have a list of members for a
small stamp collectors club. There are, say, 15 of us. And there are two
named "Peter".
The list is this format: Last-Name First-Name Street-Address E-Mail
To fish out the coordinates of the two Peters, I type this at console:
Code: Select all
A=Peter;awk '$2 ~ /'$A'/ { print }' StampCollClub.lst
Then, to query on field #2 (the "First-Name" field), inside the awk line, I
suspend awk's apostrophes around var. $A to let var. $A be "absorbed"
by the awk line as definition of field #2.
What happens is that awk is suspended for a millisecond, so bash re-
surfaces to provide the contents of the $A variable. And then we have
another apostrophe, which ends the suspension, and awk resumes its work.
(Of course I could've used the name "Peter" itself in awk to define field #2,
but there are times in bash scripting where we need the above formulation.)
I can't remember where I found this trick, but it's been fool-proof for me
ever since.
-v never worked for me, and I found that inserting an awk variable after
the awk formula but before the file name was iffy.
I hope this answers Smokey01's question.
BFN.
- Attachments
-
- example.jpg
- Example using a draft of this post.
- (83.76 KiB) Downloaded 247 times
musher0
~~~~~~~~~~
"You want it darker? We kill the flame." (L. Cohen)
~~~~~~~~~~
"You want it darker? We kill the flame." (L. Cohen)
Interesting, though the code would need to be refined to assign IGNORECASE from a variable and take it's value into account when pattern matching.Code:
A=Peter;awk '$2 ~ /'$A'/ { print }' StampCollClub.lst
In fact there is more than one way to pass variables from a shell script to an AWK program.
One way is to pass a command argument and read via ARGV[nnn]. smokey01 wanted to use a variable so I presume ruled out this method.
It would also be possible to export a variable and use AWK's ENVIRON["VARIABLE"] function to read its contents.
And of course the main loop can dispense with the /<regexp>/ method of selecting which contents to print. Mostly I avoid using the slash notation instead move the logic into the main loop. Below is an example AWK program which I use to search multiple zip files for a search string.
It works by piping the contents of unzip -l which is read by the main loop, and passing via ARGV[] the filename being searched and the string to be matched:
Code: Select all
unzip -l $ZIPFILE | searchzip.awk $ZIPFILE $SEARCHSTRING
Code: Select all
#!/bin/awk -f
#
# Case insensitive index function:
#
function indexi(p1,p2) {
return int(index(toupper(p1),toupper(p2)));
}
#
BEGIN {
IGNORECASE=1;
fname=ARGV[1];
searchstring=ARGV[2];
ARGC=1;
multimatch=0;
if (searchstring ~ /,/ ) {
multimatch=1;
split(searchstring,search_string2,",");
}
} # E N D (BEGIN)
{
if (multimatch) {
if (indexi($0,search_string2[1]) > 0 && indexi($0,search_string2[2]) > 0)
{ printf("\nFN= %s,\t%s",fname,$0); }
} else {
if (indexi($0,searchstring) > 0 )
{ printf("\nFN= %s,\t%s",fname,$0); }
}
}
smokey: Try this
IGNORECASE=myawkcasevar+0
IGNORECASE has to be a number,
awk assumes that myawkcasevar is a string -
we typecast myawkcasevar into a number by doing a number operation
---------------
Musher0:
Its a bad,inefficient approach -injecting shell-vars into awk-code.Use the -v switch or read into ARGV instead.
---------------
Edit:Just edited a typo
Code: Select all
#!/bin/sh
#MYSHELLCASECTL=0 # obey case
MYSHELLCASECTL=1 # ignore case
echo "smokeyscase
sMoKeYsCaSe"|awk -v myawkcasevar=$MYSHELLCASECTL 'BEGIN{IGNORECASE=myawkcasevar+0;}
/smokeyscase/{print;}' #>outRES
#expected
#MYSHELLCASECTL=0 -> smokeyscase
#
#MYSHELLCASECTL=1 -> smokeyscase
# -> sMoKeYsCaSe
IGNORECASE has to be a number,
awk assumes that myawkcasevar is a string -
we typecast myawkcasevar into a number by doing a number operation
---------------
Musher0:
Its a bad,inefficient approach -injecting shell-vars into awk-code.Use the -v switch or read into ARGV instead.
---------------
Edit:Just edited a typo
Last edited by some1 on Wed 30 Aug 2017, 13:41, edited 1 time in total.
- MochiMoppel
- Posts: 2084
- Joined: Wed 26 Jan 2011, 09:06
- Location: Japan
@some1, success. This is actually working. I can declare a variable external to gawk and it's seen inside gawk as long as I export it like:some1 wrote:smokey: Try this
IGNORECASE=myawkcasevar+0Code: Select all
#!/bin/sh #MYSHELLCASECTL=0 # obey case MYSHELLCASECTL=1 # ignore case echo "smokeyscase sMoKeYsCaSe"|awk -v myawkcasevar=$MYSHELLCASECTL 'BEGIN{IGNORECASE=myawkcasevar+0;} /smokeyscase/{print;}' #>outRES #expected #MYSHELLCASECTL=0 -> smokeyscase # #MYSHELLCASECTL=1 -> smokeyscase # -> sMoKeYsCaSe
IGNORECASE has to be a number,
awk assumes that myawkcasevar is a string -
we typecast myawkcasevar into a number by doing a number operation
---------------
Musher0:
Its a bad,inefficient approach -injecting shell-vars into awk-code.Use the -v switch or read into ARGV instead.
---------------
Edit:Just edited a typo
export MYSHELLCASECTL=1
or
export MYSHELLCASECTL=0
If I place it in an if/then statement like this it fails:
Code: Select all
case_yes () {
MYSHELLCASECTL=0
}
export -f case_yes
#
case_no () {
MYSHELLCASECTL=1
}
export -f case_no
Code: Select all
<menuitem label="Case Sensitive Search" checkbox="false">
<variable>CHECKBOX</variable>
<action>echo Checkbox is $CHECKBOX now.</action>
<action>"if [ $CHECKBOX = true ]; then
case_yes
echo $MYSHELLCASECTL
else
case_no
echo $MYSHELLCASECTL
fi"</action>
<variable>CHECKBOX</variable>
</menuitem>
Thanks
Hi some1.some1 wrote:smokey: Try this(...)
#
---------------
Musher0:
Its a bad,inefficient approach -injecting shell-vars into awk-code.Use the -v switch or read into ARGV instead.
---------------
(...)
Please provide proof that it is bad? "From-the-pulpit arguments" do not
work with me. It's called "code injection", and it's very efficient!
Please see answer #1 here for an example (towards the bottom of
the page).
@Smokey01:
I don't know if it's relevant to your problem, but bash "text variables" can
be made uppercase or lowercase at will. Hopefully you know this trick?
Code: Select all
A=HELLO;echo "${A,,} --- ${A,}"
Result: hello --- hELLO
Code: Select all
B=hello;echo "${B^^} --- ${B^}"
Ref. : The Bash https://linux.die.net/man/1/bash/]Manual, theResult : HELLO --- Hello
paragraph entitled "case modification".
BFN.
musher0
~~~~~~~~~~
"You want it darker? We kill the flame." (L. Cohen)
~~~~~~~~~~
"You want it darker? We kill the flame." (L. Cohen)
- MochiMoppel
- Posts: 2084
- Joined: Wed 26 Jan 2011, 09:06
- Location: Japan
- MochiMoppel
- Posts: 2084
- Joined: Wed 26 Jan 2011, 09:06
- Location: Japan
@some1:musher0 asked a valid question about your flat rejection of variable injection.
IMO this is an important issue and needs some explanation from you or anybody else. I understand the potential security issue involved. Without knowing smokey01's code it's hard to say if this argument applies , but chances are that there is no risk at all . Would have made smokey01's task very easy. You also mentioned somewhere that variable injection into awk could create "speed bumps" - I didn't understand that. Enlighten us
IMO this is an important issue and needs some explanation from you or anybody else. I understand the potential security issue involved. Without knowing smokey01's code it's hard to say if this argument applies , but chances are that there is no risk at all . Would have made smokey01's task very easy. You also mentioned somewhere that variable injection into awk could create "speed bumps" - I didn't understand that. Enlighten us
Sorry I don't understand it either.MochiMoppel wrote:@some1:musher0 asked a valid question about your flat rejection of variable injection.
IMO this is an important issue and needs some explanation from you or anybody else. I understand the potential security issue involved. Without knowing smokey01's code it's hard to say if this argument applies , but chances are that there is no risk at all . Would have made smokey01's task very easy. You also mentioned somewhere that variable injection into awk could create "speed bumps" - I didn't understand that. Enlighten us