Link to home
Start Free TrialLog in
Avatar of Anthony Mellor
Anthony MellorFlag for United Kingdom of Great Britain and Northern Ireland

asked on

AWK Why does this work and this not? Add a field to existing file, multiplied x another existing field.

Why does the first one not work at all (no output) and why does the second one work?
awk '{print $1,$1*12+1}' in.txt > out2.txt
awk '{newval=$1*12+1; print $1,newval}' in.txt > out.txt

Top one doesn't work, second one does, though it seems to miss adding a comma separator so I am thinking I need OFS=, in there somewhere.
And: why does the second one work in that it repeats the existing file and adds a field, when it refers only to $1 ?
Is there an implied $0 somehow?

Idea is to add a field to and existing file, being that field times 12 then plus one.

So starting with

1,2,3

result if say $2 is the target to be multiplied and result added to the file

1,2,3,25

edit: so I have found this edited version of the first one above does work:
awk '{print $0,$1*12+1}' in.txt > out3.txt

except the result 25 is not comma separated, not sure what it is, I guess the default OFS .. space?
Avatar of Bill Prew
Bill Prew

You are correct that OFS defaults to a single space, and can be changed to a comma if you wanted that between fields in a print statement.

~bp
Keep in mind that the default input separator FS is a space, so if you want to split comma delimited input you need to set FS to a comma.

~bp
SOLUTION
Avatar of Gerwin Jansen
Gerwin Jansen
Flag of Netherlands image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Anthony Mellor

ASKER

When am I getting no output is it possible that is because I have not set the FS to comma? So when it looks for fields it only finds one or even none? Can't be none.. and .. so.. my first (later) example field is 2 so 12 x 2 plus one is 25 and the output actually looks like this:

my later input example is 2,4,6
and when I get output it looks like this:

2,4,6  25

so is it seeing only one field and acting on the first number it finds (2) and then adding its default space delimiter and then the result 25?

I don't suppose one can change the default to a comma permanently? I never use spaces as delimiters.
Right, you are correct that without changing the FS character AWK will parse on spaces, so all your comma delimited fields will be in $1 (as long as there are no spaces in the data.

I'm not aware of a generic way to override the AWK default of space.  There may be ways for specific AWK implementations, and I know you are on a Mac so I can't offer too much specific there.

You may be able to create a command alias on your system, I know in un*x bash shell you could do something like:

alias awkc='awk -F","'

Open in new window

Then instead of running your scripts with awk you could use awkc and it would already have the change of the FS to a comma for that session.  I'm not a Mac expert though so you would have to research that approach further.

~bp

Open in new window

no there weren't any spaces in my test data, so awk selected the first number it came across in the single field it could see, which is interesting in itself.

so I tried this, but it's left me at a > prompt... with no apparent q exit quit or escape.. aha! Ctrl-C , that's a rave from the grave.

 awk ‘FS=“,”; OFS=“,”; {print $0,$1*12+1}' DfTRoadSafety_Accidents_2008.CSV > DfTRoadSafety_Accidents_2008-out2.CSV

Open in new window


must be syntax, see if I can find it before a reply appears..

awk ‘FS=“,”; OFS=“,”; {print $0,$1*12+1}' in.txt > out2.txt

Open in new window


same error = {} means actions and selections (forget the correct word) don't have {}s

maybe this:

awk ‘BEGIN{FS="," OFS=“,”;} {print $0,$1*12+1}' DfTRoadSafety_Accidents_2008.CSV > DfTRoadSafety_Accidents_2008-out2.CSV

Open in new window


but why is BEGIN required?

hmm:  -bash: syntax error near unexpected token `}' ah! missing ; after FS !

gave up and put it in a file called YearAdd.awk

‘BEGIN{FS="," ; OFS=“,”;}{ print $0,$1*12+1}' DfTRoadSafety_Accidents_2008.CSV > DfTRoadSafety_Accidents_2008-out2.CSV

then ran

awk -f addyear.awk

and my output file has appeared, not looked in it yet... and.. it's an empty file.

and now:

awk -F "," 'BEGIN {OFS=","} { print $0,$1*12+1}' DfTRoadSafety_Accidents_2008.CSV > DfTRoadSafety_Accidents_2008-out2.CSV

having read Gerwin j's suggestion .. again.. maybe the penny has dropped, it hesitated like it's doing something this time. it's done what it was programmed to do, just not what I intended. And why has it worked with FS missing and just "," present? another "built in" thing?

better now with this:

awk  'BEGIN {FS="," ; OFS=","} { print $0,$1*12+1}' DfTRoadSafety_Accidents_2008.CSV > DfTRoadSafety_Accidents_2008-out2.CSV

Open in new window


But out put of the new field is always on a new line, preceded by a comma. Looks to me like an end of line or new line character is being applied that I don't want. Why?
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Ok, I have newly reborn skills doing that. It certainly looks like a carriage return, Windows is nowhere to be seen. I'll have a look. Suffice it to say there is nothing in the code that generates a new line?
The print statement will add a single "new line" after it prints the data.

I'd also look at the output file in a hexed and see what that looks like, just for additional clues.

~bp
here it is:
MacPro:awk ADM$ xxd dft.txt
00000000: 4163 6369 6465 6e74 5f49 6e64 6578 2c4c  Accident_Index,L
00000010: 534f 415f 6f66 5f41 6363 6964 656e 745f  SOA_of_Accident_
00000020: 4c6f 6361 7469 6f6e 0d2c 6865 6164 6572  Location.,header
00000030: 3161 0a32 3030 3832 3837 330d 2c32 3030  1a.20082873.,200
00000040: 380a 3230 3038 3238 3139 0d2c 3230 3038  8.20082819.,2008
00000050: 0a 

Open in new window


and there it is the 0a is a carriage return

so how I make the print print its 0a after adding the new column?
edit: ok, maybe I have an additional pre-existing cr in addition to the print end of line..

here is the input file, with its oa entries

016388d0: 3230 2c36 2c36 302c 302c 302c 302c 302c  20,6,60,0,0,0,0,
016388e0: 302c 302c 312c 312c 342c 302c 302c 322c  0,0,1,1,4,0,0,2,
016388f0: 312c 0d0a                                1,..

Open in new window

That looks like the output file.  But I do see some 0x0d characters in the data, I suspect they are in the input file as well.  So I think your problem is data driven, not the script or AWK.  It looks like there are 0x0d characters in the input file.

00000000: 4163 6369 6465 6e74 5f49 6e64 6578 2c4c  Accident_Index,L
00000010: 534f 415f 6f66 5f41 6363 6964 656e 745f  SOA_of_Accident_
00000020: 4c6f 6361 7469 6f6e 0d2c 6865 6164 6572  Location.,header
00000030: 3161 0a32 3030 3832 3837 330d 2c32 3030  1a.20082873.,200
00000040: 380a 3230 3038 3238 3139 0d2c 3230 3038  8.20082819.,2008
00000050: 0a
it's Windows!!
I've added a bit of the input file in my post just above. When I saw that I realised, these files are originating from a Windows machine in the UK. That 0d0a is conclusively Windows' signature.

So can I.. that's a new question.
Hope honour is satisfied, the Windows data problem has persisted through a number of my questions, and my question answerer has not given up mentioning it. It's nothing to do with AWK at all.
Assisted answer helped me see a light.
Thank you both.
You're welcome, tested on a Mac:

your sample: 2,4,6

is giving me this output: 2,25

with this command: awk -F "," 'BEGIN {OFS=","} {newval=$1*12+1; print $1,newval}' test.txt
what's that "," ?  isn't it FS="," ? Think I was struggling with that, it being why the 4 is missing.
You mean the -F "," or the OFS="," ?

awk -F "," 'BEGIN {OFS="->"} {newval=$1*12+1; print $1,newval}' test.txt

gives this:

2->25
I mean the mean the -F ","

What's that "," doing on its own?
I suspect that was a typo, but not sure.  I would expect that command line was actually:

awk -FS "," 'BEGIN {OFS=","} {newval=$1*12+1; print $1,newval}' test.txt

~bp
No = used for FS ?
Ah yes, sorry, good catch, I was too focused on the vairable name.  Definately need the equals sign.

awk -FS="," 'BEGIN {OFS=","} {newval=$1*12+1; print $1,newval}' test.txt

~bp
ok so my bonus question, why is the FS outside the {} brackets and the OFS inside?
My apologies to @Gerwin, these questions have bounced around so much with so many iterations I'm losing focus, very sorry.

The following is actually correct, there is indeed a -F command line option to set the field separator.  There is also a -f option (lower case f) to read the awk script from a file.  In addition, once in the script you can set the field separator variable by changing the FS variable.  So either of the following should work.

awk -F "," 'BEGIN {OFS="->"} {newval=$1*12+1; print $1,newval}' test.txt

awk 'BEGIN {FS=","; OFS="->"} {newval=$1*12+1; print $1,newval}' test.txt

~bp
That explains my confusion, thank you.
results to those two tests:

test.txt contains this:

1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38
1a,2a,3a,4a,5a,6a,7a,8a,9a,10a,11a,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38

Open in new window


MacPro:Pegasus ADM$ awk -F "," 'BEGIN {OFS="->"} {newval=$1*12+1; print $1,newval}' test.txt
1->13
1a->13
MacPro:Pegasus ADM$ awk -F "," 'BEGIN {OFS="->"} {newval=$1*12+1; print $1,newval}' test.txt
1->13
1a->13
MacPro:Pegasus ADM$ awk 'BEGIN {FS=","; OFS="->"} {newval=$1*12+1; print $1,newval}' test.txt
1->13
1a->13

Open in new window


and both work including a space between the } and the {  which leaves me perplexed.
>> I'm losing focus, very sorry.
@Bill - no problem ;)

>> why is the FS outside the {} brackets and the OFS inside?
@Anthony - it's a matter of personal preference. I keep the separator outside the program itself.

>> space between the } and the {  which leaves me perplexed.
@Anthony - that space is not needed. It looks nicer if you like. The minimal amount of spaces you need are needed to separate the program (the single quotes):
awk -F "," 'BEGIN{OFS=","}{print$1*12$2}' test.txt
thanks Gerwin :-)