Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.1k views
in Technique[技术] by (71.8m points)

bash - Why does AWK refuse to sum up floats

I'm facing a rather strange problem withawk where I want to calculate the average of a column. This is the test input form my file:

1
2
0.4
0.250
0.225
0.221
0.220
0.218

And this is the script I'm trying to run:

awk '{sum += $1} END {print sum; print sum / NR}' ~/Desktop/bar.txt

What I expect as output is:

<calculated sum>
<calculated average>

But this is what I get invariably:

3
0,375

I've checked the formatting and characters of the input file etc. but I can't getawk to sum up those pesky floats.

Any ideas?

I'm running awk version 20070501 in bash 3.2.48 on OS X 10.8.5.

Update

As @sudo_O correctly deduced, the problem is my locale. Replacing the . with a , in the file yields the correct results. That's obviously not the solution I'm looking for though so I need to do something with my locale which is currently set to:

$ locale
LANG="de_CH.UTF-8"
LC_COLLATE="de_CH.UTF-8"
LC_CTYPE="de_CH.UTF-8"
LC_MESSAGES="de_CH.UTF-8"
LC_MONETARY="de_CH.UTF-8"
LC_NUMERIC="de_CH.UTF-8"
LC_TIME="de_CH.UTF-8"
LC_ALL=

I'd like to keep numeric, monetary and date locales I think. Which locale do I need to change (and how), to make awk work?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The problem is not awk here. Explicitly use floats and see what you get:

$ awk '{sum+=sprintf("%f",$1)}END{printf "%.6f
%.6f
",sum,sum/NR}' file
4.534000
0.566750

It looks like it's probably your locale as your output uses a , as the decimal separator so post the output of the locale command.


So using your LC_NUMERIC I can reproduce your results:

$ LC_NUMERIC="de_CH.UTF-8" awk '{sum += $1} END {print sum; print sum / NR}' file
3
0,375

The fix is to set your LC_NUMERIC or LC_ALL to C or anything else that use . as the decimal separator:

$ LC_NUMERIC="C" awk '{sum += $1} END {print sum; print sum / NR}' file
4.534
0.56675

See man locale for more information.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...