Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
558 views
in Technique[技术] by (71.8m points)

sh - awk: program limit exceeded: maximum number of fields size=32767

when i am running my shell script in ubuntu 14.04 i am getting an error like below

awk: program limit exceeded: maximum number of fields size=32767
    FILENAME="ensemble_features/Training_BOOKS_POS_Bigram_with_stemming_BOOLEAN_FVT.csv" FNR=1 NR=1
cut: invalid byte, character or field list
Try 'cut --help' for more information.
-1
cut: invalid byte, character or field list
Try 'cut --help' for more information.
6656
user@user-Lenovo-IdeaPad-Z410:~/Thesis/BOOKS$ bash Training_POS_Uni_Bi.sh
awk: program limit exceeded: maximum number of fields size=32767
    FILENAME="ensemble_features/Training_BOOKS_POS_Bigram_with_stemming_BOOLEAN_FVT.csv" FNR=1 NR=1
cut: invalid byte, character or field list
Try 'cut --help' for more information.
-1
cut: invalid byte, character or field list
Try 'cut --help' for more information.
6656

i am adding my script below

cd /home/user/Thesis/BOOKS/Features/Training/POSITIVE/
fname="ensemble_features"
mkdir $fname

cp /home/user/Thesis/BOOKS/Features/Training/POSITIVE/Training_BOOKS_POS_unigram_FVT_with_stemming_BOOLEAN.csv ensemble_features/
cp /home/user/Thesis/BOOKS/Features/Training/POSITIVE/Training_BOOKS_POS_Bigram_with_stemming_BOOLEAN_FVT.csv ensemble_features/


mkdir "proces"
cnt=0
for file in $fname/*
do
    #Number of columns
    num=`awk 'BEGIN {FS=",";c=0};{if (c==0 ){print NF; c=1}}END{}' $file`
    if [[ cnt -eq 0 ]];then
        cut -d, -f $num $file >class.csv
        cnt=1;
    fi
    num=$((num-1))
    echo $num
    nfname=`basename $file`

    #Cut the columns
    cut -d',' -f1-$num $file > proces/cutlast$nfname
done
#Paste multiple csv
paste -d',' proces/* > comb.csv
paste -d, comb.csv class.csv > Training_BOOKS_Unigram_Bigram_POS_Ensemble_Features_BOOLEAN.csv
rm comb.csv
rm class.csv
rm -r proces
rm -r ensemble_features

my input files contain 38453 columns and 6656 columns respectively.Anybody can help me to correct this error?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

On ubuntu awk is a soft link to some variant of awk, nowadays by default it is mawk. Try to install gawk. gawk does not have a limitation on the number of fields in a record.

BTW, python may be a better long term solution, if you got the time to learn it.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...