Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
867 views
in Technique[技术] by (71.8m points)

bash - process data from text file and convert into csv

In our organization, every month a few jobs will run and collect data on server level and it will find what is running on the server and also perform some checks. These files are text files and copied to one repository server. The file name will be <servername>_20200911.log

This sample file checks for servers where postgreSQL is running.

Date Collected                  || 11-10-2020 03:20:42 GMT ||
Server Name                     || pglinux1             ||
Operating system                || RHEL                     || passed
OS Version                      || 6.9                      || passed
Kernel version                  || 2.6.32-735.23.1.el6      || passed
Kernel architecture             || x86-64                   || passed
Total Memory                    || 16 gig                   || passed
/opt/postgres fs free           || 32 gig                   || passed
/opt/postgres/data fs free      || 54 gig                   || passed
Is cron jobs exist              || yes                      || passed
Is postgres installed           || yes                      || passed
Postgres version >10            || no                       || failed
repmgr installed                || yes                      || passed
repmgr version  >4              || yes                      || passed
How may pg cluster running      || 3                        || Passed
pgbackrest installed            || yes                      || passed

We will get similar files for different technologies, like oracle, mysql, weblogic ... Every month we need to process these files and identify failed checks and work with the corresponding team. Now I am consolidating data for all postgreSQL/oracle. In my case I will get lot of files and read each text file and convert data to cvs as below

Date Collected, server name, OPerating system , OS Version,Kernel version,Kernel architecture,Total Memory, /opt/postgres fs free,/opt/postgres/data fs free,Is cron jobs exist,    
11-10-2020 03:20:42 GMT,pglinux1, RHEL, passed, passed, passed, passed , passed , passed, passed passed, failed  
11-10-2020 03:20:42 GMT,pglinux2, RHEL, passed, passed, passed, passed , passed , passed, passed passed, failed  
11-10-2020 03:20:42 GMT,pglinux3, RHEL, passed, passed, passed, passed , passed , passed, passed passed, failed

Initially I thought, convert these text files into CSV and pick the second row from each file, consolidate it into one file. I failed with this attempt, since some file data is not consistent. Now I am thinking to create a file called servercheck.txt with all the checks. Use this checks file to grep data in all files and print into a CSV file (one row per server).

#! /bin/bash
awk -v ORS='{print $0 ","} /tmp/servecheck.txt |sed 's/ *$//g' > serverchecks.csv
for file in `ls -lart *2020091t.log |awk '{print $9}'`
do  
     while read line
     do 
        grep "$line" $file |awk -F "||" '{print $3}' |awk -v ORS='{print $3 ","}' >> serverchecks.csv
     done < servercheck.txt
done 

The above code is writing data in same row (heading and data).

I hope I have provided all necessary details. Please help us with code, recommendation, and the best approach to handle this issue.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

this may help you

for inputfile in *
do
  awk -F "||" '
    { 
    for (i=1; i<=NF; i++)  {
      a[NR,i] = $i
    }
}   
NF>p { p = NF }
END {    
     for(j=1; j<=p; j++) {
        str=a[1,j]
     for(i=2; i<=NR; i++){
        str=str" "a[i,j];
     }
    print str
   }
  }' $inputfile| sed 's/ + /,/g' > tmpfile && mv tmpfile "$inputfile"
done  

Edited as suggested by @Ed Morton

for inputfile in *
  do 
  awk -F "||" '
   { 
    for (i=1; i<=NF; i++)  {
    a[NR,i] = $i
 }
 }   
  NF>p { p = NF }
  END {    
  for(j=1; j<=p; j++) {
    str=a[1,j]
 for(i=2; i<=NR; i++){
    str=str" "a[i,j];
 }
{gsub(/ + /,",",str); print str}
}
}' $inputfile > tmpfile && mv tmpfile "$inputfile"
done

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...