Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
167 views
in Technique[技术] by (71.8m points)

python - script to get the max from column based on other column values

I need a script to read in a csv file(orig.csv) and output a reformatted csv file(format.csv)

The orig csv file will look like this:

Time,Label,frame,slot,SSN,Board,BT,SRN,LabelFrame,SRNAME,LabelID,Integrity,MAX_val
2014-03-17,lableA,1,8,0,,SPUB,1,NNN,NNN,1,100%,60
2014-03-17,lableA,2,22,0,,GOUC,2,NNN02,NNN02,1,100%,
2014-03-17,lableB,2,8,0,,CCCB,2,NNN02,NNN02,1,100%,59
2014-03-17,lableB,1,2,4,,CCCB,1,NNN,NNN,1,100%,48
2014-03-17,lableB,1,0,6,,CCCB,1,NNN,NNN,1,100%,59
2014-03-17,lableC,2,6,0,,SCUA,2,NNN02,NNN02,1,100%,55
2014-03-17,lableD,2,4,1,,CCCB,2,NNN02,NNN02,1,100%,59
2014-03-17,lableD,0,2,7,,CCCB,0,MPS,MPS,1,100%,46
2014-03-17,lableD,1,4,3,,CCCA,1,NNN,NNN,1,100%,43
2014-03-17,lableE,2,2,7,,CCCB,2,NNN02,NNN02,1,100%,58

The reformatting will go through the orig csv file and get all the unique names from column2(Label) and the corresponding max of the values from column 13(MAX_val), see the example below.(E.G. lableA-E is of interest and for lableB the max of [59,48,59] is of interest) I also want to it to cater for a dynamic orig.csv file where possible.

the reported csv file will look like this:

Time,Label,frame,slot,SSN,Board,BT,SRN,LabelFrame,SRNAME,LabelID,Integrity,MAX_val
2014-03-17,lableA,1,8,0,,CCCB,1,NNN,NNN,1,100%,60
2014-03-17,lableB,2,8,0,,CCCB,2,NNN02,NNN02,1,100%,59
2014-03-17,lableC,2,6,0,,SCUA,2,NNN02,NNN02,1,100%,55
2014-03-17,lableD,2,4,1,,CCCB,2,NNN02,NNN02,1,100%,59
2014-03-17,lableE,2,2,7,,CCCB,2,NNN02,NNN02,1,100%,58

Note : i am new to scripting so not sure what is the best language to write this in but was thinking along the lines of bash, shell, perl but open to others.

EDIT:: this is how I would look to pull in my csv data

<!DOCTYPE html>
<html lang="en">
    <head>
        <meta charset="utf-8">
        <title>D3: Loading data from a CSV file</title>
        <script type="text/javascript" src="d3/d3.v3.js"></script>
    </head>
    <body>
        <script type="text/javascript">

            d3.csv("XPU max load_format1(XPU load).csv", function(data) {
                console.log(data);
            });

        </script>
    </body>
</html>
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

This is a Perl solution to your problem. It keeps a hash %data of the data for each label that has the highest value in for MAX_val. It also keeps a list of labels in @labels that keeps track of new labels as they are encountered, so as to keep the output in the same order as the input.

As I said in my comment, there is a line in your data that has an empty column 13. I have added code to treat this as zero, which is unnecessary if that is an error in your post.

use strict;
use warnings;

open my $orig_fh,   '<', 'orig.csv'   or die $!;
open my $format_fh, '>', 'format.csv' or die $!;

print $format_fh scalar <$orig_fh>; # Copy header line

my %data;
my @labels;

while (<$orig_fh>) {
  chomp;
  my @fields = split /,/, $_, -1;
  my ($label, $max_val) = @fields[1,12];
  if ( exists $data{$label} ) {
    my $prev_max_val = $data{$label}[12] || 0;
    $data{$label} = @fields if $max_val and $max_val > $prev_max_val;
  }
  else {
    $data{$label} = @fields;
    push @labels, $label;
  }
}

for my $label (@labels) {
  print $format_fh join(',', @{ $data{$label} }), "
";
}

output

Time,Label,frame,slot,SSN,Board,BT,SRN,LabelFrame,SRNAME,LabelID,Integrity,MAX_val
2014-03-17,lableA,1,8,0,,SPUB,1,NNN,NNN,1,100%,60
2014-03-17,lableB,2,8,0,,CCCB,2,NNN02,NNN02,1,100%,59
2014-03-17,lableC,2,6,0,,SCUA,2,NNN02,NNN02,1,100%,55
2014-03-17,lableD,2,4,1,,CCCB,2,NNN02,NNN02,1,100%,59
2014-03-17,lableE,2,2,7,,CCCB,2,NNN02,NNN02,1,100%,58

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...