Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
538 views
in Technique[技术] by (71.8m points)

perl - How can I combine files into one CSV file?

If I have one file FOO_1.txt that contains:

FOOA

FOOB

FOOC

FOOD

...

and a lots of other files FOO_files.txt. Each of them contains:

1110000000...

one line that contain 0 or 1 as the number of FOO1 values (fooa,foob, ...)

Now I want to combine them to one file FOO_RES.csv that will have the following format:

FOOA,1,0,0,0,0,0,0...

FOOB,1,0,0,0,0,0,0...

FOOC,1,0,0,0,1,0,0...

FOOD,0,0,0,0,0,0,0...

...

What is the simple & elegant way to conduct that (with hash & arrays -> $hash{$key} = @data ) ?

Thanks a lot for any help !

Yohad

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

If you can't describe a your data and your desired result clearly, there is no way that you will be able to code it--taking on a simple project is a good way to get started using a new language.

Allow me to present a simple method you can use to churn out code in any language, whether you know it or not. This method only works for smallish projects. You'll need to actually plan ahead for larger projects.

How to write a program:

  1. Open up your text editor and write down what data you have. Make each line a comment
  2. Describe your desired results.
  3. Start describing the steps needed to change your data into the desired form.

Numbers 1 & 2 completed:

#!/usr/bin perl
use strict;
use warnings;

# Read data from multiple files and combine it into one file.
# Source files:
#    Field definitions: has a list of field names, one per line.
#    Data files:  
#      * Each data file has a string of digits.
#      * There is a one-to-one relationship between the digits in the data file and the fields in the field defs file.
# 
# Results File:
# * The results file is a CSV file.
# * Each field will have one row in the CSV file.
# * The first column will contain the name of the field represented by the row.
# * Subsequent values in the row will be derived from the data files.
# * The order of subsequent fields will be based on the order files are read.
# * However, each column (2-X) must represent the data from one data file.

Now that you know what you have, and where you need to go, you can flesh out what the program needs to do to get you there - this is step 3:

You know you need to have the list of fields, so get that first:

# Get a list of fields.
#   Read the field definitions file into an array.

Since it is easiest to write CSV in a row oriented fashion, you will need to process all your files before generating each row. So you'll need someplace to store the data.

# Create a variable to store the data structure.

Now we read the data files:

# Get a list of data files to parse
# Iterate over list

# For each data file:
#    Read the string of digits.
#    Assign each digit to its field.
#    Store data for later use.

We've got all the data in memory, now write the output:

# Write the CSV file.
# Open a file handle.

# Iterate over list of fields
# For each field
#   Get field name and list of values.
#   Create a string - comma separated string with field name and values  
#   Write string to file handle

# close file handle.

Now you can start converting comments into code. You could have anywhere from 1 to 100 lines of code for each comment. You may find that something you need to do is very complex and you don't want to take it on at the moment. Make a dummy subroutine to handle the complex task, and ignore it until you have everything else done. Now you can solve that complex, thorny sub-problem on it's own.

Since you are just learning Perl, you'll need to hit the docs to find out how to do each of the subtasks represented by the comments you've written. The best resource for this kind of work is the list of functions by category in perlfunc. The Perl syntax guide will come in handy too. Since you'll need to work with a complex data structure, you'll also want to read from the Data Structures Cookbook.

You may be wondering how the heck you should know which perldoc pages you should be reading for a given problem. An article on Perlmonks titled How to RTFM provides a nice introduction to the documentation and how to use it.

The great thing, is if you get stuck, you have some code to share when you ask for help.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...