Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
241 views
in Technique[技术] by (71.8m points)

bash - Looping through an extensive file and use each column as input

I want to run one software that takes three inputs. When I am running analysis on the cluster I can use Slurm to get each column as input for a job array.

samplesheet="test.txt"
name=`sed -n "$SLURM_ARRAY_TASK_ID"p $samplesheet |  awk '{print $1}'`
foward=`sed -n "$SLURM_ARRAY_TASK_ID"p $samplesheet |  awk '{print $2}'`
reverse=`sed -n "$SLURM_ARRAY_TASK_ID"p $samplesheet |  awk '{print $3}'`

Then I can run the software with the command:

software --imput $foward $reverse --output $name

It works for a job array. I want to do something similar on my computer. For example, I have a file:

sample001   file001a  file001b
.
.
.
sample1000  file1000a  file1000b

The first column will be the prefix of the output file and, columns 1 and 2 are two input files.

I have tried to use a loop or xargs, but it works for one column. For example:

For example,

for line in test.txt
do
     # do something with $line here
done

or

cat test.txt | while read line 
do
   # do something with $line here
done

How can I do this?


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

For something like this, you're best to use read inside of a while loop, as such:

#!/bin/bash

line_num=0
while read line; do
    line_num=$((line_num + 1))
    part_1=$(echo $line | cut -d ' ' -f1)
    part_2=$(echo $line | cut -d ' ' -f2)
    part_3=$(echo $line | cut -d ' ' -f3)
    echo "Line ${line_num} Part 1: ${part_1} Part 2: ${part_2} Part 3: ${part_3}"
done < <(cat ./sample.txt)

sample.txt:

sample001   file001a  file001b
sample002   file002a  file002b
sample003   file003a  file003b
sample004   file004a  file004b
sample005   file005a  file005b
sample006   file006a  file006b
sample007   file007a  file007b
sample008   file008a  file008b
sample009   file009a  file009b
sample010   file010a  file010b

Output:

$ ./read-from-file.sh 
Line 1 Part 1: sample001 Part 2: file001a Part 3: file001b
Line 2 Part 1: sample002 Part 2: file002a Part 3: file002b
Line 3 Part 1: sample003 Part 2: file003a Part 3: file003b
Line 4 Part 1: sample004 Part 2: file004a Part 3: file004b
Line 5 Part 1: sample005 Part 2: file005a Part 3: file005b
Line 6 Part 1: sample006 Part 2: file006a Part 3: file006b
Line 7 Part 1: sample007 Part 2: file007a Part 3: file007b
Line 8 Part 1: sample008 Part 2: file008a Part 3: file008b
Line 9 Part 1: sample009 Part 2: file009a Part 3: file009b
Line 10 Part 1: sample010 Part 2: file010a Part 3: file010b

Note: I assume that your file is space delineated, hence the cut command. Also, the sample.txt does need a trailing newline for the cat command to work effectively in this case.

It could probably be prettied up a bit inside the while loop, but it is effective.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...