Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
971 views
in Technique[技术] by (71.8m points)

bash - Read list of files on unix and run command

I am pretty new at shell scripting and I have been struggling all day to figure out how to perform a "for" command. Essentially, what I am trying to do is the following:

I have a list.txt file with a bunch of names:

name1
name2
name3

for every name in the list, there are two different files, each with a different ending to the name. Ex:

name1_R1
name1_R2

The program I am trying to run is called sickle. Basically, it takes two files (that correspond to each other) and runs an analysis on them, hence requiring me to have this naming scheme. The sickle command is as follow:

sickle pe -f input_file1.fastq -r input_file2.fastq -t sanger 

If someone could help me out, at least just by telling me how to get unix to read the list of files and treat each line independently I think I could go from there. I tried a few things, but none of them worked.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

There are a couple of ways to do it. Since the names are 'one per line' in the data file, we can assume there are no newlines in the file names.

for loop

for file in $(<list.txt)
do
    sickle pe -f "${file}_file1.fastq" -r "${file}_file2.fastq" -t sanger
done

while loop with read

while read file
do
    sickle pe -f "${file}_file1.fastq" -r "${file}_file2.fastq" -t sanger
done < list.txt

The for loop only works if there are no blanks in the names (nor other white-space characters such as tabs). The while loop is clean as long as you don't have newlines in the names, though using while read -r file would give you even better protection against the unexpected. The double quotes around the file name in the for loop are decorative (but harmless) because the file names cannot contain blanks, but those in the while loop prevent file names containing blanks from being split when they should not be split. It's often a good idea to quote variables every time you use them, though it strictly only matters when the variable might contain blanks but you don't want the value split up.

I've had to guess what names should be passed to the sickle command since your question is not clear about it — I'm 99% sure I've guessed wrong, but it matches the different suffixes in your sample command assuming the base name of file is input. I've omitted the trailing backslash; it is the 'escape' character and it is not clear what you really want there.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...