If you have access to Bio Perl, you might find a solution such as the following.
#!/usr/bin/perl
use strict;
use warnings;
use Bio::SeqIO;
my $in = Bio::SeqIO->new( -file => "input.txt",
-format => 'GenBank');
while ( my $seq = $in->next_seq ) {
my $acc = $seq->accession;
my $length = $seq->length;
my $definition = $seq->desc;
my $type = $seq->molecule;
my $organism = $seq->species->binomial;
if ($type eq 'mRNA' &&
$organism =~ /homo sapiens/i &&
$acc =~ /[A-Za-z]{2}_[0-9]{6,}/ )
{
print "$acc | $definition | $length
";
print $seq->seq, "
";
print "
";
}
}
I was able to capture the 5 variables from a sample GenBank file I have (input.txt). It should simplify your code.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…