Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
557 views
in Technique[技术] by (71.8m points)

regex - How to use awk variables in regular expressions?

I have a file called domain which contains some domains. For example:

google.com
facebook.com
...
yahoo.com

And I have another file called site which contains some sites URLs and numbers. For example:

image.google.com   10
map.google.com     8
...
photo.facebook.com  22
game.facebook.com   15
..

Now I'm going to count the url number each domain has. For example: google.com has 10+8. So I wrote an awk script like this:

BEGIN{
  while(getline dom < "./domain" > 0) {
    domain[dom]=0;
  }
  for(dom in domain) {
    while(getline < "./site" > 0) {
      if($1 ~/$dom$)   #if $1 end with $dom {
        domain[dom]+=$2;
      }
    }
  }
}

But the code if($1 ~/$dom$) doesn't run like I want. Because the variable $dom in the regular expression was explained literally. So, the first question is:

Is there any way to use variable $dom in a regular expression?

Then, as I'm new to writing script

Is there any better way to solve the problem I have?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

awk can match against a variable if you don't use the // regex markers.

if ( $0 ~ regex ){ print $0; }

In this case, build up the required regex as a string

regex = dom"$"

Then match against the regex variable

if ( $1 ~ regex ) {
  domain[dom]+=$2;
}

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...