Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
901 views
in Technique[技术] by (71.8m points)

php - Regex to conditionally replace Twitter hashtags with hyperlinks

I'm writing a small PHP script to grab the latest half dozen Twitter status updates from a user feed and format them for display on a webpage. As part of this I need a regex replace to rewrite hashtags as hyperlinks to search.twitter.com. Initially I tried to use:

<?php
$strTweet = preg_replace('/(^|s)#(w+)/', '1#<a href="http://search.twitter.com/search?q=%232">2</a>', $strTweet);
?>

(taken from https://gist.github.com/445729)

In the course of testing I discovered that #test is converted into a link on the Twitter website, however #123 is not. After a bit of checking on the internet and playing around with various tags I came to the conclusion that a hashtag must contain alphabetic characters or an underscore in it somewhere to constitute a link; tags with only numeric characters are ignored (presumably to stop things like "Good presentation Bob, slide #3 was my favourite!" from being linked). This makes the above code incorrect, as it will happily convert #123 into a link.

I've not done much regex in a while, so in my rustyness I came up with the following PHP solution:

<?php
$test = 'This is a test tweet to see if #123 and #4 are not encoded but #test, #l33t and #8oo8s are.';

// Get all hashtags out into an array
if (preg_match_all('/(^|s)(#w+)/', $test, $arrHashtags) > 0) {
  foreach ($arrHashtags[2] as $strHashtag) {
    // Check each tag to see if there are letters or an underscore in there somewhere
    if (preg_match('/#d*[a-z_]+/i', $strHashtag)) {
      $test = str_replace($strHashtag, '<a href="http://search.twitter.com/search?q=%23'.substr($strHashtag, 1).'">'.$strHashtag.'</a>', $test);
    }
  }
}

echo $test;
?>

It works; but it seems fairly long-winded for what it does. My question is, is there a single preg_replace similar to the one I got from gist.github that will conditionally rewrite hashtags into hyperlinks ONLY if they DO NOT contain just numbers?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
(^|s)#(w*[a-zA-Z_]+w*)

PHP

$strTweet = preg_replace('/(^|s)#(w*[a-zA-Z_]+w*)/', '1#<a href="http://twitter.com/search?q=%232">2</a>', $strTweet);

This regular expression says a # followed by 0 or more characters [a-zA-Z0-9_], followed by an alphabetic character or an underscore (1 or more), followed by 0 or more word characters.

http://rubular.com/r/opNX6qC4sG <- test it here.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...