Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
383 views
in Technique[技术] by (71.8m points)

javascript - 如何用链接替换纯URL?(How to replace plain URLs with links?)

I am using the function below to match URLs inside a given text and replace them for HTML links.

(我正在使用下面的功能来匹配给定文本内的URL,并将其替换为HTML链接。)

The regular expression is working great, but currently I am only replacing the first match.

(正则表达式效果很好,但目前我仅替换第一场比赛。)

How I can replace all the URL?

(如何替换所有URL?)

I guess I should be using the exec command, but I did not really figure how to do it.

(我想我应该使用exec命令,但是我没有真正弄清楚该怎么做。)

function replaceURLWithHTMLLinks(text) {
    var exp = /((https?|ftp|file)://[-A-Z0-9+&@#/%?=~_|!:,.;]*[-A-Z0-9+&@#/%=~_|])/i;
    return text.replace(exp,"<a href='$1'>$1</a>"); 
}
  ask by Sergio del Amo translate from so

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

First off, rolling your own regexp to parse URLs is a terrible idea .

(首先,滚动自己的regexp来解析URL是一个糟糕的主意 。)

You must imagine this is a common enough problem that someone has written, debugged and tested a library for it, according to the RFCs .

(您必须想象这是一个非常普遍的问题,根据RFC ,有人为此编写,调试和测试了一个库。)

URIs are complex - check out the code for URL parsing in Node.js and the Wikipedia page on URI schemes .

(URI非常复杂 -请在Node.jsURI方案的Wikipedia页面上查看用于URL解析代码 。)

There are a ton of edge cases when it comes to parsing URLs: international domain names , actual ( .museum ) vs. nonexistent ( .etc ) TLDs, weird punctuation including parentheses , punctuation at the end of the URL, IPV6 hostnames etc.

(解析URL时有很多边缘情况: 国际域名 ,实际( .museum )与不存在( .etc )TLD,包含括号的怪异标点,URL末尾的标点,IPV6主机名等。)

I've looked at a ton of libraries , and there are a few worth using despite some downsides:

(我已经看了一吨图书馆 ,并有几个值得使用,尽管一些缺点:)

Libraries that I've disqualified quickly for this task:

(我已迅速被取消此任务资格的图书馆:)

If you insist on a regular expression, the most comprehensive is the URL regexp from Component , though it will falsely detect some non-existent two-letter TLDs by looking at it.

(如果您坚持使用正则表达式,则最全面的是ComponentURL regexp ,尽管它会通过查看错误地检测出一些不存在的两个字母的TLD。)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...