Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
343 views
in Technique[技术] by (71.8m points)

jquery - JavaScript Regex to match a URL in a field of text

How can I setup my regex to test to see if a URL is contained in a block of text in javascript. I cant quite figure out the pattern to use to accomplish this

 var urlpattern = new RegExp( "(http|ftp|https)://[w-_]+(.[w-_]+)+([w-.,@?^=%&:/~+#]*[w-@?^=%&/~+#])?"

 var txtfield = $('#msg').val() /*this is a textarea*/

 if ( urlpattern.test(txtfield) ){
        //do something about it
 }

EDIT:

So the Pattern I have now works in regex testers for what I need it to do but chrome throws an error

  "Invalid regular expression: /(http|ftp|https)://[w-_]+(.[w-_]+)+([w-.,@?^=%&:/~+#]*[w-@?^=%&/~+#])?/: Range out of order in character class"

for the following code:

var urlexp = new RegExp( '(http|ftp|https)://[w-_]+(.[w-_]+)+([w-.,@?^=%&:/~+#]*[w-@?^=%&/~+#])?' );
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Though escaping the dash characters (which can have a special meaning as character range specifiers when inside a character class) should work, one other method for taking away their special meaning is putting them at the beginning or the end of the class definition.

In addition, + and @ in a character class are indeed interpreted as + and @ respectively by the JavaScript engine; however, the escapes are not necessary and may confuse someone trying to interpret the regex visually.

I would recommend the following regex for your purposes:

(http|ftp|https)://[w-]+(.[w-]+)+([w.,@?^=%&:/~+#-]*[w@?^=%&/~+#-])?

this can be specified in JavaScript either by passing it into the RegExp constructor (like you did in your example):

var urlPattern = new RegExp("(http|ftp|https)://[w-]+(.[w-]+)+([w.,@?^=%&:/~+#-]*[w@?^=%&/~+#-])?")

or by directly specifying a regex literal, using the // quoting method:

var urlPattern = /(http|ftp|https)://[w-]+(.[w-]+)+([w.,@?^=%&:/~+#-]*[w@?^=%&/~+#-])?/

The RegExp constructor is necessary if you accept a regex as a string (from user input or an AJAX call, for instance), and might be more readable (as it is in this case). I am fairly certain that the // quoting method is more efficient, and is at certain times more readable. Both work.

I tested your original and this modification using Chrome both on <JSFiddle> and on <RegexLib.com>, using the Client-Side regex engine (browser) and specifically selecting JavaScript. While the first one fails with the error you stated, my suggested modification succeeds. If I remove the h from the http in the source, it fails to match, as it should!

Edit

As noted by @noa in the comments, the expression above will not match local network (non-internet) servers or any other servers accessed with a single word (e.g. http://localhost/... or https://sharepoint-test-server/...). If matching this type of url is desired (which it may or may not be), the following might be more appropriate:

(http|ftp|https)://[w-]+(.[w-]+)*([w.,@?^=%&amp;:/~+#-]*[w@?^=%&amp;/~+#-])?

#------changed----here-------------^

<End Edit>

Finally, an excellent resource that taught me 90% of what I know about regex is Regular-Expressions.info - I highly recommend it if you want to learn regex (both what it can do and what it can't)!


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...