Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
380 views
in Technique[技术] by (71.8m points)

php - Youtube I.D parsing for new URL formats

This question has been asked before and I found this:

Reg exp for youtube link

but I'm looking for something slightly different.

I need to match the Youtube I.D itself compatible with all the possible youtube link formats. Not exclusively beginning with youtube.com.

For example:

http://www.youtube.com/watch?v=-wtIMTCHWuI

http://www.youtube.com/v/-wtIMTCHWuI?version=3&autohide=1

http://youtu.be/-wtIMTCHWuI

http://www.youtube.com/oembed?url=http%3A//www.youtube.com/watch?v%3D-wtIMTCHWuI&format=json

http://s.ytimg.com/yt/favicon-wtIMTCHWuI.ico

http://i2.ytimg.com/vi/-wtIMTCHWuI/hqdefault.jpg

is there a clever strategy I can use to match the video I.D -wtIMTCHWuI compatible with all these formats. I'm thinking character counting and matching = ? / . & characters.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I had to deal with this for a PHP class I wrote a few weeks ago and ended up with a regex that matches any kind of strings: With or without URL scheme, with or without subdomain, youtube.com URL strings, youtu.be URL strings and dealing with all kind of parameter sorting. You can check it out at GitHub or simply copy and paste the code block below:

/**
 *  Check if input string is a valid YouTube URL
 *  and try to extract the YouTube Video ID from it.
 *  @author  Stephan Schmitz <eyecatchup@gmail.com>
 *  @param   $url   string   The string that shall be checked.
 *  @return  mixed           Returns YouTube Video ID, or (boolean) false.
 */
function parse_yturl($url)
{
    $pattern = '#^(?:https?://|//)?(?:www.|m.)?(?:youtu.be/|youtube.com/(?:embed/|v/|watch?v=|watch?.+&v=))([w-]{11})(?![w-])#';
    preg_match($pattern, $url, $matches);
    return (isset($matches[1])) ? $matches[1] : false;
}

Test cases: https://3v4l.org/GEDT0
JavaScript version: https://stackoverflow.com/a/10315969/624466

To explain the regex, here's a split up version:

/**
 *  Check if input string is a valid YouTube URL
 *  and try to extract the YouTube Video ID from it.
 *  @author  Stephan Schmitz <eyecatchup@gmail.com>
 *  @param   $url   string   The string that shall be checked.
 *  @return  mixed           Returns YouTube Video ID, or (boolean) false.
 */
function parse_yturl($url)
{
    $pattern = '#^(?:https?://|//)?' # Optional URL scheme. Either http, or https, or protocol-relative.
             . '(?:www.|m.)?'      #  Optional www or m subdomain.
             . '(?:'                 #  Group host alternatives:
             .   'youtu.be/'        #    Either youtu.be,
             .   '|youtube.com/'    #    or youtube.com
             .     '(?:'             #    Group path alternatives:
             .       'embed/'        #      Either /embed/,
             .       '|v/'           #      or /v/,
             .       '|watch?v='    #      or /watch?v=,
             .       '|watch?.+&v=' #      or /watch?other_param&v=
             .     ')'               #    End path alternatives.
             . ')'                   #  End host alternatives.
             . '([w-]{11})'         # 11 characters (Length of Youtube video ids).
             . '(?![w-])#';         # Rejects if overlong id.
    preg_match($pattern, $url, $matches);
    return (isset($matches[1])) ? $matches[1] : false;
}

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...