Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
742 views
in Technique[技术] by (71.8m points)

regex - Google Sheets Formula for Extracting Domain From Website?

REGEXEXTRACT(A1:A,"(?m)http(?:s?)://.*?([^./]+?.[^.]+?)(?:/|$)")

Trying to extract domain from website

The formula above has worked for me if the link is like this: https://walmart.com/careers

However, it doesn't work if it's already a domain (walmart.com) or if it's www.walmart.com/careers

Is there a more thorough formula that can allow for these edge cases?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

try:

=ARRAYFORMULA(INDEX(SPLIT(REGEXREPLACE(A8:A12, "https?://www.|https?://|www.", ), "/"),,1))

0

enter image description here


UPDATE 1:

=ARRAYFORMULA(IFNA(REGEXEXTRACT(INDEX(SPLIT(
 REGEXREPLACE(A8:A14, "https?://www.|https?://|www.", ), "/"),,1), 
 ".(.+..+)"), INDEX(SPLIT(
 REGEXREPLACE(A8:A14, "https?://www.|https?://|www.", ), "/"),,1)))

0


UPDATE 2:

=INDEX(IFERROR(REGEXEXTRACT(A1:A, "^(?:https?://)?(?:ftp://)?(?:www.)?([^/]+)")))

enter image description here


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...