Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
337 views
in Technique[技术] by (71.8m points)

linux - xpath html combine columns

I'm trying to extract data from socks-proxy.net with the IP and port from the website table.

I'm using these commands in linux to get the IP and port. How can I combine theme?

wget -q -O - "https://socks-proxy.net" | xmllint --html --xpath "//table[@id="proxylisttable"]//tr//td[1]//text()" - 2>/dev/null

Output:

103.254.12.3393.12.55.94192:12:44:11 

It combines the IP and it its not good

that will get all the IP's from the website table

wget -q -O - "https://socks-proxy.net" | xmllint --html --xpath "//table[@id="proxylisttable"]//tr//td[2]//text()" - 2>/dev/null

that will get all the ports

Output:

108025951082

It combines the port and its not good.

Question: how can I combine them with the desired example output:

103.254.12.33:1080
93.12.55.94:2595
192:12:44:11:1082

and so on...

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

A bit late, but seeing you're using 4(!) different tools to accomplish something so simple I just had to jump in to show you another amazing XML parser, called Xidel, which can do it all by itself:

xidel -s https://pastebin.com/raw/F14VRNBc -e '//table[@id="proxylisttable"]/tbody/tr/concat("my",td[5],"://",td[1],":",td[2])'
mySocks4://103.254.126.130:1080
mySocks5://192.228.194.87:25950
mySocks5://173.162.95.122:62168
mySocks4://183.166.22.194:1080
mySocks5://70.44.216.252:40656
[...]

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...