Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
195 views
in Technique[技术] by (71.8m points)

html - how to get any text into WebBrowser Document without get any Attribute ? vb.net

how to get any text into WebBrowser Document without get any Attribute in vb.net?!

example1:

<h1>text here</h1>

example2:

<h1 name="anything">text here</h1>

how can i get "text here" ?!

thanks. :)

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You could either 1) Use the WebBrowser's built-in methods to iterate through all <h1> tags or get the very first one, or 2) Use a Regex.

Using the built-in methods

Iterating though all tags is simple, you just have to use the HtmlDocument.GetElementsByTagName() method.

Getting the first found tag (chronologically):

Dim h1Text As String = WebBrowser1.Document.GetElementsByTagName("h1")(0).InnerText

Iterating through all tags:

Dim h1Strings As New List(Of String)

For Each h1Tag As HtmlElement In WebBrowser1.Document.GetElementsByTagName("h1")
    h1Strings.Add(h1Tag.InnerText)
Next

Using a Regex

Using a Regex is not that hard if you know what you are doing. To start with put this Imports statement on the very top of your code file:

Imports System.Text.RegularExpressions

Now you just have to search the WebBrowser's DocumentText for the <h1> tag.

Dim h1Text As String = Regex.Match(WebBrowser1.DocumentText, "(?<=<h1[^<>/]*>)((?!</h1>).)*(?=</h1>)", RegexOptions.IgnoreCase).Value

The Regex pattern explained:

(?<=<h1[^<>/]*>)((?!</h1>).)*(?=</h1>)

(?<= ...): The matched text must be preceded with whatever ... is.

<h1[^<>/]*>: Match the <h1> opening tag with any attributes.

[^<>/]*: Match all characters that are not <, > or /.

((?!</h1>).)*: Match all characters that are not preceded by an </h1> tag.

(?=</h1>): The match must be followed by a </h1> tag.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...