Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
178 views
in Technique[技术] by (71.8m points)

web scraping using excel and VBA

i wrote my VBA code in excel sheet as below but it is not scrape data for me and also i don't know why please any one help me. it gave me reullt as "click her to read more" onlyi want to scrape enitre data such as first name last name state zip code and so on

Sub extractTablesData()
    Dim IE As Object, obj As Object
    Dim myState As String
    Dim r As Integer, c As Integer, t As Integer
    Dim elemCollection As Object

    Set IE = CreateObject("InternetExplorer.Application")

    myState = InputBox("Enter the city where you wish to work")

    With IE

        .Visible = True
        .navigate ("http://www.funeralhomes.com/go/listing/Search?  name=&city=&state=&country=USA&zip=&radius=")

        While IE.readyState <> 4
            DoEvents
        Wend

        For Each obj In IE.document.all.item("state").Options
            If obj.innerText = myState Then
                obj.Selected = True
            End If
        Next obj

        IE.document.getElementsByValue("Search").item.Click

        Do While IE.Busy: DoEvents: Loop

        ThisWorkbook.Sheets("Sheet1").Range("A1:K1500").ClearContents

        Set elemCollection = IE.document.getElementsByTagName("TABLE")

        For t = 0 To (elemCollection.Length - 1)

            For r = 0 To (elemCollection(t).Rows.Length - 1)
                For c = 0 To (elemCollection(t).Rows(r).Cells.Length - 1)
                    ThisWorkbook.Worksheets(1).Cells(r + 1, c + 1) = elemCollection(t).Rows(r).Cells(c).innerText
                Next c
            Next r
        Next t

    End With
    Set IE = Nothing
End Sub
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Using the same URL as the answer already given you could alternatively select with CSS selectors to get the elements of interest, and use split to get just the names and address parts from the text. We can also do away with the browser altogether to get faster results from first results page.


Business name:

You can get the name with the following selector (using paid listing example):

div.paid-listing .listing-title

This selects (sample view)

CSS query Try


Address info:

The associated descriptive information can be retrieved with the selector:

div.paid-listing .address-summary

And then using split we can parse this into just the address information.


Code:

Option Explicit
Public Sub GetTitleAndAddress()
    Dim oHtml As HTMLDocument, nodeList1 As Object, nodeList2 As Object, i As Long
    Const URL As String = "http://www.funeralhomes.com/go/listing/ShowListing/USA/New%20York/New%20York"
    Set oHtml = New HTMLDocument

    With CreateObject("WINHTTP.WinHTTPRequest.5.1")
        .Open "GET", URL, False
        .send
        oHtml.body.innerHTML = .responseText
    End With

    Set nodeList1 = oHtml.querySelectorAll("div.paid-listing .listing-title")
    Set nodeList2 = oHtml.querySelectorAll("div.paid-listing .address-summary")

    With Worksheets("Sheet3")
        .UsedRange.ClearContents
        For i = 0 To nodeList1.Length - 1
            .Range("A" & i + 1) = nodeList1.Item(i).innerText
            .Range("B" & i + 1) = Split(nodeList2.Item(i).innerText, Chr$(10))(0)
        Next i
    End With
End Sub

Example output:

Output


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...