Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
427 views
in Technique[技术] by (71.8m points)

webforms - Automate website log-in and form filling?

I'm trying to log in to a website and save an HTML page automatically (I want to be able to do this on a regular time interval). From the surface, this is a typical modern website where, if the user navigates directly to a "locked" URL, a log-in form pops up, and after logging in, the user is redirected to the intended page.

I gave mechanize a shot (http://wwwsearch.sourceforge.net/mechanize/) but it wasn't finding some form elements which were needed for login (hidden elements that have some values put in by a javascript function that runs when the user clicks the "log in" button).

I played a bit with the "web browser" control in .NET but quickly lost interest because I couldn't even get it to submit a query on the Google page.

I don't care what the language is; I'll learn it to solve this problem. At a minimum it has to work in Windows.

A simple example, say, typing in a query into the Google search box would be a great bonus.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

In my experience, the most reliable way is to use javascript. It works well in .Net. To test, browse to the following addresses one after another in Firefox or Internet Explorer:

http://www.google.com
javascript:function f(){document.forms[0]['q'].value='stackoverflow';}f();
javascript:document.forms[0].submit()

That performs a search for "stackoverflow" on Google. To do it in VB .Net using the webbrowser control, do this:

WebBrowser1.Navigate("http://www.google.com")
Do While WebBrowser1.IsBusy OrElse WebBrowser1.ReadyState <> WebBrowserReadyState.Complete
    Threading.Thread.Sleep(1000)
    Application.DoEvents()
Loop
WebBrowser1.Navigate("javascript:function%20f(){document.forms[0]['q'].value='stackoverflow';}f();")
Threading.Thread.Sleep(2000) 'wait for javascript to run
WebBrowser1.Navigate("javascript:document.forms[0].submit()")
Threading.Thread.Sleep(2000) 'wait for javascript to run

Notice how the space in the URL is converted to %20. I'm not certain if this is necessary but it can't hurt. It is important that the first javascript be in a function. The calls to Sleep() are to wait for Google to load and also for the javascript stuff. The Do While Loop might run forever if the page fails to load so for automation purposes have a counter that will timeout after, say, 60 seconds.

Of course, for Google you can just navigate directly to www.google.com?q=stackoverflow but if your site has hidden input fields, etc, then this is the way to go. Only works for HTML sites - flash is a whole other matter.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...