Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
879 views
in Technique[技术] by (71.8m points)

java - Getting Captcha image using jsoup

I am trying to develop a small java-GUI based application by fetching captcha from my academics URL, asking user for his username, password and captcha, display the content after log in. However I am stuck at log in page itself as after submitting the form the response from web is

alert('Please enter correct code.'); window.history.go(-1);

Code

public Map cookies;
public void downloadCaptcha()throws Exception {
Connection.Response response = Jsoup.connect("https://academics.ddn.upes.ac.in/upes/")
.timeout(300000)
.userAgent("Mozilla/5.0")
.method(Connection.Method.GET).execute();
cookies = response.cookies();
Connection.Response resultImageResponse = Jsoup.connect("https://academics.ddn.upes.ac.in/upes/modules/create_image.php")
.cookies(cookies)
.ignoreContentType(true)
.method(Connection.Method.GET).timeout(30000).execute();
FileOutputStream out = (new FileOutputStream(new java.io.File("F:\abc.jpg")));
out.write(resultImageResponse.bodyAsBytes()); 
out.close();
System.out.println("Captcha Fetched");

}

After downloading Captcha

public static void getData(String captacha)throws Exception{
Connection.Response response = Jsoup.connect("https://academics.ddn.upes.ac.in/upes/index.php")
.userAgent("Mozilla/5.0")
.cookies(cookies)
.data("username",username)
.data("passwd",password)
.data("txtCaptcha",captacha)
.data("submit","Login")
.data("option","login")
.data("op2","login")
.data("lang","english")
.data("return","https://academics.ddn.upes.ac.in/upes/index.php?option=com_content&task=view&id=53&Itemid=6420")
.data("message","0")
.data("j1643f05a0c7fc7910424fb3fc4fbbb6f","1")
.timeout(0)
.method(Connection.Method.POST)
.execute();
cookies = response.cookies();
System.out.println(response.cookies());
Document doc= response.parse();
FileWriter fr = new FileWriter("F:\response.html");
PrintWriter pw= new PrintWriter(fr);
pw.println(doc.toString());
pw.close();
fr.close();
}

resonse.cookies() gives output {PHPSESSID=ai0r017bmb55gv0m4ikeu6jfc6, 61c78a27855d239ae8682ff6befaa989=5ae2e5baf548bc293c943d3416e7d400}

The website is https://academics.ddn.upes.ac.in/upes/index.php

Please point out my mistakes.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You need two changes for your code to work:

1 - You need to pick up the cookie returned by the second call (the download of the image) and add it to previous cookie.

2 - If you see the field "j1643f05a0c7fc7910424fb3fc4fbbb6f" is very suspicious, in fact that field is variable, you will need to pick the hidden input in the form and use it.

3 (extra) - It is not the case but some servers complain if you don't send some headers, like Accept, Accept-Encoding, Accept-Language ...

When I use your code with those changes I get :

<script>alert('Incorrect username or password. Please try again.'); window.history.go(-1);</script> 

Of course I don't have a user/pass, I think you'll get the desired page.

The code with the neccesary changes is:

public class SO_28619161 {


    public Map cookies;
    private String username = "u";
    private String password = "p";

    public HashMap<String,String> downloadCaptcha()throws Exception {
        Connection.Response response = Jsoup.connect("https://academics.ddn.upes.ac.in/upes/")
                .timeout(300000)
                .userAgent("Mozilla/5.0")
                .method(Connection.Method.GET).execute();

        //nice
        cookies = response.cookies();

        //now we will load form's inputs 
        Document doc = response.parse();
        Elements fields = doc.select("form input");
        HashMap<String,String> formFields = new HashMap<String, String>();
        for (Element field : fields ){
            formFields.put(field.attr("name"), field.attr("value"));
        }

        Connection.Response resultImageResponse = Jsoup.connect("https://academics.ddn.upes.ac.in/upes/modules/create_image.php")
                .cookies(cookies)
                .ignoreContentType(true)
                .method(Connection.Method.GET).timeout(30000).execute();

        //we will need these cookies also!
        cookies.putAll(resultImageResponse.cookies());

        FileOutputStream out = (new FileOutputStream(new java.io.File("abc.jpg")));
        out.write(resultImageResponse.bodyAsBytes()); 
        out.close();

        System.out.println("Captcha Fetched");

        return formFields;
    }

    public void getData(HashMap<String, String> formFields) throws Exception{
        Connection conn = Jsoup.connect("https://academics.ddn.upes.ac.in/upes/index.php")
                .userAgent("Mozilla/5.0 (Windows NT 6.1; WOW64; rv:35.0) Gecko/20100101 Firefox/35.0")
                //not neccesary but these extra headers won't hurt
                .header("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8")
                .header("Accept-Encoding", "gzip, deflate")
                .header("Accept-Language", "es-ES,es;q=0.8,en-US;q=0.5,en;q=0.3")
                .header("Host", "academics.ddn.upes.ac.in")
                .header("Referer", "https://academics.ddn.upes.ac.in/upes/index.php")
                .cookies(cookies)
                .timeout(0)
                .method(Connection.Method.POST);

        //we send the fields
        conn.data(formFields);

        Response response = conn.execute();
        cookies = response.cookies();
        System.out.println(response.cookies());
        Document doc= response.parse();
        FileWriter fr = new FileWriter("response.html");
        PrintWriter pw= new PrintWriter(fr);
        pw.println(doc.toString());
        System.out.println(doc.toString());
        pw.close();
        fr.close();
    }

    private void run() throws Exception, IOException {
        HashMap<String, String> formFields = downloadCaptcha();

        BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
        String captcha = br.readLine();

        //we set user/pass and captcha
        formFields.put("username", username);
        formFields.put("passwd", password);
        formFields.put("txtCaptcha", captcha);

        getData(formFields);
    }

    public static void main(String[] args) throws Exception {
        SO_28619161 main = new SO_28619161();
        main.run();
    }

}

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...