Pages

Monday, September 7, 2009

Use Python mechanize library to simulate a browser


    Mechanize is a library to programmatic web browsing in python. Some basic features are:
  • HTML form filling and submitting
  • Link parsing and following
  • Manage Browser history (.back() and .reload() methods)
  • Modify/View HTML headers
  • Deal with cookies
  • Download files
  • Setting proxies

The examples from here were very helpful. I was working the programming challenge 10 from Security Override. The task is to code a script that will scan 100 subdirectories for 3 given passwords, then formulate an answer an submit it in a form. The script below uses mechanize to login to the site, submit requests and compute the answer.


import urllib
import urllib2
import mechanize

# Login to site
url = 'http://securityoverride.com/login.php'
userinfo = {'username' : 'pennypecker', 'password' : 'abracadabra'}
br = mechanize.Browser()
br.addheaders = [('User-agent', 'Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.10) Gecko/20100914 Firefox/3.6.10')]
br.open(url)
br.select_form(name = "loginform")
br["user_name"] = userinfo['username']
br["user_pass"] = userinfo['password']
br.form.action = url
response = br.submit()

answer = ''

# Open sites
for i in range(1, 101):
    response = br.open(challenge_url + 'moo/' + str(i) +'/index.php')
    content = response.read()
    print str(i) + ' = ' + content
    if 'fail' != content:
        answer = answer + str(i) + ':' + content + '; '

answer = answer[0:len(answer)-2]
print answer

# Open index and fill submit form with answer
br.open(challenge_url)
br.select_form(name = "submitform")                                                                                                                               
br["string"] = answer
br.form.action = challenge_url + '/index.php'
print br.form
response = br.submit()

References:

1 comment: