A simple Python HTTP client
A simple HTTP client I had lying around that I wrote a long time ago. It supports cookies, redirects and stuff:
1 #!/usr/bin/env python 2 # 3 # Http 4 # 5 # A simple HTTP client that supports persistent cookies 6 # 7 8 import cookielib 9 import httplib 10 #httplib.HTTPConnection.debuglevel = 1 11 import urllib2 12 13 class Http: 14 def __init__(self, redirect_callback = None): 15 self.redirect_callback = redirect_callback 16 self.cookie_jar = cookielib.CookieJar() 17 self.opener = urllib2.build_opener(urllib2.HTTPCookieProcessor (self.cookie_jar)) 18 19 urllib2.install_opener(self.opener) 20 21 def get(self, url, headers = None): 22 request = urllib2.Request(url, headers = headers) 23 return self.execute_request(request) 24 25 def post(self, url, headers = None, parameters = None): 26 data = None 27 if parameters != None: 28 data = urllib.urlencode(parameters) 29 30 request = urllib2.Request(url, data, headers) 31 return self.execute_request(request) 32 33 def execute_request(self, request): 34 response = self.opener.open(request) 35 # Check for redirect, maybe better way to do this 36 if response.geturl() != request.get_full_url(): 37 if self.redirect_callback == None: 38 raise "Redirected to '" + response.geturl() + "' but no redirect callback defined" 39 else: 40 self.redirect_callback(response) 41 42 return response 43
How to install and use the mysql-python library
First download mysql-python from http://sourceforge.net/projects/mysql-python.
Extract it and run:
1 python setup.py build 2 sudo python setup.py install
If you get this error you need to install python-dev package:
1 In file included from _mysql.c:29: 2 pymemcompat.h:10:20: error: Python.h: No such file or directory 3 _mysql.c:30:26: error: structmember.h: No such file or directory 4 In file included from /usr/include/mysql/mysql.h:44, 5 from _mysql.c:40: 6 . 7 . 8 . 9 _mysql.c:2808: warning: return type defaults to 'int' 10 _mysql.c: In function 'DL_EXPORT': 11 _mysql.c:2808: error: expected declaration specifiers before 'init_mysql' 12 _mysql.c:2886: error: expected '{' at end of input 13 error: command 'gcc' failed with exit status 1
Installing the python-dev package on Debian is done with apt-get or synaptic:
1 apt-get install python-dev
Installing the library should now work:
1 python setup.py build 2 python setup.py install
Next test the library in the python console:
1 import MySQLdb 2 3 # Note that this example uses UTF-8 encoding 4 conn = MySQLdb.connect(host='localhost', user='...', passwd='...', db='...', charset = "utf8", use_unicode = True) 5 cursor = conn.cursor() 6 7 8 cursor.execute ("SELECT * FROM cities") 9 rows = cursor.fetchall () 10 11 for row in rows: 12 print "%s, %s" % (row[0], row[1].encode('utf-8')) 13 14 print "Number of rows returned: %d" % cursor.rowcount 15
Don’t forget to close the cursor and connection, and if you’re inserting data commit before closing, because autocommit is disabled by default:
1 cursor.close () 2 conn.commit () 3 conn.close ()
For more information about MySQLdb see this article.
How to parse an RSS or Atom feed with Python and the Universal Feed Parser library
This example uses the Universal Feed Parser, one of the best and fastest parsers for Python.
Feed Parser is a lot faster than feed_tools for Ruby and it’s about as fast as the ROME Java library according to my simple benchmark.
Feed Parser uses less memory and about as much of the CPU as ROME , but this wasn’t tested with a long running process, so don’t take my word for it.
1 import time 2 import feedparser 3 4 start = time.time() 5 6 feeds = [ 7 'http://..', 8 'http://' 9 ] 10 11 for url in feeds: 12 options = { 13 'agent' : '..', 14 'etag' : '..', 15 'modified': feedparser._parse_date('Sat, 29 Oct 1994 19:43:31 GMT'), 16 'referrer' : '..' 17 } 18 19 feed = feedparser.parse(url, **options) 20 21 print len(feed.entries) 22 print feed.feed.title.encode('utf-8') 23 24 end = time.time() 25 26 print 'fetch took %0.3f s' % (end-start)