python snippets

How to parse an RSS or Atom feed with Python and the Universal Feed Parser library

Tagged universal, feed, parser, atom, rss, python  Languages python

This example uses the Universal Feed Parser, one of the best and fastest parsers for Python.

Feed Parser is a lot faster than feed_tools for Ruby and it's about as fast as the ROME Java library according to my simple benchmark.

Feed Parser uses less memory and about as much of the CPU as ROME, but this wasn't tested with a long running process, so don't take my word for it.

import time
import feedparser

start = time.time()

feeds = [
    'http://..', 
    'http://'
]

for url in feeds:
  options = {
    'agent'   : '..',
    'etag'    : '..',
    'modified': feedparser._parse_date('Sat, 29 Oct 1994 19:43:31 GMT'),
    'referrer' : '..'
  }

  feed = feedparser.parse(url, **options)

  print len(feed.entries)
  print feed.feed.title.encode('utf-8')

end = time.time()

print 'fetch took %0.3f s' % (end-start)

How to install and use the mysql-python library

Tagged python, mysql, mysql-python, install  Languages python

First download mysql-python from http://sourceforge.net/projects/mysql-python.

Extract it and run:

python setup.py build
sudo python setup.py install

If you get this error you need to install python-dev package:

In file included from _mysql.c:29:
pymemcompat.h:10:20: error: Python.h: No such file or directory
_mysql.c:30:26: error: structmember.h: No such file or directory
In file included from /usr/include/mysql/mysql.h:44,
                 from _mysql.c:40:
.
.
.
_mysql.c:2808: warning: return type defaults to 'int'
_mysql.c: In function 'DL_EXPORT':
_mysql.c:2808: error: expected declaration specifiers before 'init_mysql'
_mysql.c:2886: error: expected '{' at end of input
error: command 'gcc' failed with exit status 1

Installing the python-dev package on Debian is done with apt-get or synaptic:

apt-get install python-dev

Installing the library should now work:

python setup.py build
python setup.py install

Next test the library in the python console:

import MySQLdb

# Note that this example uses UTF-8 encoding
conn = MySQLdb.connect(host='localhost', user='...', passwd='...', db='...', charset = "utf8", use_unicode = True)
cursor = conn.cursor()


cursor.execute ("SELECT * FROM cities")
rows = cursor.fetchall ()

for row in rows:
  print "%s, %s" % (row[0], row[1].encode('utf-8'))

print "Number of rows returned: %d" % cursor.rowcount

Don't forget to close the cursor and connection, and if you're inserting data commit before closing, because autocommit is disabled by default:

cursor.close ()
conn.commit ()
conn.close ()

For more information about MySQLdb see this article.

A simple Python HTTP client

Tagged python, http, client  Languages python

A simple HTTP client I had laying around that I wrote a long time ago. It supports cookies, redirects and stuff:

#!/usr/bin/env python
#
#     Http
#
#     A simple HTTP client that supports persistent cookies
#

import cookielib
import httplib
#httplib.HTTPConnection.debuglevel = 1
import urllib2

class Http:
  def __init__(self, redirect_callback = None):
    self.redirect_callback = redirect_callback
    self.cookie_jar = cookielib.CookieJar()
    self.opener = urllib2.build_opener(urllib2.HTTPCookieProcessor (self.cookie_jar))

    urllib2.install_opener(self.opener)

  def get(self, url, headers = None):
    request = urllib2.Request(url, headers = headers)
    return self.execute_request(request)

  def post(self, url, headers = None, parameters = None):
    data = None
    if parameters != None:
      data = urllib.urlencode(parameters)

    request = urllib2.Request(url, data, headers)
    return self.execute_request(request)

  def execute_request(self, request):
    response = self.opener.open(request)
    # Check for redirect, maybe better way to do this
    if response.geturl() != request.get_full_url():
      if self.redirect_callback == None:
        raise "Redirected to '" + response.geturl() + "' but no redirect callback defined"
      else:
        self.redirect_callback(response)

    return response

Fixing "the fastcgi-backend /usr/bin/python2.5 app.py failed to start:"

Tagged python, lighttpd, fastcgi  Languages 

If you're getting an error similar to this one on Ubuntu:

$ 2010-01-07 11:35:20: (log.c.97) server started 
2010-01-07 11:35:20: (mod_fastcgi.c.1051) the fastcgi-backend /usr/bin/python2.5 /var/www/xxx/app.py failed to start: 
2010-01-07 11:35:20: (mod_fastcgi.c.1055) child exited with status 1 /usr/bin/python2.5 /var/www/xxx/app.py 
2010-01-07 11:35:20: (mod_fastcgi.c.1058) If you're trying to run PHP as a FastCGI backend, make sure you're using the FastCGI-enabled version.
You can find out if it is the right one by executing 'php -v' and it should display '(cgi-fcgi)' in the output, NOT '(cgi)' NOR '(cli)'.
For more information, check http://trac.lighttpd.net/trac/wiki/Docs%3AModFastCGI#preparing-php-as-a-fastcgi-programIf this is PHP on Gentoo, add 'fastcgi' to the USE flags. 
2010-01-07 11:35:20: (mod_fastcgi.c.1365) [ERROR]: spawning fcgi failed. 
2010-01-07 11:35:20: (server.c.902) Configuration of plugins failed. Going down.

check the following:

* are the /var/www and /var/www/python-test directories readable by the www-data group? If not: chgrp -R /var/www * are you specifying the full path to both python binary and the script? * have you installed flup? If not: sudo easy_install flup or sudo easy_install-2.5 flup * can the www-data user run the script? Check with: su - www-data then /usr/bin/python2.5 /var/www/xxx/app.py. * does your configuration work? This works for me:

"/app.py" => ((
                "bin-environment" => (
                    "REAL_SCRIPT_NAME" => ""
                ),
                "check-local" => "disable",
                "min-procs" => 1,
                "bin-path" => "/usr/bin/python2.5 /var/www/xxx/app.py",
                "socket"   => "/tmp/fastcgi.socket"
        ))

How to use Python's simplejson to read and write JSON data

Tagged simplejson, python, json  Languages python

First you need to install simplejson:

easy_install simplejson

Now you can dump data to JSON:

import simplejson as json

class Something:

    def __init__(self):
        self.test = "test"

    def to_json(self):
        return json.dumps(self.__dict__)

Or if you have complex objects:

import simplejson as json
class Something:

    def __init__(self):
        self.test = [Other('a', 'b'), Other('a', 'c')]

    def to_json(self):
        return json.dumps([p.__dict__ for p in self.devices])

How to parse XML with Python's built-in ElementTree parser

Tagged elementtree, python, xml, parse  Languages python
from xml.etree.ElementTree import fromstring, tostring

namespace = 'https://xxx.com/xxx'
element = fromstring(xml)

device = element.find('.//{%s}Device' % namespace)
detail = device.find('.//{%s}Details' % namespace)
series = device.findall('.//{%s}Series' % namespace)

Watch out for namespaces...

How to use a Python decorator wrapper to get a reference to the calling class instance

Tagged python, decorator, self  Languages python
def requires_authentication(method):
    """
    self points to a SheisseController instance instead of the decorator function.
    """
    def wrapper(self, *args, **kwargs):
        if self._requires_authentication == True and self._authenticated == False:
            return response('403 Forbidden or whatever')

        return method(self, *args, **kwargs)
    return wrapper

class SheisseController:
  @requires_authentication
  def index(self):

How to retrieve information about Python errors in a C extension

Tagged python, pyeval_callobject, pyerr_fetch  Languages python
result = PyEval_CallObject(tmp_callback, args);
    // result == NULL means an error occured
    if (PyErr_Occurred()) {
        PyObject* ptype;
        PyObject* pvalue;
        PyObject* ptraceback;
        PyErr_Fetch(&ptype, &pvalue, &ptraceback);
        printf("Error occurred on line: %d", ((PyTracebackObject*)ptraceback)->tb_lineno);
        // Restore exception instead of disposing of it
        PyErr_Restore(ptype, pvalue, ptraceback);
        PyErr_Print();

        Py_XDECREF(ptype);
        Py_XDECREF(pvalue);
        Py_XDECREF(ptraceback);
    }

via http://www.ragestorm.net/tutorial?id=21

How to generate screenshots on Debian Linux with python-webkit2png

Tagged python, python-webkit2png, webkit2png  Languages bash

Install:

apt-get install python-qt4 libqt4-webkit
git clone git://github.com/adamn/python-webkit2png.git
cd python-webkit2png/
python setup.py install

Test:

./webkit2png.py www.google.com

You might get this error:

webkit2png.py: cannot connect to X server

Fix:

# install xvbf
apt-get install xvfb xbase-clients xfonts-base libgtk2.0-0
# start
Xvfb :99 -ac

See details here

Take a screenshot:

./webkit2png.py -o obama.png -x 1024 768 "http://obama.com"

On OSX I would recommend you use Paparazzi

How to extract the palette from an image with Python

Tagged python, palette, colors, extract, colorific  Languages python

Detect the color palette of an image:

# See https://github.com/99designs/colorific/blob/master/colorific.py
# min_saturation = The minimum saturation needed to keep a color
# min_prominence = The minimum proportion of pixels needed to keep a color
import colorific
palette = >>> colorific.extract_colors('test.jpg', min_prominence=0.1)
colorific.print_colors('test.jpg', palette)

Example

This example will scan a directory for images and create an HTML file showing the images and the detected color palette for each image:

import colorific
import glob

html = open("index.html", "w")

for filename in glob.glob('./images/*'):
    html.write("<div>")
    html.write("<img width=\"150px\" src=\"" + filename + "\">")
    print filename
    palette = colorific.extract_colors(filename)
    print palette
    for color in palette.colors:
        print color
        hex_value = colorific.rgb_to_hex(color.value)
        html.write("""
            <div style="background: {color}; width: 500px; height: 50px; color: white;">
            {prominence}
            </div>
        """.format(color=hex_value, prominence=color.prominence))
        html.write("</div>")

    if palette.bgcolor != None:
        hex_value = colorific.rgb_to_hex(palette.bgcolor.value)
        html.write("""
            <div style="background: {color}; width: 500px; height: 50px; color: white;">
            {prominence}
            </div>
        """.format(color=hex_value, prominence=palette.bgcolor.prominence))
        html.write("</div>")

Issues

Note, on OSX I had to edit colorific.py (/Library/Python/2.7/site-packages/colorific-0.2.0-py2.7.egg/colorific.py) slightly to get it to work:

#from PIL import Image as Im
#from PIL import ImageChops, ImageDraw
import Image as Im
import ImageChops, ImageDraw

Before this, I got this error:

ImportError: No module named PIL