Archive for the 'Python' Category

Django query set iterator – for really large, querysets

When you try to iterate over a query set with about 0.5 million items (a few hundred megs of db storage), the memory usage can become somewhat problematic. Adding .iterator to your query set helps somewhat, but still loads the entire query result into memory. Cronjobs at YouTellMe.nl where unfortunately starting to fail. My colleague Rick came up with the following fix.

This solution chunks up the querying in bits of 1000 (by default). While this is somewhat heavier on your database (multiple queries) it seriously reduces the memory usage. Curious to hear how other django developers have worked around this problem.

import gc

def queryset_iterator(queryset, chunksize=1000):
    '''
    Iterate over a Django Queryset ordered by the primary key

    This method loads a maximum of chunksize (default: 1000) rows in it's
    memory at the same time while django normally would load all rows in it's
    memory. Using the iterator() method only causes it to not preload all the
    classes.

    Note that the implementation of the iterator does not support ordered query sets.
    '''
    pk = 0
    last_pk = queryset.order_by('-pk')[0].pk
    queryset = queryset.order_by('pk')
    while pk < last_pk:
        for row in queryset.filter(pk__gt=pk)[:chunksize]:
            pk = row.pk
            yield row
        gc.collect()

#Some Examples:
#old
MyItem.objects.all()

#better
MyItem.objects.all().iterator()

#even better
queryset_iterator(MyItem.objects.all())

Django snippet here.

Django & Python & Web Development & YouTellMe tschellenbach 03 Mar 2010 6 Comments

YTM launch!!

No more beta for YouTellMe.nl
The website which is taking over the Dutch product comparison market is officially going out of beta @ 8 o clock.
Party in Amsterdam, Keizersgracht 182 :) Festivities starting right now!

13342_350348980430_784785430_9966158_5558110_n

Things are going well, looking very forward to international launch.
We’ve changed a lot since the first reviews!

13342_350352790430_784785430_9966172_7726367_n

Beter pictures coming after the event :P

PS. Thanks to Python and Django, for enabling us to beat the competition :)

PSS. Next2News, eduhub, come and join :)

Apache & Business & Css & Django & Dutch & Events & Javascript & PHP & Prototype & Python & Symfony & Web Development & YouTellMe tschellenbach 11 Dec 2009 1 Comment

Django template tags – Google chart – python 2.4 port

I just tried the django template tags for the google charting api by Jacob. Unfortunately they were python 2.5 only and I happen to still be stuck to 2.4. The changes to move it to 2.4 were minimal though. Still to save some of you googlers out there the hassle:

charts

I was just browsing the code a bit. There is something peculiar there:

args, varargs, varkw, defaults = inspect.getargspec(func)

I don’t get the point of using the inspect functionality. Anyone care to explain?

Django & Python & Web Development tschellenbach 02 Jan 2009 3 Comments