Posted by: dresstosurvive | December 21, 2007

Six Function Calculator in One Line of Python

Usage Instructions

  • Copy
  • Paste
  • Press Enter
  • Type one simple expression (i.e. 2 + 2 or -42 ^ 16) per line
  • Press Ctrl-D (EOL) to evaluate all the expressions

print map((lambda environ:environ[0]['ops'][(environ[0]['re'].match(environ[1]).group(’op’)or’+')](int((environ[0]['re'].match(environ[1]).group(’lvalue’)or 0)),int((environ[0]['re'].match(environ[1]).group(’rvalue’)or 0)))),(environ for environ in(({’ops’:{’+':__import__(”operator”).add,’-':__import__(”operator”).sub,’*':__import__(”operator”).mul,’^':pow,’%':__import__(”operator”).mod,’/':__import__(”operator”).div},’re’:__import__(”re”).compile(r’^(?P[\+\-]{,1}(\s*)\d+)(\s*)((?P[\+\-\*\^\%/])(\s*)(?P[\+\-]{,1}(\s*)\d+)(\s*)){,1}$’)},line)for line in __import__(”sys”).stdin.readlines())))

Commented Version

# Here, we print out the final list of evaluated expressions
# returned by mapping a lambda over a generator comprehension
print map(
    # This lambda evaluates one line
    (lambda environ:
        # Environ is a tuple containing the tools and the current line
        environ[0]['ops']
            # This gets the right operation to perform
            [(environ[0]['re'].match(environ[1]).group(’op’) or ‘+’)](
                # Grab our lvalue and our rvalue from the expression
                int((environ[0]['re'].match(environ[1]).group(’lvalue’) or 0)),
                int((environ[0]['re'].match(environ[1]).group(’rvalue’) or 0))
           &nbsp ;)
    ),
    # This is a generator comprehension returning the tuple
    (environ for environ in (
        # The first item of the tuple is a dict containing useful modules
        # and “variables” such as the regex matcher
        ( {
                # Available operations, stolen from the operator module
                ’ops’: {
                            ’+': __import__(”operator”).add,
                            ’-': __import__(”operator”).sub,
                            ’*': __import__(”operator”).mul,
                            ’^': pow,
                            ’%': __import__(”operator”).mod,
                            ’/': __import__(”operator”).div
                        },
                # The regex matcher to split up an expression into groups
                # Tweaking the last {,1} into {,} or * would allow you to parse
                # more complicated expressions, albeit requiring a change to
                # the lambda
                ’re’: __import__(”re”).compile(r’^(?P[\+\-]{,1}(\s*)\d+)(\s*)((?P[\+\-\*\^\%/])(\s*)(?P[\+\-]{,1}(\s*)\d+)(\s*)){,1}$’)
            },
            # The current line
            line)
        # Grab all lines until the user hits Ctrl-D (EOF)
        for line in __import__(”sys”).stdin.readlines())
   &nbsp ;)
)

Posted by: dresstosurvive | December 18, 2007

Juno’s Hand Drawn Opening Titles

The opening titles for Diablo Cody’s Juno are beautifully hand-drawn, courtesy of Shadowplay Studio. They remind me of coloring over a photocopy with a crayon or colored pencil. The accompanying music is also an excellent, up-beat choice. If you’ve read the plot synopsis on IMDB or seen the film, it’s also ironic touch.

I’ve updated Corey Goldberg’s ystockquote.py module. I originally emailed him the updated code, but I haven’t received a reply. Since it’s LGPL’d, here is the updated version for your perusal.

Corey, if you’re out there, please email me back.

Posted by: dresstosurvive | September 27, 2007

GNU MP Floating Point Extension for Python 3000

I’ve been quietly working on an extension for Python 3000 (current release: 3.0a1) that provides fast, infinite precision floating point via the GNU MP library. You can find out all the details and get the code on the MPF Python page.

Posted by: dresstosurvive | August 24, 2007

Mourning Media Teaser #1

I’ve been slowly putting together a mod team and working on some stuff behind the scenes. Last night, I threw together this little image. If you’re wondering what the game might feel like, just soak in it. Check back later for some audio. ;)

Mourning

Posted by: dresstosurvive | August 20, 2007

The Push versus Pull Model of Software Development

Shell pipelines, reading files, and streaming data; structure your code for maximal efficiency.

Have you ever considered why your code bottlenecks at certain points? It pays to distinguish between data that is pushed to its next destination and data that is pulled. Pulling is a great thing when the bottleneck is the user. Pulling data saves resources when there is little or no work to be done.

However, things change when you need to operate on a large data set or perform multiple transformations in sequence. Pulling data becomes ineffective—the slowest link in the chain becomes a choke point. This can be significantly alleviated by pushing data instead.

Pushing data works best if you can structure your chain of operations to perform the most expensive ones first. By doing this, the later operations are able to execute without starving for data. As each operation completes, it sends the transformed data or a subset of it to the next component.

Care must be taken to avoid placing heavier duty operations later, otherwise you risk flooding a component with data. In the case that a more expensive operation must be performed later, it is best to pull data into that component.

Pulling data entails components completing their work as fast as possible, while also serving requests to dump new data. A component requests more data as needed from the previous component. If there is nothing available, it must sleep and poll at a later date or employ mechanism to be signaled when new data is available. Reading a file is a pull operation, because the component must wait for the drive before the data is supplied.

Shell pipelines are an example of pushing data. Suppose a user wished to search their movies for all stored in AVI containers. They might construct a shell pipeline looking something like below. First, the listing of all possible movies would be created. Then, the subset of those stored in AVI containers would be created.

Pushing Data

This method of structuring data by progressively filtering results is at the heart of good data mining practice. Note that at each stage, you can take the new data set and perform various operations on it. This allows for cheap non-destructive updates provided you maintain a copy of the data set. As each subsequent operation is cheaper, filters can be added and removed at will.

Pushing works well for things like streaming music. You don’t want to starve the final output. If you use a pull model which requests an update every few milliseconds, you might experience skipping or delays.

Pulling works well for things like AJAX interfaces. You don’t want to continuously do work, but you want data whenever the user asks for it. Ideally, on the server side, the code will be structured to do its processing in a push fashion and provide an interface for data to be pulled.

In short, consider whether your operations are driven by user input, physical motion (such as in a disk drive) or whether they are limited by processing power. Also take into account whether the final output must be available on a time-critical or realtime basis. In a multiplayer online game, data becomes irrelevant when it is out of date and must be discarded. Other operations, such as analysing stock market activity, will not allow for data to be discarded at random.

Posted by: dresstosurvive | August 20, 2007

Retrieving GET and POST Variables with Mod_python

Painlessly retrieve information from forms and queries.

If you’re frustrated trying to retrieve the GET and POST variables you’re used to with PHP or some other language, check out get_env.

An important thing to note is that a POST variable of the same name as a GET variable in a request will overwrite the GET value.

Under the hood, get_env uses a few tricks to maximize performance. First, it tries to use the version of parse_qsl provided by Mod_python instead of the standard one provided by cgi as it is much faster. Next, the GET values are retrieved, but not automatically parsed. If the method was not POST, get_env does not attempt to find any POST variables. If it was, the values are appended to the query string containing any possible GET values. Finally, the query string is parsed and transformed into a dictionary using the standard dictionary constructor. You might consider another option, but this is the fastest in this case.

options = {"content_type": "text/plain"}

from mod_python import apache
try:
    from mod_python.util import parse_qsl
except ImportError:
    from cgi import parse_qsl

def get_env(req):
    #   grab GET variables
    req.add_common_vars()
    req.content_type    = options['content_type']
    query               = req.subprocess_env['QUERY_STRING']

    #   grab POST variables
    if req.subprocess_env['REQUEST_METHOD'] == ‘POST’:
        query += ‘&’ + req.read()

    #   break down the urlencoded query string
    query       = parse_qsl(query)
    http_var    = dict(query)

    return http_var

def handler(req):
    http_env = get_env(req)

    #   lookie ma, we got an environment!
    for key, value in http_env.items():
        req.write(”%s: %s\n” % (key, value))

    return apache.OK

Posted by: dresstosurvive | August 18, 2007

An Impossible Distance

Enjoying the ride for its own sake.

Do you often find yourself rushing to finish a book, impatient for the ending?

Slow down and examine the scenery. Mark Z. Danielewski’s fantastic debut, House of Leaves, has something to remind readers of. In the words of Johnny Truant, annotator to The Navidson Record:

“…one sinking ship after another, in fact that was the conclusion to every single story he told, so that we, his strange audience learned not to wonder about the end but paid more attention to the tale preceding the end…”

House of Leaves is blithely unaware of its own labyrinthine character for the majority of the novel; this little passage provides an exception. We, the readers, are reminded that the soul of a book is in the journey it takes us on, not the destination we arrive at.

Posted by: dresstosurvive | August 18, 2007

Creating Dictionaries from Lists of Tuples in Python

Thoughts and Benchmarks.

I tested various ways of building dictionaries from lists of tuples in Python. Three ways to accomplish this are benchmarked in the code below. Not surprisingly, the default dict() constructor is the fastest. More interesting is that the functional version using map and lambda is the slowest by far. The looping version shows decent performance, but nothing special.

On a Pentium 4, 2.6ghz with 760mb of RAM under Python 2.5 on Win32 the benchmark gives these results:

$ python prof.py
map_list_to_dict took 46.063ms
loop_list_to_dict took 12.511ms
build_dict_from_list took 11.182ms

In #python @ irc.freenode.net, bpietro54 reported similar numbers on an unidentified Linux box.

$ python prof.py
map_list_to_dict took 62.500ms
loop_list_to_dict took 16.900ms
build_dict_from_list took 14.000ms

The obligatory benchmark chart (smaller bars are faster):

Benchmark Results

Unless you have a need to add tuples to a dictionary occasionally, dict() will be the fastest. In that case, adding them incrementally might be cheaper.

import time

def print_timing(func):
    def wrapper(*arg):
        t1 = time.clock()
        res = func(*arg)
        t2 = time.clock()
        return (t2-t1)*1000.0
    return wrapper

def benchmark(func, list_of_tuples):
    avg_time = [func(list_of_tuples) for i in xrange(100)]
    return sum(avg_time) / len(avg_time)

#   Dataset of 10,000 tuples
list_of_tuples = [("key_%d" % i, i) for i in xrange(10000)]

#   The messy functional approach
@print_timing
def map_list_to_dict(list_of_tuples):
    mapped_dict = {}
    map((lambda mapping: mapped_dict.update({mapping[0]: mapping[1]})), list_of_tuples)
    return mapped_dict

#   The obvious, but overcomplicated way
@print_timing
def loop_list_to_dict(list_of_tuples):
    looped_dict = {}
    for key, value in list_of_tuples:
        looped_dict[key] = value
    return looped_dict

#   Control, the normal way to do it
@print_timing
def build_dict_from_list(list_of_tuples):
    return dict(list_of_tuples)

#   The results! Hallelujah!
print "map_list_to_dict took %0.3fms" % benchmark(map_list_to_dict, list_of_tuples)
print "loop_list_to_dict took %0.3fms" % benchmark(loop_list_to_dict, list_of_tuples)
print "build_dict_from_list took %0.3fms" % benchmark(build_dict_from_list, list_of_tuples)

Posted by: dresstosurvive | August 1, 2007

Scrounging for Dinner

And this is what I came up with:

Leftovers galore

Older Posts »

Categories