Search posterous

Search all posts and users. Type a name, type a favorite song title, whatever! See what comes up.
  

More posterous blogs











More recommended blogs »

Here are posterous posts filed under python...

Jared says...

We are flying back from Boston after an excellent week at the Architecture Technology Review. This was my first interaction with the Nokia architecture community at large, and I was really pleased (and I have to admit, somewhat surprised), to see how awesome many of the developments coming down the pipe are. Ville gave a talk on what we have been doing with Disco, and we also gave a demo during one of the 'speed geeking' sessions. One of the most common questions we were asked was, "why not Hadoop?", so I thought I'd give my opinion on the subject.

Prior to coming to the NRC, I was using Hadoop for about a year and a half (doing bioinformatics), and I must say that it served me quite well. To be sure, there were problems along the way, but Hadoop enabled me to do analyses that I would not otherwise have done, not because they would be impossible without Hadoop, but because mapreduce makes it so easy to parallelize a huge class of problems, that the overhead of doing things with big data becomes amazingly small.

Even when using Hadoop, I always used Python (with Hadoop Streaming) to write map/reduce functions, because Python is such a pleasure to write, and because I am much more productive writing Python than Java (or pretty much any other language). Because of my love for Python, I often wondered why noone had yet written a Python implementation of mapreduce, and even considered writing my own. I think it is natural for anyone who thinks about the design of systems, to question the validity of architecture decisions and to wonder how those designs might be improved. Of course, actually implementing a new design is a whole other story, and finding the impetus to do so, especially when a reasonably good implementation (with lots of high-profile developers) already exists, is not always easy.

When I discovered the Disco project, which is part Erlang, part Python, I was deeply intrigued. I questioned the choice of Erlang (not knowing much about it), but Ville's argument was extremely pragmatic: Erlang is really good at distributed stuff (that's what it was built to do), and Python is awesome for high-level programming (i.e. its fun, easy to read/write, expressive, etc.). But I guess the question remains, why not Hadoop? The reason answering this question is hard, is because largely it is a matter of taste. The bottom line is that neither Hadoop nor Disco is really a mature project (Hadoop IS more highly developed than Disco though), while it seems to me the choice of framework is a long-term question. For me, wanting to use Python to improve the framework itself is a no-brainer (additionally, Jython is currently too far behind CPython for me to consider it a replacement).

Why Disco? Because of it's philosophy: massive data - "minimal code". Lightweight is a design goal in Disco, and we really, truly, care about programmer overhead. Framework development should be as agile as possible, if we are trying to optimize programmer productivity. My vision of Disco is a framework that can be shaped to the needs of its users (including myself), by its users. For me, the reality of Hadoop was quite different.

Filed under: architecture, erlang, hadoop, nokia, python

Bascht says...

This is all pretty understandable: it’s easy to define community in terms of what we’re not. A common enemy focuses and drives us. Competition can take a positive form: when it’s friendly and constructive both communities benefit.

Lately, though, I’ve noticed the tone of the arguments in the Django community getting nastier — especially when it comes to Rails. Again, I’m far from innocent in this regard: I’ve certainly done my fair share of Rails-bashing, and I regret it.

Neat article - seeing Rails from a Django persons view.

Filed under: django, python, rails, rants, ruby

amnorvend says...

I've been meaning to blog about this for some time, so I suppose now
is as good an opportunity as any. This is going to be a very "stream
of consciousness"-esque posting, so bear with me. Some of these are
things that have radically changed the way I use emacs. Others are
minor changes that I like. Feel free to pick and choose from them.

I'll assume you're running Ubuntu. Also, not all of these are Python
specific. But I feel that they will be useful to most Python
programmers who use emacs.

I should also note that (like most emacs users), most of these things
are tricks that I've picked up from various sources along the way. If
you wrote something that I put in here, thanks!

Ropemacs

Ropemacs is among the tools I love the most and hate the most. When
it works, it opens up a new world of automated refactorings for you.
But it seems to be a bit buggy at times. That said, setting it up is
really easy. Firstly, you need to have ropemacs installed. This is
pretty easy:

 
sudo apt-get install python-ropemacs 

After that, just a couple of lines of elisp in your .emacs file and
you're good to go!

 
(require 'pymacs) 
(pymacs-load "ropemacs" "rope-") 
(setq ropemacs-enable-autoimport t) 

Anything

Anything is almost like Quicksilver for emacs. To begin, you need to
download anything.el and anything-config. I also use 
anything-match-plugin. Then you just need the following lines of elisp:

 
(require 'anything-config) 
(require 'anything-match-plugin) 
(global-set-key "\C-ca" 'anything) 
(global-set-key "\C-ce" 'anything-for-files) 

Then, prepare to spend a lot less time searching for files!

Line number mode

When you're pair programming, nothing is more helpful than being able
to direct people to a certain line of code. This lets you spend less
time saying "hey, see that over there? It's about 3 lines up. No,
too far! go down another two lines." Installing this is really easy.
You just need linum.el and two lines of elisp:

 
(require 'linum) 
(global-linum-mode 1) 

Flymake through pyflakes

In case you miss the automatic error highlighting of the Visual Studio
world, you should realize that emacs has a similar system built-in.
It's called flymake. And you can make it work with Python as well. I
personally prefer to use pyflakes for this. All that's needed is
pyflakes:

 
sudo apt-get install pyflakes 

...and some elisp:

 
(when (load "flymake" t) 
 (defun flymake-pyflakes-init () 
 (let* ((temp-file (flymake-init-create-temp-buffer-copy 
 'flymake-create-temp-inplace)) 
 (local-file (file-relative-name 
 temp-file 
 (file-name-directory buffer-file-name)))) 
 (list "pyflakes" (list local-file)))) 
 
 (add-to-list 'flymake-allowed-file-name-masks 
 '("\\.py\\'" flymake-pyflakes-init))) 

Uniquify

How many __init__.py buffers do you have open at this moment? If
you're using emacs for Python programming, probably a lot. This is
where emacs's uniquify functionality is useful. It gives you a more
useful name for your buffers other than just appending a number at the
end. I have mine use the reverse of the directory. For instance, if
I have foo/__init__.py and bar/__init__.py open, they will be named
__init__.py/foo and __init__.py/bar respectively.

You just need this in your .emacs:

 
(setq uniquify-buffer-name-style 'reverse) 
(setq uniquify-separator "/") 
(setq uniquify-after-kill-buffer-p t) ; rename after killing uniquified 
 
(setq uniquify-ignore-buffers-re "^\\*") ; don't muck with special 
buffers (or Gnus mail buffers) 

python-mode

To be totally honest with you, I haven't used the built-in emacs mode
for python. I just installed python-mode because I was told it was
better. What I can tell you is that there is an occasional plugin
that requires python-mode. Installing it is easy. Just install
python-mode:

 
sudo apt-get install python-mode 

And add some elisp:

 
(autoload 'python-mode "python-mode" "Python Mode." t) 
(add-to-list 'auto-mode-alist '("\\.py\\'" . python-mode)) 
(add-to-list 'interpreter-mode-alist '("python" . python-mode)) 

Pylookup

Pylookup is useful for those moments when you find yourself asking
something like "Is join in os or os.path?" Unfortunately, the setup
can be complex, but well worth it. There are instructions here.

     
Click here to download:
7_tools_for_working_with_Pytho.zip (49 KB)

Filed under: emacs, python, ubuntu

Adios says...

Robot Versioning

Robots within the Wave API are versioned. This allows the Wave system to detect when robots have changed and/or their capabilities have been altered. If you modify a robot's capabilities (by adding or removing monitored events, for example), you should also modify the version identifier in the Robot's constructor.

When deploying a robot, the Wave system will check if the robot identifer is different than what it has cached. (The robot identifier is simply a text string.) If so, Wave will refresh the robot and alter the system to generate any new events you've indicated interest in.

app.yaml 的 version 是應用程式的版本,每個版本的檔案會被 GAE 給記起來,所以可以輕易回復到之前的版本。

而 robot 的 version 這個參數則是指定 robot 的 capabilities.xml 是否有變更,若你為這個 robot 新增了一些 event handler,對應的 capabilities.xml 也會更新,就要告訴 GAE 你要重新 cache 新的。

Filed under: gae, google, google wave, python

DK says...

The previous post described how I went about cleaning up some yfd data using Python and numpy. I have no doubt it can be done in fewer lines of code, but I think the post described how useful it can be to manipulate arrays rather than looping through everything. With the data cleaned up, I hoped to visualize my newborn son's sleep schedule. I recently received an example that does the same thing as my python code, but in 3 lines! It uses R, ggplot2, and plyr. A few more lines can generate pretty plots like this (box plot of sleep length in hrs vs. start time):


As the plot above shows, my son doesn't sleep a helluva lot during the day. The boxplot also illustrates how volatile his night sleeping has been. This tells me I need to do a better job of getting the boy to nap during the day in hopes of producing longer and more restful sleep periods at night.

While Python has been my gateway drug into the world of programming, I've been itching to try out a plotting package based on R, ggplot2. R is a popular language in the statistics community that has enjoyed some good press recently. Anyway, my little sleep duration project seemed perfect for some R exploration.

After searching around on the Interweb, I managed to write some broken R code that didn't really do what I wanted. Luckily, Hadley Wickham (the author of plyr and ggplot2) took pity on me and offered up some example code to point me in the right direction. I was shocked at the efficiency of the example, particularly given all the wrangling I had to do in python. Now, just for the record, I'm not making any statements about R vs. Python. Hadley obviously created plyr and ggplot2 to make R easier to use, and I imagine the same could be (or already has been) done for python. I just lack the experience and education to know!

Anyway, plyr and ggplot2 are very nice libraries that offer yet more reasons to learn R. Thank you Professor Wickham! Between python and R, I've got to believe one can slice and dice almost anything. If I could only get rpy2 working...

Filed under: life, python, R

Adios says...

4.5 pass Statements

The pass statement does nothing. It can be used when a statement is required syntactically but the program requires no action. For example:

    >>> while True:
    ...       pass # Busy-wait for keyboard interrupt
    ... 

I don't like this style...

Filed under: python

Adios says...

Files named __init__.py are used to mark directories on disk as a Python package directories. If you have the files

mydir/spam/__init__.py
mydir/spam/module.py

and mydir is on your path, you can import the code in module.py as:

import spam.module

or

from spam import module

Filed under: python

amnorvend says...

There are two schools of thought in the programming world:

  1. Explicit is better than implicit (configuration over convention)
  2. A developer should only have to program the unconventional aspects of a program (convention over configuration).

We'll call #1 the Python school and #2 the Ruby school. In fact, I
would argue that this is an issue that's at the core of whether code
is considered "Pythonic" or "Rubyic" (I doubt the last one is a word).

So which school of thought is right? I personally think they both
are. It doesn't really take a whole lot to demonstrate that the
Python school of thought isn't always right. Think about it. Did you
know that the Python runtime has a component that goes around deleting
objects from memory totally implicitly? How unpythonic is that?

The Ruby school of thought takes a bit more work though. After all,
if it's unconventional, why should you have to configure it? Of
course, the problem here is in defining "conventional". What's
conventional to me is likely unconventional to others. And what's
conventional to others could be unconventional to me.

I wish I had more advice on how to reconcile these two schools of
thought. The truth is that I struggle with them daily. But I think
having an intuition about this is the dividing line between
"experienced programmer" and "newb". After all, if programming were
merely about "make everything explicit" or "make everything implicit",
any idiot could do it.

I think this is also the core skill for writing readable code. You
need to determine what details are relevant to each piece of code.
Whatever the case, you need to make a conscious decision as to what
details shine through and what details you obscure. Because if these
things happen on accident, they're almost guaranteed to be wrong.

Filed under: programming, python, ruby

chexov says...

    xrange([start,] stop[, step]) -> xrange object

 

    Like range(), but instead of returning a list, returns an object that

    generates the numbers in the range on demand.  For looping, this is 

    slightly faster than range() and more memory efficient.

Filed under: python

ssk says...

python
2.6系だとJSONが標準で組み込まれてるらしいんですが、手元は2.5系なのでsimplejsonを使っています。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
#!/usr/bin/python
# -*- coding: utf-8 -*-
import sys, codecs
sys.stdout = codecs.getwriter('utf_8')(sys.stdout)
 
import simplejson
 
data = { 'items':[
  {'name':'iPhone',  'price':50000},
  {'name':'macbook', 'price':100000},
  {'name':"マクド",  'price':100},
]}
text = simplejson.dumps(data)  # encode
copy = simplejson.loads(text)  # decode
 
print "data = " + str(data)
print "text = " + text
print "copy = " + str(copy)
for item in copy['items']:
  print item["name"]

Filed under: json, python