Through a blog post from John D. Cook on planet python I beacame aware of the Microsoft Sho Project for data analysis and scientific computing. I haven't installed it yet, but it looks promising and I always love to see progress and usage of IronPython on Windows.
Here's the Computing a Word Histogramm Example:
>>> fp = System.IO.File.ReadAllText("./declarationofindependence.txt")
>>> table = System.Collections.Hashtable()
>>> for word in fp.split():
if table.ContainsKey(word):
table[word] +=1
else:
table[word] = 1
>>> pairs = zip(list(table.Keys), list(table.Values))
>>> pairs.sort(lambda a,b: a[1]<b[1])
>>> bar([elt[0] for elt in pairs[0:10]], [elt[1] for elt in pairs[0:10]])
I've used a hashtable for counting in the past too, but the Counter Datastructure from the python standard library (added in 2.7) is much better suited for this kind of task:
>>> from collections import Counter >>> table = Counter() >>> table(fp.split()) >>> pairs = table.most_common(10)
It's shorter and more readable.
For sorting a list of lists based on a specific element I prefer using operator.itemgetter instead of a lambda expression.
>>> from operator import itemgetter
>>> lst = [('orange', 5), ('banana', 7), ('apple', 2)]
>>> lst.sort(key=itemgetter(1))
>>> lst
[('apple', 2), ('orange', 5), ('banana', 7)]
The bottom line is that python has a great standard library and it is worth knowing it well.