Friday, July 29

Python for dummies, Part III

Ah, the GUI at last. This wasn't really a requirement but I'd rather have something that looks more like a proper application than a command line interface -command line interfaces are sooo eighties. To combine this with my previous sections, I just need to import them and 'call' their functionality from within this class.
The code to create the user interface is listed below.

from Tkinter import *

class TemplateGUI(Frame):
# the outermost frame
  def __init__(self, parent=0):
    self.type = 2
    self.master.title('GUI Template')

  def buildUI(self):
    fFile = Frame(self)
    Label(fFile, text="Filename: ").pack(side="left")
    self.eName = Entry(fFile)
    self.eName.pack(side=LEFT, padx=5)

# example of radio buttons to get gender data
    fType = Frame(fFile, borderwidth=1, relief=SUNKEN)
    self.rMale = Radiobutton(fType, text="Male", variable = self.type,       value=2, command=self.doMale)
    self.rMale.pack(side=TOP, anchor=W)
    self.rFemale = Radiobutton(fType, text="Female", variable=self.type,       value=1, command=self.doFemale)
    self.rFemale.pack(side=TOP, anchor=W)

# 'Male' is the default radio button selection
    fType.pack(side=RIGHT, padx=3)
    fFile.pack(side=TOP, fill=X)
    self.txtBox = Text(self, width=60, height=10)
    self.txtBox.pack(side=TOP, padx=3, pady=3)

# buttons to execute application functionality
    fButts = Frame(self)
    self.bStart = Button(fButts, text="Start", command=self.doStart)
    self.bStart.pack(side=LEFT, anchor=W, padx=50, pady=2)
    self.bReset = Button(fButts, text="Reset", command=self.doReset)
    self.bReset.pack(side=LEFT, padx=10)
    self.bQuit = Button(fButts, text="Quit", command=self.doQuit)
    self.bQuit.pack(side=RIGHT, anchor=E, padx=50, pady=2)
    fButts.pack(side=BOTTOM, fill=X)

  def doQuit(self):

  def doReset(self):
    self.txtBox.delete(1.0, END)
    self.eName.delete(0, END)

  def doMale(self):
    self.type = 2

  def doFemale(self):
    self.type = 1

  def doStart(self):
    filename = self.eName.get()
    if filename == "":
      self.txtBox.insert(END,"\nNo filename provided!\n")

    self.txtBox.insert(END, "\nStarted application...\n")

    resultStr = "Success"
    self.txtBox.insert(END, resultStr)

myApp = TemplateGUI()

When run, the code above produces the GUI below:

Neat, huh?

Thursday, July 28

My color?


You are a very calm and contemplative person. Others are drawn to your peaceful, nurturing nature.

Find out your color at Quiz Me!

Green? Hmmm.

Thanks to Michael Collins for the link. Check out your color here.

Wednesday, July 27

Python for dummies, Part II

In my previous Python entry I posted code to collect string/word information from a document/string, plus a simple MSN-specific parser that allows me to navigate a DOM document and extract its nodes and content.

Now, I need functionality to search for the files to parse on a disk and I need a

Search first.
I spent an awful lot of time playing around with OS-specific code to traverse and navigate directory structures only to find out that Python has most of this functionality built-in, faster, and cross-platform.

I didn't want my disk search to take ages. Hence, I decided to split it in three phases:

1 - Search for log files on the current folder, in case the user drops the executable there. Assuming the parser from Part I is named 'my_parser':

import os, fnmatch
files = [filename for filename in os.listdir('.')]
for f in files:
  # only list XML files
  if fnmatch.fnmatch( f, '*.xml' ):

2 - Search for files, starting from the user's profile folder. This prunes the search space by avoiding OS and user specific folders. I separated searching the right folders from listing the log files;

import os, fnmatch, re
folders = []

files = []
r = re.compile(r'msn_username') # msn_username is passed by the user

def browse((r, folders), dirpath, namelist):
  for name in namelist:
# list folders' names starting with msn_username given
      folders.append(os.path.join(dirpath, name))

def listfiles((wildcard, files), dirpath, namelist):
  for name in namelist:
    # only append 'wildcard'-specified filenames
    if fnmatch.fnmatch( name, wildcard ):
      files.append(os.path.join(dirpath, name))

userbase = os.environ['USERPROFILE']
# change directory to user profile folder
os.path.walk(os.getcwd(), browse, (r, folders))

if folders:
  # populate the list of XML files from the folders
  wcard = '*.xml'
  for fld in folders:

    os.path.walk(fld, listfiles, (wcard, files))

3 - Search for files on the whole disk, in case none was found with the two previous methods;

# replace the userbase value with root directory
userbase = '\\'
os.path.walk(os.getcwd(), browse, (r, folders))

And that's it. Amazing how simple it looks now. Part III, the GUI, will follow soon.

UPDATE: This module does most of the above in a nicer, more elegant way. No need to reinvent the wheel.

So very special

I wonder if Radiohead knows about this Flash video for Creep, a terrific match for the song, much better than their actual video. Thanks to Michael Collins for the link:

Thom Yorke has a superb talent for creating and singing simple, beautifully melodic chorus, two of which come to mind now:

"I wish I were special
So very special"
from the song Creep on the Pablo Honey album and

"If I could be who you wanted
If I could be who you wanted
All the time
All the time"

from the song Fake Plastic Trees on The Bends album.

The Unbearable Weightness of Being

Every time I go back home to my parents, they always remark that "I've lost weight" and "I look very skinny". Now if I had really lost all that much weight at each visit, I would be a popular skeleton model at some med school by now.
Interestingly enough, I've also noticed my scales seem to disagree with that observation, together with the fact that my food intake levels haven't seen any significant drop (quite the opposite!), and I feel myself "puffier".
Since they are also the only ones who notice this extraordinary weight loss of mine, I guess what they are really trying to say is "we miss you beloved son, we want you back at our dinner table like in the past, we want to cook for you, we miss your company!"
I love my parents :)

Saturday, July 23

Geek dinner

Yesterday I attended my first geek dinner, held at this place in central London. It was good to finally put some faces to the blogs I've been reading. Met a few 'famous' geeks such as Chris DiBona from Google -the man behind the Summer Of Code program, Jeremy Zawodny from Yahoo, alongside with Helen, Rachel, Peter, and some others whose names I can't remember or didn't get.

Thought it was very interesting to have a guy from Google and a guy from Yahoo together in the same table (which happened to be the table I was sitting).
Their personalities highlighted the current state of the companies they work for:

  • Yahoo, the colossal Internet mammoth now somewhat peripheral to the media hype, was represented by an obviously very intelligent but quite low profile Jeremy.
  • Google, on the other hand, is the place to be; the company-du-jour, the site at the forefront of most of today's technology innovation, and the company that probably has the highest levels of media attention in the corporate-sphere these days, was represented by a very eloquent, pervasive, excited, charismatic and over-confident Chris.

Tell me the company you work for and I'll tell you how you are?

Wednesday, July 20

Python for dummies

I have been using Python extensively in my research at UCL. Since it offers simple and elegant solutions for some issues I had, I'm posting my development sketches for future reference.
The IM logs I'm analyzing are stored in XML files and I'm extracting and normalizing the following data:

  • duration of sessions
  • number of messages
  • words per message
  • frequency of words
I addressed each of the above tasks separately and then combined them all (plus a GUI) in the final program. I wanted to collect some more data, such as the language preference of sender and recipient, but that is not kept by any IM client.
Collecting the duration of the session is a simple matter of keeping the start and end time of the session.
Counting the number of messages is a simple iterator.

Number of words per message is achieved with the pristine code:

lst = msg.split()
len( lst )

Frequency of words is addressed with the elegant:

freq = {}
for str in lst:
  str = str.lower()
  freq[str] = freq.get(str, 0) + 1

Next, some hardcore XML parser was needed. This took a bit longer to write, being that I was re-learning Python simultaneously, so I started by tackling MSN log files. Generalizing this to all other IM clients is a trivial task.

from xml.dom import minidom

def load(filename):
  return minidom.parse(filename)

def getElementsByTagName(node, tagName):
  children = node.getElementsByTagName(tagName)
  if len(children):
    return children
  return []

def first(node, tagName):
  children = getElementsByTagName(node, tagName)
  return len(children) and children[0] or None

def textOf(node):
  return node and "".join([ for child in node.childNodes]) or ""

if __name__ == '__main__':
  import sys
  document = load("msn_log_example.xml")
  for item in getElementsByTagName(document, 'Message'):
    print 'Message:', textOf(first(item, 'Text')

Note that the example above is customized for MSN Messenger logs where the <Text> element (containing the message content) is contained in the <Message> tag. Part 2 of this tutorial will follow soon.

Oh no! (just for Seinfeld fans)

Saturday, July 16

Updating Ant on JBuilder9 Enterprise

From Tools -> Configure Libraries, select the Ant library on the left hand side panel. You should see the following under the Class tab on the right hand side:


Replace ant.jar and optional.jar (all but jbuilder.jar) with your Ant installation jars under %ANT_INSTALL_DIR%/lib.

Now I have all my Java IDE's (IDEA, Eclipse, JBuilder) in sync.


From this thread on slashdot:

It's all about the memory and bus architecture...
The sad uninformed people say (pinch airflow from nose so you sound geeky) "It's a 64 bit processor, you need a 64 bit OS to take advantage of it, period".

What they fail to realize is the 64 bit memory and bus architecture happen *below the HAL*. the OS doesn't even see it, let alone need to be 64 bit to take advantage of it. I politely let them flap their gums and went out and bought one anyway, then proved them wrong.

Computationally = intel, tho that gap is narrowing
IO = AMD64 or opteron.

I don't care if you are running windows 95, amd64 will be faster for IO bound stuff, than any 32 bit architecture is capable of even getting close to.
If you are crunching spreadsheets, word processing or videogaming, *generally*, intel is better, tho that gap is getting smaller. It will likely disappear when 64 bit OS's get apps caught up to them.
For DB, working with large files, shunting lots of streams around the mobo, AMD 64 *smokes* intel 32 right now, 32 bit apps notwithstanding.

There is simply no contest.

Wednesday, July 13


At last, the full set of HTML components is customizable by CSS.

This article tells you how to do it, with great examples given too.

Canvas talking

My friend Kamila is doing some fabulous research involving museums in London. I took part in one of her experiments and it was an extraordinary self-revelation exercise.
Armed with a voice recorder, I had to choose two painters exposing in the gallery, a couple of paintings from each, and talk for half an hour about the paintings whilst facing them.

I chose Rembrandt and Seurat, Rembrandt being one of my favorites, Seurat because I wanted to have contrasting artists. The experiment was incredible, as I have never verbalized art before. I was also amazed (and so was Kamila) with how much I had to say (note: not necessarily amazed with how good what I had to say was!).
It will definitely be something I will be doing often, obviously without the recorder and without the need to talk it out loud.
It made me see paintings I took for granted in a totally different way.

Saturday, July 9

A plan for Africa

My plan for Africa simply relies in education. The situation in the African continent will only improve with the education of its people.
My idea is to spend (Live8?) money on scholarships, grants, and stipends aplenty so that 100s or 1000s of young people from many African countries access good schools and universities abroad.
Sure some would remain abroad, but many others would return and do/give something back to their home nation.
I think the hope of the continent lies in those who return, educated, new sounds in their ears, new visuals populating their dreams, their minds flourishing with new ideas.

Friday, July 8

London 7/7

No more, no less:

Londoners react to explosions not with fear and terror but with resolution and bravery. The eyes of the world are on London today. The world will see a display of stiff upper lips and unity. If there’s one thing that Londoners can do well, it’s this: they cope.

Update: The very powerful speech by Mayor Ken Livingstone is available online. Read it here, or see/hear it here. It is really a very inspired, sincere, and highly emotional moment from a man who has done some great work for this great city.

Wednesday, July 6

LondonTown 2012!

Yup, LondonTown got the Olympics! It's been crazy out in Trafalgar Sq!

I gotta tell you I've been choosing my cities pretty well:
-In 2001 I was living in China when Beijing won their 2008 bid. What a party that was!
-Now I'm back in London and London won.

Ok, you know exactly where I'm getting at: 2016 is up for grabs now.

Metropolis of the world, where should I move to and help win the next Olympics bid?