December 23, 2009

Dynamic Methods and Garbage Collecting

I really enjoy the Python programming language. Even when it bites you, it takes you on an interesting journey.

In the MUD, I use the term Client to refer to the session with the user at the other end of the wire. The Client is passed to a User object that represents the player in some state or another. When a User gets deleted it gets chomped up by Python's garbage collector which should delete the client which in turn should delete the socket which fires the socket's close() method on the way out and gives the cat a fish, err, drops the user resulting in the famous 'Connection closed by foreign host..'

Only it wasn't.

Deleted clients and "disconnected" users were hanging around typing to unmonitored, uncaring sockets. After a bit of poking I found the problem was some cheap hackery I was using for the login process. I was using a property called "cmd_driver" like a state machine, changing it to point to the next method I wanted input to go;
self.cmd_driver = self.get_password
Turns out, this was setting cmd_driver to a bound method; one that is wrapped inside a reference to the an instance of the class, which in this case was itself. This caused the reference count to increase by one and avoid the garbage collector.Here's a demonstration snippet;
>>> class A(object):
... def __del__(self):
... print "__del__ called"
... def foo(self):
... pass
...
>>> a = A()
>>> del a
__del__ called
>>> a2 = A()
>>> a2.bar = a2.foo
>>> del a2
>>>

So I need to work out a healthier version of the state machine.
Maybe one that calls unbound methods of the class or instance methods
via some introspection.

December 18, 2009

Random Name Generation

A few years ago I was playing around with a random name generator. My approach was to cobble together random letter combination like;

leading consonants + vowels + inner consonants + vowels + closing consonants


Basically, I was aiming for something pronounceable with commons letters weighted to appear more often. It produced output such as;

Votharn Eristacark Iplortidot Birtoil Udaeteahieb Aceastoherk Reloist Tharnog Wasterk Femewelav Ublyrrielic Cekird Owritothol Hoogoh Obloukajarriem Sleebont Niestart Pekev Lirtooth Efentoidagix Klyckas Yryfesat Klooton

Yeah ... that's really, really awful.

So I decided to give it another whack. This time I started with the premise, 'what sounds most like a name?'

Names do!

I found a couple files with over a thousand of the most common male and female first names on the US Census Bureau's web page and started playing. I wrote a Python script that used regular expressions to slice a batch of words into three lists;

List 1 = Zero or more vowels + One or more consonants at the start of the word
List 2 = One or more vowels + One or more consonants inside the word (not at the start or the end). We can get 0 or more of these patterns depending on the word.
List 3 = One or more vowels + Zero or more consonants at the end of the word.

Side note: If you haven't dug into regular expressions yet I highly recommend you check them out. I avoided them for years and now they're an essential part of my programmer tool box. Another big plus is their utility spans multiple languages.

I also tracked the frequency of each pattern, sorting by most common first and discarding the rares. Finally, I dumped the output formatted as Python lists that I could paste right into the source of the next script.

import re
import operator


_FILENAME = 'data/elves2.txt'
_CULL = 1

## Match 0 or more vowels + 1 or more consonants at the start of the word
_LEAD = re.compile(r'^[aeiouy]*(?:qu|[bcdfghjklmnpqrstvwxz])+')
## Match 1 or more vowels + 1 or more consonants inside a word (not start/end)
_INNER = re.compile(r'\B[aeiouy]+(?:qu|[bcdfghjklmnpqrstvwxz])+\B')
# Match 1 or more vowels + 0 or more consonats at the end of a word
_TRAIL = re.compile(r'[aeiouy]+(?:qu|[bcdfghjklmnpqrstvwxz])+$')


def token_lists(names):

lead, inner, tail = {}, {}, {}

## Populate dictionaries; key=pattern, value=frequency
for name in names:

match = re.match(_LEAD, name)
if match:
pat = match.group(0)
count = lead.get(pat,0)
lead[pat] = count +1

matches = re.findall(_INNER, name)
for pat in matches:
print pat,
count = inner.get(pat,0)
inner[pat] = count +1

match = re.search(_TRAIL, name)
if match:
pat = match.group(0)
count = tail.get(pat,0)
tail[pat] = count +1


## Convert dicts to a list of tuples in the format (pattern, frequency)
lead_srt = sorted(lead.items(),key=operator.itemgetter(1),reverse=True)
inner_srt = sorted(inner.items(),key=operator.itemgetter(1),reverse=True)
tail_srt = sorted(tail.items(),key=operator.itemgetter(1),reverse=True)

## Build lists of patterns ordered most to least frequent and cull rares
lead_list = [ x[0] for x in lead_srt if x[1] > _CULL ]
inner_list = [ x[0] for x in inner_srt if x[1] > _CULL ]
tail_list = [ x[0] for x in tail_srt if x[1] > _CULL ]

return lead_list, inner_list, tail_list


if __name__ == '__main__':

names = open(_FILENAME, 'rt').readlines()
lead_list, inner_list, tail_list = token_lists(names)

print '#', len(lead_list), len(inner_list), len(tail_list)
print '_LEADS = ', lead_list
print '_INNERS = ', inner_list
print '_TAILS = ', tail_list


Next I used a script to assemble random names from these lists. Here's the one for male names:

import random


_LEADS = ['d', 'j', 'm', 'r', 'l', 'w', 'c', 'h', 'g', 'b', 'br', 't', 'k',
'n', 's', 'cl', 'fr', 'f', 'p', 'st', 'v', 'ch', 'sh', 'gr', 'tr']
_INNERS = ['er', 'ar', 'el', 'or', 'an', 'ic', 'arr', 'am', 'ol', 'on',
'al', 'en', 'ill', 'in', 'err', 'and', 'il', 'om', 'et', 'arl', 'ev',
'ac', 'ust', 'av', 'ert', 'enn', 'ent', 'ath', 'onn', 'it', 'os', 'enc',
'yr', 'em', 'ist', 'anc', 'arc', 'ich', 'as', 'est']
_TAILS = ['o', 'on', 'e', 'y', 'ey', 'er', 'in', 'an', 'ie', 'io', 'en',
'is', 'el', 'us', 'es', 'as', 'ian', 'ed']


def namegen():
syllables = random.randint(0,1)
if random.random() > .85:
syllables += 1
name = random.choice(_LEADS)
for x in range(syllables):
name += random.choice(_INNERS)
name += random.choice(_TAILS)
return name.title()


if __name__ == '__main__':

for x in range(100):
print namegen(),


Which gives output like;

Tyron Frian Grathan Charcan Tren Garrian Stasas Cholian Dites Ris Tie Non Lin Lon Wian Wio Ramas Lo Larus Pan Cio Wasie Mely Chie Levey Bustel Lillen Ben Ponny Panian Nyrus Bre Raso Stel Fasis Konandas Starrin Hian Starlio Rey Jathen Frander Ven Talamio Samin Ches Handey Sterian Nencin Sathon Ponel Citel Momen Wus Rencey Dence Rencan Bevon Jo Bre Tis Non Heved Ne Shestas Lales Ferrus Werrus Foso Standie Ver Gis Giner Ver Nin Res Veled Clon Geler Mustin Gio Bry Juste Shanan Cled Trely Chon Trin Binis Jinen Jacy Ralio Bo Metel De Sian Brie Fanan Donen Jed

Seeding with female names, I get;

Alianne Cler Cishel Kroller Namie Dah Kyn Niton Hie Trosancel Cher Na Angy Sestin Jah Dathe Pin Traton Teris Dishon Goren Sheny Frindis Chalin Frie Lolleen Ware Chrena Hy Veannon Dyn Padica Sher Chie Kon Brollis Cisi Gy Chey Kreni Dulani Gancis Shancey Satheen Chreanny Shy Clin Cessia Kriner Shestie Hancah Ner Trorer Ger Tren Ramerrer Freanne Kaurah Frolah Kritah Harley Kisten Eleen Trellyn Chiannel Trosi Alisin Genery Cla Mie Jate Shosacey De Angoron Kis Gianon Mandite Chryn Wer Shadalie Polon Gita Ner Fron Va Chrabica Vatia Clia Tacel Rareen Treen Angia Henen Angin Poreen Eleannel Ken Brer Branca Hesta

December 17, 2009

MiniBoa

Over on Mudbytes, Idealiad (aka Kooneiform) started a thread about creating a Python Socket MUD -- a very tiny socket library that Python coders could use as the basis for their MUD projects and experiments. There already were a few of these for other languages such as TeensyMud for Ruby but nothing for Python. I really liked this idea and, as I had a working Telnet server already, I decided to repackage my network modules as MiniBoa. Instead of the BogBoa's GPL licensing, I switched to the more permissive Apache 2.0 (one of the recommended licenses for maximum compatibility with the Python's own).

To be honest, the code was a lot messier to convert into a stand-alone library than I expected -- because this was the first code I wrote for BogBoa and I was hooking into it less than gracefully. Since I didn't want to maintain two copies of (mostly) the same modules, I decided to revamp BogBoa to use MiniBoa for networking.

You can find code and documentation at the project page and some discussion on this Mudbytes thread.