Friday, December 28

Types and Programming Languages

This excellent overview of type systems by nostrademons @ reddit describes latent, manifest, static, dynamic, weak, strong, nominal, and structural typing, and places a number of popular programming languages on some of these dimensions.

I got curious about listing all the possible combinations and then find programming languages to fill in the slots.

I cooked up some quick and dirty Python code to do the job.

Let's call S = {static, dynamic, latent, manifest, weak, strong, nominal, structural}
The powerset of S has 28 = 256 combinations, only some of them useful but I can fix that later.

# return a list with 2len(l) elements (definition of powerset)
def powerset(l):
     r = []
     if l:
         head = l[:1]
         tail = l[1:]
         for item in powerset(tail):
             r.append(head + item)
     return r

The powerset of S returns the complete list of subsets of S from sizes 1 to 8 plus the empty set {}.
This means subsets {static}, {dynamic}, {static, dynamic}, ..., all the way to {static, dynamic, latent, manifest, weak, strong, nominal, structural} are included too.

However, there are 4 pairs of [kind of] mutually exclusive types and I am only interested in the subsets of size 4. Thus, I need only consider the unique 4-dimension combinations.

The maths tells me I should expect n!/k!(n-k)! = 8!/4!(8-4)! = 70 such combinations (see binomial coefficient or look up the middle number of the 8th row of Pascal's triangle).

I can obtain this by keeping only those subsets of size 4 from the list returned by the powerset. The function below will help:

# filter list keeping only n-tuples where n=size
def len_filter(l, size=1):
     return [item for item in l if len(item) == size]

Still, the set of 70 size-4 combinations contain lots of invalid members because of the 4 pairs of [kind of] mutually exclusive types (read original comment for details):

    ("static", "dynamic"),
    ("latent", "manifest"),
    ("weak", "strong"),
    ("nominal", "structural")

A simple solution is to just remove any combination that contains these "invalid" pairs. The 2 functions below will do that:

# retain n-tuples that don't contain any of the invalid pairs
def invalidpairs_filter(l):
     return [item for item in l if not is_invalid(INVALID_PAIRS, item)]

# returns True if any of the invalid tuples is contained in the given n-tuple,
# False otherwise
def is_invalid(invalid_pairs, ntuple):
     s = set(ntuple)
     for invalid_pair in invalid_pairs:
         si = set(invalid_pair)
         if si.issubset(s):
             return True
     return False

Et voilá! Running the Python program prints the 16 combinations we want:

    ['latent', 'nominal', 'static', 'weak']
    ['dynamic', 'latent', 'nominal', 'weak']
    ['manifest', 'nominal', 'static', 'weak']
    ['dynamic', 'manifest', 'nominal', 'weak']
    ['latent', 'nominal', 'static', 'strong']
    ['dynamic', 'latent', 'nominal', 'strong']
    ['manifest', 'nominal', 'static', 'strong']
    ['dynamic', 'manifest', 'nominal', 'strong']
    ['latent', 'static', 'structural', 'weak']
    ['dynamic', 'latent', 'structural', 'weak']
    ['manifest', 'static', 'structural', 'weak']
    ['dynamic', 'manifest', 'structural', 'weak']
    ['latent', 'static', 'strong', 'structural']
    ['dynamic', 'latent', 'strong', 'structural']
    ['manifest', 'static', 'strong', 'structural']
    ['dynamic', 'manifest', 'strong', 'structural']

Should there be a programming language for all the above combinations? nostrademons and grauenwolf already filled in a few of these slots (apologies if my copy & paste introduced errors):

latent, nominal, static, weak =
dynamic, latent, nominal, weak = PHP
manifest, nominal, static, weak = C, Access SQL, Java, C++, C#,
dynamic, manifest, nominal, weak =
latent, nominal, static, strong = Haskell
dynamic, latent, nominal, strong = Scheme
manifest, nominal, static, strong = Java, C++, C#, VB.NET (strict), T-SQL
dynamic, manifest, nominal, strong =
latent, static, structural, weak = VB.NET, VBScript
dynamic, latent, structural, weak = JavaScript, PHP, Assembly
manifest, static, structural, weak =
dynamic, manifest, structural, weak =
latent, static, strong, structural = Ocaml, Haskell
dynamic, latent, strong, structural = Erlang, Scheme, Common Lisp, Python, Ruby
manifest, static, strong, structural = Haskell, C++ Templates
dynamic, manifest, strong, structural = Common Lisp

Repetitions occur because most languages support multiple forms of typing.

My next task (for another night, another post) is to clean-up the table and find languages that fit in the 5 unused dimensions from the list above:

    ['latent', 'nominal', 'static', 'weak'] =
    ['dynamic', 'manifest', 'nominal', 'weak'] =
    ['dynamic', 'manifest', 'nominal', 'strong'] =
    ['manifest', 'static', 'structural', 'weak'] =
    ['dynamic', 'manifest', 'structural', 'weak'] =

Possible Questions:

  1. No programming languages for a given combination? Why is that?

  2. Once most known programming languages found their place in the matrix, it will be interesting to understand why some combinations are more popular than others

I've created project programming-misc on Google Code to hold these and other programming exercises I might engage in. The code shown above is available in the SVN repository and downloads page.

3-31-04 I'm Over It (the article by Bruce Eckel that started the conversation @ reddit)
Comment by nostrademons
Comment bt grauenwolf

It is quite bizarre that the otherwise very polished Google Analytics has a bug known since (at least) Nov 2005!
The URL considered invalid by Google Analytics is actually the one generated by and taken from Google Code. The issue is well documented with easy work-arounds, though.

1 comment:

Anonymous said...

so where does actionscript fit into something like this :)