Sunday, December 30

Types and Programming Languages Revisited

My previous brute-force approach to list all the 4-dimension combinations of S = {(latent, manifest), (static, dynamic), (weak, strong), (nominal, structural)} typing -by filtering the desired tuples from the powerset of S - wasn't very efficient.
It was a perfect example of The Danger of Naïveté: an inefficient algorithm that was nevertheless simple to devise and implement.

Moreover, quoting The eight rules of Dean Carney, I should have observed that "the first place to look for a solution is within the problem itself". If the problem involves 4 tuples of 2 elements each to begin with, the efficient solution is not likely to require the powerset of all 8 elements.

Thus, I figured I should give it another go.

First, for simplicity and better readability, let me replace the values of S with the character-tuples on the left hand side:

(A, B) = (latent, manifest)
(C, D) = (static, dynamic)
(E, F) = (weak, strong)
(G, H) = (nominal, structural)

S = {(A, B), (C, D), (E, F), (G, H)}

Next, I'll write down some 4-size combinations. After a short while, a pattern emerges:


The pattern becomes clear when I look at the elements like this:

X1=(A, B) x X2=(C, D) x X3=(E, F) x X4=(G, H)

This is a Cartesian product of X1 x X2 x ... x Xn, the set containing all possible combinations of one element from each set.

With this knowledge, implementing an efficient algorithm is easy. First, a method to return the Cartesian product of two sets:

# arguments must be lists, e.g.: cartesian([1, 2], [3, 4])
def cartesian(list1, list2):
     result = []
     for el in list1:
         for item in list2:
             if isinstance(el, list):
                 result.append(el + [item])
                 result.append([el] + [item])
     return result

Testing the method with input X=[1, 2] x Y=[3, 4] yields:

     [[1, 3], [1, 4], [2, 3], [2, 4]]

Which looks good.
Next, a method to process the Cartesian product of any number of sets. Here's a candidate implementation:

def types_cartesian(aList):
     head = aList[0]
     tail = aList[1:]
     for item in tail:
         head = cartesian(head, item)
     return head

The output of method types_cartesian(aList) given input aList=S is:

S={['static', 'dynamic'], ['weak', 'strong'], ['latent', 'manifest'], ['nominal', 'structural']}

S1 x S2 x ... Sn =

     ['static', 'weak', 'latent', 'nominal']
     ['static', 'weak', 'latent', 'structural']
     ['static', 'weak', 'manifest', 'nominal']
     ['static', 'weak', 'manifest', 'structural']
     ['static', 'strong', 'latent', 'nominal']
     ['static', 'strong', 'latent', 'structural']
     ['static', 'strong', 'manifest', 'nominal']
     ['static', 'strong', 'manifest', 'structural']
     ['dynamic', 'weak', 'latent', 'nominal']
     ['dynamic', 'weak', 'latent', 'structural']
     ['dynamic', 'weak', 'manifest', 'nominal']
     ['dynamic', 'weak', 'manifest', 'structural']
     ['dynamic', 'strong', 'latent', 'nominal']
     ['dynamic', 'strong', 'latent', 'structural']
     ['dynamic', 'strong', 'manifest', 'nominal']
     ['dynamic', 'strong', 'manifest', 'structural']

This is the same list as the one obtained earlier with the powerset approach.
The big difference is that with the powerset implementation the number of possible combinations explodes exponentially ( O(2n) ), whereas the Cartesian product is a quadratic ( O(n2) ) algorithm.

Here's the code.

Types and Programming Languages
What to know before debating type systems

Saturday, December 29

Anchoring & Brainstorming

From Quote of the week: Why brainstorming is a bad idea:

"To their surprise, the researchers found that virtual groups, where
people brainstormed individually, generated nearly twice as many
ideas as the real groups.

The result, it turned out, is not an anomaly. In a 1987 study,
researchers concluded that brainstorming groups have never
outperformed virtual groups.
Of the 25 reported experiments by psychologists all over
the world, real groups have never once been shown to be
more productive than virtual groups.
In fact, real groups that engage in brainstorming consistently
generate about half the number of ideas they would have
produced if the group's individuals had worked alone."

Fascinating stuff.
Could this be related to [Nobel laureate] Daniel Kanheman's work on the Anchoring bias, where [apparently individual] contributions during a brainstorming session overly rely (anchor) on each others' ideas?

Useful Links:
Tversky, A. & Kahneman, D. (1974). Judgment under uncertainty: Heuristics and biases
The Medici Effect

Friday, December 28

Types and Programming Languages

This excellent overview of type systems by nostrademons @ reddit describes latent, manifest, static, dynamic, weak, strong, nominal, and structural typing, and places a number of popular programming languages on some of these dimensions.

I got curious about listing all the possible combinations and then find programming languages to fill in the slots.

I cooked up some quick and dirty Python code to do the job.

Let's call S = {static, dynamic, latent, manifest, weak, strong, nominal, structural}
The powerset of S has 28 = 256 combinations, only some of them useful but I can fix that later.

# return a list with 2len(l) elements (definition of powerset)
def powerset(l):
     r = []
     if l:
         head = l[:1]
         tail = l[1:]
         for item in powerset(tail):
             r.append(head + item)
     return r

The powerset of S returns the complete list of subsets of S from sizes 1 to 8 plus the empty set {}.
This means subsets {static}, {dynamic}, {static, dynamic}, ..., all the way to {static, dynamic, latent, manifest, weak, strong, nominal, structural} are included too.

However, there are 4 pairs of [kind of] mutually exclusive types and I am only interested in the subsets of size 4. Thus, I need only consider the unique 4-dimension combinations.

The maths tells me I should expect n!/k!(n-k)! = 8!/4!(8-4)! = 70 such combinations (see binomial coefficient or look up the middle number of the 8th row of Pascal's triangle).

I can obtain this by keeping only those subsets of size 4 from the list returned by the powerset. The function below will help:

# filter list keeping only n-tuples where n=size
def len_filter(l, size=1):
     return [item for item in l if len(item) == size]

Still, the set of 70 size-4 combinations contain lots of invalid members because of the 4 pairs of [kind of] mutually exclusive types (read original comment for details):

    ("static", "dynamic"),
    ("latent", "manifest"),
    ("weak", "strong"),
    ("nominal", "structural")

A simple solution is to just remove any combination that contains these "invalid" pairs. The 2 functions below will do that:

# retain n-tuples that don't contain any of the invalid pairs
def invalidpairs_filter(l):
     return [item for item in l if not is_invalid(INVALID_PAIRS, item)]

# returns True if any of the invalid tuples is contained in the given n-tuple,
# False otherwise
def is_invalid(invalid_pairs, ntuple):
     s = set(ntuple)
     for invalid_pair in invalid_pairs:
         si = set(invalid_pair)
         if si.issubset(s):
             return True
     return False

Et voilá! Running the Python program prints the 16 combinations we want:

    ['latent', 'nominal', 'static', 'weak']
    ['dynamic', 'latent', 'nominal', 'weak']
    ['manifest', 'nominal', 'static', 'weak']
    ['dynamic', 'manifest', 'nominal', 'weak']
    ['latent', 'nominal', 'static', 'strong']
    ['dynamic', 'latent', 'nominal', 'strong']
    ['manifest', 'nominal', 'static', 'strong']
    ['dynamic', 'manifest', 'nominal', 'strong']
    ['latent', 'static', 'structural', 'weak']
    ['dynamic', 'latent', 'structural', 'weak']
    ['manifest', 'static', 'structural', 'weak']
    ['dynamic', 'manifest', 'structural', 'weak']
    ['latent', 'static', 'strong', 'structural']
    ['dynamic', 'latent', 'strong', 'structural']
    ['manifest', 'static', 'strong', 'structural']
    ['dynamic', 'manifest', 'strong', 'structural']

Should there be a programming language for all the above combinations? nostrademons and grauenwolf already filled in a few of these slots (apologies if my copy & paste introduced errors):

latent, nominal, static, weak =
dynamic, latent, nominal, weak = PHP
manifest, nominal, static, weak = C, Access SQL, Java, C++, C#,
dynamic, manifest, nominal, weak =
latent, nominal, static, strong = Haskell
dynamic, latent, nominal, strong = Scheme
manifest, nominal, static, strong = Java, C++, C#, VB.NET (strict), T-SQL
dynamic, manifest, nominal, strong =
latent, static, structural, weak = VB.NET, VBScript
dynamic, latent, structural, weak = JavaScript, PHP, Assembly
manifest, static, structural, weak =
dynamic, manifest, structural, weak =
latent, static, strong, structural = Ocaml, Haskell
dynamic, latent, strong, structural = Erlang, Scheme, Common Lisp, Python, Ruby
manifest, static, strong, structural = Haskell, C++ Templates
dynamic, manifest, strong, structural = Common Lisp

Repetitions occur because most languages support multiple forms of typing.

My next task (for another night, another post) is to clean-up the table and find languages that fit in the 5 unused dimensions from the list above:

    ['latent', 'nominal', 'static', 'weak'] =
    ['dynamic', 'manifest', 'nominal', 'weak'] =
    ['dynamic', 'manifest', 'nominal', 'strong'] =
    ['manifest', 'static', 'structural', 'weak'] =
    ['dynamic', 'manifest', 'structural', 'weak'] =

Possible Questions:

  1. No programming languages for a given combination? Why is that?

  2. Once most known programming languages found their place in the matrix, it will be interesting to understand why some combinations are more popular than others

I've created project programming-misc on Google Code to hold these and other programming exercises I might engage in. The code shown above is available in the SVN repository and downloads page.

3-31-04 I'm Over It (the article by Bruce Eckel that started the conversation @ reddit)
Comment by nostrademons
Comment bt grauenwolf

It is quite bizarre that the otherwise very polished Google Analytics has a bug known since (at least) Nov 2005!
The URL considered invalid by Google Analytics is actually the one generated by and taken from Google Code. The issue is well documented with easy work-arounds, though.

Friday, December 7

One-liners that made me lol

(original theme by Niniane)

"Don't let your beard get in the way of your typing"

A comment on this Reddit thread.