Saturday, May 27, 2006

Neptune

Neptune is a new programming language inspired mainly by smalltalk and C/C++ but also taking ideas from many other places. It is, in short, a dynamically typed object-oriented language with an optional static type system. It is similar in many ways to smalltalk but has a C-like syntax. The language was designed by Esmertec AG in Århus, Denmark. Esmertec Denmark is now closed but work has started on a new implementation at sourceforge.

Posts

I' ve written about it in these posts:

  • Types #2: more about the type system, including the void type and protocol literals.

  • Types: an overview of the static type system

  • Selectors: about selector objects, a dynamic representation of method names, which is a very powerful abstraction.

  • Characters: about a neat way to specify character literals in neptune.

  • Why Neptune?: A post that tries to explain why we decided to switch from using smalltalk to designing our own language.

  • Traits: How traits work in neptune.

  • Exceptions: About neptune's exception mechanism which has some unusual features

  • Using: How neptune's using statement works

  • Brevity: An example of how you could represent a simple concept in neptune, demonstrating various language features

  • C# 3.0: A look at some new features in C# that look very similar to features in neptune.

  • Constructors: How constructors work.

  • Interpol: About neptune's approach to string construction: string interpolation.

  • Structs and Memory: Describes a tool I've written to make it easier to work with external C structures from neptune. Describes neptune's interface to external calls and external memory.

I'll keep this list updated as I write new posts.

Wednesday, May 24, 2006

Selectors

One of the most important rules in software engineering is don't repeat yourself. If you find yourself writing the same code, or almost the same code, in more than one place then that is a sign that your code smells. For instance, if you see this code
Node root = current_node;
while (root.get_parent() != null)
root = root.get_parent();
near this code
Node topmost = left_leaf;
while (topmost.get_parent() != null)
topmost = topmost.get_parent();
you should feel a strong urge to factor out the similarities:
Node find_root(Node start) {
Node current = start;
while (current.get_parent() != null)
current = current.get_parent();
return current;
}
and then just use that method:
Node root = find_root(current_node);
...
Node topmost = find_root(left_leaf);
Refactorings like this is something we do all the time: notice similarities in our code and factor them out.

But not all similarities are easy to factor out. In the example above it was easy: the same thing was done with two different objects, current_node and left_leaf. Factoring out the subject of an operation is usually easy: you just create a method or function that takes the subject as an argument. But consider these two code snippets:
Node root = current_node;
while (root.get_parent() != null)
root = root.get_parent();
and
File root_directory = current_directory;
while (root_directory.get_enclosing_directory() != null)
root_directory = root_directory.get_enclosing_directory();
These two pieces of code are almost identical in what they do but in this case they're not only different in the object they operate on but also in which method is called. In most object-oriented languages you can't "factor out" the name of a method, like get_parent or get_enclosing_directory in this example, so you can't write a find_root method that can be used to replace both loops as we could in the previous example.

In neptune, on the other hand, there is a mechanism for abstracting over method names: selectors. A selector is an object that represents the name of a method. For instance, the name of the get_enclosing_directory method is written as ##get_enclosing_directory:0. The syntax of a selector, at least in the common case, is ## followed by the name of the method, colon, and the number of arguments expected by the method. Given a selector object you can invoke the corresponding method on an object using the perform syntax:
Selector sel = ##to_string:0;
Point p = new Point(3, 5);
String s = p.{sel}(); // = p.to_string()
The syntax recv.{expr}(args...) means "invoke the method specified by expr on recv with the specified arguments. Using this, the loop example from before can be refactored into
find_root(var start, Selector method_name) {
var current = start;
while (current.{method_name}() != null)
current = current.{method_name}();
return current;
}
and then the two instances can call that method:
Node root = find_root(current_node, ##get_parent:0);
...
Node root_directory = find_root(current_directory,
##get_enclosing_directory:0);
Using selectors this way can sometimes be useful but code that is identical except for the name of a method is pretty rare, at least in my experience. But selectors can be used for many other things.

One of the most useful applications of selectors is delegates. A delegate is a selector coupled with an object. You can think of it as a delayed method call: you specify a particular method to call on a particular object but you don't perform the call just yet.
Point p = new Point(3, 5);
Delegate del = new Delegate(p, ##to_string:0);
String s = del();
Here, we create an object, then we create a delegate which can be used to send to_string() to the object, and finally we invoke the delegate which causes to_string() to be called on the point. The syntax for invoking a delegate is the standard function call syntax: delegate(args...).

The syntax new Delegate(...) is a bit cumbersome so there is also a binary operator, =>, that can be used to create delegates:
...
Delegate del = (##to_string:0 => p);
...
How are delegates useful? Well, the place where I've had most use for them is as event handlers. For instance, we have a rudimentary GUI toolkit based on Qt that uses delegates for all events:
void draw_controls(qt::Widget parent) {
qt::Button ok_button = new qt::Button(parent);
ok_button.add_on_click_listener(##ok_button_clicked:0 => this);
}

void ok_button_clicked() {
System.out.println("Ok button clicked");
}
This code demonstrates how delegates can be used in a very light-weight mechanism for specifying event handlers, in this case causing the system to print a message on the console each time the button is clicked. And if we use accessor methods the code that sets the event handler can be made even more concise:
...
ok_button.on_click = (##ok_button_clicked:0 => this);
...
Another use of delegates is for spawning threads. Besides just invoking a delegate, you can also call the spawn method which invokes the delegate in a new thread:
void start_process() {
Worklist list = new Worklist();
(##produce:1 => this).spawn(list); // spawn producer
(##consume:1 => this).spawn(list); // spawn consumer
}

void produce(Worklist list) {
while (true) {
var obj = produce();
list.offer(obj);
}
}

void consume(Worklist list) {
while (true) {
var obj = list.take();
consume(obj);
}
}
The start_process method starts two threads: on one that adds objects to the worklist and one that consumes those object, again in a very light-weight fashion using delegates to invoke two local methods in separate threads. I think that's pretty elegant!

Unlike languages like C#, delegates in neptune are not "magic"; they are implemented as pure neptune code that uses selectors and perform to do the actual delegation. While selectors might not look like that useful a construct they can be used to build some very powerful abstractions.

Tuesday, May 16, 2006

Prefix Keywords

Even though we've decided not to use smalltalk in OSVM it doesn't mean that we haven't been happy with smalltalk ourselves. You always want what you can't have and if I wasn't a smalltalk fan before, I was certainly made one by the decision that we were not going to use it anymore. Having said that, I think there are situations where smalltalk is less than perfect; in particular, I find some smalltalk code very hard to read. In this post I'll discuss some examples of this and describe a neat solution (at least I think so) I once experimented with.

One thing that can make smalltalk code hard to read is the fact that almost everything is a message send, and that the receiver of a message send is always written first. For instance, in a send to ifTrue:ifFalse: you always write the condition first because that is the receiver of the send:
self size ~= other size ifTrue: [ ... ] ifFalse: [ ... ]
When you read this line you have to read past the self size ~= other size part before you see that what you're reading is actually the condition of a conditional. In this case the line is short so it's not a big problem but it does cost me just one or two extra brain cycles when reading the line. I would have saved those if you could write the conditional so that it was immediately clear, when reading from left to right, that it was a conditional. For instance:
if: (self size ~= other size) ...
The problem becomes much worse when you consider the operations whose receiver is a block, for instance some repetition, thread and exception operations, because blocks can be very long. An example is this code snippet taken from Squeak (well, slightly changed):
[
[ | promise |
promise := Processor tallyCPUUsageFor: secs
every: msecs.
tally := promise value.
promise := nil.
self findThePig.
] repeat
] on: Exception do: [
^nil.
]
What's happening here? The outer block is an exception handler, the body of which repeatedly performs some operation. When you read this code you have to read ahead several lines to understand what each block means. By the way, I have no idea what findThePig does.

Another example, also taken from Squeak, is this:
[
[
newMorph := NetworkTerminalMorph connectTo:
self ipAddress.
WorldState
addDeferredUIMessage:
[newMorph openInStyle: #scaled] fixTemps.
]
on: Error
do: [ :ex |
WorldState addDeferredUIMessage: [
self inform: 'No connection to: ',
self ipAddress,' (',ex printString,')'
] fixTemps
].
] fork
Here you not only have to read way ahead to see that the outer block forks a thread, but you have to look closely at the body of the inner block to spot the on:do: selector which doesn't stand out at all but is very important in understanding the code. This is only two levels of nesting and if you add another level the code becomes even harder to read.

Back when I wrote a lot of smalltalk I started inserting comments to make code such as this easier to read. For instance, I would insert "do" before a loop and "try" before an exception handler:
"try" [
"do" [ | promise |
promise := Processor tallyCPUUsageFor: secs
every: msecs.
tally := promise value.
promise := nil.
self findThePig.
] repeat
] on: Exception do: [
^nil.
]
Here I don't have to read ahead because I can see which kind of construct I'm dealing with even before I see the opening bracket. You still have to read to the end to understand exactly how the code is repeated and which exceptions are caught, but my experience is that this gives just enough context that you can read and understand the code in one go, something I couldn't do with the previous version of the code, without the comments.

Back when we were discussing how to change the language, before we decided to go for a C/C++ style syntax, I suggested something called prefix keywords to make the language slightly more C-like and easier to read. It was never very popular but I still think it's a pretty decent idea. The suggestion was to allow keyword sends that start with a keyword. You can still write ordinary keyword sends such as:
(...) ifTrue: [...] ifFalse: [...]
but you would also be allowed to write:
if: (...) then: [...] else: [...]
which means sending the if:true:false: message to the first expression, with the two blocks as arguments. You would declare the if:then:else: method on booleans like this (here on True)
if: self then: thenPart else: elsePart
^thenPart value
That way, instead of using comments to mark the beginning of constructs like above, you can use a prefix keyword:
try: [
do: [ | promise |
promise := Processor tallyCPUUsageFor: secs
every: msecs.
tally := promise value.
promise := nil.
self findThePig.
].
] on: Exception do: [
^nil.
]
There's no problem in parsing it and the semantics is as simple as the three other kinds of sends. It also works pretty well for methods such as acquireDuring:. Consider
mutex acquireDuring: [
... critical section ...
]
compared with
acquire: mutex during: [
... critical section ...
]
I think it's pretty neat. It does change the flavor of the code somewhat but I think it makes the code much easier to read.

Monday, May 15, 2006

Totten

I read the Middle East Journal, a blog written by freelance journalist Michael Totten. Recently, he asked his readers what we thought about him no longer going on paid assignments. Instead, he would write exclusively for the readers of his blog and financed by our voluntary tips. The feedback in the comments were very positive and apparently so was the feedback in his tip jar. So now, if things work out, he's off to Iran. And maybe later Afghanistan, or Syria, or North Korea.

It's an interesting experiment -- a journalist cutting out the middleman and writing directly for, and interacting with, a group of people. At first I was pretty excited about it but as I've thought a bit more about it I'm having some second thoughts. For one thing, some might argue that the middleman, the editor, is there for a reason. I would tend to agree. Also, these are dangerous places he'll be going. Will I be (partially) responsible if something happens to him? I'm sure that the more remote and dangerous places he writes about, the more people will read his blog and hit his tip jar. Will put pressure on him to visit these places and put himself at even greater risk?

On the other hand there's the distinct possibility that I shouldn't think so much and just enjoy the blog.

Tuesday, May 09, 2006

Entertainment

Where does the wierdest wikipedia content go when it dies? Why, to BJAODN, bad jokes and other deleted nonsense.

From Coleoptera

The most musical of the Insecta, Coleoptera are known for their tight, 4-part harmonizing and catchy melodies.

From Malaga (province)

The Sun Coast (Costa del Sol) is a concrete monster that swallows, burns, and spits back millions of happy European tourists.

From Alternative rock

Alternative rock is the name given to one stone when you're looking at another stone. The term was coined by photographer Edwin Blastocyst when looking at one stone and speaking about another, oddly enough.

    The quote from Edwin Blastocyst needs to be verified.

In other (old) news: They're made out of meat is pure genious and now a film has been made based on the original short story.