A great summary of OOPSLA 2007
[info]chanson
Dan Weinreb, one of the founders of Symbolics has posted an excellent summary of OOPSLA 2007 to his weblog.

For anyone who doesn't know, OOPSLA is the ACM SIGPLAN's annual Conference on Object-Oriented Programming Systems, Languages and Applications.

I wasn't able to make it to Montréal for OOPSLA 2007, so I've made sure to listen to the OOPSLA 2007 podcast and read what others have written about it. Hopefully I can make it to Nashville for OOPSLA 2008!

If you have anything at all to do with software development, and you didn't make it to Montréal either, I can't encourage you strongly enough to listen to everything in the OOPSLA podcast. Richard Gabriel (conference chair) and his team did an absolutely amazing job of lining up keynote speakers — you'll be very entertained and you'll learn a huge amount too.

The SmalltalkAgents foreign-function interface
[info]chanson
I learned object-oriented programming first by reading the introductory materials shipped with Digitalk Smalltalk/V and then by using QKS SmalltalkAgents on my Centris 610 and my PowerBook 520. I think that's had a profound influence on my career and outlook compared to those who learned with C++ or Java, for example, almost as profound as the fact that I originally learned to program using Logo instead of AppleSoft BASIC.

One of the great things about SmalltalkAgents compared to most other similar environments was its great integration with the Macintosh System Software. It accomplished this by creating a great foreign-function interface that let you call arbitrary non-Smalltalk code from within SmalltalkAgents easily, efficiently, and in a way that Just Worked when it came to interacting with the operating system. And the fact that it could do this with the 68000-based Macintosh was a pretty interesting feat, due to the multiplicity of runtime models and calling conventions that wouldn't be unified until the release of the Power Mac.

Here's how it worked: SmalltalkAgents extended normal Smalltalk syntax with an additional construct, the ExternalMethod invocation which was effectively a function call with inline type information wrapped in «double angle brackets». (On a Mac, those are option-\ and option-shift-\ respectively; you could also use << and >> if you wanted.) Not only that, but SmalltalkAgents also supported a concept of structured storage for objects that would allow you to pass around C-style structures and pointers to them very easily.

So an invocation of a foreign function would look like this:
  sayHello
    | result |
    
    result := «printf(('Hello, world!' asCString):Ptr):Int».
    
    [result < 0] ifTrue: [ Exception raise: 'Error calling printf.' ].

    ^self
This is a method, sayHello, that invokes the ExternalMethod listed in the module-global ExternalMethodDictionary under the key symbol #printf. It declares the argument and return value types and uses some of the object coercions built into the SmalltalkAgents frameworks to get values that will be passed to the foreign code properly.

Creating a new ExternalMethod is very easy, too, especially if you can rely on it to follow one of a couple standard calling conventions. SmalltalkAgents defines all of the basic machine-style parameter types you'd expect, and you pass them at the call site rather than when creating an ExternalMethod, so you don't have to do all sorts of complicated header file parsing that many other similar technologies require you to use. Instead, you can just create what amounts to a list of the external methods you want to support and provide some way to hook them to their actual code (whether via a load of a code resource or a pointer in an external library), and then you can Just Use Them in your Smalltalk code.

It worked great, and I think it's something a lot of modern systems could learn from.

Interface Builder helps you leverage MVC
[info]chanson
Kevin Hoffman, The .NET Addict's Blog: A NYC .NET Developer in Steve Jobs' Court - Day 1:
I took a step back and tried to figure out what the hell was going on. After a few minutes, I realized that Xcode was actually doing a really, really, really good thing. The vast majority of problems that arise from a poor separation of concerns between the GUI and the underlying code, model, and controller (if you even have such constructs in your app!) stem from the fact that you can double-click a button and immediately start writing code without thinking about the consequences of such a thing.

The NIB (stands for NeXTStep Interface Builder) file is a loosely coupled, standalone, user interface definition. It wouldn't make sense to double-click a button and immediately be taken to a code-behind. Instead, you have to create a controller class, and then ctrl-drag from the button to the controller and then pick the outlet you're going to use (something like click: or calculate:). To me, as a huge fan of the MVC pattern, this makes perfect sense. And it seems so elegant in its simplicity, and so incredibly cool in the fact that it is truly enforcing good design simply by the way the IDE works.
Kevin sees very clearly what sometimes takes people quite a while to realize: The design of Interface Builder and the design of the Cocoa frameworks go hand-in-hand to help you best leverage a model-view-controller architecture in your application and thus create something maintainable over the long haul. Interface Builder isn't just a "form painter" for a rapid application development environment that's discarded by the "more serious" developers — it's a critical piece of technology that every Cocoa developer can leverage to improve their application's design.

Unfortunately I can't say a whole lot about Leopard and its improvements to Interface Builder here. I can, however, point you to the Xcode 3.0 page on Apple's Mac OS X Leopard Sneak Peek site and the Leopard Technology Series for Developers at the Apple Developer Connection, which includes a great high-level Developer Tools Overview.

Create classes to represent your model objects
[info]chanson
Collections are quite widely used in a lot of the code I see posted to the Cocoa-dev mailing list and elsewhere. Unfortunately, many times they're not used to represent collections of objects but rather to represent objects themselves. In one recent pathological case, I even saw some code that used an NSMutableArray as a model object instead of the typical NSMutableDictionary, complete with hard-coded indexes to represent the different properties of the object!

Why's this a problem? This goes back to the core tenet of object-oriented programming: In object-oriented programming, you write software by sending messages to objects, and these objects are state plus behavior. When you use a dictionary or an array as a model object behavior goes out the window — the only behavior you'll get out of that model object is that which makes it a dictionary or array, not anything specific to your application.

Thus even if you're only just going to implement accessors to start with, you're generally better off creating classes to stand behind your model objects because you'll have an easier time moving appropriate behavior into them. And if your application targets Leopard this is extra-easy thanks to the new support in Objective-C 2.0 for properties.

Collect 'em all!
[info]chanson
As of a used book delivery today, I now own three-quarters of all the products that Taligent shipped! The only thing I don't have is the CommonPoint 1.0 for AIX, which did actually ship. (I've seen a CD set for it.) The CommonPoint documentation is online though!

Haskell and functional programming.
[info]chanson
So I'm learning Haskell, a purely functional programming language, by reading Yet Another Haskell Tutorial. It's pretty interesting, and appears to have a much nicer feel than Standard ML did.

I'm particularly interested in the way Haskell uses monads to represent non-functional aspects (I/O and other state changes) without introducing side effects to the language itself or compromising its purely functional nature.

Functional programming quite different from object-oriented programming, which is all about state manipulation (after all, OOP is "programming by sending messages to objects"). On the other hand, pure functional programming and pure object-oriented programming have a lot of the same "values" underneath them: Leverage the type system to do most of the work in modeling your problem domain, and when you write code, write it in very small, concise, discrete units that each express one concept clearly. In either situations, long do-everything methods are a major code smell.

I was spurred to actually sit down and learn Haskell after reading through most of Google's MapReduce Programming Model — Revisited by Ralf Lämmel (blog).

Leverage Cocoa patterns
[info]chanson
David Aames posted on the Cocoa-Dev mailing list asking for some general development tips for managing complexity in the project he's working on. I responded pretty comprehensively based on my experience dealing with complex Cocoa projects, but I wanted to call out part of it here because I think it's a subtle point that a lot of people could leverage. (Note that I've cleaned it up just a little from what I posted. I'm a compulsive editor, what can I say?)

The Cocoa frameworks are built around a number of common design patternsmodel-view-controller, target-action and chain of responsibility, delegation, notification and observation, etc. — and structure your code along the same lines. That will keep your cognitive load low as you work with your code, in that the only thing you'll really have to worry about is what's unique about your code, since its interaction with the rest of both your code and the frameworks you're using all follow common patterns.

As an example, say you want a view that displays a hierarchical diagram of some data in your application. One way to do this is to tie your view very tightly to the data it's displaying — perhaps even to the point of having the view act as the "container" for that data. However, if you do this, you'll have to remember everything about this code and how it works every time you want to change it. That's a pretty high cognitive load.

Instead, with probably about the same amount of effort, you could implement your view in such a way that it uses an arbitrary cell to actually draw its data, and it's designed to be bound to an NSTreeController (and/or uses the NSOutlineView data source informal protocol) to obtain the data to display. This may result in a slightly more complex internal design, but it's likely to be both more reusable and easier to maintain because much of what it does goes through standard interfaces. In other words, the only code in the view is the code that makes it unique and the code that hooks it into the rest of Cocoa.

Steve Yegge describes what's wrong with Lisp
[info]chanson
Steve Yegge, Lisp is Not an Acceptable Lisp:
You've all read about the Road to Lisp. I was on it for a little over a year. It's a great road, very enlightening, blah blah blah, but what they fail to mention is that Lisp isn't the at the end of it. Lisp is just the last semi-civilized outpost you hit before it turns into a dirt road, one that leads into the godawful swamp most of us spend our programming careers slugging around in. I guarantee you there isn't one single Lisp programmer out there who uses exclusively Lisp. Instead we spend our time hacking around its inadequacies, often in other languages.
Steve does a very good job of articulating a lot of the things I dislike about Lisp, especially Common Lisp. One interesting thing, though, is that a lot (but not all) of the issues he raises are addressed by Dylan.

One of the more interesting things about Dylan in this context is that, despite not adopting a traditional message-based object system like Smalltalk or Objective-C (or their more static cousins C++ and Java), Dylan does push objects all the way down, but in a functional style. It appears to work pretty well, making it easy to define both nouns and verbs in the combinations a developer might need, and even (through its hygienic macro system) allow developers to extend language syntax too.

Dylan
[info]chanson
From the Introduction to the The Dylan Reference Manual:

Dylan is a general-purpose, high-level programming language, designed for use in application and systems programming. Dylan includes garbage collection, type-safety, error recovery, a module system, and programmer control over runtime extensibility of programs.

The name "Dylan" is a portmanteau of the words "dynamic" and "language." Dylan is designed to allow efficient, static compilation of features normally associated with dynamic languages.</p>

There's a lot more information at Gwydion Dylan. I became interested in the language back in the early 1990s, when Apple sent copies of the original book on the original version of the language to any developer that asked.

A lot of top-notch Lisp hackers worked on Dylan, including a lot of people who came from the Lisp machine community. Take a look at these screenshots of a project browser and a class browser from Apple's Dylan environment.

A reasonable way of describing Dylan would be as Scheme plus the Common Lisp Object System (CLOS), cleaned up quite a bit, with a Pascal-style infix syntax. I much preferred it before the change to the infix syntax. It makes code much more needlessly verbose, and it made both the macro system itself and Dylan implementations much more difficult than the original Lisp-style syntax would have.

After reading The Art of the Metaobject Protocol, I have to say that Lisp-syntax Dylan is a much cleaner language than Common Lisp. The price of that is, of course, that Dylan isn't compatible with the existing body of Lisp code, whereas Common Lisp strove for portability.

Like CLOS, Dylan is based on interacting with objects which are instances of classes and are made up of slots. Classes can inherit from other classes, even more than one, and there are fairly straightforward rules describing what happens when inheriting from multiple classes that declare slots with the same name, or from multiple classes that share base classes.

The most significant difference between CLOS and Dylan on one side and Smalltalk and C++ on the other side is that interacting with objects isn't done by sending them messages. Instead, objects are strictly data; they have generic functions applied to them. Generic functions are essentially collections of methods whose arguments are specialized on the classes of the objects they interact with. When a generic function is applied to some objects, the most specific method that is specialized on those objects' classes is invoked; that method can, in turn, invoke the next-most-specific method, and so on.


Why I Hate Code Generation
[info]chanson
One of the areas where I diverge most strongly from the Software Factories people is in the use of code generation.

In the standard articulation of the concept by Jack Greenfield et al, code generation plays a central role. Experts craft components and tools that can be manipulated using a domain-specific language, which has an XML representation, which in turn is used to generate the code for applications built with those components and in that language.

Data and code are really just two sides of the same coin. This is one of the fundamental truths of computer science. So why would you bother translating some data into code in order to do something with it? You can just use some code whose execution is driven by the data!

This has some important advantages. For one, you don't have to open that huge can of worms known as "round-tripping" — you simply aren't in situations where you might have to translated backwards from modified, generated code to your domain-specific language. Even more importantly, though, you can actually update the framework without requiring a developer to re-generate their application!

This are such powerful advantages that I simply can't understand why anyone would want to work any other way. Especially in today's world where commonly-used languages have very powerful introspection mechanisms, there are very good persistence frameworks that make it easy to manage large and complex object graphs, there are standard data representation and query languages, and software flexibility and extensibility is paramount.

Note that what I'm talking about is code generation "in the large." This is the kind of code generation that some human interface design tools and some persistence tools engage in. You lay out an interface and then the tool spits out a whole bunch of code that actually creates the interface at runtime. You define some persistence mappings between classes and a database and then the tool spits out some code that actually performs the mapping. And so on. This kind of code generation is harmful because the generated code is both static and fragile, and it's ultimately working at the wrong level of abstraction.

Code generation "in the small" — for creating method and class stubs, for example, or accessor methods — is just fine by me. It can be a great way to streamline the software development workflow, just like code completion and easy documentation browsing. And it's sensible code generation: The whole point is to generate code rather than to solve a higher-level problem, so it's working at an appropriate level of abstraction.

Model/View/ViewModel
[info]chanson
This guy must be kidding. Introduction to Model/View/ViewModel pattern for building WPF apps (John Gossman):
Model/View/ViewModel is thus a refinement of MVC that evolves it from its Smalltalk origins where the entire application was built using one environment and language, into the very familiar modern environment of Web and now Avalon development.
Yeah, because Model-View-Controller is just so inadequate when you're using visual human interface construction tools to create desktop applications or web applications.

I mean, it's not like anyone else has had an "interface builder" as an integral part of their platform. Or tools for creating web applications out of reusable and composable "web objects."

Why can't I do that in a thread?
[info]chanson
Threads are a very powerful concept, but there's a lot of confusion about what is and isn't thread-safe in Cocoa. Just this morning there was a question on the Cocoa-Dev list about how to append to an NSTextStorage from a non-main thread.

Cocoa is a framework, not just a class library. The distinction is subtle but important: A class library provides a set of classes you can use to build software. The C++ Standard Template Library is a class library. On the other hand, a framework is something that your application plugs into to build software. In other words, a framework is like Hollywood: Don't call us, we'll call you.

Furthermore, Cocoa does all of its event handling and drawing on the main thread, the first thread created in your application. This means that no matter what you're doing on another thread, Cocoa may try to process user events or do some drawing. And since Cocoa is in control, not your code, just because you add locks around all of the non-thread-safe functionality in your application doesn't mean that Cocoa will use them.

So, for example, if you want to append to an NSTextStorage you need to do so from the main thread. If you want to reload an NSTableView you need to do so from the main thread. If you want to update or access the value of any control, you need to so from the main thread.

What's more, this kind of thing can happen as a side-effect now as a result of Key-Value Observing and Cocoa bindings. If you change a property using Key-Value Coding — or even using the property's accessors when automatic observer notifications are enabled — and there are observers, value-changed notifications will be sent to those observers immediately. In other words, on the same thread where the value was changed.

Good Reads Online
[info]chanson
A while back, I saw on Lambda the Ultimate that Smalltalk-80: Bits of History, Words of Advice was available online.

There are quite a few more worthwhile language books on the Web. For example, Tim Budd's A Little Smalltalk is also available at Professor Ducasse's site.

You can also find Common Lisp: The Language, Second Edition in various places online, and Paul Graham has made On Lisp available.

I started down this route because I'm reading Graham's Hackers & Painters right now and in some ways it reminds me of Patterns of Software by Richard P. Gabriel which, it turned out, Gabriel posted online once it went out of print.

The brick wall thanks you!
[info]chanson
Charles Miller, Weighing into the Static vs Dynamic Typing Debate, The Fishbowl:
One place where dynamic typing has truly 0wned me, however, has been the Cocoa framework for OS X. Cocoa have shown me that programming a GUI doesn't have to be an exercise in banging my head against a brick wall, it can actually be fun. A lot of the flexibility of Cocoa comes from the dynamic nature of Objective-C. If you've got a Mac and you haven't learned Cocoa yet, set aside a week to go through a tutorial or two. You won't be disappointed.
That's almost exactly what I thought when I first started learning OpenStep development on a used NeXTstation in 1997. OpenStep was to developing software what the Macintosh was to using it.

I haven't lost that feeling: OpenStep got even better when it became Yellow Box, and Yellow Box got even better when it became Cocoa. And Cocoa just keeps getting better with features like NSNetService in Mac OS X 10.2 and Cocoa Bindings in Mac OS X 10.3.

As for static versus dynamic typing: I think Objective-C is a great compromise. (What, you didn't see that one coming?) All of the actual messaging is done at runtime, dynamically; there is no static binding. However, you can — if you want — use types when writing your code to let the compiler help you. No, there are no templates for typed collections; this winds up not being a significant problem in practice, and it enables you to easily have heterogeneous collections when you need them.

Use id — the "any object" type in Objective-C, which supports multiple root classes — whenever you truly don't care about type, use a protocol (where Java got the idea for interfaces) when you only care about whether the receiver responds to a very restricted subset of messages, and use a full type when you actually care about the receiver being of that class or a subclass. And use unit testing for everything else. It works quite well.

Class Clusters
[info]chanson
[info]moonlessnights has a good post on class clusters in the [info]ood community. Class clusters are one of the unsung great design patterns in Cocoa; they don't just encapsulate instance functionality but class functionality as well, and a lot more naturally and flexibly than the Factory pattern used in a lot of Java and C++ development.

Microsoft further corrupting C++
[info]chanson
Stan Lippman, famous C++ author, is working on Visual C++ now at Microsoft. You'd think this would mean better standards conformance, better cross-platform support, and so on.

Wrong.

Reading his weblog about "Managed C++" it's obvious that Microsoft wants to lock C++ in their trunk too. They're adding all sorts of extensions, new keywords, new syntax, all to better "integrate" C++ into .NET. And incidentally make any code you write using their extensions horribly non-portable.

Now, I've worked on ports of games to the Mac. Sometimes they're a bit difficult because their original developers abuse semi-standard language features. Many times they're a lot more difficult than they should be because developers use non-standard language misfeatures like anonymous unions. This is going to make the problem much, much worse.

What really gets me is that the majority of Windows C++ developers don't even seem to know the difference between Microsoft's tools, frameworks, and language extensions and what's actually supported within the standard language.

OK, there's another thing that gets me about some of what Lippman talks about. For instance, Managed C++ is getting "properties." This is an attempt to create something vaguely similar to (but more limited than) Key-Value Coding. Microsoft could have instead decided to keep more class metadata around at runtime; this would enable them to both avoid cluttering up the language with extra keywords and syntax and get even greater flexibility in a way that could be supported easily by other vendors and eventually rolled into the standard language.

Oh, wait, they're not going to do that. That would mean doing something that could possibly benefit someone other than Microsoft.

What crap!
[info]chanson
I knew Visual Basic was a bad environment in which to develop quality software. But I didn't realize just how bad it really really was until now. According to various people on the Extreme Programming mailing list, Visual Basic doesn't have implementation inheritance.

To borrow an example from Carl Manaster, this means that if you have a class Widget and need to specialize it, you can either add a flag to Widget and put in a bunch of "if" statements, or you can create a class SpecialWidget and copy all of the common code into it. So, true to the cargo-cult nature of so much Microsoft and Windows software, it mimics the forms of something useful — in this case, object-oriented programming — but fails utterly to grasp the details.

Tell me again why people use this garbage? It's virtually guaranteed to lead to unmaintainable messes that cost a lot to maintain!

Simonyi Leaves Microsoft
[info]chanson
Steve Lohr, A Microsoft Pioneer Leaves to Strike Out on His Own (New York Times) - Charles Simonyi, a computer scientist who joined Microsoft when it had 40 employees and who helped set its technical strategy for years, is leaving the company to found his own software start-up.

Simonyi's the guy who came up with the winning idea that to make code more maintainable you should make it less readable. Of course, it wasn't actually put that way; the idea is that you prefix the names of all of your variables with the types of those variables and that somehow helps you maintain the code more easily. This is called, of course, "Hungarian notation." It's one of the things that makes Windows code damn near illegible, because like everything else Microsoft does Windows programmers treat it as gospel.

I was sort of hoping he'd stay with Microsoft. Keep the damage confined to the Microsoft sphere, you know?

I don't get the obsession with variable typing that lots of developers seem to have. Do they really make that many type errors? Even when I was doing lots of work in C++, I didn't make nearly enough type errors that such bondage & discipline techniques as Hungarian notation and templates were worth the time they took to learn and use and debug. (In the case of templates, they also get in the way of writing good object-oriented code since they're actively hostile to object-oriented programming.)

Using languages like Smalltalk and Objective-C that leave types to values instead of variables I'm far more productive and wind up writing far less code. And every line of code you don't have to write is a line of code that can't have an error in it. (That's also why you shouldn't hard-code graphical human interfaces - or web interfaces, for that matter. Use tools that can automatically wire up your controller objects to your views based on data.)

For those who don't know the language yet, Objective-C's type system is essentially the same as C's with one additional type: id. An id represents a pointer to any kind of object; it's like a void * for objects, and it's necessary because while Objective-C only has single-inheritance of classes it supports multiple root classes. Even so, you can declare a variable as having type Foo * and if you send a message to the object referenced by it that neither the Foo class nor any of its superclasses understands, you'll get a compiler warning.

Yes, a warning, not an error. Sometimes it's valid to send a message to an object that doesn't declare it can handle it. For instance, you might have a proxy object that's standing in for a distant object on the other side of a Distributed Objects connection. Or an object that acts as a bridge to another language or object system by translating messages it receives into whatever the other system requires.

MetaKit
[info]chanson
MetaKit is an Open Source embedded structured database engine. It looks interesting, though it has some limitations. (For instance, no 64-bit integer type that I could see.)

It's written in C++, with a very STL "static objects and overloaded operators everywhere" feel. Unfortunately.

I wonder how hard it would be to wrap in Objective-C for use with Cocoa applications. Especially mapping between it and model objects without requiring them to be subclasses of any particular class...

OK, I just built MetaKit 2.4.6 for Mac OS X. When doing the makes, I had to add CXX='cc' CXXFLAGS='-lstdc++' to make it find needed libraries properly, and I had to add a CVS directory to the tests directory so diff wouldn't puke during the make test stage. But it seems to have built and installed just fine.

Eventually I'll write some code for it. Maybe I should also package it as a framework or something.

Model-View-Controller
[info]chanson
Back in the 1970s, when Alan Kay's team at Xerox PARC (now "Palo Alto Research Center Incorporated") was inventing object-oriented programming, the Smalltalk language and the basis of the modern graphical human interface, they invented a design pattern that every software developer should know, understand, and use today: Model-View-Controller.

In an MVC application, there are three primary types of objects: Model objects, view objects, and controller objects. Each type of object has its own role to play, and by keeping objects' roles separate an application can be designed to be much more maintainable and extensible than if everything is intermingled.

A model object typically represents a single piece of data or knowledge. It may have relationships to other model objects, but it is not directly linked to anything in the application's human interface. Keeping human interface knowledge out of model objects helps them to be very reusable between different applications. For instance, model objects can be shared between a desktop and a web application, even though those applications may have radically different interfaces.

A view object represents an interface element in your application's human interface. This includes both display-only elements like windows, pictures and static text strings, and editable elements such as edit fields, checkboxes, buttons, and pop-up menus. View objects are "dumb"; they have no knowledge of the information they present, they only know how to show it. By keeping views dumb, views can often be made extremely reusable between different applications. A slider is a slider and a checkbox is a checkbox and a window is a window, no matter what they represent in your application.

Controller objects are the glue that binds an application together. Controllers have deep knowledge about both your model objects and your view objects, and control the interaction between the two. When you pick an item from a popup menu, the popup sends a message to a controller, which determines which item was picked and how that should modify your data model. The controller modifies the data model, and then modifies other elements as necessary based on the data model changes. Controller objects are often not reusable between different applications because they represent the bulk of the application-specific behavior.

It's disturbing how many people write applications without thinking about how to break them down into model, view, and controller objects. I've seen applications -- and written some, before I understood MVC -- that do things like rely on interface elements for data storage, subclass interface elements to represent an element that manipulates or displays a particular type of object, and so on. You may be able to write and debug an application this way fairly easily, but extending it down the line will be tough.

Why did I write this? I just added a feature to a non-MVC application. It had a popup menu that needed to display a human-readable version of some raw data, instead of the just displaying the raw data. Changing the popup menu to display the human-readable version was easy. Changing the places in the code that relied on being able to extract the raw data from the popup menu was a pain. If instead there was just a little bit more abstraction, it would have been a lot easier. And if this feature needs to change again, it's going to be a whole bunch harder...