Previous Entry Add to Memories Tell a Friend Next Entry
New in Leopard: Objective-C Garbage Collection!
userpic
[info]chanson
That's right, Objective-C has garbage collection now in Leopard! And it's not just a conservative collector like the Boehm collector GNUstep supports, either; it's a full-fledged multi-threaded, generational garbage collector! (Though due to the limitations imposed by C semantics, it isn't a copying collector.)

Practically speaking, this means that you don't have to worry about memory management in Cocoa applications the way you used to, if you can require Leopard. You're not entirely freed from worrying about memory management, of course: For one thing, you need to be careful about multithreading, as any -finalize methods you implement will be run in a collector thread. For another, you can still easily over-root objects that should be collectable; this is a common cause of memory leaks in garbage-collected applications, and should be familiar to Java developers.

To address the over-rooting problem, zeroing weak references have been introduced to Objective-C. By prefixing an instance variable declaration with __weak, you tell the garbage collector that if it's the only reference to an object that the object should be considered collectable. And when the object is collected, the instance variable will be set to nil for you!

This works for any number of weak references to a particular collectable object. So long as there are no strong references to an object — which are the regular kind of non-weak references you're used to — an object is eligible for collection no matter how many weak references there are to it. Thus I can create an object, assign it to a thousand different __weak instance variables, and then not assign it to any regular instance variables, and it will eventually be collected and all of those __weak instance variables will be "simultaneously" changed to nil.

Because objects are commonly stored in collections — for example, in static NSMutableDictionary instances that are used as caches — the Foundation framework also has some additional classes that can support weak references. There's NSPointerArray, NSHashTable, and NSHashMap:
  • NSPointerArray is a lot like NSMutableArray, but it can contain arbitrary pointers (not just objects) and also supports nil elements and can hold zeroing weak references.
  • NSHashTable is a lot like NSMutableSet, but it can hold zeroing weak references.
  • NSHashMap is a lot like NSMutableDictionary, but it can hold zeroing weak references for either its keys or its values, and doesn't have to copy its keys.
And most importantly, the behavior of an instance of any of these classes can be customized by creating an instance of the NSPointerFunctions class: That's how all of the above collection classes implement their equality comparison, hashing, and weak-reference behavior. See the header or documentation for details.

One other thing you can do is create hybrid applications and frameworks. As Scott Stevenson shows in his Objective-C 2.0 tutorial, you can build an application or framework such that garbage collection is either supported or required. For frameworks this is particularly useful; after all, you may want to use a framework in both a GC and in a non-GC application. By building it as GC-supported rather than GC-required, you can create a single framework binary that can be used in either.

How does this work? It's simple: When running under GC, the Objective-C runtime will "eat" all -retain, -release and -autorelease messages such that objects never receive them, and objects will never be sent a -dealloc. However, you can still write your code such that it has all of the proper non-GC memory management too. Then when you build it GC-supported, if it's loaded into a process that is using the collector it will use GC, but if it's loaded into a process using reference counting it will behave correctly there too.

There are a couple of things that you can't do with GC. One is that you can't use a non-GC framework in a GC application; an entire process has to be either GC or non-GC. Thus if you use third-party frameworks that haven't been made hybrid, GC-supporting frameworks, you'll need to stick with reference counting until the frameworks are updated.

The other thing you can't do with GC is write code that makes use of GC but that runs on pre-Leopard operating systems. Code built for GC uses certain runtime calls that are not present in earlier versions of the Objective-C runtime, and unfortunately this means that code built for GC requires the Leopard version of the runtime to execute. Leopard contains so many other compelling developer and end-user features, and other Objective-C 2.0 features also require Leopard, so in practice this probably won't be such a huge deterrent.

Update: At jcr's request, I've expanded the discussion of weak references to make it clear that the number of weak references to an object doesn't matter; as long as there are no strong references to it (whether in instance variables, reachable via globals, or on the stack or in registers), it will be eligible for collection.

(Leave a comment)

over-root?

(Anonymous)

2007-10-29 03:01 am (UTC)

Is there another term that's more conventionally used than "over-root"? A quick Google for "over-root garbage collection" returns this page for the first hit and then no more relevant-looking results.

The other term that's most often used is "leak," since that's really what an over-root of an object in a garbage-collected system is. You can probably find a bit by looking for "memory leaks in Java" or "memory leaks in .NET" on Google or elsewhere. Remember, GC doesn't mean you don't have to worry about object ownership or overall memory management, it just means you don't have to worry about the nitty-gritty details of retain/release and gives you the tools to make your everyday coding go more smoothly.

Say for example you have a bit of old-school code that keeps a non-retained reference to the object "foo" in a mutable dictionary, and relies on a notification posted by foo's -dealloc method to remove that non-retained reference. That would work under non-GC. But when you run that code under GC, the notification will never be posted because foo's -dealloc will never be invoked — not just because -dealloc isn't sent under GC, but also because "foo" will live forever since it is referenced by the mutable dictionary and the collector will treat that as a strong reference.

That's why the new classes like NSHashMap exist: You can make the keys or values of an NSHashMap weak references instead of strong ones, and they will go to nil (and clean up the entire map entry) when the object is collected. And you can have them function like your previous mutable dictionary implementation when you're not running under GC.

PS - It would be great if you could sign your replies. You can use a pseudonym if you like (anonymity's not bad at all); it just makes it easier to distinguish one thread of anonymous replies from another.

". By prefixing an instance variable declaration with __weak, you tell the garbage collector that if it's the only reference to an object that the object should be considered collectable'

Wait a second.. So if there are >1 weak references to the object, it's not collectable?

-jcr

No, an object is collectible if there aren't any strong references to it. So there can be any number of weak references to an object and it will be collectable, so long as there are no strong references.

I'll update my post to make that clear.

Re: ???

(Anonymous)

2007-10-30 12:38 am (UTC)

The word "only" was what was throwing me there.

-jcr

It would be nice to mention that the name was actually changed from NSHashMap to NSMapTable, and perhaps any history behind the change that you may know of. All I know is that Googling for "NSHashMap" turns up this post and some libFoundation code, but nothing from Apple. :-)

(Leave a comment)