Let's merge managed object models!
[info]chanson
There was a question recently on Stack Overflow asking how to handle cross-model relationships in managed object models. Now, the poster wasn't asking about how to handle relationships across persistent stores — he was asking how to handle splitting a model up into pieces such that the pieces could be recombined.

It turns out that this is somewhat straightforward to do using Core Data. Let's say you have a simple model with Song and Artist entities. I'll write it out here in a pseudo-modeling language for ease of reading:
MusicModel = {
    Song = {
        attribute title : string;
        attribute duration : float;
        to-one-relationship artist : Artist,
            inverse : songs,
            delete-rule : nullify;
        userInfo = { };
    };

    Artist = {
        attribute name : string;
        to-many-relationship songs : Song,
            inverse : artist,
            delete-rule : cascade;
        userInfo = { };
    };
};
Now let's say you want to split this up into two models, where Song is in one and Artist is in the other. You could just try and create two xcdatamodel files in Xcode, one with each entity, and wire the relationships together after loading them and merging them with +[NSManagedObjectModel modelByMergingModels:]. Except that won't work: Relationships with no destination entity won't be compiled by the model compiler.

What else might you try? You could try just putting dummy entities in for relationships to point to. However, merging models will fail then, because NSManagedObjetModel won't merge models that have entity name collisions.

It turns out, though, that you can merge models very easily by hand, by taking advantage of the way Core Data's model-description objects handle the NSCopying protocol. All you have to do is create your destination model, loop through every entity in each of your source models, and copy every entity that you haven't tagged as a stand-in using a special key in their userInfo dictionary.

Why does this work? The trick is that before you tell a persistent store coordinator to use a model, that model is mutable and references relationship destination entities and inverse relationships by name. So you can have only a minimal representation of Artist in one model, and a minimal representation of Song in another model:
SongModel = {
    Song = {
        attribute title : string;
        attribute duration : float;
        to-one-relationship artist : Artist,
            inverse : songs,
            delete-rule : nullify;
        userInfo = { };
    };

    Artist = {
        /* Note no attributes. */
        to-many-relationship songs : Song,
            inverse : artist,
            delete-rule : cascade;
        userInfo = { IsPlaceholder = YES; };
    };
};

ArtistModel = {
    Song = {
        /* Note no attributes. */
        to-one-relationship artist : Artist,
            inverse : songs,
            delete-rule : nullify;
        userInfo = { IsPlaceholder = YES; };
    };

    Artist = {
        attribute name : string;
        to-many-relationship songs : Song,
            inverse : artist,
            delete-rule : cascade;
        userInfo = { };
    };
};
Then, when you write some code to combine them, the merged model will wind up with the full definition of Song and the full definition of Artist. Here's an example of the code you might write to do this:
- (NSManagedObjectModel *)mergeModelsReplacingDuplicates:(NSArray *)models {
    NSManagedObjectModel *mergedModel = [[[NSManagedObjectModel alloc] init] autorelease];

    // General strategy:  For each model, copy its non-placeholder entities
    // and add them to the merged model. Placeholder entities are identified
    // by a MyRealEntity key in their userInfo (which names their real entity,
    // though their mere existence is sufficient for the merging).

    NSMutableArray *mergedModelEntities = [NSMutableArray arrayWithCapacity:0];

    for (NSManagedObjectModel *model in models) {
        for (NSEntityDescription *entity in [model entities]) {
            if ([[[entity userInfo] objectForKey:@"IsPlaceholder"] boolValue]) {
                // Ignore placeholder.
            } else {
                NSEntityDescription *newEntity = [entity copy];
                [mergedModelEntities addObject:newEntity];
                [newEntity release];
            }
        }
    }

    [mergedModel setEntities:mergedModelEntities];

    return mergedModel;
}
This may seem like a bit of overhead for this simple example. The critical thing to see above is that only that which is necessary for model consistency is in the placeholder entities. Thus you only need the inverse relationship from Song to Artist in ArtistModel. Say you wanted to add a Picture entity related to the Artist entity — you don't have to add that to both models, only to ArtistModel. The benefit of this method for merging models should then be pretty apparent: It gives you the ability to make your model separable, just like your code.

Ever hear of normalization? How about relations?
[info]chanson
Roll Your Own Clustered Index, The Daily WTF:
Levi checked the query used to run the report. It had several thousand UNIONs that combined CustomerHistory_2007_02_75, CustomerHistory_2007_02_74, CustomerHistory_2007_02_73, and so on. Talking to one of the developers, he found that the tables were named to include a year, month, and customer ID. Forgetting to "initialize the database" meant failing to create a CustomerHistory table for each customer ID that month, or forgetting to update reports that queried the thousands of tables in the database. No, creating the tables wasn't automated - it had to be done manually.
This "we have only the most marginal understanding of the relational model or SQL but have conned people into paying us to fuck up their data" shit never gets old!

Anyone brand new to relational model, data modeling, and working extensively with data in general should read up on the subject before just diving in. It will save you a world of hurt and make a great many things a whole lot clearer.

Reminder: CocoaHeads Silicon Valley at Apple on Thursday, December 14, 2006
[info]chanson
The next CocoaHeads Silicon Valley meeting will be on Thursday, December 14, 2006 at 7:30 P.M. tonight in the Hong Kong conference room at Apple. That's just inside the entrance to Infinite Loop 1, the main headquarters building at Apple's campus in Cupertino. See the web site for directions.

Dan Wood of Karelia Software will be showing off their excellent Sandvox web site creation tool and talking about its development. Sandvox is a really cool application, and was one of the first major third-party applications to adopt Core Data. Thanks a ton to Scott Stevenson for setting it up, and for letting me know about it!

Last month, Scott Stevenson gave a great presentation on using TextMate for Cocoa development, and Simon Fell demonstrated his very cool SF3 and SoqlXplorer applications for synchronizing with and working with Salesforce.com data on Mac OS X. Simon also gave us the scoop on his plans for releasing his Cocoa Salesforce.com client API, zkSforce, as Open Source!

Here's a tidbit I heard on a podcast: The Canberra, Australia .NET users group has 60 people attend a typical meeting. Given the large number of Mac developers — not to mention people interested in Mac development — in the San Francisco Bay Area, we should be able to beat that easily! Spread the word and post a link to the CocoaHeads Silicon Valley page on your own blog, and let everybody know your plans for attending! And if you have something you'd like to talk about, contact either myself, our organizer Steve Zyszkiewicz, or Scott Stevenson.

CocoaHeads Silicon Valley at Apple on Thursday, December 14, 2006
[info]chanson
The next CocoaHeads Silicon Valley meeting will be on Thursday, December 14, 2006 at 7:30 in the Hong Kong conference room at Apple. That's just inside the entrance to Infinite Loop 1, the main headquarters building at Apple's campus in Cupertino. See the web site for directions.

Update: Dan Wood of Karelia Software will be showing off their excellent Sandvox web site creation tool and talking about its development. Sandvox is a really cool application, and was one of the first major third-party applications to adopt Core Data. Thanks a ton to Scott Stevenson for setting it up, and for letting me know about it!

In general, at a CocoaHeads meeting we do some introductions, have a presentation including Q&A time with the presenter, and then have an open Q&A and demo-your-cool-app period. After the meeting there's more independent mingling and discussion.

When we haven't had a presentation or two lined up, we've also had some great "unmeetings" (in the spirit of "unconferences") where we came up with an agenda for the core of the meeting on the fly by writing down topics and questions on our room's whiteboard and talking about each one of them for a few minutes. It worked really well.

Last month, Scott Stevenson gave a great presentation on using TextMate for Cocoa development, and Simon Fell demonstrated his very cool SF3 and SoqlXplorer applications for synchronizing with and working with Salesforce.com data on Mac OS X. Simon also gave us the scoop on his plans for releasing his Cocoa Salesforce.com client API, zkSforce, as Open Source!

Here's a tidbit I heard on a podcast today: The Canberra, Australia .NET users group has 60 people attend a typical meeting. Given the large number of Mac developers — not to mention people interested in Mac development — in the San Francisco Bay Area, we should be able to beat that easily! Spread the word and post a link to the CocoaHeads Silicon Valley page on your own blog, and let everybody know your plans for attending! And if you have something you'd like to talk about, contact either myself or our organizer Steve Zyszkiewicz.

Designing for Core Data performance
[info]chanson
On the comp.sys.mac.programmer.help newsgroup, Florian Zschocke asked about improving the performance of his Core Data application. Here's an adapted version of my reply to his post.

Core Data applications should scale quite well to large data sets when using an SQLite persistent store. That said, there are a couple implementation tactics that are critical to performance for pretty much any application using a technology like Core Data:
  1. Maintain a well-normalized data model.
  2. Don't fetch or keep around more data than you need to.
Implementing these tactics will make it much easier to both create well-performing Core Data applications in the first plce, and to optimize the performance of applications already in progress.

Maintaining a normalized data model is critical for not fetching more data than you need from a persistent store, because for data consistency Core Data will fetch all of the attributes of an instance at once. For example, consider a Person entity that can have a binary data attribute containing a picture. Even if you're just displaying a table of Person instances by name, Core Data will still fetch the picture because it's an attribute of Person. Thus for performance in a situation like this, you'd normalize your data so that you have a separate entity, Picture, to represent the picture for a Person on the other side of a relationship. That way the image data will only be retrieved from the persistent store if the relationship is actually traversed; until it's traversed, it will just be represented by a fault.

Similarly, if you have lots of to-many relationships and need to display summary information about them, de-normalizing your data model slightly and caching the summary information in the main entity can help.

For example, say your app works with Authors and Books. Author.books is a to-many relationship to Book instances and Book.authors is a to-many relationship to Author instances. You may want to show a table of Authors that includes the number of Books related to the Author. However, binding to books.@count for that column value will cause the relationship fault to fire for every Author displayed, which can generate a lot more traffic to the persistent store than you want.

One strategy would be to de-normalize your data model slightly so Author also contains a booksCount attribute, and maintains that whenever the Author.books relationship is maintained. This way you can avoid firing the Author.books relationship fault just because you want to display the number of Books an Author is related to, by binding the column value to booksCount instead of books.@count.

Another thing be careful of is entity inheritance. It's an implementation detail, but inheritance in Core Data is single-table. Thus if you have every entity in your application inheriting from one abstract entity, it'll all wind up in a single table, potentially increasing the amount of time fetches take etc. because they require scanning more data.

Retaining or copying the arrays containing fetch results will keep those results (and their associated row cache entries) in memory for as long as you retain the arrays or copies of them, because the arrays and any copies will be retaining the result objects from the fetch. And as long as the result objects are in memory, they'll also be registered with a managed object context.

If you want to prune your in-memory object graph, you can use -[NSManagedObjectContext refreshObject:mergeChanges:] to effectively turn an object back into a fault, which can also prune its relationship faults. A more extreme measure would be to use -[NSManagedObjectContext reset] to return a context to a clean state with no changes or registered objects. Finally, you can of course just ensure that any managed objects that don't have changes are properly released, following normal Cocoa memory management rules: So long as your managed object context isn't set to retain registered objects, and you aren't retaining objects that you've fetched, they'll be released normally like any other autoreleased objects.

ADC Video Tutorial on Core Data
[info]chanson
Yesterday, the Apple Developer Connection posted a new video tutorial by Wolf Rentzsch, Building a Sample Core Data Application.

If you're at all interested in Cocoa and Core Data development on Mac OS X, you should definitely check it out!

Core Data on Wikipedia
[info]chanson
Wikipedia's article on Core Data could use some significant revisions. In particular, the stuff about vCards and XML Schema and such is completely off the wall — I have no idea how anyone would have come up with that.

Just pointing this out in case, you know, anyone is interested in updating it.

Happy birthday, Mac OS X!
[info]chanson
Five years ago yesterday, Mac OS X was released. I managed to get a few people together at Gameworks in Schaumburg, IL to celebrate the release, which was a lot of fun.

Things have come a long, long way since that time. Congratulations to everyone who has helped to make Mac OS X a success — I'm proud to play my small part in it, as a third-party developer for most of my career, and lately in Development Technologies at Apple.

I can't wait to see what the next five years brings, especially as technologies like Cocoa bindings, Core Data, and Quartz Composer become more and more mainstream. The "force multipliers" available to Mac OS X developers enable truly amazing applications to be created.

"Enterprise" thought leadership?
[info]chanson
David Heinemeier Hansson, creator of Rails at 37signals, takes James McGovern — some Java/J2EE author — to task for his über-lame rant against Ruby in the Enterprise in a great post titled Boy, is James McGovern enterprise or what!

So by Enterprise, Architect, and Enterprise Architect standards, this gent must be the top of the pop. Thus, allow me to make this perfectly clear: I would be as happy as a clam never to write a single line of software that guys like James McGovern found worthy of The Enterprise.

If Ruby, Rails, and the rest of the dynamic gang we're lumped together to represent, is not now, nor ever, McGovern Enterprise Ready™, I say hallelujah! Heck, I'll repeat that in slow motion just to underscore my excitement: HAL-LE-LU-JAH!

With that out of the way, we're faced with a more serious problem. How do we fork the word enterprise? The capitalized version has obviously been hijacked by McGovern and his like-minded to mean something that is synonymous with hurt and pain and torment.

Indeed, McGovern's rant reads more like a parody of a rant than the real thing:
13. Lets say there is a sixteen week project and the productivity stuff was true and Ruby could save me an entire three weeks which would be significant. Since Ruby is a new vendor and not represented by existing vendors I already do business with, do you think that I will spend more than three weeks in just negotiating the contract?
Yes, because there is some vendor out there named "Ruby that you need to sign a contract with before you can begin a project.

Despite his claims to be agile, McGovern obviously doesn't know the first thing about agile development. People come first, sure, but agile development doesn't say that tools aren't important. Not using good tools makes it harder for good people to do good work.

That's why I love developing software for Mac OS X and why I love helping people develop software on Mac OS X: We have great tools like Cocoa, Core Data, Interface Builder, OCUnit, WebObjects, and Xcode, and these can be used by great developers to do great things.

Creating an Application with Tiger Technologies
[info]chanson
There's a new article on the ADC web site, Creating an Application with Tiger Technologies:
To illustrate how to take advantage of the new technologies in Tiger, we're going to do something a bit different and present the creation of a prototypical Cocoa application over a series of articles. This first article in this series covers the first few steps of creating our application, including putting together a data model and providing a user-interface. As we build up the application, we'll look at most of the new technologies in Tiger and how they can be utilized. When we're done, we'll have covered the spectrum of technologies that you should consider using in your own applications.
Check it out!

WWDC 2005 Wrap-Up
[info]chanson
WWDC 2005 is over, and damn was it a great week! Apple made some incredible announcements and shipped some incredible software, I got to see lots of old friends and make a lot of new ones, and I got to talk to lots of developers about things that I'm passionate about: Core Data, unit testing, setting up and streamlining your build process, and creating insanely great software to make users' lives better.

It was a wonderful, wonderful time. Thanks to everyone!

Unit testing and Core Data
[info]chanson
Mike Zornek asks about unit testing and Core Data. I've been meaning to write about this, so this is the perfect opportunity to do so.

Writing unit tests against your model and code that uses Core Data is easy. For example, it's trivial to load your compiled model in a unit test:
NSManagedObjectModel *model = [NSManagedObjectModel mergedModelFromBundles:nil];
Not only that, but you can introspect it:
NSArray *entities = [model entities];
And you can do this all the way down to the property level. This means that it's possible to assert that your entire model is set up the way you expect it to be. For example, you can make sure that your Employee entity has a mandatory salary attribute with a minimum value of 1 and a type of NSDecimalAttributeType, and descends from a Person entity that has a mandatory name attribute with a minimum length of 1 and a default value of "name."

But how do you test your use of Core Data? You just use Core Data in your tests as you would in your project. For example, to instantiate a complete Core Data "stack" (as it's sometimes referred to):
NSManagedObjectModel *model;
NSPersistentStoreCoordinator *coordinator;
NSManagedObjectContext *context;

model = [[NSManagedObjectModel alloc] initWithContentsOfURL:urlToModel];
coordinator = [[NSPersistentStoreCoordinator alloc] initWithManagedObjectModel:model];
context = [[NSManagedObjectContext alloc] init];
[context setPersistentStoreCoordinator:coordinator];
To instantiate managed objects associated with that context from entities in your model:
NSManagedObject *employee;

employee = [NSEntityDescription insertNewObjectForEntityForName:@"Employee"
                                 inManagedObjectContext:context];
This gives you an autoreleased Employee (assuming your context's coordinator's model has an Employee entity, of course). It's that easy.

You can then do things like check that this object was created with the correct defaults (e.g. the ones specified in your model), that it posts KVO notifications properly for properties where you care about such things, and so on. You can even add a persistent store to the coordinator and test that saving and loading work (and don't work when they're supposed to fail, of course) just by using normal Core Data and Foundation APIs.

You can multiply the full power of data modeling with the full power of unit testing and test driven development. It kicks ass.

Core Data: Generating an interface for an entity
[info]chanson
The new Core Data framework and Xcode 2 modeling tools in Tiger are an extremely powerful way to develop great end-user applications quickly. You can even easily generate a human interface for your application that will let you work with its data model with little to no code.

To generate an interface, create an empty window in Interface Builder and make sure you'll be able to see it with Xcode in front. Switch to your model in Xcode. Then just option-drag the entity you want an interface for into your window. Interface Builder will ask you whether you want an interface for one or many instances of that entity, and then generate a basic form-style human interface for all of the attributes and to-one relationships in that entity.

This generated interface isn't a special monolithic "NSCoreDataControl" or anything of the sort. it's composed of standard Cocoa controls that wired to standard Cocoa controllers via bindings. If your nib file's owner is set to be an instance of NSPersistentDocument or a subclass, Interface Builder will even bind the controllers' managed object contexts to the document's.

If you just want to create controllers rather than full interfaces, or if you want to update the controllers in your nib file with the latest definition of your entity, drag the entity from your model straight to your nib's document window. (That's the one with the tabs for classes and instances etc.)

Note that none of this, none of this requires generating or writing code. You can create a new Core Data Document-based Application from the project template, create a data model for it in Xcode, create an interface for it in Interface Builder, and then build and run it. You can create, save, load, and manipulate documents and even undo and redo changes and avoid saving invalid data with no code.

ADC: Developing with Core Data
[info]chanson
ADC posted a great article on Developing with Core Data. Check it out!