Don’t retain anything unless you must
Regarding reference counting, this is the idea I’d like a developer to at least consider. For several reasons.
- A dangling pointer / weak reference needn’t be evil. During development hitting a dangling pointer is better than preventing an object to deallocate when it should. An object that exceeds it’s intended lifetime can behave in undesirable, unpredictable ways. All it takes to detect an invalid object is turning NSZombieEnabled on.
- Everything retained needs to be released. Retain less = less work.
- If the sums don’t add up (over-released/over-retained object) it’s easier to work out what’s wrong when the number of objects involved is small (e.g. 1, 2 or 3)
- Potentially any object that retains a target will increase the target’s lifetime; this a priori translates into increased memory usage.
- Whenever retaining an object we risk creating a cyclic reference. Of course ‘we know what we’re doing’ (and cyclic dependencies can be removed) but isn’t it just easier to avoid getting too many of these.
Unwanted objects that remain inside the runtime after they should have deallocated will harm you. The more your system is dynamic (e.g. game, simulation) the more these objects are likely to generate functional bugs that are hard to figure.
I read a little about automated reference counting (ARC) which we are getting in iOS5. If I understand correctly (reading here) the central idea of this article will translate to ‘don’t abuse strong references’. Now that I more or less get it I look forward to using ARC but I guess I’ll be waiting for another 6 months or so, being a happy laggard.
A quick introduction
Reference counting approaches memory management indirectly, using concurrent ownership:
- Take ownership of an object by retaining it
- Relinquish ownership by releasing the object.
- When all owners of an object have released it, the object is deallocated.
indirect : we don’t explicitly deallocate the object.
concurrent: several objects can simultaneously retain the same target.
The basic rules are covered in many places, like here and here and from the horse’s mouth, here.
Reference counting is efficient, error prone and occasionally awkward.
More efficient than garbage collection: objects get deallocated as soon as their reference count reaches zero, whereas GC is heuristic and may cause your program to slow down unexpectedly while it’s doing its thing.
Error prone – programmers need to pair release/retain statements (either directly or indirectly). Mismatched statements cause a program to leak or crash. Additionally reference counting is hackable; it is easy to traffic the sums (either accidentally or by design) and obtain a valid program that handles memory correctly, while violating reference counting rules.
Awkward, notably when we know beforehand that we would like to deallocate a well defined subset of the runtime graph. A typical example is when you start a kind of ‘session’, allocate any number of objects in the course of the session and wish to deallocate all the objects at the end of the session. In such cases opting out may be somewhat short sighted yet remains attractive.
A design idea
There is a principle which I find rather productive: instead of thinking about whether an object should retain X or not, consider X then try to think about an object ‘up the runtime graph’ that should own X. Often there is an object Y such that, if Y deallocates, X should also deallocate. It could be the parent of X or maybe another object up the chain.
Now, if there is only one such object Y, then you don’t need to retain X anywhere else. You can even assert the retain count to ensure that X is unambiguously managed by Y.
Anti-patterns?
There are little recipes around (e.g. here and here) that you can use to ‘ease the pain of memory management’. From the point of ownership these recipes work the same way GC does: easy way out of memory management issues, hard into functional bugs with all the enticing prospects of a muddle-through approach.
One point these approaches have in common is ‘if in doubt, retain’. I’m OK with that as long as I know (beyond reasonable doubt) that keeping the target alive won’t generate unwanted behavior. If the target is an observer that receives and processes notifications… …then if in doubt, don’t retain. A clean, happy crash will provide the decision point where you can say:
- ‘Yea, this object should still be alive at this point’ (in which case maybe something else should have retained it) or…
- ‘No, this object is dead and well dead, we should have sent a death note’
Additionally these approaches look incomplete. You need to use class extensions if you want to declare everything as a property without exposing all your ivars.
Weak references
An unretained field is a ‘weak reference’. At least in a first approach, the use of weak references is encouraged in a number of situations:
- Backward references from a child to a parent
- Listener sets. See a straightforward application here.
- Same type objects cross-referencing each other.
- Any situation where you feel unsure whether to claim ownership (retain) or not.
Implementation details
Maybe for historical reasons there are two approaches to enforcing reference counting in objective C:
- The non intrusive approach revolves around tagging using properties and indirect access using self.x = . Although this approach looks theoretically better and safer there are practical details of how it is done in Objective C which I often find off-putting. For one I like to not declare properties until I want to expose public state, and I’m not used to class extensions, thus find myself unwilling to add class extensions to my .m files.
- An older approach revolves around the [release] and [retain] statements. The advantage (easier reviewing/debugging) and inconvenience (intrusive approach leading to somewhat cluttered code) are the explicit way in which things are done. This leads to a weird situation because it makes it more likely that bugs are introduced while making the same bugs easier to fix.
Note about [autorelease]
[autorelease] is very convenient and helps avoid errors in many situations. Sadly enough when an error does occur and [autorelease] is the lucky guy that causes a target to deallocate, we get very little debugging information because [autorelease] doesn’t take effect until we exit the frame.
So I try to limit its usage to where it’s unavoidable.
What is covered elsewhere (or should be)
- Cocoa collections (NSArray, NSSet, NSDictionary) retain all their elements. This can be a hindrance in some cases, but you can configure the underlying, toll-free bridged counterparts, as demonstrated here.
- There are various approaches to notifying stakeholders when an object gets deallocated. I will try to write a quick article about an approach I find useful when implementing observer schemes.