We know how to use instruments to fine tune the performance of our games: we have a quick run (maybe one minute) check the heaviest stack traces, and do some surgery here and there, right.

How about cases when…

  • There’s just a couple of frames hanging every once in a while.
  • The game slows down for a short period of time

Here’s how I did it. I had a short span in my game where the frame rate dropped dramatically. While this typically lasts for less than 5 seconds, you may have noticed than the best games rarely slow down at all, if ever. Besides, what’s an isolated incident in a game may signal a problem in the game engine. Let’s get to work.

Here’s what I did:

  1. I create a new file in Instruments, choosing CPU sampler.
  2. I record a 2 to 5 minutes session, making sure I record the part of the game where the frame rate drops (!)
  3. After the run, spikes (higher spikes…) would show, matching the time when the frame drop occured. I tried to get to this sample by sample, but I couldn’t extract any useful information.
  4. Now, checking the inspection range button group at the top of the instruments UI, I noticed that I could restrict the span considered for profiling.
  5. I then compared overheads within and without the frame drop time-span.

I took a couple of screenshots so I could compare easily. Here’s what we see:

  • In the first case, CPU usage is very low. The game spends about 80% of the time waiting for the frame callback (note we’re running on an iPhone4, and this is a universal app also available on iPod 2nd gen.)
  • In the second case, 85% of machine time is used up. There will be other things happening in the background etc… but more importantly, we can identify several activities that are, otherwise, simply insignificant (too fast for the sampler to pickup):
    • Drawing actors. This occupies 50% of the run loop
    • Evaluating decorations

Oh well. I happen to have 4 actors in there. Incidentally the model is the heaviest I’ve got. The actors aren’t showing on-screen when the frame drop occurs, they’re just nearby. But as a general rule, it’s much better to send more than less than there is to view(!).

The next obvious step was to go back to the game script and see what happens when the actors are removed. The frame rate recovered completely – all that was needed was optimizing the geometry for these actors.

Conclusion

This quick case study shows how CPU sampler can be used to identify overheads within a specific time-span. No traces, no manual profiling. It’s a very simple technique, and it can avoid heading in the wrong direction based on vague intuitions of where the overheads should lie.

In this case, the scene considered actually cumulated several candidates:

  • The scene is complex. In fact I had to break up the scenery into several components because the max vertex count (owing to number format in my files) is ~8000.
  • Procedurally generated decorations add to the rendering overhead for static elements.
  • But the actor model, duplicated 4 times, is also heavy.

When I started off, I was so convinced that the complex scene was responsible for my overhead that I was about to do the artwork all over again, even though scene rendering uses VBOs. Bothering with firing up the profiler and running a 15 minutes session total pointed me in the right direction.