Thursday, April 14, 2016

One simple change removed heap allocations from our generic IO extensions

The more I use generics, the more I miss C++ templates (we'll just ignore the horrendous compile errors we can get with them when not using Clang).

While I was profiling our Unity project the other day I was wondering why, exactly, parts of our serialization extensions were popping up in our per-frame heap allocations report. While we don't do much serialization ops in client code, the server nodes do, so I felt it worthwhile to investigate.

I normally only profile in a non-development "release" build (although, I'm not certain Unity compiles assemblies with the /optimze+ flag). So by default I'm not seeing file/line info for everything since Unity doesn't give the option to force include the .mdb files (which our profiler tries to use when tracing allocations and other bits).

After making another build, this time with "Development Build" checked and after modifying the .pdb path in Unity's prebaked executable, I was up and running our memory profiled builds with full file/line info for managed and unmanaged code. The reports were showing that at line 330 of our extensions file was to blame for the curious heap allocations.

But how can I easily fix this near the end of a milestone and without going and changing a bunch of callsites? You can't specialize generic code based on the constraints alone, so I couldn't have two Read<T> methods which only differed from "where T:class/struct". I really didn't want to go and make, plus reeducate others to use, a ReadValue and ReadReference method when our Write<T> extensions don't require such verbose code. A few other suggestions were thrown at me, but I wasn't biting.

But wait...what about default arguments? What if I instead added a parameter based on T that uses its default value? This would allow me to avoid having to update a huge number of callsites, but what about still having to allocate reference objects? Turns out checking null is enough for them, while value types don't compare equally to null, so we can avoid the obj = new T() statement altogether for value types and use default arguments to avoid any other code fixup!


Yeah, I hate it when people post code in picture form too, so I put up the bits on gist.github too.

There can be a hidden cost here, but I factor anything GC related to be the biggest, nondeterministic, hidden cost. Specifically, the larger the T value type is, the more code can get generated to deal with copy-by-value semantics for the stack and return value. Although it is possible the runtime or native compiler could work some magic under the hood to use a reference to a stack value. It all depends! I'm not positive what Unity's Mono (IL2CPP would be easier to figure out) is doing under the hood for JIT'd code, but I do know we're no longer generating unexpected temporary garbage in our serialization code!

No comments:

Post a Comment