Sunday, January 3, 2016

Seeing Sharp C++: The love/hate relationship with Enums

Enums are great. They give you an identifier to either an explicit value (Apples = 1, Oranges = 2), or a compiler derived value. They're something you can easily refactor. They're quasi-type-safe. They're just great.

Except when it comes to generics. And IL2CPP.

In a project I'm currently working on, I make use of a custom BitVector32 struct for cases like object flags storage, etc. For the project I've added some additional methods to interop with it using an Enum via generics, where each member in a given Enum maps to a bit index:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
#region Access via Enum
private bool ValidateBit<TEnum>(TEnum bit, int bitIndex)
 where TEnum : struct, IComparable, IFormattable, IConvertible
{
 if (bitIndex < 0 || bitIndex >= this.Length)
 {
  Assert("Enum member {0} with value {1} is out of range for indexing",
   bit, bitIndex);
  return false;
 }

 return true;
}

public bool Test<TEnum>(TEnum bit)
 where TEnum : struct, IComparable, IFormattable, IConvertible
{
 int bitIndex = bit.ToInt32(null);
 if (!ValidateBit(bit, bitIndex))
  return false;

 return Bitwise.Flags.Test(mWord, ((uint)1) << bitIndex);
}

public void Set<TEnum>(TEnum bit, bool value)
 where TEnum : struct, IComparable, IFormattable, IConvertible
{
 int bitIndex = bit.ToInt32(null);
 if (!ValidateBit(bit, bitIndex))
  return;

 var flag = ((uint)1) << bitIndex;

 Bitwise.Flags.Modify(value, ref mWord, flag);
}
#endregion

The "where" clause uses as many Enum-specific constraints as possible since whatever lightbulbs behind the language thought it was a bright idea to not allow "where T : enum" (but you can in IL and F#). I mean, what the hell? But I digress, this isn't a Unity problem.

The IConvertible interface gives us the power of ToInt32(), which is needed to generically 'interpret' the TEnum value "bit" as a bit index. For the rest of this article I'm just going to focus on the Set method (line 25). Let's inspect the IL and how IL2CPP (as of 5.3.1p1) transforms it:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
.method public hidebysig 
 instance void Set<valuetype .ctor ([mscorlib]System.ValueType, [mscorlib]System.IComparable, [mscorlib]System.IFormattable, [mscorlib]System.IConvertible) TEnum> (
  !!TEnum bit,
  bool 'value'
 ) cil managed 
{
 // Method begins at RVA 0x159f8
 // Code size 51 (0x33)
 .maxstack 3
 .locals init (
  [0] int32,
  [1] uint32
 )

 IL_0000: ldarga.s bit
 IL_0002: ldnull
 IL_0003: constrained. !!TEnum
 IL_0009: callvirt instance int32 [mscorlib]System.IConvertible::ToInt32(class [mscorlib]System.IFormatProvider)
 IL_000e: stloc.0
 IL_000f: ldarg.0
 IL_0010: ldarg.1
 IL_0011: ldloc.0
 IL_0012: call instance bool BitVector32::ValidateBit<!!TEnum>(!!0, int32)
 IL_0017: brtrue IL_001d

 IL_001c: ret

 IL_001d: ldc.i4.1
 IL_001e: ldloc.0
 IL_001f: ldc.i4.s 31
 IL_0021: and
 IL_0022: shl
 IL_0023: stloc.1
 IL_0024: ldarg.2
 IL_0025: ldarg.0
 IL_0026: ldflda uint32 BitVector32::mWord
 IL_002b: ldloc.1
 IL_002c: call bool Bitwise.Flags::Modify(bool, uint32&, uint32)
 IL_0031: pop
 IL_0032: ret
} // end of method BitVector32::Set

Here's the IL2CPP of Set instanced with an unimportant enum named MyEnum ('KM00' marks my own comments, not the compiler):
UPDATE: A fix for the double boxing described below is in the pipes on Unity's end. It is now in as of 5.3.1p4, "Removed an unnecessary Box used to null check before calling a virtual method". 

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
// generic sharing
#define IL2CPP_RGCTX_DATA(rgctxVar,index) (InitializedTypeInfo(rgctxVar[index].klass))
#define IL2CPP_RGCTX_METHOD_INFO(rgctxVar,index) (rgctxVar[index].method)

// System.Void BitVector32::Set<MyEnum>(TEnum,System.Boolean)
// System.Void BitVector32::Set<MyEnum>(TEnum,System.Boolean) // [sic] KM00: why is this written twice?
extern TypeInfo* IConvertible_t1162873557_0_il2cpp_TypeInfo_var;
extern const uint32_t BitVector32_Set_TisMyEnum_t1322059486_0_m_97831884_0_MetadataUsageId;
extern "C"  void BitVector32_Set_TisMyEnum_t1322059486_0_m_97831884_0_gshared (BitVector32_t_84621378_0 * __this, int32_t ___bit, bool ___value, const MethodInfo* method)
{
 static bool s_Il2CppMethodIntialized;
 if (!s_Il2CppMethodIntialized)
 {
  il2cpp_codegen_initialize_method (BitVector32_Set_TisMyEnum_t1322059486_0_m_97831884_0_MetadataUsageId);
  s_Il2CppMethodIntialized = true;
 }
 int32_t V_0 = 0;
 uint32_t V_1 = 0;
 {
  // KM00: RGCTX stands for Runtime Generic Context
  // Index 0 in the RGCTX data should be a TypeInfo pointer to MyEnum.
  // So this should be trying to box ___bit to a full MyEnum object (including vtable, etc)
  // And yes, this is resulting in double the garbage: Null checked Box, followed by another Box for the actual ToInt32 call.
  NullCheck((Object_t *)Box(IL2CPP_RGCTX_DATA(method->rgctx_data, 0), (&___bit)));
  // KM00: the actual ToInt32() call
  int32_t L_0 = (int32_t)InterfaceFuncInvoker1< int32_t, Object_t * >::Invoke(7 /* System.Int32 System.IConvertible::ToInt32(System.IFormatProvider) */, IConvertible_t1162873557_0_il2cpp_TypeInfo_var, (Object_t *)Box(IL2CPP_RGCTX_DATA(method->rgctx_data, 0), (&___bit)), (Object_t *)NULL);
  V_0 = (int32_t)L_0;
  int32_t L_1 = ___bit;
  int32_t L_2 = V_0;
  // KM00: this is the ValidateBit() call
  bool L_3 = ((  bool (*) (BitVector32_t_84621378_0 *, int32_t, int32_t, const MethodInfo*))IL2CPP_RGCTX_METHOD_INFO(method->rgctx_data, 1)->method)((BitVector32_t_84621378_0 *)__this, (int32_t)L_1, (int32_t)L_2, /*hidden argument*/IL2CPP_RGCTX_METHOD_INFO(method->rgctx_data, 1));
  if (L_3)
  {
   // KM00: If ValidateBit() didn't fail
   goto IL_001d;
  }
 }
 {
  return;
 }

IL_001d:
 {
  int32_t L_4 = V_0;
  V_1 = (uint32_t)((int32_t)((int32_t)1<<(int32_t)((int32_t)((int32_t)L_4&(int32_t)((int32_t)31)))));
  bool L_5 = ___value;
  uint32_t* L_6 = (uint32_t*)&(__this->___mWord_3);
  uint32_t L_7 = V_1;
  Flags_Modify_m_912810781_0(NULL /*static, unused*/, (bool)L_5, (uint32_t*)L_6, (uint32_t)L_7, /*hidden argument*/NULL);
  return;
 }
}

Essentially what is happening here is that IL2CPP is missing a possible optimization of just treating the ___bit as itself: an int32_t value. Instead it goes through the full plumbing of Enum's IConvertible implementation, which 'requires' a virtual call which itself requires a complete Il2CppObject (vtable, etc).
Line 24 in the above snippet looks an awful lot like bad IL2CPP codegen. It's a bit painful to trace without source (hint hint!), but it appears to be generated in IL2CPP's MethodBodyWriter.CallMethod, specially where it invokes MethodBodyWriter.WriteNullCheckForInvocationIfNeeded. They should probably cache args[0] in a stack variable, then null check that, and then assign args[0] to that stack variable to avoid more boxing and garbage than needed.
As it turns out, calling ToInt32() on an enum value will always result in a GC operation in Mono, as it calls the get_value method which returns an Object. Internally, this is implemented in Mono via ves_icall_System_Enum_get_value. It essentially boxes the enum value to an instance of the underlying-type (int, long, etc) and does a memcpy on the field data. From there it would end up calling the IConvertible on the underlying type (in this case, Int32).

Final thoughts

No matter what your runtime, Unity's super-old Mono or IL2CPP (MS .NET can be more forgiving), you're probably running into more overhead than you realize with Enums and value types in general. Just because you use generics and constrain your type parameters to given interfaces doesn't mean that Mono or IL2CPP will properly optimize the output using that knowledge (in the event types are structs or otherwise sealed).

If C# had more relaxed forms of 'generics' as with C++'s templates, then I could have simply done an 'int bitIndex = (int)bit' cast, with the cast being resolved at template instantiation. The goal of these generic methods was to abstract the need for these casts from client code, but it looks like I'll need to roll back their usage in game loop code.

It also appears interface method calls on value types, as of 5.3.1p1, can result in an extra GC alloc and box operation. IL2CPP has come a long, long way over the past year but it obviously still has lots of room for improvement. For Unity and the Unity community's sake, more developers should inspect their IL2CPP code to not only ensure they're not generating funky IL, but that Unity itself isn't generating funky C++!

One day we'll be able to have nice things. Like enums and generics. Until then I'm just going to byte the bullet and explicitly cast them to integers :(

Saturday, January 2, 2016

IL2CPP codegen bugs as of Unity 5.3.1.p1

While researching the changes made in Patch 5.3.1p1 ("IL2CPP: Optimize method prologues for code size and incremental builds"), I came across both a codegen bug and filegen oversight.

Codegen bug

For IL2CPP codegen, Unity uses StreamWriter. Said class inherits from TextWriter, which exposes a NewLine property. Viewing Unity's mscorlib assembly on Windows in ILSpy shows the default value of NewLine is "\r\n", which is the line ending Windows uses. However, Unity prefers Unix-style line endings. Which wouldn't be a problem if they would either set or override (say, in a UnityStreamWriter class) NewLine to just be "\n"! But they don't.

For IL2CPP's CodeWriter class, it gracefully uses Unix-style since they wrap an internal StreamWriter object and expose WriteLine methods that just call the Write method. However, there are multiple instances across the IL2CPP assembly where StreamWriter is used directly without having NewLine set to Unix-style line endings.

Good Newlines
Bad Newlines
I wouldn't be surprised if there are other places in Unity's managed code where this is an issue. Perhaps Unity's test cases should include checks to ensure there are no carriage returns prior to new lines.

Filegen oversight

While looking through my project's IL2CPP I noticed that enums had method decl headers, like "mscorlib_System_AttributeTargets_59449993MethodDeclarations.h". However, IL2CPP doesn't actually spit out anything (of interest) in these headers for enums! For a project I'm working on IL2CPP spits out almost 13k header and source files. 512 of those are meaningless MethodDeclations for enums.

Should probably check type != enum too