Known copy(...) Bugs

Currently Claro's copy implementation suffers from two different implementation problems that will be resolved in a future release. I'll describe them below just for the sake of clarity.

Please feel free to reach out if you'd like to help to address these bugs!

Compiler Stack Overflows on Copying Recursive Types

Currently, the Claro compiler doesn't do any special handling of recursively defined types, and so as it attempts to generate code for an inlined copy of a recursive type, it ends up infinitely looping over the codegen phase.

Fig 1:


# This type is recursive (with int as its "bottom" to terminate recursion).
newtype ComplexData : oneof<int, tuple<ComplexData, ComplexData>, {ComplexData}>

function generateRandomComplexData(rng: random::RandomNumberGenerator, maxDepth: int) -> ComplexData {
  # ...
  return generateRandomComplexData_impl(rng, 0, maxDepth);
}

function generateRandomComplexData_impl(rng: random::RandomNumberGenerator, currDepth: int, maxDepth: int) -> ComplexData {
  if (currDepth == maxDepth) {
    return ComplexData(-1); # Let's just avoid attempting to create some infinitely large data structure.
  }
  var next = lambda () -> ComplexData { return generateRandomComplexData_impl(rng, currDepth + 1, maxDepth); };
  match (random::nextNonNegativeBoundedInt(rng, 3)) {
    case 0 -> # int
      return ComplexData(random::nextNonNegativeBoundedInt(rng, 100));
    case 1 -> # tuple<ComplexData, ComplexData>
      return ComplexData((next(), next()));
    case _ -> # {ComplexData}
      return ComplexData({next(), next(), next()});
  }
}

var someComplexData = generateRandomComplexData(random::forSeed(3), 3);
print(someComplexData);

# KNOWN COMPILER BUG: CURRENTLY CLARO IS UNABLE TO CORRECTLY GENERATE COPY LOGIC OVER RECURSIVE TYPES!
#     This currently causes the compiler to stack overflow. This will be resolved in a future release.
# var copied = copy(someComplexData);
# print(copied);

Output:

ComplexData({ComplexData((ComplexData((ComplexData(-1), ComplexData(-1))), ComplexData(82))), ComplexData({ComplexData(37), ComplexData(6), ComplexData((ComplexData(-1), ComplexData(-1)))}), ComplexData({ComplexData(64), ComplexData(81), ComplexData(2)})})

In the future, this will be fixed by statically identifying when a recursive type is being copied, and then generating a custom copy function for that particular type that will actually recurse at runtime rather than at compile time. Note, this will put the onus on the programmer to ensure that they never call copy(...) on any cyclical data structure.

Generated Copy Logic Severs Shared References to Mutable Data

Potentially more nefarious than the previous bug, Claro's current copy implementation handles the copying of shared references to mutable data in a way that is potentially likely to cause confusion or lead to bugs. A piece of nested data that contains multiple fields of the same mutable type has the potential to contain shared references to the same mutable value. This is a semantically meaningful feature, not just some esoteric feature of the low-level memory layout. Mutation of this shared mutable data will be observable via each reference in the containing structure. Problematically, when a copy is made, every single mutable value within the entire recursive structure will be guaranteed to have a single, unique reference. This may be a useful guarantee in some contexts, but I believe that this goes against Claro's goals of being as unsurprising as possible.

The copied data should have the exact same semantics as the original data that it was derived from, but in this one subtle way that is not currently the case. This will be fixed in a future release.

Fig 2:


var X = mut [99];
var l1 = [X, X];
var l2 = copy(l1);
print("l1: {l1}");
print("l2: {l2}");

l1[0][0] = -1;
print("\nl1: {l1}  # <-- Notice that both list elements have updated after a single write to the shared reference.");
print("l2: {l2}");

l2[0][0] = -2;
print("\nl1: {l1}");
print("l2: {l2}  # <-- This is arguably a bug. The shared reference was severed.");

Output:

l1: [mut [99], mut [99]]
l2: [mut [99], mut [99]]

l1: [mut [-1], mut [-1]]  # <-- Notice that both list elements have updated after a single write to the shared reference.
l2: [mut [99], mut [99]]

l1: [mut [-1], mut [-1]]
l2: [mut [-2], mut [99]]  # <-- This is arguably a bug. The shared reference was severed.

Mutability Coercion Can Circumvent a User Defined Type's initializers Restrictions

User Defined Types support the declaration of initializers that restrict the usage of the type's default constructor to only the procedures defined within the initializers block. Claro's builtin copy(...) currently provides a backdoor to initialize and instance of a user defined type without actually using one its initializers.

This is fortunately of limited impact as the worst thing a user can do is create instances with a mutability declaration that the type would otherwise not support. But regardless, this will be addressed in a future release.

Fig 3:


newtype Foo<T> : T

initializers Foo {
  # Calling this function should be the **only** way to get an instance of Foo<T>.
  function getFooForInts(ints: [int]) -> Foo<[int]> {
    return Foo(ints);
  }
}

var original: Foo<[int]> = getFooForInts([0, 1, 2]);

# The fact that this is somehow permitted is arguably a bug... why are you able
# to initialize a Foo<T> without invoking the declared initializer?? This seems
# to break the semantic intent of declaring initializers to restrict the direct
# instantiation of user defined types to have to "go through the front door".
var coercedCopy: Foo<mut [int]> = copy(original);
print(coercedCopy);

Output:

[0.001s][warning][perf,memops] Cannot use file /tmp/hsperfdata_runner/6 because it is locked by another process (errno = 11)
Foo(mut [0, 1, 2])