Tech Trends

Type class derivation with ZIO Schema

February 3, 2024

Making the compiler automatically derive implementations of a type class for custom algebraic data types is common across programming languages. This post examines how Haskell, Rust, and Scala handle this capability.

Haskell example:

data Literal = StringLit String
             | BoolLit Bool
               deriving (Show)

Rust example:

#[deriving(Debug)]
enum Literal {
  StringLit(String),
  BoolLit(bool)
}

Scala 3 example:

enum Literal deriving Show:
  case StringLit(value: String)
  case BoolLit(value: Boolean)

Scala 2/traditional approach:

sealed trait Literal
object Literal {
  final case class StringLit(value: String) extends Literal
  final case class BoolLit(value: String) extends Literal

  implicit val show: Show[Literal] = DeriveShow[Literal]
}

To automatically generate an implementation for an arbitrary type we need to be able to gather information about these types as (compilation-) runtime values, and to generate new code fragments.

Example: Desert Library

Desert is a Scala serialization library. Its core trait is:

trait BinaryCodec[T] extends BinarySerializer[T] with BinaryDeserializer[T]

trait BinarySerializer[T] {
  def serialize(value: T)(implicit context: SerializationContext): Unit
  // ...
}

trait BinaryDeserializer[T] {
  def deserialize()(implicit ctx: DeserializationContext): T
  // ...
}

Desired usage:

final case class Point(x: Int, y: Int, z: Int)
object Point {
  implicit val codec: BinaryCodec[Point] = DerivedBinaryCodec.derive
}

Alternative Approaches

Scala 3 Mirrors

Scala 3 provides built-in support via Mirror types:

inline def gen[T](using m: Mirror.Of[T]): DeriveGen[T] =
  new DeriveGen[T] {
    def derive: Gen[Any, T] = {
      val elemInstances = summonAll[m.MirroredElemTypes]
      inline m match {
        case s: Mirror.SumOf[T]     => genSum(s, elemInstances)
        case p: Mirror.ProductOf[T] => genProduct(p, elemInstances)
      }
    }
  }

Scala 2 Macros

Writing custom derivation with Scala 2 macros requires whitebox macro definitions:

object Derive {
  def derive[A]: BinaryCodec[A] = macro deriveImpl[A]

  def deriveImpl[A: c.WeakTypeTag](
    c: whitebox.Context
  ): c.Tree = {
    import c.universe._
    // ...
  }
}

Type examination:

val tpe: Type = weakTypeOf[A]

def isCaseClass(tpe: Type): Boolean = tpe.typeSymbol.asClass.isCaseClass

val fields = tpe.decls.sorted.collect {
  case p: TermSymbol if p.isCaseAccessor && !p.isMethod => p
}

AST construction with quotes:

val fieldSerializationStatements = // ...

val codec = q"new BinaryCodec[$tpe] {
  def serialize(value: T)(implicit context: SerializationContext): Unit = {
    ..$fieldSerializationStatements
  }
}

Shapeless

Shapeless provides type-level programming utilities. Example transformation:

final case class Point(x: Int, y: Int, z: Int)

val point: Point = Point(1, 2, 3)
val genericPoint: Int :: Int :: Int :: HNil = 
  1 :: 2 :: 3 :: HNil
val labelledGenericPoint = 
  ("x" ->> 1) :: ("y" ->> 2) :: ("z" ->> 3) :: HNil

Derivation setup:

def derive[T, H](implicit gen: LabelledGeneric.Aux[T, H]) = {
  new BinaryCodec[T] {
    def serialize(value: T)(implicit context: SerializationContext): Unit = {
      val h: H = gen.to(value)
      // ...
    }
    // ...
  }
}

HList serialization pattern:

implicit val hnilSerializer: BinarySerializer[HNil] =
  new BinarySerializer[HNil] {
    def serialize(value: HNil)(implicit context: SerializationContext) => {
      // no (more) fields
    }
  }

implicit def hlistSerializer[K <: Symbol, H, T <: HList](implicit
  witness: Witness.Aux[K],
  headSerializer: BinarySerializer[H],
  tailSerializer: BinarySerializer[T]
): BinarySerializer[FieldType[K, H] :: T] = // ...

Complex real-world Shapeless signature:

def derive[T, H, Ks <: HList, Trs <: HList, Trcs <: HList, KsTrs <: HList, TH](implicit
      gen: LabelledGeneric.Aux[T, H],
      keys: Lazy[Symbols.Aux[H, Ks]],
      transientAnnotations: Annotations.Aux[transientField, T, Trs],
      transientConstructorAnnotations: Annotations.Aux[transientConstructor, T, Trcs],
      taggedTransients: TagTransients.Aux[H, Trs, Trcs, TH],
      zip: Zip.Aux[Ks :: Trs :: HNil, KsTrs],
      toList: ToTraversable.Aux[KsTrs, List, (Symbol, Option[transientField])],
      serializationPlan: Lazy[SerializationPlan[TH]],
      deserializationPlan: Lazy[DeserializationPlan[TH]],
      toConstructorMap: Lazy[ToConstructorMap[TH]],
      classTag: ClassTag[T]
  ): BinaryCodec[T]

All these type and implicit resolutions can make the compilation quite slow, the code is very complex and hard to understand or modify, and most importantly error messages will be a nightmare.

Magnolia

Magnolia provides a user-friendly approach by hiding macro complexity:

object BinaryCodecDerivation {
  type Typeclass[T] = BinaryCodec[T]

  def join[T](ctx: CaseClass[BinaryCodec, T]): BinaryCodec[T] =
    new BinaryCodec[T] {
      def serialize(value: T)(implicit context: SerializationContext) => {
        for (parameter <- ctx.parameters) {
          parameter.typeclass.serialize(parameter.dereference(value))
        }
        // ...
      }
    }

  def split[T](ctx: SealedTrait[BinaryCodec, T]): BinaryCodec[T] =
    // ...

  def gen[T]: BinaryCodec[T] = macro Magnolia.gen[T]
}

Why Not Magnolia?

Two key limitations with Magnolia:

Evolution parameters: Earlier Desert versions required user-defined evolution steps:

object Point {
  implicit val codec: BinaryCodec[Point] = BinaryCodec.derive(FieldAdded[Int]("z", 1))
}

Modern approach using attributes:

@evolutionSteps(FieldAdded[Int]("z", 1))
final case class Point(x: Int, y: Int, z: Int)

Transient field support limitation: It is not possible to shortcut the derivation tree. Desert has transient field and transient constructor support. For those fields and constructors which are marked as transient we don't want to, and cannot define codec instances.

ZIO Schema Based Derivation

ZIO Schema was first released in v0.3.0 in November 2022.

Core trait implementation:

trait Deriver[F[_]] {
  def deriveRecord[A](
    record: Schema.Record[A],
    fields: => Chunk[WrappedF[F, _]],
    summoned: => Option[F[A]]
  ): F[A]

  // more deriveXXX methods to implement
}

Key features:

Receives Schema.Record describing case classes
WrappedF makes field instances lazy for transient field support
summoned parameter provides control over implicit instances
.autoAcceptSummoned modifier enables automatic behavior
.cached modifier stores instances in concurrent hash map

Implementation Example

Desert codec derivation:

object DerivedBinaryCodec {
  lazy val deriver = BinaryCodecDeriver().cached.autoAcceptSummoned

  private final case class BinaryCodecDeriver() extends Deriver[BinaryCodec] {
    // ...
  }
}

Required methods include deriveRecord, deriveEnum, derivePrimitive, deriveOption, deriveSequence, deriveMap, and deriveTransformedRecord.

Primitive type handling:

override def derivePrimitive[A](
  st: StandardType[A],
  summoned: => Option[BinaryCodec[A]]
): BinaryCodec[A] =
  st match {
    case StandardType.UnitType           => unitCodec
    case StandardType.StringType         => stringCodec
    case StandardType.BoolType           => booleanCodec
    case StandardType.ByteType           => byteCodec
    // ...
  }

Entry Point Signatures

Scala 3:

inline def derive[A](implicit schema: Schema[A]): F[A]

Scala 2:

def derive[F[_], A](deriver: Deriver[F])(
  implicit schema: Schema[A]
): F[A] = macro deriveImpl[F, A]

Usage patterns:

val binaryCodecDeriver: Deriver[BinaryCodec] = // ...
val pointCodec: BinaryCodec[Point] = binaryCodecDeriver.derive[Point]

Or:

object BinaryCodecDeriver extends Deriver[BinaryCodec] {
  // ...
}

val pointCodec: BinaryCodec[Point] = BinaryCodecDeriver.derive[Point]

Cross-compilation Support

Scala 2 wrapper macro:

trait DerivedBinaryCodecVersionSpecific {
  def deriver: Deriver[BinaryCodec]

  def derive[T](implicit schema: Schema[T]): BinaryCodec[T] =
    macro DerivedBinaryCodecVersionSpecific.deriveImpl[T]
}

object DerivedBinaryCodecVersionSpecific {
    def deriveImpl[T: c.WeakTypeTag](
      c: whitebox.Context)(
      schema: c.Expr[Schema[T]]
    ): c.Tree = {
      import c.universe._
      val tpe = weakTypeOf[T]
      q"_root_.zio.schema.Derive.derive[BinaryCodec, $tpe]  (_root_.io.github.vigoo.desert.zioschema.DerivedBinaryCodec.deriver)($schema)"
    }
}

Scala 3 wrapper:

trait DerivedBinaryCodecVersionSpecific {
  lazy val deriver: Deriver[BinaryCodec]

  inline def derive[T](implicit schema: Schema[T]): BinaryCodec[T] =
    Derive.derive[BinaryCodec, T](DerivedBinaryCodec.deriver)
}

Conclusion

ZIO Schema's deriver offers key advantages. You may want to use it if you want to have a single deriver source code for both Scala 2 and Scala 3, if you need more flexibility than what Magnolia provides, or if you are already using ZIO Schema in your project.