Decoding structured JSON arrays with circe in Scala

Suppose I need to decode JSON arrays that look like the following, where there are a couple of fields at the beginning, some arbitrary number of homogeneous elements, and then some other field:

[ "Foo", "McBar", true, false, false, false, true, 137 ]

I don't know why anyone would choose to encode their data like this, but people do weird things, and suppose in this case I just have to deal with it.

I want to decode this JSON into a case class like this:

case class Foo(firstName: String, lastName: String, age: Int, stuff: List[Boolean])

We can write something like this:

import cats.syntax.either._
import io.circe.{ Decoder, DecodingFailure, Json }

implicit val fooDecoder: Decoder[Foo] = Decoder.instance { c =>
  c.focus.flatMap(_.asArray) match {
    case Some(fnJ +: lnJ +: rest) =>
      rest.reverse match {
        case ageJ +: stuffJ =>
          for {
            fn    <- fnJ.as[String]
            ln    <- lnJ.as[String]
            age   <- ageJ.as[Int]
            stuff <- Json.fromValues(stuffJ.reverse).as[List[Boolean]]
          } yield Foo(fn, ln, age, stuff)
        case _ => Left(DecodingFailure("Foo", c.history))
      }
    case None => Left(DecodingFailure("Foo", c.history))
  }
}

…which works:

scala> fooDecoder.decodeJson(json"""[ "Foo", "McBar", true, false, 137 ]""")
res3: io.circe.Decoder.Result[Foo] = Right(Foo(Foo,McBar,137,List(true, false)))

But ugh, that's horrible. Also the error messages are completely useless:

scala> fooDecoder.decodeJson(json"""[ "Foo", "McBar", true, false ]""")
res4: io.circe.Decoder.Result[Foo] = Left(DecodingFailure(Int, List()))

Surely there's a way to do this that doesn't involve switching back and forth between cursors and Json values, throwing away history in our error messages, and just generally being an eyesore?


Some context: questions about writing custom JSON array decoders like this in circe come up fairly often (e.g. this morning). The specific details of how to do this are likely to change in an upcoming version of circe (although the API will be similar; see this experimental project for some details), so I don't really want to spend a lot of time adding an example like this to the documentation, but it comes up enough that I think it does deserve a Stack Overflow Q&A.

Asked By: Travis Brown
||

Answer #1:

Working with cursors

There is a better way! You can write this much more concisely while also maintaining useful error messages by working directly with cursors all the way through:

case class Foo(firstName: String, lastName: String, age: Int, stuff: List[Boolean])

import cats.syntax.either._
import io.circe.Decoder

implicit val fooDecoder: Decoder[Foo] = Decoder.instance { c =>
  val fnC = c.downArray

  for {
    fn     <- fnC.as[String]
    lnC     = fnC.deleteGoRight
    ln     <- lnC.as[String]
    ageC    = lnC.deleteGoLast
    age    <- ageC.as[Int]
    stuffC  = ageC.delete
    stuff  <- stuffC.as[List[Boolean]]
  } yield Foo(fn, ln, age, stuff)
}

This also works:

scala> fooDecoder.decodeJson(json"""[ "Foo", "McBar", true, false, 137 ]""")
res0: io.circe.Decoder.Result[Foo] = Right(Foo(Foo,McBar,137,List(true, false)))

But it also gives us an indication of where errors happened:

scala> fooDecoder.decodeJson(json"""[ "Foo", "McBar", true, false ]""")
res1: io.circe.Decoder.Result[Foo] = Left(DecodingFailure(Int, List(DeleteGoLast, DeleteGoRight, DownArray)))

Also it's shorter, more declarative, and doesn't require that unreadable nesting.

How it works

The key idea is that we interleave "reading" operations (the .as[X] calls on the cursor) with navigation / modification operations (downArray and the three delete method calls).

When we start, c is an HCursor that we hope points at an array. c.downArray moves the cursor to the first element in the array. If the input isn't an array at all, or is an empty array, this operation will fail, and we'll get a useful error message. If it succeeds, the first line of the for-comprehension will try to decode that first element into a string, and leaves our cursor pointing at that first element.

The second line in the for-comprehension says "okay, we're done with the first element, so let's forget about it and move to the second". The delete part of the method name doesn't mean it's actually mutating anything—nothing in circe ever mutates anything in any way that users can observe—it just means that that element won't be available to any future operations on the resulting cursor.

The third line tries to decode the second element in the original JSON array (now the first element in our new cursor) as a string. When that's done, the fourth line "deletes" that element and moves to the end of the array, and then the fifth line tries to decode that final element as an Int.

The next line is probably the most interesting:

    stuffC  = ageC.delete

This says, okay, we're at the last element in our modified view of the JSON array (where earlier we deleted the first two elements). Now we delete the last element and move the cursor up so that it points at the entire (modified) array, which we can then decode as a list of booleans, and we're done.

More error accumulation

There's actually an even more concise way you can write this:

import cats.syntax.all._
import io.circe.Decoder

implicit val fooDecoder: Decoder[Foo] = (
  Decoder[String].prepare(_.downArray),
  Decoder[String].prepare(_.downArray.deleteGoRight),
  Decoder[Int].prepare(_.downArray.deleteGoLast),
  Decoder[List[Boolean]].prepare(_.downArray.deleteGoRight.deleteGoLast.delete)
).map4(Foo)

This will also work, and it has the added benefit that if decoding would fail for more than one of the members, you can get error messages for all of the failures at the same time. For example, if we have something like this, we should expect three errors (for the non-string first name, the non-integral age, and the non-boolean stuff value):

val bad = """[["Foo"], "McBar", true, "true", false, 13.7 ]"""

val badResult = io.circe.jawn.decodeAccumulating[Foo](bad)

And that's what we see (together with the specific location information for each failure):

scala> badResult.leftMap(_.map(println))
DecodingFailure(String, List(DownArray))
DecodingFailure(Int, List(DeleteGoLast, DownArray))
DecodingFailure([A]List[A], List(MoveRight, DownArray, DeleteGoParent, DeleteGoLast, DeleteGoRight, DownArray))

Which of these two approaches you should prefer is a matter of taste and whether or not you care about error accumulating—I personally find the first a little more readable.

Answered By: Travis Brown
The answers/resolutions are collected from stackoverflow, are licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0 .



# More Articles