Generating from types

I actually managed to do a little bit of work on the pole-prediction backend last night. There is a part where we store messages in the "database" (it's not really a database, it's really just persistent storage). So the messages are a variant type, and as such we need to write code to both encode each message into JSON, and decode each message from JSON.

 1type DatabaseMessage
 2    = AddDriver Year Driver
 3    | AddTeam Year Team
 4    | AddEntrant Driver.Id Team
 5
 6
 7encodeDatabaseMessage : DatabaseMessage -> Encode.Value
 8encodeDatabaseMessage dMsg =
 9    case dMsg of
10        AddDriver year driver ->
11            [ ( "tag", "AddDriver" |> Encode.string )
12            , ( "arg1", year |> Encode.int )
13            , ( "arg2", driver |> Driver.encode )
14            ]
15                |> Encode.object
16
17        AddTeam year team ->
18            [ ( "tag", "AddTeam" |> Encode.string )
19            , ( "arg1", year |> Encode.int )
20            , ( "arg2", team |> Team.encode )
21            ]
22                |> Encode.object
23
24        AddEntrant year driverId team ->
25            [ ( "tag", "AddEntant" |> Encode.string )
26            , ( "arg1", driverId |> Encode.string )
27            , ( "arg2", team |> Team.encode )
28            ]
29                |> Encode.object
30
31
32databaseMessageDecoder : Decoder DatabaseMessage
33databaseMessageDecoder =
34    let
35        interpret s =
36            case s of
37                "AddDriver" ->
38                    Decode.succeed AddDriver
39                        |> Decode.andField "arg1" Decode.int
40                        |> Decode.andField "arg2" Driver.decoder
41
42                "AddTeam" ->
43                    Decode.succeed AddTeam
44                        |> Decode.andField "arg1" Decode.int
45                        |> Decode.andField "arg2" Team.decoder
46
47                "AddEntrant" ->
48                    Decode.succeed AddEntrant
49                        |> Decode.andField "arg1" Decode.string
50                        |> Decode.andField "arg2" Decode.string
51
52                _ ->
53                    Decode.fail (String.append "Unknown message string: " s)
54    in
55    Decode.field "tag" Decode.string
56        |> Decode.andThen interpret

As you can see all of this is very repetitive and lends itself well to being automatically generated. You can easily imagine some meta-code that, given a type definition, can automatically generate an encoder and decoder (there is also the elm-codec library but if you're auto-generating these anyway then that's less useful).

Anyway, imagining some type for the abstract syntax of Elm code both of these functions could be generated, even from the parsed type definition. Which is great since it means that there isn't any extra burden on the programmer of writing your type definition in some meta code. That is, it's much nicer to write:

1type DatabaseMessage
2    = AddDriver Year Driver

than

1databaseMessageType : ElmSyntax.TypeDef
2databaseMessageType =
3    { name = "DatabaseMessage"
4    , constructors = 
5        [ { name = "AddDriver"
6          , args = [ Name "Year", Name "Driver" ]
7          }
8        ]
9    }

Using some imagined meta-programming library for Elm. However, there is a small niggle here. At some point, you may wish to update the type definition in a backwards compatible way. Adding a constructor is fine, you can still just generate code from the type definition, the fact that there is additional code being generated is fine. However, suppose I wish to change the AddDriver constructor. Suppose it now also wants a driver number. So we now want:

1type DatabaseMessage
2    = AddDriver Year Driver Int

The problem is that the obvious generated code for the encoder and decoders will assume that all messages in the existing store can be decoded as having a third parameter, when they cannot. Now it's easy enough to make this backwards compatible, You can just handle the case that there is no arg3, so instead of the following:

1                "AddDriver" ->
2                    Decode.succeed AddDriver
3                        |> Decode.andField "arg1" Decode.int
4                        |> Decode.andField "arg2" Driver.decoder
5                        |> Decode.andField "arg3" Decode.int

you can instead write:

1                "AddDriver" ->
2                    Decode.succeed AddDriver
3                        |> Decode.andField "arg1" Decode.int
4                        |> Decode.andField "arg2" Driver.decoder
5                        |> Decode.optionalField "arg3" 0 Decode.int

The question is, how do we communicate that to the meta-programming system if we're just writing the normal Elm type definition? There are a few options:

Force people to never change constructors, you can just use a new constructor name and then factor out the common update stuff, for example you could do:

1type DatabaseMessage
2    = AddDriver Year Driver -- Doesn't change
3    | AddDriverWithNumber Year Driver Number
4
5
6... in an update function ...
7    AddDriver year driver ->
8        update (AddDriverWithNumber year driver 0) model

Always generate code that just picks a default for when the argument isn't there.

1                "AddDriver" ->
2                    Decode.succeed AddDriver
3                        |> Decode.optionalField "arg1" 0 Decode.int
4                        |> Decode.optionalField "arg2" Driver.empty Driver.decoder
5                        |> Decode.optionalField "arg3" 0 Decode.int

Force migrations

You can even generate the migrations given two type definitions.

Write the type definition in the ugly meta-programming library syntax, but have someway to specify that an argument to a constructor maybe missing at decode time.

 1databaseMessageType : ElmSyntax.TypeDef
 2databaseMessageType =
 3    { name = "DatabaseMessage"
 4    , constructors = 
 5        [ { name = "AddDriver"
 6          , args = 
 7            [ { kind = Name "Year", decoder = strictStringField }
 8            , { kind = Name "Driver", decoder = strictDriverField }
 9            , { kind = Name "Number", decoder = optionalIntField 0 }
10            }
11        ]
12    }

Introduce some commenting on type definitions to specify this, such as:

1type DatabaseMessage
2    = AddDriver Year Driver {- optional -} Int

Probably some other solutions I haven't thought of. All-in-all this is a non-trivial problem to solve, which is why so far I haven't done any meta-programming here at all, and just written all the encode/decoders for messages by hand. Sometimes when you do not know the best route forward, you are best to delay the decision. Not always, but that's the approach I'm taking here.

#programming #elm