Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backwards compatibility? (new structure definitions but old json serialization) #997

Open
infn-ke opened this issue Mar 27, 2024 · 17 comments

Comments

@infn-ke
Copy link

infn-ke commented Mar 27, 2024

What is the best way to achieve backward compatibility with boost json? My data structures has changed, but I still need to interact with json data generated by previous versions of my structs. What is the best way to achieve this? Any examples? Simply trying to invoke boost::json::value_to will throw an exception.

I'm using boost json and boost describe version 1.83.

@grisumbras
Copy link
Member

grisumbras commented Mar 27, 2024

So, the simplest thing to do is use optionals in the new version of the structure. That works if that new version simply extends the previous one. But if some types are changed or some fields are removed, that won't work.

The next option is using a variant. We support any type with the interface similar to that of std::variant (I recommend using boost::variant2::variant). E.g. you parse into variant<mystruct_v1_1, mystruct_v1_0>. And then you check index, and if it corresponds to the old version, you construct the new struct from the old one.

Finally, if the struct is not the root of the target, then you can write a tag_invoke overload. It can also use the variant trick above in its implementation.

Update:
Although, do remember that if you invoke value_to< variant<mystruct_v1_1, mystruct_v1_0> > from inside tag_invoke for mystruct_v1_1, you'll get an infinite recursion. This can be avoided with using a third type (e.g. mystruct) or using contextual conversions.

@infn-ke
Copy link
Author

infn-ke commented Mar 28, 2024

Thanks for the suggestions. Currently using a tag_invoke overload as a user-provided conversion, as in example here

But this seems to not correctly handle the std::variant. Is it anything specific missing in this example to handle the variant? I can see std::variant working correctly without the user-provided conversion linked above. I wonder if there is something with the context that needs to passed?

@grisumbras
Copy link
Member

That example doesn't deal with variant, so I'm not sure what you mean. Do you have a variant as a member of a described struct?

@infn-ke
Copy link
Author

infn-ke commented Mar 30, 2024

Yes I have a described struct containing a variant. What is needed to handle that? Is there an example you can point me to?

@grisumbras
Copy link
Member

It should work: https://godbolt.org/z/1YzKs4s4e

@infn-ke
Copy link
Author

infn-ke commented Apr 2, 2024

It does not work for me if I use std::variant and a tag_invoke overload.

https://godbolt.org/z/Gc7Tq9PzK

Passing the contextual tag there causes it to not hold the alternative B when doing value_to on the json string.

@grisumbras
Copy link
Member

The problem with your approach is that your tag_invoke download never fails. Inside value_to<C> you invoke value_to< variant<A, B> >, it calls value_to<A>, it tries to convert to A, fails, the failure is ignored and a default-constructed A is returned.

You should either throw an exception (since you're using a throwing overload) or use a non-throwing overload and return a result with set error_code.

Also, I've noticed that you are using contexts incorrectly. When you are using a two context overload of tag_invoke your context is supposed to be used in the function which is defined, but you should pass the full context to inner calls of value_to. That is, unless you explicitly don't want that for some reason.

BTW, why do you need that custom overload? Currently it doesn't do anything that the library isn't doing already. And the library actually does more, e.g. it has special handling for optionals.

@infn-ke
Copy link
Author

infn-ke commented Apr 2, 2024

The custom overload is there to handle backward compatibility in a generic way. As new parameters are added I want to decode them as default if they are missing in the json serialization. Consider the following use case

struct C {
    std::variant<A, B> variant;
    double new_param{14.1}; // new parameter added
};
BOOST_DESCRIBE_STRUCT(C, (), (variant, new_param))

// json string generated by old describe, need backward compatibility
std::string_view json = R"foo({"variant":{"d":3.14E0}})foo";

Is there a way to distinguish between boost json trying the different variants versus the case where a new field has been added? Can I look at boost::json::error to tell what happened?

@grisumbras
Copy link
Member

grisumbras commented Apr 2, 2024

I see. You are writing a templated overload seemingly intended for all described classes, but in fact you only need it for C. So, make it just for C. That way you don't need a context. Also, if it's an overload for a specific type, you can easily implement it using e.g.

C c;
c.a = value_to<A>( obj.at("a") );
c.b = value_to<B>( obj.at("b") );

But if you don't want to do that, you can take the implementation for described classes from the library and slightly change it. I actuall did it here: https://godbolt.org/z/frbhPxqe5

The main interesting point is usage of descriptor_by_pointer from Boost.Describe. With that component I define a special trait that judges wether a particular member of C is required (default) or not.

template<>
struct is_required<
    boost::describe::descriptor_by_pointer<C_Ds, &C::new_param> >
    : std::false_type
{};

And then inside the overload I use it like this:

auto const found = obj->find(D.name);
if( found == obj->end() )
{
    if constexpr( !is_required< decltype(D) >::value ) // ignore missing new fields
        return;
    throw std::runtime_error("missing field");
}

@infn-ke
Copy link
Author

infn-ke commented Apr 2, 2024

I would probably want to make is_required default to false since I have many newly added fields. But since an std::variant may be a subset of another variant it is not clear to me how to make boost continue trying.

struct A {
    int i;
};
BOOST_DESCRIBE_STRUCT(A, (), (i))

struct B {
     int i;
    double d;
};
BOOST_DESCRIBE_STRUCT(B, (), (i,d))

struct C {
    std::variant<A, B> variant;
    double new_param = 14.1; // new parameter added
};

@grisumbras
Copy link
Member

Oh, so you want to support these added fields in other structs too? Then, you do need to make the overload a template.

@infn-ke
Copy link
Author

infn-ke commented Apr 3, 2024

Backwards support means I need to support added and removed fields in data structures. I can see boost 1.83 does support std::optional to handle added fields (using is_optional). That is good, something I could try to use. But next problem would then be deprecated fields. Currently the boost implementation returns error::size_mismatch is a field has been deprecated. Any suggestion on how to handle deprecated fields?

@grisumbras
Copy link
Member

Ok, so you have several structs with a few members which were added, and a few which were removed. In addition, those structs are nested inside each other, right?

To be honest, in this case I'd write custom tag_invoke overloads for each struct. The complexity of trying to handle it in a generic way is probably not worth it.

BTW, this issue accidentally helped me to catch a bug: #999.

@kiwixz
Copy link

kiwixz commented Apr 17, 2024

Currently the boost implementation returns error::size_mismatch is a field has been deprecated. Any suggestion on how to handle deprecated fields?

I opened #991 for this. My use case was more about forward-compatibility and ignoring new fields the program doesn't know about, but it's kind of the same.

@grisumbras
Copy link
Member

grisumbras commented Apr 19, 2024

At some point we can't provide a good enough generic solution. Generic solutions are for generic problems. This is why we allow you to fully customise behaviour with tag_invoke overloads. There's also the problem of having customisation options on top of customisation options. Because of that I approach the topic with some caution.

This problem of supporting older versions of a struct isn't very generic. Hence, I believe, the most efficient (in terms of time spent on writing it and in terms of ease of updating it in the future) is a tag_invoke overload.

If you wish to have some sort of generic approach, you can write a generic tag_invoke overload for described structs, where you mark certain data members as required (or instead you mark them as optional). It can use contextual conversions if you wish to choose whether or not to use it, or it can be defined for all described structs from your namespaces. This is basically what I did here: https://godbolt.org/z/frbhPxqe5.

But., semantially, you expect a "variant" of versions of described struct: variant<S, S_v0_2, S_v0_1>. This should work correctly in Boost 1.85. But you probably don't want to end up with a variant, you want an S. In this case you can go back to custom tag_invoke overload, in which you convert to variant, then pull out the value from it. Like this: https://godbolt.org/z/1qEj4aEos

Finally, this approach of converting to/from type A instead of type B, followed/preceded by conversion bewtween B and A seems to arise quite often. And we probably want to support it in JSON explicitly. This is what #989 is about. If that is implemented, then user code would look something like this:

namespace ns {

struct S_v0_1 { ... };
BOOST_DESCRIBE_STRUCT(S_v0_1, (), (...))

struct S_v0_2 { ... };
BOOST_DESCRIBE_STRUCT(S_v0_2, (), (...))

struct S_v0_2 { ... };
BOOST_DESCRIBE_STRUCT(S_v0_2, (), (...))

struct S_v1_0 { ... };
BOOST_DESCRIBE_STRUCT(S_v1_0, (), (...))

struct S : S_v1_0
{ 
  ...
  explicit S( variant<S_v1_0, S_v0_2, S_v0_1>&& ); // for value_to
  explicit operator variant<S_v1_0, S_v0_2, S_v0_1>(); // for value_from
};

} // namespace ns

namespace boost::json {
template<>
struct serialize_as<ns::S> { using type = variant<ns::S_v1_0, ns::S_v0_2, ns::S_v0_1>;  };
} // namespace boost::json

@kiwixz
Copy link

kiwixz commented Apr 19, 2024

This problem of supporting older versions of a struct isn't very generic.

I agree this is not an easy one!

The problem I have with the variant approach you suggest is that, while quite elegant, it doesn't scale very well for big structs and/or lots of versions. If I ever completely shuffle my struct this would serve me well, but not for adding fields one by one.

But thanks for the clear examples, they helped me implement exactly what I wanted:

#include <cstdio>
#include <boost/describe/class.hpp>
#include <boost/json/value_to.hpp>
#include <boost/json/parse.hpp>

struct A {
    int a = 0;
    int b = 1;
};
BOOST_DESCRIBE_STRUCT(A, (), (a, b))


struct JsonLenient {};
constexpr JsonLenient json_lenient;


template<typename T>
T tag_invoke(boost::json::value_to_tag<T>, const boost::json::value &v, JsonLenient) {
    printf("debug: lenient tag_invoke for %s\n", typeid(T).name());

    const boost::json::object& obj = v.as_object();

    T result;

    using Ds = boost::describe::describe_members<T,
        boost::describe::mod_any_access | boost::describe::mod_inherited>;
    size_t count = 0;

    boost::mp11::mp_for_each<Ds>([&](auto D) {
        auto it = obj.find(D.name);

        if (it == obj.end()) {
            printf("debug: default-constructed field %s\n", D.name);
            return;
        }

        result.*D.pointer = boost::json::value_to<
                std::decay_t<decltype(result.*D.pointer)>>(it->value());

        ++count;
    });

    if (count != obj.size())
        printf("debug: extra members\n");

    return result;
}


int main()
{
    A a;

    a = boost::json::value_to<A>(boost::json::parse(R"( {} )"), json_lenient);
    printf(" *** a=%d b=%d\n\n", a.a, a.b);

    a = boost::json::value_to<A>(boost::json::parse(R"( {"a":10} )"), json_lenient);
    printf(" *** a=%d b=%d\n\n", a.a, a.b);

    a = boost::json::value_to<A>(boost::json::parse(R"( {"a":10,"b":11} )"), json_lenient);
    printf(" *** a=%d b=%d\n\n", a.a, a.b);

    a = boost::json::value_to<A>(boost::json::parse(R"( {"a":10,"b":11,"c":12} )"), json_lenient);
    printf(" *** a=%d b=%d\n\n", a.a, a.b);
}
debug: lenient tag_invoke for 1A
debug: default-constructed field a
debug: default-constructed field b
 *** a=0 b=1

debug: lenient tag_invoke for 1A
debug: default-constructed field b
 *** a=10 b=1

debug: lenient tag_invoke for 1A
 *** a=10 b=11

debug: lenient tag_invoke for 1A
debug: extra members
 *** a=10 b=11

The json_lenient is exactly what I was wishing for in #991. I guess it's not that hard to do with tag_invoke!

Maybe you could consider integrating something similar upstream ? It feels like a generic solution to a generic problem, because it just makes described structs parsing much less strict without any notion of versions/migrations (and all the headaches that comes with that).

It would help to keep the simple cases simple. And for cases where you're not just adding ints and strings, you could add the variant trick on top of that.

@infn-ke
Copy link
Author

infn-ke commented Apr 19, 2024

My solution to getting backwards compatibility was to use a templated tag_invoke overload and then specialize on a few ones. I also had to make templated tag_invoke for enums (boost::describe::describe_enumerators). It may be safer to serialize enums as integers rather than strings if you want to rename enums (as long as you don't renumber them).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants