Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Further improve serialization of "homogenous" lists #2191

Closed
fingolfin opened this issue Apr 3, 2023 · 1 comment
Closed

Further improve serialization of "homogenous" lists #2191

fingolfin opened this issue Apr 3, 2023 · 1 comment

Comments

@fingolfin
Copy link
Member

Recently @antonydellavecchia implemented optimizations that dramatically reduced the overhead for serializing e.g. Vector{Int}.

We should push this further to extend to other types of homogeneous vectors. For example, here I am serializing a vector of finite field elements (FFEs), all over the same finite field.

julia> F = GF(911)
Galois field with characteristic 911

julia> save("ffe.json", F.([0,1,2,3]))

julia> save("intvec.json", [0,1,2,3]) # for comparison

This is the resulting ffe.json (after pretty printing and with the _ns entry removed):

{
   "data" : {
      "vector" : [
         {
            "data" : {
               "data" : 0,
               "parent" : {
                  "data" : {
                     "characteristic" : 911
                  },
                  "id" : "92c02e5f-dfa2-4318-9031-4b78a9f33b02",
                  "type" : "Nemo.fpField"
               }
            },
            "id" : "e070f5f4-8261-4378-adc3-a8c918509843",
            "type" : "Nemo.fpFieldElem"
         },
         {
            "data" : {
               "data" : 1,
               "parent" : {
                  "id" : "92c02e5f-dfa2-4318-9031-4b78a9f33b02",
                  "type" : "#backref",
                  "version" : 1
               }
            },
            "id" : "d57f0213-951a-41e0-9c6b-b89549711dca",
            "type" : "Nemo.fpFieldElem"
         },
         {
            "data" : {
               "data" : 2,
               "parent" : {
                  "id" : "92c02e5f-dfa2-4318-9031-4b78a9f33b02",
                  "type" : "#backref",
                  "version" : 1
               }
            },
            "id" : "26713fb3-5e71-4ef5-8f5d-d650a04d0cca",
            "type" : "Nemo.fpFieldElem"
         },
         {
            "data" : {
               "data" : 3,
               "parent" : {
                  "id" : "92c02e5f-dfa2-4318-9031-4b78a9f33b02",
                  "type" : "#backref",
                  "version" : 1
               }
            },
            "id" : "ba90ef62-50b4-4da8-9975-2e2e8df7a6ea",
            "type" : "Nemo.fpFieldElem"
         }
      ]
   },
   "id" : "e6416bbb-8976-47ba-a433-377a8b1eec2f",
   "type" : "Vector"
}

For references, this is intvec.json (I am leaving out the namespace bit)

{
   "data" : {
      "entry_type" : "Base.Int",
      "vector" : [
         "0",
         "1",
         "2",
         "3"
      ]
   },
   "id" : "160c2eb6-4740-4cd5-b50e-b850ad3ba984",
   "type" : "Vector"
}

Some observations:

  • for FFEs (finite field elements) of a prime field, I don't think we should assign an id for them / should not consider them for backrefs
  • storing the parent repeatedly in there should be redundant

Here is how it could look like (I did not check whether:

{
   "data" : {
      "entry_parent" : {
         "data" : {
            "characteristic" : 911
         },
         "id" : "92c02e5f-dfa2-4318-9031-4b78a9f33b02",
         "type" : "Nemo.fpField"
      }
      "entry_type" : "Nemo.fpFieldElem"
      "vector" : [
         {
            "data" : 0
         },
         {
            "data" : 1
         },
         {
            "data" : 2
         },
         {
            "data" : 3
         }
      ]
   },
   "id" : "e6416bbb-8976-47ba-a433-377a8b1eec2f",
   "type" : "Vector"
}

And why stop there, we can special case and say "FFEs essentially just store an integer", and compact it further:

{
   "data" : {
      "entry_parent" : {
         "data" : {
            "characteristic" : 911
         },
         "id" : "92c02e5f-dfa2-4318-9031-4b78a9f33b02",
         "type" : "Nemo.fpField"
      }
      "entry_type" : "Nemo.fpFieldElem"
      "vector" : [
         0,
         1,
         2,
         3
      ]
   },
   "id" : "e6416bbb-8976-47ba-a433-377a8b1eec2f",
   "type" : "Vector"
}

So really, what we are storing here is a Vector{Int16} plus the parent plus the information that this parent should be used to re-create the elements of the vector.

@lgoettgens
Copy link
Member

Resolved by #2102.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants