Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENG-2572: VectorClockIds are a tuple. Editing changesets are gone #3988

Merged
merged 10 commits into from
Jul 11, 2024

Commits on Jul 11, 2024

  1. feat(dal): VectorClockIds are a tuple, no editing changesets

    Previously, vector clock ids were the same as change set ids. And, we
    generated a "editing change set" anytime we mutated the graph. This
    changeset was ephemeral, and not connected to a real "change set" in the
    system. In addition, for conflict detection to work correctly, the node
    write clocks have to store every vector clock write that has *ever*
    happened to them. This meant these clocks would grow indefinitely, since
    they have to store every ephemeral "editing change set" in the node,
    forever.
    
    This change transforms the vector clock id into a tuple of the real
    ChangeSetId and the UserPk/ActorId of the current user. In the context
    of system actors, like Pinga and the Rebaser, the WorkspacePk is used in
    place of the UserPk and removes editing change sets entirely. Now the
    bound on the vector clock write clocks in node weights is the number of
    change sets and users in the system, which will grow much more slowly
    than the editing change sets.
    
    This is a breaking change, since it changes both Node and Edge weight
    data strucutres. Migration must be in place before this can be deployed.
    zacharyhamm committed Jul 11, 2024
    Configuration menu
    Copy the full SHA
    abf5993 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    cb84214 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    2e05bb1 View commit details
    Browse the repository at this point in the history
  4. feat(dal): session based actor ids for system actors

    Whenever we don't have HistoryActor, generate an actor id that lasts for
    the current DalContext and use that for the vector clock id's actor id.
    But, when the rebaser writes out the final snapshot, use the workspace
    pk for the actor id.
    
    Co-Authored-By: Jacob Helwig <[email protected]>
    zacharyhamm and jhelwig committed Jul 11, 2024
    Configuration menu
    Copy the full SHA
    ab6d2a7 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    7a6133f View commit details
    Browse the repository at this point in the history
  6. feat(dal): Deprecate current snapshot graph

    Co-Authored-By: Jacob Helwig <[email protected]>
    zacharyhamm and jhelwig committed Jul 11, 2024
    Configuration menu
    Copy the full SHA
    d481164 View commit details
    Browse the repository at this point in the history
  7. feat(dal): Auto migrate snapshots.

    On SDF boot, attempt to automatically migrate all snapshots for a
    deployment, beginning with the builtin workspace's snapshot. Follows the
    "based_on_change_set_id" paths, treating the snapshots as a dependency
    graph, so that shared clock ids are migrated to the new clock ids
    correctly.
    
    Once this code is deployed, SDF will panic if it encounters a 'legacy'
    snapshot.
    
    Co-Authored-By: Jacob Helwig <[email protected]>
    zacharyhamm and jhelwig committed Jul 11, 2024
    Configuration menu
    Copy the full SHA
    6bd2768 View commit details
    Browse the repository at this point in the history
  8. feat(si-data-pg): Discard query plans when recycling connections

    If a table's structure changes, cached query plans against that table
    need to be invalidated, or postgresql will return an error. This change
    prevents that error after migrating the database in a production system
    running pb_bouncer, which holds on to connections and reuses them even
    if our services are restarted. We could avoid needing to discard plans
    by selecting exactly the columns we need instead of SELECT * (unless
    the column type changes!)
    
    This issue never hit us before because we haven't changed table
    structures much since adding pg_bouncer to the stack.
    
    Co-Authored-By: Jacob Helwig <[email protected]>
    zacharyhamm and jhelwig committed Jul 11, 2024
    Configuration menu
    Copy the full SHA
    8cb8090 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    897e08a View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    2f82b77 View commit details
    Browse the repository at this point in the history