-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retention Module #5344
base: main
Are you sure you want to change the base?
Retention Module #5344
Conversation
Hello, thanks for this PR , I think we need to check some elements because the different persistence stores are not managed in the same way as in the v2. For example, I don't think the WorkflowInstanceStore delete all the dependency logs or events around the instance. A WorkflowInstanceManager was introduced and used for eg by the Rest Api to delete Instance. This Manager trigger event that are then handle by other services to remove dependency. |
Ah, I missed that! I've changed the job to use the I wonder if there is a better way to combine 2 workflow instance filters, maybe using |
Hello @Sverre-W, I do have a same requirement of usage of Retention Module is Elsa 3.0. Thank You! |
@jayachandra21 I've updated it to be current with the main branch. I also moved the cloning of the |
Thank You very much! |
No not yet, this I can't do myself. |
src/modules/Elsa.Retention/HostedServices/CleanupHostedService.cs
Outdated
Show resolved
Hide resolved
src/modules/Elsa.Retention/HostedServices/CleanupHostedService.cs
Outdated
Show resolved
Hide resolved
@Sverre-W Thank you for the initiative! In general, we definitely want this module. There are some good suggestions from @raymonddenhaan that would be good to review. In addition, I would like to see the following enhancement: Currently, we're doing a bulk delete on workflow instances that match the filter. However, it would be good to have the option of customizing the cleanup job with something else, like for instance: archiving the workflow instances and related data by sending it to cheap storage, like Elastic Search or a local DB file. To that end, it might be good to introduce a strategy for the cleanup job, e.g. What do you think? |
src/modules/Elsa.Retention/Extensions/WorkflowInstanceFilterExtensions.cs
Show resolved
Hide resolved
Sounds good, I'll make the requested changes. Do you have in mind a contract for Something along the lines of: interface IRetentionCleanupStrategy {
public Task Handle(IWorkflowInstanceFilter filter, CancellationToken CancellationToken);
} Or should we already load the workflow instances that match the filter? In this case, we would need to have something that calls |
Awesome! I believe the cleanup strategies should be responsible only for handling the items they receive. This means the hosted service will manage the querying and batching, while the cleanup strategy processes each batch (deleting, archiving, etc.). The hosted service should load not only the workflow instance to be cleaned up but also related entities such as workflow execution log records, activity execution records, bookmarks, and potentially third-party tables dependent on a workflow instance. The challenge may be implementing support for third-party modules that want to hook into the cleanup process. Currently, we have the following known entities eligible for cleanup:
Ideally, we should design an API that also accommodates third-party entities, for example:
One solution could be to raise events and let individual handlers perform the cleanup. This approach would allow third-party modules to handle the cleanup event as well. These handlers should ideally receive the objects to be cleaned up without having to query them first. This would reduce code repetition when supporting both deletion and archiving to Elasticsearch and other platforms. Furthermore, this approach would enable scenarios where data is sent to multiple destinations. Perhaps this could be handled nicely with generic type interfaces, for example: /// Gets entities related to a given batch of workflow instances.
interface IRelatedEntityCollector<TEntity>
{
// Returns an async enumerable stream of entities to be cleaned up.
IAsyncEnumerable<TEntity> GetRelatedEntities(ICollection<WorkflowInstance> workflowInstances);
}
class BookmarkCollector : IRelatedEntityCollector<Bookmark>
{
IAsyncEnumerable<Bookmark> GetRelatedEntities(ICollection<WorkflowInstance> workflowInstances)
{
var workflowInstanceIds = workflowInstances.Select(x => x.Id).ToList();
// Load bookmarks in batches
// For each batch:
yield return bookmark;
}
}
// A strategy responsible for cleaning up a set of entities.
interface ICleanupStrategy<TEntity>
{
Task Cleanup(ICollection<TEntity> entities);
}
class DeleteBookmarkCleanupStrategy : ICleanupStrategy<Bookmark>
{
Task Cleanup(ICollection<Bookmark> entities)
{
// Delete bookmarks
}
}
class ElasticsearchBookmarkCleanupStrategy : ICleanupStrategy<Bookmark>
{
Task Cleanup(ICollection<Bookmark> entities)
{
// Send bookmarks to Elasticsearch
}
} These are just initial ideas that serve as an illustration and may not be entirely viable. But please consider them and see if they capture the outlined requirements, which, I think, should be:
We might start simple by just sending a notification and letting each handler in the enabled modules figure out what to do. We can refine the API from there if we start to see too much code repetition. |
This approach might work well. I will also check if this approach can be used if we would introduce a |
I just stumbled upon this old thread that may be interesting to consider as part of this PR: #412 Even if not in one go, it could be part of one of the next steps. |
I like the retention policy idea 👍 |
If we allow a custom clean-up strategy which in the end does not enforce the deletion of workflow, we could have some scenarios that are not really retention focused. For example, I could create a clean-up strategy that just cancels a workflow and use a filter that targets long running suspended workflows. I wonder if deleting a workflow is some concrete job that you would schedule using Elsa scheduling module. Maybe where you have some generic helpers around the principle of scheduling a job that filters on some workflows and then does some transformation on the filtered workflows, like deleting, archiving, cancelling, restarting, .... |
|
GitGuardian id | GitGuardian status | Secret | Commit | Filename | |
---|---|---|---|---|---|
13223926 | Triggered | Generic Password | e2014d3 | test/component/Elsa.Workflows.ComponentTests/Helpers/Fixtures/Infrastructure.cs | View secret |
🛠 Guidelines to remediate hardcoded secrets
- Understand the implications of revoking this secret by investigating where it is used in your code.
- Replace and store your secret safely. Learn here the best practices.
- Revoke and rotate this secret.
- If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.
To avoid such incidents in the future consider
- following these best practices for managing and storing secrets including API keys and other credentials
- install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.
🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.
@sfmskywalker, I've updated the module with the following usage: elsa.UseRetention(retention =>
{
retention.ConfigureCleanupOptions = options =>
{
options.SweepInterval = TimeSpan.FromHours(4);
options.PageSize = 25;
};
retention.AddDeletePolicy("Delete completed workflows", sp =>
{
ISystemClock systemClock = sp.GetRequiredService<ISystemClock>();
RetentionWorkflowInstanceFilter filter = new()
{
TimestampFilters = new List<TimestampFilter>(),
WorkflowStatus = WorkflowStatus.Finished
};
filter.TimestampFilters.Add(
new TimestampFilter
{
Column = nameof(WorkflowInstance.CreatedAt),
Operator = TimestampFilterOperator.LessThanOrEqual,
Timestamp = systemClock.UtcNow.Subtract(TimeSpan.FromDays(1))
}
);
return filter;
});
}); To implement the policy concept, I used a marker interface to determine which You can view the implementation here: https://github.com/Sverre-W/elsa-core/blob/f7bb304643a281d23c1253fffe32741ec87f8efd/src/modules/Elsa.Retention/Policies/DeletionRetentionPolicy.cs#L9-L21 |
I've ported the retention module from Elsa V2 to V3.
Usage is similar to before:
To be able to change the configured WorkflowInstanceFilter to add the time to live filter, I also made
WorkflowInstanceFilter
implementIClonable
to make a deep clone of the filter.This change is