Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve logging when skipping values or entities in mappings #1425

Merged
merged 3 commits into from
May 8, 2024

Conversation

tillprochaska
Copy link
Contributor

This is a follow-up to #1404. This PR only includes the changes that improve logging as I agree that completely disabling value cleaning can be dangerous.

This is a small change that allows callers of the `unsafe_add` method to determine if a value has been removed as part of value cleaning. This is helpful to be able to log/warn when values are removed.
Right now, mappings silently discard unclean values. This makes it difficult to debug mapping issues during development as well as to monitor mappings running in production.
prop.map(proxy, record, entities)
discarded_values = prop.map(proxy, record, entities)
for value in discarded_values:
log.warn(f"[{self.name}] Discarded unclean value \"{value}\" for property \"{prop.prop.qname}\".")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pudo You suggested also logging the proxy ID here. I can see how this can be useful, especially when the mapping uses id_column. I just don’t know how to implement this in a smart way as the proxy ID hasn’t been computed yet at this point (and it cannot be computed earlier as that requires knowing the property values).

Only option I see is not to log the warning message directly, but cache them until the ID has been computed, something like this:

discarded_values: dict[Property, set[value]] = defaultdict(set)

for prop in self.properties:
  if prop.prop.type == registry:country:
    # Save discarded values
    discarded_values = prop.map(proxy, record, entities)
    discarded_values[prop].update(discarded_values)

for prop in self.properties:
  if prop.prop.type == registry:country:
    # Save discarded values
    discarded_values = prop.map(proxy, record, entities)
    discarded_values[prop].update(discarded_values)

proxy.id = self.compute_key(record)
if proxy.id is None:
  if self.id_column:
    log.warn(...)
  if self.keys:
    log.warn(...)
  return None

if discarded_values:
  # Finally about discarded values here
  for prop, values in discarded_values.items():
    log.warn(...)

But that’s a lot of code and indirection for a simple log message… Any good ideas?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think it's too much faff in the core loop of the mapper. Let's leave it.

@tillprochaska tillprochaska requested a review from pudo April 2, 2024 20:29
Copy link
Contributor

@pudo pudo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be OK to merge this and make a minor version release?

prop.map(proxy, record, entities)
discarded_values = prop.map(proxy, record, entities)
for value in discarded_values:
log.warn(f"[{self.name}] Discarded unclean value \"{value}\" for property \"{prop.prop.qname}\".")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think it's too much faff in the core loop of the mapper. Let's leave it.

@tillprochaska tillprochaska merged commit e3fb599 into main May 8, 2024
9 checks passed
@tillprochaska
Copy link
Contributor Author

@pudo Thanks for taking another look, just merged the PR! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants