Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fuzzy_join example for help or vignette. Match (key==key) and date between (startDate and endDate) #63

Open
IndigoJay opened this issue Oct 2, 2019 · 2 comments

Comments

@IndigoJay
Copy link

IndigoJay commented Oct 2, 2019

More examples: I've used this package in other powerful ways, but on proprietary data. I'm interested in ideas for use cases that can be provided as vignettes.

fuzzy_left_join(A, B
  by = c(
    "key" = "key",
    "date" = "startDate",
    "date" = "endDate"
  ),
  match_fun = list("==", ">=", "<=")
)
@IndigoJay
Copy link
Author

Getting into longer datasets (ex. 6000 rows on each side), execution is substantially faster (0.11s vs 1662.77s) to perform this in two steps; join followed by filter. FYI

system.time({ 
  a<-inner_join(A, B, by = c("key" = "key"))
  a<-filter(a,
            date >= startDate &
            date <= endDate
            )
  })

@espinielli
Copy link

I ended up doing like that because I was always hitting out of memory...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants