Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Estimation of R-loss criterion and the Monte-carlo error using causal survival forest #1339

Open
mnose opened this issue Sep 5, 2023 · 2 comments
Labels

Comments

@mnose
Copy link

mnose commented Sep 5, 2023

After estimating causal survival forest (Cui et al, 2023), is it possible to compute and report the debiased error (R-loss function) and the excess error (the Monte-carlo error)? Even after running the sample code of causal survival forest (https://cran.r-project.org/web/packages/grf/grf.pdf), both errors are all "NaN". The same is the case for my own data. I cannot find them in the original paper too.

Are they not supported in the current code? If so, is there a way to compute them or report alternative test statistics to assess the csf estimates?

@erikcs
Copy link
Member

erikcs commented Sep 5, 2023

Hi @mnose, one suggested way to assess the CSF estimates would be to use the TOC/RATE as mentioned at the end of the CSF docstring example.

debiased.error predictions are currently not implemented, but the implied R-loss could be backed out with

tau.hat <- predict(forest)$predictions
mean(
  ((forest[["_psi"]]$numerator - forest[["_psi"]]$denominator * tau.hat) / (forest$W.orig - forest$W.hat))^2
)

This would be the appropriate loss to tune a CSF with. But given a fit CSF (default parameters are typically reasonable), the RATE is more informative.

@erikcs erikcs added the question label Sep 5, 2023
@mnose
Copy link
Author

mnose commented Sep 13, 2023

Thank you for the information.

After estimating CATE using the causal survival forest, I examine the partial dependence of CATE (i.e., how the estimate changes when changing only a single variable, while keeping all other variables at median, evaluated at each quintile. I could get the partial dependence plot similar to the one using causal forest in this website (https://gsbdbi.github.io/ml_tutorial/hte_tutorial/hte_tutorial.html)

Using the estimated coefficient and standard error of a particular covariate for each quintile, is there a way to perform the equality of coefficient test across quintiles (i.e. b_Q1 = b_Q2 = b_Q3 = b_Q4 = b_Q5) under causal survival forest? I wonder performing usual hypothesis tests on linear combinations of coefficients would probably not appropriate for this. Is there an appropriate alternative way to perform this (with adjustments for multiple hypothesis testing)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants