Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interpretation for best linear projection #1260

Open
Einspanner8888 opened this issue Feb 7, 2023 · 1 comment
Open

Interpretation for best linear projection #1260

Einspanner8888 opened this issue Feb 7, 2023 · 1 comment
Labels

Comments

@Einspanner8888
Copy link

Dear grf team
Thank you for making such a wonderful package.
I am currently analyzing the data using causal forest. While checking the best_linear_projection results, I encountered two questions.

  1. In the Best_linear_projection function, the statistical significance changes significantly depending on the matrix composed of covariate. In other words, The results of performing BLP with all variables differ from those of performing BLP with only variables of interest (see the results below). I wonder if it is reasonable to proceed with variable selection such as stepwise in the linear model and perform BLP by constructing a matrix with only the selected variables.

Result for BLP with all variables

Best linear projection of the conditional average treatment effect.
Confidence intervals are cluster- and heteroskedasticity-robust (HC3):

               Estimate  Std. Error t value Pr(>|t|)  
(Intercept) -8.3923e-01  3.4658e-01 -2.4215  0.01584 *
PRAPACHE     1.7305e-02  9.8373e-03  1.7591  0.07923 .
AGE          7.5675e-03  3.0559e-03  2.4764  0.01363 *
BLGCS       -1.3452e-02  1.5681e-02 -0.8578  0.39143  
ORGANNUM     2.9168e-02  6.1689e-02  0.4728  0.63656  
BLIL6        7.4330e-07  5.6685e-07  1.3113  0.19042  
BLLPLAT      4.7364e-04  3.4598e-04  1.3690  0.17168  
BLLBILI      1.0151e-02  1.2267e-02  0.8276  0.40836  
BLLCREAT    -4.9351e-03  1.2213e-02 -0.4041  0.68635  
TIMFIRST     5.8356e-05  4.6491e-05  1.2552  0.21005  
BLADL       -1.0051e-02  1.2494e-02 -0.8045  0.42154  
blSOFA      -4.6013e-04  2.2480e-02 -0.0205  0.98368  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Result for BLP with interested variables

Best linear projection of the conditional average treatment effect.
Confidence intervals are cluster- and heteroskedasticity-robust (HC3):

              Estimate Std. Error t value  Pr(>|t|)    
(Intercept) -0.8888329  0.2125291 -4.1822 3.449e-05 ***
PRAPACHE     0.0222930  0.0069279  3.2179  0.001381 ** 
AGE          0.0066299  0.0028854  2.2977  0.022020 *  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
  1. In result of 'Result for BLP with interested variables', I interpreted the result as "When Age increases by one unit, CATE increases by 0.0066 and is statistically significant." I wonder if this interpretation is valid.

Best regards.

@erikcs
Copy link
Member

erikcs commented Mar 22, 2023

Hi @Einspanner8888,
You can interpret the BLP as just another linear regression, but that under the hood uses a more involved construction for the LHS. Thus doing your favorite stepwise selection procedure, and interpreting coefficients as you would with an OLS regression, is perfectly reasonable (but of course the usual considerations around multiple selection and inference post model selection apply).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants