Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom error metrics #281

Closed
lukedex opened this issue Sep 17, 2021 · 11 comments
Closed

Custom error metrics #281

lukedex opened this issue Sep 17, 2021 · 11 comments
Labels
enhancement New feature or request

Comments

@lukedex
Copy link

lukedex commented Sep 17, 2021

Hello, is it possible to use custom error metrics/loss functions (such as mean Poisson Deviance from the sklearn package) for our model builds?

My dataset has a poisson distribution which I believe is causing some adverse effects when comparing the predictions to that from a GBM on dual lift charts. It performs better on RMSE and Gini but fails quite significantly on mean Poisson Deviance.

Any advice is greatly appreciated.

Thank you
Luke

@interpret-ml
Copy link
Collaborator

Hi @lukedex,

This is a great question, but unfortunately we have no support for custom loss functions in EBMs today. It's something we're working towards on our backlog, but it may take quite some time before we can support this.

It is very reasonable that a mismatch of target distribution and loss function could cause performance issues, so I think your intuition is right. Regression EBMs directly optimize MSE right now, so it makes sense that they're performing better on this category and worse in others. Happy to jump on a call and brainstorm if that would be helpful -- feel free to reach out to us at [email protected]. In addition, we'll update this issue if we have support for this in the future.

-InterpretML Team

@lukedex
Copy link
Author

lukedex commented Oct 6, 2021

Hi @lukedex,

This is a great question, but unfortunately we have no support for custom loss functions in EBMs today. It's something we're working towards on our backlog, but it may take quite some time before we can support this.

It is very reasonable that a mismatch of target distribution and loss function could cause performance issues, so I think your intuition is right. Regression EBMs directly optimize MSE right now, so it makes sense that they're performing better on this category and worse in others. Happy to jump on a call and brainstorm if that would be helpful -- feel free to reach out to us at [email protected]. In addition, we'll update this issue if we have support for this in the future.

-InterpretML Team

@interpret-ml

Hey, I did email 2 weeks ago but have no response. Just a bit worried it's been pushed into spam/junk email.

@interpret-ml
Copy link
Collaborator

Hi @lukedex,

Really appreciate that you reached out again on here -- managed to find your email, and just sent you a reply. Let us know if you didn't receive it!

@nchesk
Copy link

nchesk commented Jun 7, 2022

hi @interpret-ml, any update on custom error metric?

@interpret-ml
Copy link
Collaborator

Hi @nchesk -- We've done some work on custom losses, but it's a big change and it'll be a while (months at least) before it's ready.

-InterpretML team

@lmssdd
Copy link

lmssdd commented Nov 25, 2022

Hi, any news on this? I would need quantile regression and I am thinking about implementing it myself.
Do you have a branch I could fork and work on?

@paulbkoch
Copy link
Collaborator

Hi @lmssdd - We've done some work on it, but it isn't ready yet. You might want to look at my response to another loss function question regarding implementation #380 (comment)

We work from the develop branch.

@paulbkoch paulbkoch added the enhancement New feature or request label Feb 10, 2023
This was referenced Feb 10, 2023
@paulbkoch
Copy link
Collaborator

Hi @lukedex, @nchesk, and @lmssdd -- I'm happy to report that we finally support alternative objectives in v0.4.0, which has recently been published on PyPI. "poisson_deviance" is one of the supported objectives. More details are available in our documentation: https://interpret.ml/docs/ebm.html#explainableboostingregressor

@lmssdd
Copy link

lmssdd commented May 16, 2023 via email

@JDE65
Copy link

JDE65 commented Nov 23, 2023

Hello,
Thanks a lot for the hard work and the fantastic tool !
I used it efficiently to benchmark explainable versus non-explainable models in doi.org/10.1007/978-3-031-44064-9_26

Do you still work on allowing users to develop some custom loss functions ?
In some fields, using asymmetric loss function proves to be very efficient when there is a significant imbalance between various types of prediction errors.
Thanks in advance.

@paulbkoch
Copy link
Collaborator

Thanks @JDE65, we really liked your paper and I've added it to the readme.

We're at the point where it's possible to specify custom objectives, but this currently requires a small modification to the C++. I've added an example objective to simplify this process.

To specify that the example objective should be used, you can invoke it this way from python:

ebm = ExplainableBoostingRegressor(objective="example")

The default "example" objective is currently RMSE. To change the objective you need to modify the CalcMetric, CalcGradient, and CalcGradientHessian functions, and then recompile using either "build.sh" or "build.bat".

Those functions are located here:

GPU_DEVICE inline TFloat CalcMetric(const TFloat & score, const TFloat & target) const noexcept {
const TFloat prediction = score; // identity link function
const TFloat error = prediction - target;
return error * error;
}
GPU_DEVICE inline TFloat CalcGradient(const TFloat & score, const TFloat & target) const noexcept {
const TFloat prediction = score; // identity link function
const TFloat error = prediction - target;
// Alternatively, the 2.0 factor could be moved to GradientConstant()
const TFloat gradient = Two * error;
return gradient;
}
// If the loss function doesn't have a second derivative, then delete the CalcGradientHessian function.
GPU_DEVICE inline GradientHessian<TFloat> CalcGradientHessian(const TFloat & score, const TFloat & target) const noexcept {
const TFloat prediction = score; // identity link function
const TFloat error = prediction - target;
// Alternatively, the 2.0 factors could be moved to GradientConstant() and HessianConstant()
const TFloat gradient = Two * error;
const TFloat hessian = Two;
return MakeGradientHessian(gradient, hessian);
}

If you implement a nice objective that would be useful to the wider community, please consider contributing it back to InterpretML. I've included some instructions on how to create a new objective with its own tag instead of re-using the "example" tag:

https://github.com/interpretml/interpret/blob/develop/shared/libebm/compute/objectives/objective_registrations.hpp

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Development

No branches or pull requests

6 participants