Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AD norm is NaN at origin #191

Open
koehlerson opened this issue Jan 11, 2023 · 3 comments
Open

AD norm is NaN at origin #191

koehlerson opened this issue Jan 11, 2023 · 3 comments

Comments

@koehlerson
Copy link
Member

Not really specific to Tensors.jl, but maybe it's good to document it here as well. I ran into the following issue yesterday

julia> using Tensors

julia> Tensors.hessian(x->norm(x),zero(Tensor{2,2}),:all)
([NaN NaN; NaN NaN;;; NaN NaN; NaN NaN;;;; NaN NaN; NaN NaN;;; NaN NaN; NaN NaN], [NaN NaN; NaN NaN], 0.0)

Basically a duplicate of JuliaDiff/ChainRules.jl#576 and JuliaDiff/ForwardDiff.jl#547 just at a different repo. You can close it if you think its too redundant.

@KnutAM
Copy link
Member

KnutAM commented Jan 11, 2023

Not really a solution because the "correct" answer should be Inf,NaN,0, right?
But to circumvent some issues adding the following subgradient, one can define

julia> using LinearAlgebra, Tensors
julia> h,g,v = Tensors.hessian(norm,zero(Tensor{2,2}),:all); println.((h,g,v));
[NaN NaN; NaN NaN;;; NaN NaN; NaN NaN;;;; NaN NaN; NaN NaN;;; NaN NaN; NaN NaN]
[NaN NaN; NaN NaN]
0.0
julia> function dnorm_dx(x::AbstractTensor)
           n = norm(x)
           dn = (n == zero(n)) ? zero(x) : x / n
           return n, dn
       end
julia> LinearAlgebra.norm(x::Tensor{2,<:Any,<:Tensors.Dual}) = Tensors._propagate_gradient(dnorm_dx, x)
julia> h,g,v = Tensors.hessian(norm,zero(Tensor{2,2}),:all); println.((h,g,v));
[0.0 0.0; 0.0 0.0;;; 0.0 0.0; 0.0 0.0;;;; 0.0 0.0; 0.0 0.0;;; 0.0 0.0; 0.0 0.0]
[0.0 0.0; 0.0 0.0]
0.0
julia> h,g,v = Tensors.hessian(norm,rand(Tensor{2,2}),:all); 
julia> norm(g)
0.9999999999999999

(See docs in #181 )
if using Nan-safe mode is not an option!
Adding a note about NaN-safe mode of ForwardDiff in the Tensor docs might be beneficial since the user isn't actually importing ForwardDiff.

@KristofferC
Copy link
Collaborator

Imo, ForwardDiff should switch to Nan-safe mode by default.

@KnutAM
Copy link
Member

KnutAM commented Jan 11, 2023

I tried running the benchmarks with and without NaN-safe mode, but couldn't figure out how to show the results from the Tensors.jl ad benchmarks nicely.
Would it be too breaking to use NaN-safe mode in Tensors (is that even possible)?
If possible, would that then only apply to Tensors, or would that also affect user-code using ForwardDiff outside tensors?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants