Coder Social home page Coder Social logo

New Gauss-Newton Solver about pressio HOT 9 CLOSED

pressio avatar pressio commented on September 27, 2024
New Gauss-Newton Solver

from pressio.

Comments (9)

mhoemmen avatar mhoemmen commented on September 27, 2024 1

However, this choice was just a bit more convenient (for now) to avoid having to pass the convergence condition through the various layers needed to construct a ROM problem to make it a run-time choice that is queried when the actual solve happens.

Designing convergence tests and nested solvers is tricky. You might think you're sufficiently general, but then some new kind of solver comes along that breaks your interface assumptions.

In general, convergence criteria need run-time information. For example, users might want to supply a custom norm or inner product (ROL has this issue). An outer solver might want to control behavior of successive inner solves based on previous inner solves' results (e.g., with inexact Krylov methods, or Newton-Krylov). Users might want to combine convergence criteria (e.g., Belos' GMRES wants both the implicitly computed and explicitly computed residual norms to satisfy the tolerance).

from pressio.

fnrizzi avatar fnrizzi commented on September 27, 2024 1

@mhoemmen I agree with you! I think what we have now can be made more general and making it also a run-time option seems good to have. Btw, now that I am thinking about it, I think the current design does not prevent us from accessing run-time information, it just forces the user to choose the criterion upfront (which I agree it might not be ideal).

from pressio.

fnrizzi avatar fnrizzi commented on September 27, 2024

@eparish1
A few things to clarify:
[1] what about we call/refer to:

  • J^T J: hessian (since it is an approximation of it)
  • J^T R: projected residual (maybe you have a better name)?

[2] in view of 1 above, the target system type is admissible if it meets the following conditions:
it has these typedefs:

using scalar_type = ...;
using state_type = ...;
using hessian_type = ...;
using projected_residual_type = ...;

and these methods:

void computeHessianAndProjectedResidual(const state_t & state, 
                                        hessian_type & JTJ, 
                                        proj_residual_type & JTR) const;

hessian_type createHessianObject(const state_t & state) const{
  // create and return an empty object, does not have to contain any real data in it. 
  // It is just called to initialized members of the solver class, so only called once. 
}

projected_residual_type createProjectedResidualObject(const state_t & state) const{
  // create and return an empty object, does not have to contain any real data in it. 
  // It is just called to initialized members of the solver class, so only called once. 
}

[3] for the line search, we need to talk more, since some methods need the residual. What if we wanted to compute just the residual (or J^T R)? in that case, we should make the system method support that too. Maybe using another argument to the above method, or have a separate residual method? Basically forcing the system class to either compute J^T T and J^T R together or just R (or J^T R).

from pressio.

mhoemmen avatar mhoemmen commented on September 27, 2024

Regarding converged_when_tag, does it actually pay for this to be a compile-time choice? Why not make it a run-tiime choice?

from pressio.

fnrizzi avatar fnrizzi commented on September 27, 2024

Regarding converged_when_tag, does it actually pay for this to be a compile-time choice? Why not make it a run-tiime choice?

@mhoemmen Good point! Right now, I made this choice so one would do:

using converged_when_t = solvers::iterative::converged_when::completingNumMaxIters;
using gn_t = solvers::iterative::GaussNewton<linear_solver_t, converged_when_t, problem_t, hessian_t>;
gn_t GNSolver(problem, x, linSolver);
// problem is the problem object, x is the initial state here, linSolver is the linear solver object

One caveat (which I think is what you thought about) is that this implies a fixed convergence method for a solver object, so if you to vary the convergence condition, you would need to create a different solver object. I agree this might be annoying.
However, this choice was just a bit more convenient (for now) to avoid having to pass the convergence condition through the various layers needed to construct a ROM problem to make it a run-time choice that is queried when the actual solve happens. But I agree that it would be ideal (and at some point should) to support both options.

The other thing I want to add is that currently one can only choose among a few options for the convergence criterion specified within pressio. At some point, this should be made more flexible by allowing the user to pass a callable that sets the convergence condition and pressio performs a compile-time analysis of the callable type to make sure this meets the right API. This kind of flexibility was not super urgent (at least so far) so that is why this is missing but I would love for this to be done properly at some point :)

from pressio.

eparish1 avatar eparish1 commented on September 27, 2024

[1] For the naming convention, I think they are both ok. I personally don't mind JTJ and JTR because I think they are descriptive (but that's just me). Calling JTJ the Hessian is definitely reasonable (especially since that's what you currently have). For JTR, I don't love projected_residual as J^TR is not really a projected residual. To be consistent with Hessian, I would call J^T R the gradient.

[2] I agree with this, although is there a reason that we are doing a "createHessianObject" rather than having a "void" and a "return" equivalent of "computeHessianAndProjectedResidaul"?

[3] Yes, this is a good point with the line search. It would be useful to have a method that computes the residual norm (not the whole residual). It is a little annoying to have a method that is required only for the line search, however. Maybe we can't really get around it though.

from pressio.

fnrizzi avatar fnrizzi commented on September 27, 2024

[1] For the naming convention, I think they are both ok. I personally don't mind JTJ and JTR because I think they are descriptive (but that's just me). Calling JTJ the Hessian is definitely reasonable (especially since that's what you currently have). For JTR, I don't love projected_residual as J^TR is not really a projected residual. To be consistent with Hessian, I would call J^T R the gradient.

Ok, I like the gradient... :) The reason I don't like JTJ and JTR is that these are just letters. If somebody has a formulation that relies on denoting things differently, e.g. \zeta for the residual, then we are off (probably that will never happen, but just in case). I would argue that for generality and expressiveness, we should use names that convey the concept/meaning of things not letter. For the same reason, we don't call computeResidual something like computeR. So hessian and gradient are nice. I would even say approximate hessian and gradient, what do you think of that?

[2] I agree with this, although is there a reason that we are doing a "createHessianObject" rather than having a "void" and a "return" equivalent of "computeHessianAndProjectedResidaul"?

Yes, to follow along the residual API, and to clearly tell any user (or even ourselves) that the methods create... are just meant to create an object with no useful data in it and convey the idea that these will only be called once. Makes sense?

[3] Yes, this is a good point with the line search. It would be useful to have a method that computes the residual norm (not the whole residual). It is a little annoying to have a method that is required only for the line search, however. Maybe we can't really get around it though.

I agree. So you think just a method for the norm?

from pressio.

fnrizzi avatar fnrizzi commented on September 27, 2024
using scalar_type = ...;
using state_type = ...;
using hessian_type = ...;
using gradient_type = ...; 
//...
void computeHessianAndGradient(const state_t & state, 
                               hessian_type & JTJ, 
                               gradient_type & JTR,
                               const ::pressio::solvers::Norm & normType, 
                               scalar_type & residualNorm) const;

hessian_type createHessianObject(const state_t & state) const{
  // create and return an empty object, does not have to contain any real data in it. 
  // It is just called to initialized members of the solver class, so only called once. 
}

gradient_type createGradientObject(const state_t & state) const{
  // create and return an empty object, does not have to contain any real data in it. 
  // It is just called to initialized members of the solver class, so only called once. 
}

from pressio.

eparish1 avatar eparish1 commented on September 27, 2024

If we want to be able to integrate the Wolfe conditions, we should support a method for computing: r(x_1)^T J(x_2), where r is the residual, J is the Jacobian, and x_1, x_2 are two instances of the optimization variables (e.g., the optimization variable at the current step and the guess for the next iteration). This leads to storing a vector in memory that scales with the size of the optimization variable, which is doable. So I'd say we could have two additional methods:

1.) residual_norm(x)
2.) jacobianTransposeResid(x_1,x_2)

Obviously I think it is important to not require both of these for the solver, but only for the line search (e.g., the solver will run if none are provided, but can't use the line search). Just a side note, we should be careful with how the line search is implemented for the case of hyper reduction and different norms, as these then change the Wolfe conditions.

from pressio.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.