|
My Project
|
Gradient descent with optional momentum. More...


Public Member Functions | |
| CudaGD (CublasHandle &handle) | |
| Construct the optimizer. More... | |
| void | setLearningRate (CudaScalar lr) |
| Set the learning rate. More... | |
| void | setMomentum (CudaScalar momentum) |
| Set the momentum factor in [0,1) More... | |
| void | solve (int n, CudaScalar *params, const CudaScalar *input, const CudaScalar *target, int total_samples, const LossGradFun &loss_grad) override |
| Run full-batch gradient descent. More... | |
Public Member Functions inherited from cuda_mlp::CudaMinimizerBase | |
| CudaMinimizerBase (CublasHandle &handle) | |
| Construct with a cuBLAS handle reference. More... | |
| virtual | ~CudaMinimizerBase ()=default |
| int | iterations () const noexcept |
| Return the number of iterations performed in the last solve. More... | |
| void | setRecorder (::IterationRecorder< CudaBackend > *recorder) |
| Attach a recorder for loss/grad norm history. More... | |
| void | setMaxIterations (int iters) |
| Set maximum number of iterations. More... | |
| void | setTolerance (CudaScalar tol) |
| Set stopping tolerance (interpretation depends on optimizer) More... | |
| void | setLineSearchParams (int max_iters, CudaScalar c1, CudaScalar rho) |
| Configure Armijo line search parameters. More... | |
Additional Inherited Members | |
Public Types inherited from cuda_mlp::CudaMinimizerBase | |
| using | LossGradFun = std::function< CudaScalar(const CudaScalar *params, CudaScalar *grad, const CudaScalar *input, const CudaScalar *target, int batch)> |
| Loss and gradient callback signature. More... | |
| using | IterHook = std::function< void(int)> |
| Optional per-iteration hook signature. More... | |
Protected Attributes inherited from cuda_mlp::CudaMinimizerBase | |
| CublasHandle & | handle_ |
| cuBLAS handle used by the optimizer More... | |
| int | max_iters_ = 200 |
| int | max_line_iters_ = 20 |
| Iteration limits. More... | |
| CudaScalar | tol_ = 1e-6f |
| CudaScalar | c1_ = 1e-4f |
| CudaScalar | rho_ = 0.5f |
| Stopping and line-search params. More... | |
| int | last_iterations_ = 0 |
| Iterations performed in last run. More... | |
| ::IterationRecorder< CudaBackend > * | recorder_ = nullptr |
| Optional recorder for diagnostics. More... | |
Gradient descent with optional momentum.
Minimizes a loss L(theta) using updates:
|
inlineexplicit |
Construct the optimizer.
|
inline |
Set the learning rate.
|
inline |
Set the momentum factor in [0,1)
|
inlineoverridevirtual |
Run full-batch gradient descent.
| n | Number of parameters |
| params | Parameter vector (device) |
| input | Input batch (device) |
| target | Target batch (device) |
| total_samples | Number of samples in the full batch |
| loss_grad | Callback returning loss and gradient |
Implements cuda_mlp::CudaMinimizerBase.
