Releases: lixilinx/psgd_torch
Releases · lixilinx/psgd_torch
PSGD 2.0 release
PSGD 2.0 supports both the old tri-solver based update formulae for Q and a few inverse-free matmul only methods for updating Q, including online Newton-Schulz iterations. Main files:
- psgd.py: functional APIs providing all the flexibilities.
- wrapped_as_torch_optimizer_for_ddp.py: a basic momentum whitening torch.optim.Optimizer wrapping example for DDP training.
- wrapped_as_torch_optimizer_for_dtensor.py: one more basic momentum whitening torch.optim.Optimizer wrapping example for DTensor-based distributed training.
archived code
Update preconditioned_stochastic_gradient_descent.py 1, replace trtrs with triangular_solve due to torch's API update 2, use torch.chain_matmul for things like A @ B @ C ...