Adaptive Normalization

Adaptive Normalization, or AdaNorm, is a variation of Layer Normalization which replaces the bias and scaling terms with a scaling function, based on the statistics of the input. Furthermore the gradients of the scaling function are 'detached' (that is, not required) during backpropagation.
Related concepts:
Layer Normalization
Related video:
https://youtube.com/shorts/Q84K1XJAhtc
External reference:
https://arxiv.org/abs/1911.07013