Published on

tensor2tensor框架记录

tensor2tensor 1.13.2

tensorflow 1.13.1

版本较老,pip自动安装会出现版本依赖问题。

!pip install tensorflow-datasets==1.0.1

默认参数主要layers/common_hparams.py,这里有一些参数的解释

训练的一些参数在 utils/trainer_lib.py

学习率,一般采用constant*linear_warmup*rsqrt_decay*rsqrt_hidden_size

分别的计算方式

constant = hparams.learning_rate_constant 2 linear_warmup = tf.minimum(1.0, step_num / hparams.learning_rate_warmup_steps 8000) tf.rsqrt(tf.maximum(step_num, hparams.learning_rate_warmup_steps)) hparams.hidden_size ** -0.5 1024

def calc_lr(step, warmup_up=8000):
    constant = 0.2
    learning_rate_warmup_steps = warmup_up
    hidden_size = 1024
    linear_warmup = min(1.0, step / learning_rate_warmup_steps)
    rsqrt_decay = 1/math.sqrt(max(step, learning_rate_warmup_steps))
    rsqrt_hidden_size = hidden_size ** -0.5
    result = constant * linear_warmup * rsqrt_decay * rsqrt_hidden_size
    return result