TensorFlow分布式训练

1、模型并行,in-graph replication;数据并行,between-graph replication。

tf.train.Supervisor

tf.train.MonitoredTrainingSession

参考链接:

https://github.com/tensorflow/examples/blob/master/community/en/docs/deploy/distributed.md