编译tensorflow遇见JVM out错误

文章目录

1、问题

[root@k8s-master tensorflow]# bazel build --config=opt --verbose_failures //tensorflow:libtensorflow_cc.so
INFO: Analysed target //tensorflow:libtensorflow_cc.so (138 packages loaded, 11509 targets configured).
INFO: Found 1 target...
[10,192 / 10,400] 48 actions running
    Compiling tensorflow/core/kernels/cwise_op_igammas.cc; 33s local
    Compiling tensorflow/core/kernels/cwise_op_greater.cc; 33s local
    Compiling tensorflow/core/kernels/cwise_op_floor_mod.cc; 33s local
    Compiling tensorflow/core/kernels/cwise_op_mod.cc; 33s local
    Compiling tensorflow/core/kernels/cwise_op_mul_2.cc; 33s local
    Compiling tensorflow/core/kernels/cwise_op_equal_to_2.cc; 33s local
    Compiling tensorflow/core/kernels/cwise_op_pow.cc; 33s local
    Compiling tensorflow/core/kernels/cwise_op_bitwise_and.cc; 33s local ...

Server terminated abruptly (error code: 14, error message: 'Socket closed', log file: '/root/.cache/bazel/_bazel_root/c67f7401a5c0cf4a446e6a7f5e6a0388/server/jvm.out')

2、解决

2.1 查看是否内存问题 即交换内存

free -m 查看下

如果没有swap内存 增加开启swap内存、重启系统

2.2 因为是用的CUDA 看下GPU的温度

nvidia-smi 温度太高 需要进行降温

3、参考

【1】 编译tensorflow出现jvm out问题解决