TensorFlow源码分析——Tensor与Eigen

2024-04-27 07:04•Android•阅读 752

TensorFlow底层操作的数据结构是Tensor（张量），可以表示多维的数据，其实现在core/framework/tensor.h中，对于tensor的理解主要分两大块：

1.Tensor的组成成分

2.Tensor是如何进行数学运算的（TensorFlow本质就是处理大量训练数据集，在底层要实现深度学习常用的算法，自然涉及到一些数学运算，例如：Tensor相加、相减，softmax，reduction等操作）

Tensor的组成：

Tensor的组成可以分成3部分：DataType、TensorShape、TensorBuf. 即数据类型、张量的形状和存储数据的内存.

DataType：Tensor数据的类型，float、double、int等

TensorShape: Tensor的形状. 这里要表示维度需存储两种元素，第一种是维数、第二种是每个维度的大小. 例如：一个矩阵的大小为3×4，那么其维数为2，第一个维度大小是3，第二个是4.

TensorBuf:就是存储数据的内存地址，比如一个Tensor是3×4的矩阵，并且类型时float，那么其TensorBuf就float*类型，并且长度是12，在实现上是基于模板实现的.

Tensor的运算：

Tensor的运算主要由kernels完成，每个kernel都完成不同的运算。其中CPU版本的kernel大部分都使用Eigen库来实现，GPU版本的kernel少部分使用Eigen，大部分使用cuda编程实现。

Eigen unsupported模块提供了同样叫Tensor的类，主要就是能完成各种复杂的数学计算，并且是并行实现的（主要使用线程池或cuda），效率很高. 其API文档可参考 https://github.com/PX4/eigen/blob/master/unsupported/Eigen/CXX11/src/Tensor/README.md. TensorFlow使用Eigen进行计算时，首先要将其自己实现的Tensor类的对象转换成Eigen支持的Tensor对象. 可以再core/framework/tensor_types.h下面看到Eigen支持的各种数据类型，主要是TensorMap类. 并进行了重命名，其源码如下所示：

struct TTypes {
  // Rank-<NDIMS> tensor of scalar type T.
  typedef Eigen::TensorMap<Eigen::Tensor<T, NDIMS, Eigen::RowMajor, IndexType>,
                           Eigen::Aligned>
      Tensor;
  typedef Eigen::TensorMap<
      Eigen::Tensor<const T, NDIMS, Eigen::RowMajor, IndexType>, Eigen::Aligned>
      ConstTensor;

  // Unaligned Rank-<NDIMS> tensor of scalar type T.
  typedef Eigen::TensorMap<Eigen::Tensor<T, NDIMS, Eigen::RowMajor, IndexType> >
      UnalignedTensor;
  typedef Eigen::TensorMap<
      Eigen::Tensor<const T, NDIMS, Eigen::RowMajor, IndexType> >
      UnalignedConstTensor;

  typedef Eigen::TensorMap<Eigen::Tensor<T, NDIMS, Eigen::RowMajor, int>,
                           Eigen::Aligned>
      Tensor32Bit;

  // Scalar tensor (implemented as a rank-0 tensor) of scalar type T.
  typedef Eigen::TensorMap<
      Eigen::TensorFixedSize<T, Eigen::Sizes<>, Eigen::RowMajor, IndexType>,
      Eigen::Aligned>
      Scalar;
  typedef Eigen::TensorMap<Eigen::TensorFixedSize<const T, Eigen::Sizes<>,
                                                  Eigen::RowMajor, IndexType>,
                           Eigen::Aligned>
      ConstScalar;

  // Unaligned Scalar tensor of scalar type T.
  typedef Eigen::TensorMap<
      Eigen::TensorFixedSize<T, Eigen::Sizes<>, Eigen::RowMajor, IndexType> >
      UnalignedScalar;
  typedef Eigen::TensorMap<Eigen::TensorFixedSize<const T, Eigen::Sizes<>,
                                                  Eigen::RowMajor, IndexType> >
      UnalignedConstScalar;

  // Rank-1 tensor (vector) of scalar type T.
  typedef Eigen::TensorMap<Eigen::Tensor<T, 1, Eigen::RowMajor, IndexType>,
                           Eigen::Aligned>
      Flat;
  typedef Eigen::TensorMap<
      Eigen::Tensor<const T, 1, Eigen::RowMajor, IndexType>, Eigen::Aligned>
      ConstFlat;
  typedef Eigen::TensorMap<Eigen::Tensor<T, 1, Eigen::RowMajor, IndexType>,
                           Eigen::Aligned>
      Vec;
  typedef Eigen::TensorMap<
      Eigen::Tensor<const T, 1, Eigen::RowMajor, IndexType>, Eigen::Aligned>
      ConstVec;

  // Unaligned Rank-1 tensor (vector) of scalar type T.
  typedef Eigen::TensorMap<Eigen::Tensor<T, 1, Eigen::RowMajor, IndexType> >
      UnalignedFlat;
  typedef Eigen::TensorMap<
      Eigen::Tensor<const T, 1, Eigen::RowMajor, IndexType> >
      UnalignedConstFlat;
  typedef Eigen::TensorMap<Eigen::Tensor<T, 1, Eigen::RowMajor, IndexType> >
      UnalignedVec;
  typedef Eigen::TensorMap<
      Eigen::Tensor<const T, 1, Eigen::RowMajor, IndexType> >
      UnalignedConstVec;

  // Rank-2 tensor (matrix) of scalar type T.
  typedef Eigen::TensorMap<Eigen::Tensor<T, 2, Eigen::RowMajor, IndexType>,
                           Eigen::Aligned>
      Matrix;
  typedef Eigen::TensorMap<
      Eigen::Tensor<const T, 2, Eigen::RowMajor, IndexType>, Eigen::Aligned>
      ConstMatrix;

  // Unaligned Rank-2 tensor (matrix) of scalar type T.
  typedef Eigen::TensorMap<Eigen::Tensor<T, 2, Eigen::RowMajor, IndexType> >
      UnalignedMatrix;
  typedef Eigen::TensorMap<
      Eigen::Tensor<const T, 2, Eigen::RowMajor, IndexType> >
      UnalignedConstMatrix;
};

在tensor.h可以看到一些转换函数，比如：flat(拉伸)、vec(向量化)、matrix(矩阵化)等。部分源码如下所示：

template <typename T>
  typename TTypes<T>::Vec vec() {
    return tensor<T, 1>();
  }

  template <typename T>
  typename TTypes<T>::Matrix matrix() {
    return tensor<T, 2>();
  }

  template <typename T, size_t NDIMS>
  typename TTypes<T, NDIMS>::Tensor tensor();

template <typename T>
  typename TTypes<T>::Flat flat() {
    return shaped<T, 1>({NumElements()});
  }

调用这些函数就能将tensor转换成Eigen支持的类型，就可以直接使用Eigen提供的API进行计算. 像矩阵乘、二元操作、一元操作等都是通过调用Eigen实现的. 所以只要了解了Eigen的基本数据类型，就能轻松看懂tensor这部分源码. 了解了Eigen操作基本类型的API就能看懂kernels的代码.

注：这里为什么不直接用Eigen::Tensor？原因是Eigen::Tensor只提供了最基本的数学运算，而有时候我们在运算的时候要进行不同程度的预处理，可以参考源码中的cwise_ops_common.h中的

BinaryOp class以及reduction_ops_common.h中的ReductionOp class，这两个kernels都在运算前进行了预处理. 如果没有把Eigen::Tensor包装成我们自己的Tensor，我们就需要调用大量的Eigen的API，代码更难理解，这样包装之后，我们就只需要调用Eigen计算的一些API，而不用考虑获取一些属性. 另一个原因就是Tensor有一些属性和方法，用Eigen::Tensor的API并不好实现，并且Eigen的Tensor在声明的时候，维度是模板参数，因此只能为常量，这样就带来很多不便之处. 总之多包装一层代码层次感就更强，阅读起来就更容易一些.

上一篇 »tensorflow多层CNN代码分析
下一篇 »React超详细分析useState与useReducer源码

TensorFlow源码分析——Tensor与Eigen

相关推荐

tensorflow运行原理分析，源码

TensorFlow中numpy与tensor数据相互转化

Pytorch-Tensor，与tensor

Eigen与MATLAB使用对照表

Tensorflow之改变tensor形状

TensorFlow源码分析——Tensor与Eigen

tensorflow c++接口的编译安装与一些问题记录

tensorflow中tensor与数组之间的转换