matlab、sklearn 中的数据预处理

数据预处理(normalize、scale)

0. 使用 PCA 降维

  • matlab:

    [coeff, score] = pca(A);
    reducedDimension = coeff(:,1:5);
    reducedData = A * reducedDimension;

1. 最大最小映射(matlab)

[trainx, s1] = mapminmax(trainx);
testx = mapminmax('apply', test1, s1);

2. sklearn.preprocessing

去均值时,在测试集上进行预测时减去的均值是训练集上得到的均值;

import sklearn.preprocessing as prep

def standard_scale(X_train, X_test):
    preprocessor = prep.StandardScaler().fit(X_train)
    X_train = preprocessor.transform(X_train)
    X_test = preprocessor.transform(X_test)
    return X_train, X_test