python - Results differ whether using a list or a numpy array in scikit-learn

翻译自：https://stackoverflow.com/questions/17674523 2013-07-16T10:46:34.967

904 次

I have a dataset, data, and a labeled array, target, with which I build in scikit-learn a supervised model using the k-Nearest Neighbors algorithm.

neigh = KNeighborsClassifier()
neigh.fit(data, target)

I am now able to classify my learning set using this very model. To get the classification score :

neigh.score(data, target)

Now my problem is that this score depends on the type of the target object.

If it is a python list, that is, created using list() and filled in with target.append(), the score method returns 0.68.
If it is a numpy array, created using target = np.empty(shape=(length,1), dtype="S36") (it contains only 36-character strings), and filled in with target[k] = value, the score method returns 0.008.

To make sure whether results were really different or not, I created text files that list the results of

for k in data:
    neigh.predict(k)

in each case. The results were the same.

What can explain the score difference ?

python - Results differ whether using a list or a numpy array in scikit-learn

1 に答える 1

Related

Reference