[Clarification Chapter 3] FutureWarning: elementwise comparison failed; returning scalar instead #583

vedanthv · 2022-08-03T04:28:54Z

This is a warning I got while running the Binary Classifier (5 Detector) code from Chapter 3 specifically when I was creating the subset of the dataset with only 5's on the train and test set.

y_train_5 = (y_train == 5)
y_test_5 = (y_test == 5)

This error forbids me from running the SGDClassifier in the next code block of the book/jupyter notebook since y is not 1D array.

Also realized that the same error is still open as an issue on numpy and pandas repositories.

I'm using the versions mentioned in the readme of this repository.

Any help regarding this is appreciated. If a similar issue exists, please leave a comment and I'll close this.

Thanks!

The text was updated successfully, but these errors were encountered:

ian-coccimiglio · 2022-08-08T06:23:50Z

Hi Vedanthv,

I noticed this too. The problem seems to occur because "y_train" is created as type "object." Then the condition "y_train == 5" checks whether these objects are equivalent to integers - which they aren't, so every element returns False. Here, we can see that the first element ought to be True.

>>> X_train, X_test, y_train, y_test = X[:60000], X[60000:], y[:60000], y[60000:]
>>> y_train
array(['5', '0', '4', ..., '5', '6', '8'], dtype=object)

>>> y_train == 5
array([False, False, False, ..., False, False, False])

My solution was to cast y_train as type integer, and reshape it in the following step (SGDClassifier expects a 2D array in the correct shape).

>>> X_train, X_test, y_train, y_test = X[:60000], X[60000:], y[:60000], y[60000:]
>>> y_train = y_train.astype(int)
>>> y_train
array([5, 0, 4, ..., 5, 6, 8])

>>> y_train == 5
array([ True, False, False, ...,  True, False, False])

>>> sgd_clf = SGDClassifier(max_iter=1000, tol=1e-3, random_state=42)
>>> sgd_clf.fit(X_train, y_train_5)
>>> sgd_clf.predict(some_digit.reshape(1,-1))
array([ True])

vedanthv · 2022-08-08T07:33:06Z

Hi Ian,
Thanks for the clarification! This fixed the problem

ageron · 2022-09-26T01:20:47Z

Thanks for your question @vedanthv , and thanks for the solution @ian-coccimiglio !
It's indeed important to cast the labels to integers. The books includes this line at the bottom of page 86: y = y.astype(np.uint8).
Also, since the book was published, fetch_openml() changed: it used to return NumPy arrays, but now it returns Pandas DataFrames. This breaks some of the code in the notebooks. Luckily there's an easy fix: just set as_frame=False when calling fetch_openml() and everything should work fine.

Btw, the third edition of the book will come out in October 2022, and the updated notebooks are available at https://github.com/ageron/handson-ml3

Hope this helps!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Clarification Chapter 3] FutureWarning: elementwise comparison failed; returning scalar instead #583

[Clarification Chapter 3] FutureWarning: elementwise comparison failed; returning scalar instead #583

vedanthv commented Aug 3, 2022

ian-coccimiglio commented Aug 8, 2022 •

edited

Loading

vedanthv commented Aug 8, 2022

ageron commented Sep 26, 2022

[Clarification Chapter 3] FutureWarning: elementwise comparison failed; returning scalar instead #583

[Clarification Chapter 3] FutureWarning: elementwise comparison failed; returning scalar instead #583

Comments

vedanthv commented Aug 3, 2022

ian-coccimiglio commented Aug 8, 2022 • edited Loading

vedanthv commented Aug 8, 2022

ageron commented Sep 26, 2022

ian-coccimiglio commented Aug 8, 2022 •

edited

Loading