In this notebook, we will implement linear regression from scratch using numpy and compare it with the linear regression model from scikit-learn and the ordinary least squares method.
Because we are using gradient descent to optimize the weights of our model, we need to normalize the data.
def prep_data(X, y):
# normalize x to have mean=0, std=1
X = (X - X.mean(axis=0)) / X.std(axis=0) + 1
# Normalize y to have mean=1, std=1
y = (y - y.mean()) / y.std() + 1
return X, y
We use gradient descent to optimize the weights of our model.
We use mean squared error as the loss function.
To optimize the weights of our model, we must get the gradient of the loss function with respect to the weights.
def fit(self, X, y):
n_samples, n_features = X.shape
self.weights = np.zeros(n_features)
self.bias = 0
for _ in range(self.n_iters):
z =, self.weights) + self.bias
dw = (2 / n_samples) *, (z - y))
db = (2 / n_samples) * np.sum(z - y)
self.weights -= self.learning_rate * dw
self.bias -= self.learning_rate * db
We update the weights and biases using the following formulas:
Where, hyperparameter,
We use the coefficient of determination,
$$ \begin{equation} R^2 = 1 - \frac{\sum_{i=1}^{N} (y_i - \hat{y}i)^2}{\sum{i=1}^{N} (y_i - \bar{y})^2} \end{equation} $$
def score(self, X, y): # R^2 score
y_pred = self.predict(X)
u = ((y - y_pred) ** 2).sum()
v = ((y - y.mean()) ** 2).sum()
return 1 - u / (v + 1e-10) # Add small constant to avoid division by zero
Just like our model, we prep our data exactly the same. However, for the fit function, we use the following formula to calculate the weights:
def fit(self, X, y):
# Add a column of ones to X for the bias term
if X.ndim == 1:
X = X.reshape(-1, 1)
X_with_bias = np.column_stack([np.ones(X.shape[0]), X])
# Compute the coefficients using the normal equation
coeffs = np.linalg.inv(X_with_bias.T @ X_with_bias) @ X_with_bias.T @ y
# Extract bias and weights
self.bias = coeffs[0]
self.weights = coeffs[1:]
We have to an add a column of ones to X for the bias term because our X only has features and no bias term. Then we use the normal equation to calculate the weights and bias.
We score this model with an
After implementing our linear regression model from scratch, the final
Scikit-learn's R^2 score: 0.49656835105076846
Our Model: 0.4965683510508899
OLS Model: 0.49656835105076835