Simple Linear Regression in Python – Step 4.) Predicting Result with Simple Linear Regression Model

with No Comments

In the previous section, we have created a variable name ‘regressor’, which learnt the mathematical relationship of our x variable and y variable.

We can now predict the result of testing set using the ‘regressor’

Predicting Result with Simple Linear Regression Model

#Import Libraries

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

#Import data
dataset = pd.read_csv(‘Data.csv’)
x = dataset.iloc[:,:-1].values
y =dataset.iloc[:,1].values

#Splitting training set and testing set
from sklearn.cross_validation import train_test_split
xtrain, xtest, ytrain, ytest =train_test_split(x,y,test_size=0.25)

#Training and Fitting model
from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
regressor.fit(xtrain, ytrain)

#Predicting using the Model

y_prediction = regressor.predict(xtest)

  • In the section where we split the data into training set and testing set, we have created the xtest and ytest variables.
  • ytest is the TRUE result that we have observed with xtest as the independent variable.
  • With ‘regressor’ from the previous section, we have drew the mathematical relationship of xtrain and ytrain. Now, we are going to see whether the mathematical relationship can help us predict the value of ytest using xtest.
  • We are going to assign the predicted value as y_prediction.
  • If the Simple Linear Regression Model is good, then the y_prediction values should be close to the y_test values.
  • The table below shows the value of predicted value. The predicted value of Y is very close the to true value of Y which is excellent.
xtest
4.21
8.71
4.19
7.11
2.07
2.46
3.80
4.53
6.63
ytest (TRUE VALUE)
64,194.70
103,234.62
60,560.02
101,400.67
48,593.80
44,312.45
64,651.30
69,147.45
82,687.99
y_prediction (Pedicted Value)
65,744.20
109,691.96
65,548.88
94,066.09
44,844.60
48,653.41
61,740.07
68,869.38
89,378.33

 

Leave a Reply