Principal Component Analysis (PCA) in Python – Step 1.) Import Libraries and Import Data

with No Comments

Start with Importing the libraries and Importing the Data

  • It is so easy to create a model of Principal Component Analysis. Just like other Regression or Machine Learning Model, we have to start with importing the Libraries and the Data.

Import Libraries 

  • Numpy is the library that does the scientific calculation. We definitely need that for a PCA model.
  • Pandas is the library for handling Data. We use Pandas for all the Regression Model or Machine Learning Model.
  • For example, we are going to import the CSV data into python using Pandas.

Import Data 

  • In the example below, we have declared Pandas as pd.
  • We will import the CSV data by calling the “read_csv” module from pandas.
  • “dataset” is the variable that stores all of our csv data.

Dependent Variables vs Independent Variables 

  • iloc is the function that we use to split the data from CSV file into X and Y in python.
  • X is a set of Independent Variables.
  • Y is the Dependent Variable.
  • In the example below, column 0 to column 12  (end before column 13) are the Independent variables.
  • Hence we are going to take all the rows and column 0 to 12 as “X”.
  • We are going to find the relationship between the X and Y.

# Import the libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

# Import Data
dataset = pd.read_csv(‘PCA data.csv’)
X = dataset.iloc[:, 0:13].values
y = dataset.iloc[:, 13].values

Other Topics – Multivariate Analysis : 
Other Topics – Association Rule : 
Other Topics – Artifical Inteligent : 
  • Upper Confidence Bound
  • Thompson Sampling

Leave a Reply