## What will we cover?

In the first lesson we learnt how to load data into a **DataFrame**. This part will show how to work with each column in the **DataFrame**. The columns are represented by a different data type, called **Series**.

n this lesson we will learn how to make calculations on the columns. The columns are represented by a data type called **Series**.

Each column in a **DataFrame** is a **Series** and can be easily accessed. Also, it is easy to calculate new **Series** of data. This is similar to calculate now columns of data in an **Excel** sheet.

We will explore that and more in this lesson.

## Step 1: Load the data

We will start by importing the data (CSV file available here).

```
import pandas as pd
data = pd.read_csv("AAPL.csv", index_col=0, parse_dates=True)
```

## Step 2: Explore the data and data type

In the video we explore the data to ensure it is correct. You can do that by using **data.head()**.

Then we investigate the data type of the columns of the **DataFrame** **data**.

```
data.dtypes
```

Which results in the following.

```
Open float64
High float64
Low float64
Close float64
Adj Close float64
Volume int64
dtype: object
```

This means shows that each column has one data type. Here **Open** is float64. This is one difference from Excel sheets, where each cell has a data type. The advantage of restricting a data type per column is speed.

The data type of **data** is **DataFrame**.

```
type(data)
```

The build in function **type(…) **gives you the type. It is handy to use it when exploring data.

```
pandas.core.frame.DataFrame
```

Notice that it is given by a long string **pandas.core.frame.DataFrame**, this is the structure of the library **Pandas**.

The data type of a column in a **DataFrame** can be found by.

```
type(data['Close'])
```

Where **data[‘Close’]** gives access to column **Close** in the **DataFramedata**.

```
pandas.core.series.Series
```

Where we see a column is represented as a **Series**. The is similar to a DataFrame that it has an index. E.g. the **Series data[‘Close’]** has the same index as the **DataFrame data**. This is handy when you need to work with the data as you will see in a moment.

## Step 3: Calculating with Series

To keep it simple, we will start by the daily difference from open and close.

```
daily_chg = data['Open'] - data['Close']
```

This calculates a **Series daily_chg** with the opening price minus the closing price.

Please explore the full data in **daily_chg** with the data in **data**.

A more advanced calculation is this one.

```
daily_pct_chg = (data['Close'] - data['Open'])/data['Open']*100
```

Where we calculate the daily percentage change. In the calculation above we have limited us to only use data on the same rows (same dates). Later we will learn how to do it with data from previous day (the row above).

## Step 4: Normalize stock data

Now we will normalize the data by using the **iloc** we learned about in previous lesson.

```
norm = data['Close']/data['Close'].iloc[0]
```

The above statements calculates a **Series norm** where the **Close** price is normalized by dividing by the first available **Close** price, accessed by using **iloc[0]**.

This results in that **norm.iloc[0]** will be** 1.0000 **and **norm.iloc[-1]** we show the return of this particular stock if invested in on day 1 (index 0) and sold on the day of the last index (index -1), in the case of the video: **1.839521**.

## Next step?

Want to learn more?

This is part of the **FREE** online course on my page. No signup required and 2 hours of free video content with code and **Jupyter** **Notebooks** available on **GitHub**.

Follow the link and read more.