# DataFrame Columns and Series for Financial Analysis

## What will we cover?

In the first lesson we learnt how to load data into a DataFrame. This part will show how to work with each column in the DataFrame. The columns are represented by a different data type, called Series.

n this lesson we will learn how to make calculations on the columns. The columns are represented by a data type called Series.

Each column in a DataFrame is a Series and can be easily accessed. Also, it is easy to calculate new Series of data. This is similar to calculate now columns of data in an Excel sheet.

We will explore that and more in this lesson.

## Step 1: Load the data

We will start by importing the data (CSV file available here).

```import pandas as pd
```

## Step 2: Explore the data and data type

In the video we explore the data to ensure it is correct. You can do that by using data.head().

Then we investigate the data type of the columns of the DataFrame data.

```data.dtypes
```

Which results in the following.

```Open         float64
High         float64
Low          float64
Close        float64
Volume         int64
dtype: object
```

This means shows that each column has one data type. Here Open is float64. This is one difference from Excel sheets, where each cell has a data type. The advantage of restricting a data type per column is speed.

The data type of data is DataFrame.

```type(data)
```

The build in function type(…) gives you the type. It is handy to use it when exploring data.

```pandas.core.frame.DataFrame
```

Notice that it is given by a long string pandas.core.frame.DataFrame, this is the structure of the library Pandas.

The data type of a column in a DataFrame can be found by.

```type(data['Close'])
```

```pandas.core.series.Series
```

Where we see a column is represented as a Series. The is similar to a DataFrame that it has an index. E.g. the Series data[‘Close’] has the same index as the DataFrame data. This is handy when you need to work with the data as you will see in a moment.

## Step 3: Calculating with Series

To keep it simple, we will start by the daily difference from open and close.

```daily_chg = data['Open'] - data['Close']
```

This calculates a Series daily_chg with the opening price minus the closing price.

Please explore the full data in daily_chg with the data in data.

A more advanced calculation is this one.

```daily_pct_chg = (data['Close'] - data['Open'])/data['Open']*100
```

Where we calculate the daily percentage change. In the calculation above we have limited us to only use data on the same rows (same dates). Later we will learn how to do it with data from previous day (the row above).

## Step 4: Normalize stock data

Now we will normalize the data by using the iloc we learned about in previous lesson.

```norm = data['Close']/data['Close'].iloc
```

The above statements calculates a Series norm where the Close price is normalized by dividing by the first available Close price, accessed by using iloc.

This results in that norm.iloc will be 1.0000 and norm.iloc[-1] we show the return of this particular stock if invested in on day 1 (index 0) and sold on the day of the last index (index -1), in the case of the video: 1.839521.