Loading Data and Visualisation with Pandas
Pandas has other magic powers. Two of these that are used often are its functions to load and visualise data.
Loading Data and Visualisation with Pandas
Pandas has other magic powers. Two of these that are used often are its functions to load and visualise data.
As usual, we load the libraries. Two new lines here to explain -
we import matplotlib.pyplot so as to get access to plotting functions within matplotlib, another library which we will get into later
we then use a magic function — % matplotlib inline — so as to have the charts appear inside the notebook. Otherwise they would open in a new window
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
% matplotlib inlineNow we use Numpy to generate 20 samples from a normal distribution with mean of 10 and std dev of 10, save it as a Pandas series, and plot it. We also use the plt.legend function to make a legend appear in the chart.
s = pd.Series(np.random.normal(10,10, 20))
s.plot(style='r--', alpha=0.6, label='Pandas Plot')
plt.legend()
plt.show()Next, we create a dataframe from a dictionary with lists.
data = {'Employment':[10, 20, 30, 40, 50],
'Income':[100, 50, 200, 300, 20]}
df = pd.DataFrame(data, index=['2000', '2001', '2002', '2003', '2004'])Plotting bars (instead of lines) is super simple. Just have the kind be set to bars.
df.plot(kind='bar', subplots=True, sharex=True)
plt.tight_layout()
plt.show()Now we use Pandas to load an external CSV file. Easy peasy.
SNP = pd.read_csv('SNP.csv')And plot the data.
SNP.plot(style='r--', alpha=0.6, label='Pandas Plot')
plt.legend()
plt.show()The Jupyter notebook with the code, and some other useful tips is here


