Pandas Practice Code

2/7/22, 11:44 AM

Pandas Practice Code - Set 2 - Jupyter Notebook

localhost:8888/notebooks/Pandas Practice Code - Set 2.ipynb

1/6

In [1]: 

In [2]: 

[5, 8, 2, 11, 9]

Empty DataFrame

Columns: []

Index: []

# A Pandas series can also be converted to a Python list using

# the tolist() method, as shown in the script below:

# Converting to List

import pandas as pd

import numpy as np

my_series = pd.Series([5, 8, 2, 11, 9])

print(my_series.tolist())

# A Pandas dataframe is a tabular data structure that stores

# data in the form of rows and columns. As a standard, the rows

# correspond to records while columns refer to attributes. In

# simplest words, a Pandas dataframe is a collection of series.

# As is the case with a series, there are multiple ways to create a

# Pandas dataframe.

# To create an empty dataframe, you can use the DataFrame

# class from the Pandas module, as shown below:

# empty pandas dataframe

import pandas as pd

my_df = pd.DataFrame()

print(my_df)

2/7/22, 11:44 AM

Pandas Practice Code - Set 2 - Jupyter Notebook

localhost:8888/notebooks/Pandas Practice Code - Set 2.ipynb

2/6

In [3]: 

In [4]:



Out[3]:

Subject Score

0 Mathematics 85

1 English 91

2 History 95

Out[4]:

Subject Score

0 Mathematics 98

1 History 75

2 English 68

3 Science 82

4 Arts 99

# You can create a Pandas dataframe using a list of lists. Each

# sublist in the outer list corresponds to a row in a dataframe.

# Each item within a sublist becomes an attribute value.

# To specify column headers, you need to pass a list of values to

# the columns attribute of DataFrame class.

# Here is an example of how you can create a Pandas dataframe

# using a list.

# dataframe using list of lists

import pandas as pd

scores = [['Mathematics', 85], ['English', 91], ['History', 95]]

my_df = pd.DataFrame(scores, columns = ['Subject', 'Score'])

my_df

# Similarly, you can create a Pandas dataframe using a

# dictionary. One of the ways is to create a dictionary where

# keys correspond to column headers. In contrast,

# corresponding dictionary values are a list, which corresponds

# to the column values in the Pandas dataframe.

# Here is an example for your reference:

# dataframe using dictionaries

import pandas as pd

scores = {'Subject':["Mathematics", "History", "English", "Science", "Arts"], 'Score':[9

my_df = pd.DataFrame(scores)

my_df

2/7/22, 11:44 AM

Pandas Practice Code - Set 2 - Jupyter Notebook

localhost:8888/notebooks/Pandas Practice Code - Set 2.ipynb

3/6

In [5]: 

In [6]:



Out[5]:

Subject Score

0 Mathematics 85

1 History 98

2 English 76

3 Science 72

4 Arts 95

Out[6]:

Subject Score

0 Mathematics 85

1 History 98

2 English 76

3 NaN 72

4 Arts 95

# Another way to create a Pandas dataframe is using a list of

# dictionaries. Each dictionary corresponds to one row. Here is

#an example of how to do that.

# dataframe using list of dictionaries

import pandas as pd

scores = [{'Subject':'Mathematics', 'Score':85},{'Subject':'History', 'Score':98},

{'Subject':'English', 'Score':76}, {'Subject':'Science', 'Score':72},

{'Subject':'Arts', 'Score':95},]

my_df = pd.DataFrame(scores)

my_df

# The dictionaries within the list used to create a Pandas

# dataframe need not be of the same size.

# For example, in the script below, the fourth dictionary in the

# list contains only one item, unlike the rest of the dictionaries in

# this list. The corresponding dataframe will contain a null value

# in place of the second item, as shown in the output of the

# script below:

# dataframe using list of dictionaries

# with null items

import pandas as pd

scores = [{'Subject':'Mathematics', 'Score':85}, {'Subject':'History', 'Score':98},

{'Subject':'English', 'Score':76}, {'Score':72}, {'Subject':'Arts', 'Score':95

my_df = pd.DataFrame(scores)

my_df

2/7/22, 11:44 AM

Pandas Practice Code - Set 2 - Jupyter Notebook

localhost:8888/notebooks/Pandas Practice Code - Set 2.ipynb

4/6

In [7]: 

In [8]:



Out[7]:

Subject Score

0 Mathematics 85

1 History 98

Out[8]:

Subject Score

3 Science 72

4 Arts 95

# Let’s now see some of the basic operations that you can

# perform on Pandas dataframes.

# To view the top(N) rows of a dataframe, you can call the

# head() method, as shown in the script below:

# viewing header

import pandas as pd

scores = [

{'Subject':'Mathematics', 'Score':85},

{'Subject':'History', 'Score':98},

{'Subject':'English', 'Score':76},

{'Subject':'Science', 'Score':72},

{'Subject':'Arts', 'Score':95},

]

my_df = pd.DataFrame(scores)

my_df.head(2)

# To view the last N rows, you can use the tail() method. Here is

# an example:

# viewing tail

import pandas as pd

scores = [

{'Subject':'Mathematics', 'Score':85},

{'Subject':'History', 'Score':98},

{'Subject':'English', 'Score':76},

{'Subject':'Science', 'Score':72},

{'Subject':'Arts', 'Score':95},

]

my_df = pd.DataFrame(scores)

my_df.tail(2)

2/7/22, 11:44 AM

Pandas Practice Code - Set 2 - Jupyter Notebook

localhost:8888/notebooks/Pandas Practice Code - Set 2.ipynb

5/6

In [9]: 

RangeIndex: 5 entries, 0 to 4

Data columns (total 2 columns):

# Column Non-Null Count Dtype

--- ------ -------------- -----

0 Subject 5 non-null object

1 Score 5 non-null int64

dtypes: int64(1), object(1)

memory usage: 208.0+ bytes

# You can also get a summary of your Pandas dataframe using

# the info() method.

# gettingdataframe info

import pandas as pd

scores = [

{'Subject':'Mathematics', 'Score':85},

{'Subject':'History', 'Score':98},

{'Subject':'English', 'Score':76},

{'Subject':'Science', 'Score':72},

{'Subject':'Arts', 'Score':95},

]

my_df = pd.DataFrame(scores)

my_df.info()

# In the output below, you can see the number of entries in your

# Pandas dataframe, the number of columns along with their

# column type, and so on.

2/7/22, 11:44 AM

Pandas Practice Code - Set 2 - Jupyter Notebook

localhost:8888/notebooks/Pandas Practice Code - Set 2.ipynb

6/6

In [10]: 

Out[10]:

Score

count 5.000000

mean 85.200000

std 11.388591

min 72.000000

25% 76.000000

50% 85.000000

75% 95.000000

max 98.000000

# Finally, to get information such as mean, minimum, maximum,

# standard deviation, etc., for numeric columns in your Pandas

# dataframe, you can use the describe() method, as shown in

# the script below:

# getting info about numeric columns

import pandas as pd

scores = [

{'Subject':'Mathematics', 'Score':85},

{'Subject':'History', 'Score':98},

{'Subject':'English', 'Score':76},

{'Subject':'Science', 'Score':72},

{'Subject':'Arts', 'Score':95},

]

my_df = pd.DataFrame(scores)

my_df.describe()

CYBORG TUTORIALS

Search This Blog

Pandas Practice Code - Set 2

Comments

Post a Comment