Topic 12.3: Data Frame
DataFrame is the widely used data structure of pandas. Note that, Series are used to work with one dimensional array, whereas DataFrame can be used with two dimensional arrays. DataFrame has two different indexes i.e. column-index and row-index. The most common way to create a DataFrame is by using the dictionary of equal-length list as shown below. Further, all the spreadsheets and text files are read as DataFrame, therefore it is very important data structure of pandas.
‘Date’:[’19-March-2022’, ’18-March-2022’, ’17-March-2022’, ’16-March-2022’]
Indexing and Slicing
Reading data from various sources
In this section, two data files are used i.e. ‘titles.csv’ and ‘cast.csv’. The ‘titles.csv’ file contains the list of movies with the releasing year; whereas ‘cast.csv’ file has five columns which store the title of movie, releasing year, star-casts, type(actor/actress), characters and ratings for actors, as shown below
read_csv : load the data from the csv file.
index_col = None : there is no index i.e. first column is data
head() : show only first five elements of the DataFrame
tail() : show only last five elements of the DataFrame If there is some error while reading the file due to encoding, then try for following option as well,
Note: head() and tail() commands can be used to remind ourselves about the header and contents of the file.
These two commands will show the first and last 5 lines respectively of the file. Further, we can change the total number of lines to be displayed by these commands,
Data Frame properties
dtype: Return the dtypes in the DataFrame.
ndinm: Return an int representing the number of axes / array dimensions.
shape: Return a tuple representing the dimensionality of the DataFrame.
size: Return an int representing the number of elements in this object.
Info:gives information about data
describe data in detail with
Isnull() check null values and sum() will perform addition of it in respective column
Unique():-Give unique value from column
cars[“engine-type”]. unique( )
Value_counts():-Will count how many time that value is repeat in data
cars[“body-style”]. value_counts( )
print(cars[“price”].max( )) print(cars[“price”].min( ))