5 Most Popular Pandas Functions

Here are some of the most frequently used Pandas commands, grouped by their primary functions:

Related: Confluent & Kafka: From Microservices to AI Data Streams.

Related: Pandas for Beginners: From Zero to Hero in 3 Simple Steps.

1. Data Loading and Saving:

  • pd.read_csv(): Reads data from a CSV file into a DataFrame.
  • pd.read_excel(): Reads data from an Excel file into a DataFrame.
  • pd.read_json(): Reads data from a JSON file into a DataFrame.
  • pd.to_csv(): Saves a DataFrame to a CSV file.
  • pd.to_excel(): Saves a DataFrame to an Excel file.
  • pd.to_json(): Saves a DataFrame to a JSON file.

2. Data Inspection and Selection:

  • df.head(): Displays the first few rows of a DataFrame.
  • df.tail(): Displays the last few rows of a DataFrame.
  • df.shape: Returns the dimensions (rows and columns) of a DataFrame.
  • df.info(): Provides summary information about the DataFrame’s columns, including data types and missing values.
  • df.describe(): Calculates descriptive statistics for numerical columns.
  • df.columns: Returns a list of column names.
  • df.loc[]: Selects data by label.
  • df.iloc[]: Selects data by position.

3. Data Cleaning and Handling Missing Values:

  • df.isnull(): Identifies missing values in a DataFrame.
  • df.fillna(): Fills missing values with specified values.
  • df.dropna(): Drops rows or columns with missing values.

4. Data Manipulation and Transformation:

  • df.sort_values(): Sorts a DataFrame by one or more columns.
  • df.groupby(): Groups data by one or more columns and applies aggregate functions.
  • df.apply(): Applies a function to each row or column of a DataFrame.
  • df.merge(): Combines DataFrames based on common columns (like SQL joins).
  • df.concat(): Concatenates DataFrames along rows or columns.

5. Data Aggregation and Summary:

  • df.sum(): Calculates the sum of values in a column.
  • df.mean(): Calculates the mean of values in a column.
  • df.median(): Calculates the median of values in a column.
  • df.count(): Counts the number of non-null values in a column.
  • df.max(): Finds the maximum value in a column.
  • df.min(): Finds the minimum value in a column.

Discover more from Susiloharjo

Subscribe to get the latest posts sent to your email.

Discover more from Susiloharjo

Subscribe now to keep reading and get access to the full archive.

Continue reading