Categories: Data Science Quiz
Tags:

Pandas is one of the most powerful and widely used libraries in Python for data analysis and manipulation. Whether you are a beginner stepping into data science or a professional working with large datasets, understanding Pandas is essential. This quiz is designed to test your foundational knowledge of key Pandas concepts such as DataFrames, Series, data cleaning, filtering, grouping, and merging. By attempting these questions, you can evaluate your understanding, identify knowledge gaps, and strengthen your data handling skills. It serves as a practical way to reinforce concepts that are frequently used in real-world data analysis and analytics projects.

Q1. What is Pandas primarily used for?

A. Web Development
B. Data Analysis and Manipulation
C. Game Development
D. Machine Learning Only


Q2. Which data structure is 1-dimensional in Pandas?

A. DataFrame
B. Panel
C. Series
D. Array


Q3. Which function is used to read a CSV file in Pandas?

A. pd.load_csv()
B. pd.read_csv()
C. pd.open_csv()
D. pd.import_csv()


Q4. How do you display the first 5 rows of a DataFrame?

A. df.first()
B. df.top()
C. df.head()
D. df.show()


Q5. Which method is used to get summary statistics of a DataFrame?

A. df.summary()
B. df.describe()
C. df.stats()
D. df.info()


Q6. What does df.shape return?

A. Number of rows only
B. Number of columns only
C. Tuple of rows and columns
D. Total number of elements


Q7. Which function is used to select a column in a DataFrame?

A. df.column_name
B. df[‘column_name’]C. Both A and B
D. df.get(column_name)


Q8. What does df.isnull() do?

A. Deletes null values
B. Fills null values
C. Detects missing values
D. Replaces null values


Q9. Which function removes missing values?

A. dropna()
B. fillna()
C. remove()
D. clean()


Q10. How do you rename columns in Pandas?

A. df.rename()
B. df.columns.rename()
C. df.change()
D. df.modify()


Q11. What does df.info() provide?

A. Data summary statistics
B. Data types and non-null counts
C. Column names only
D. Shape of data


Q12. Which function sorts data in Pandas?

A. sort()
B. order()
C. sort_values()
D. arrange()


Q13. How do you filter rows based on a condition?

A. df.filter()
B. df.where()
C. df[df[‘col’] > value]D. df.select()


Q14. Which method is used to group data?

A. group()
B. groupby()
C. aggregate()
D. split()


Q15. What does fillna() do?

A. Removes null values
B. Fills missing values
C. Detects null values
D. Converts data types


Q16. Which function is used to merge two DataFrames?

A. concat()
B. join()
C. merge()
D. All of the above


Q17. What is the default axis in Pandas operations?

A. axis=0 (rows)
B. axis=1 (columns)
C. axis=None
D. axis=2


Q18. Which method removes duplicate rows?

A. drop_duplicates()
B. remove_duplicates()
C. unique()
D. distinct()


Q19. What does df.iloc[] use?

A. Label-based indexing
B. Integer-based indexing
C. Boolean indexing
D. String indexing


Q20. Which method converts a DataFrame to a NumPy array?

A. df.to_array()
B. df.to_numpy()
C. df.convert()
D. df.numpy()

Q1. Answer: B — Data Analysis and Manipulation

Pandas is mainly used for handling structured data, cleaning datasets, and performing analysis. It is a core library in data science workflows.


Q2. Answer: C — Series

A Series is a one-dimensional labeled array, while a DataFrame is two-dimensional.


Q3. Answer: B — pd.read_csv()

This is the standard function used to load CSV files into a Pandas DataFrame.


Q4. Answer: C — df.head()

df.head() shows the first 5 rows by default. You can pass a number like df.head(10).


Q5. Answer: B — df.describe()

This function provides summary statistics like mean, count, std, min, max, etc.


Q6. Answer: C — Tuple of rows and columns

df.shape returns (rows, columns) — useful for quickly understanding dataset size.


Q7. Answer: C — Both A and B

You can access a column using:

  • df['col'] (preferred)
  • df.col (only if column name is valid Python identifier)

Q8. Answer: C — Detects missing values

df.isnull() returns a boolean DataFrame indicating missing values.


Q9. Answer: A — dropna()

dropna() removes rows or columns with missing values.


Q10. Answer: A — df.rename()

Used to rename columns or index labels:

df.rename(columns={'old':'new'})

Q11. Answer: B — Data types and non-null counts

df.info() gives a quick overview including:

  • column names
  • data types
  • non-null values

Q12. Answer: C — sort_values()

Used to sort data by column values:

df.sort_values(by='col')

Q13. Answer: C — df[df[‘col’] > value]

This is boolean indexing, the most common way to filter rows.


Q14. Answer: B — groupby()

Used for grouping data and applying aggregation functions like sum, mean, count.


Q15. Answer: B — Fills missing values

fillna() replaces null values with a specified value.


Q16. Answer: D — All of the above

  • merge() → SQL-style joins
  • join() → index-based joining
  • concat() → stacking data

Q17. Answer: A — axis=0 (rows)

Default axis is rows.

  • axis=0 → operate along rows
  • axis=1 → operate along columns

Q18. Answer: A — drop_duplicates()

Removes duplicate rows from the DataFrame.


Q19. Answer: B — Integer-based indexing

iloc uses position-based indexing (row/column numbers), unlike loc which uses labels.


Q20. Answer: B — df.to_numpy()

Converts DataFrame into a NumPy array for numerical operations.


Conclusion

This Pandas MCQ Quiz Set 1 serves as a solid foundation for anyone looking to build or assess their data analysis skills using Python. The questions covered a wide range of essential topics, including data structures like Series and DataFrames, data loading techniques, handling missing values, filtering datasets, and performing operations such as sorting, grouping, and merging. These are not just theoretical concepts but practical tools that data professionals use daily to extract insights from raw data.

Understanding Pandas is critical in today’s data-driven world, where businesses rely heavily on analytics to make informed decisions. From cleaning messy datasets to performing complex transformations, Pandas provides a flexible and efficient framework that simplifies data manipulation tasks. Mastering these basics ensures that you are well-prepared to handle real-world datasets and perform meaningful analysis.

If you found certain questions challenging, it’s a good indicator of areas where you can focus your learning. Practice is key when it comes to mastering Pandas. Try working on small datasets, experimenting with different functions, and applying concepts in real scenarios like sales analysis, customer segmentation, or marketing data evaluation.

As you progress, you can move toward more advanced topics such as time series analysis, data visualization integration, and performance optimization. Continuous learning and hands-on practice will help you transition from basic understanding to expertise.

Overall, this quiz is not just a test but a stepping stone toward becoming proficient in data analysis with Python. Keep practicing, stay curious, and consistently apply your knowledge to real-world problems to truly master Pandas and unlock its full potential.