matplotlib & seaborn Basics for Data Analysts
matplotlib & seaborn Basics for Data Analysts
Analysis nobody can see is analysis nobody acts on. As a data analyst in India, your numbers become decisions only when you can show them clearly - and the two libraries that do the heavy lifting in Python are matplotlib and seaborn. This guide gives you the core charts, the rule for choosing between them, and the clean-styling habits interviewers notice.
How the two libraries relate
- matplotlib is the foundational plotting engine. It gives you full control over every element but is verbose.
- seaborn is built on top of matplotlib. It produces attractive statistical charts with far less code and sensible defaults.
The practical workflow: use seaborn for quick, good-looking statistical plots, and drop down to matplotlib when you need fine control over labels, titles, or layout. They cooperate - seaborn returns matplotlib axes you can keep tweaking.
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
sns.set_theme(style="whitegrid") # instantly cleaner defaults
The figure and axes model
Interviewers like the explicit object-oriented style over scattered plt. calls. Create a figure and an axes, then draw on the axes:
fig, ax = plt.subplots(figsize=(8, 4))
ax.plot([1, 2, 3], [10, 20, 15])
ax.set_title("Sample")
plt.show()
fig is the whole canvas; ax is the plotting area. This pattern scales cleanly to multiple subplots and is the professional default.
Core charts and when to use each
Line chart - trends over time
Use a line chart when the x-axis is continuous, especially dates. "Zomato revenue by month" is the classic case:
revenue = pd.DataFrame({
"month": ["Jan", "Feb", "Mar", "Apr", "May"],
"amount": [120000, 135000, 128000, 156000, 171000],
})
fig, ax = plt.subplots(figsize=(8, 4))
ax.plot(revenue["month"], revenue["amount"], marker="o")
ax.set_title("Monthly Revenue (Rupees)")
ax.set_xlabel("Month")
ax.set_ylabel("Revenue")
plt.show()
Bar chart - comparing categories
Use bars to compare discrete categories - revenue per city, orders per restaurant. seaborn makes this one line:
city_sales = pd.DataFrame({
"city": ["Mumbai", "Delhi", "Pune", "Jaipur"],
"sales": [820000, 760000, 540000, 390000],
})
sns.barplot(data=city_sales, x="city", y="sales")
plt.title("Sales by City (Rupees)")
plt.show()
Histogram - distribution of one numeric variable
Use a histogram to see how values spread - are most cab fares small with a few large ones?
fares = pd.Series([120, 150, 90, 320, 200, 110, 480, 175, 130, 260])
sns.histplot(fares, bins=8)
plt.title("Distribution of Cab Fares (Rupees)")
plt.show()
Scatter plot - relationship between two numerics
Use scatter to check whether two variables move together - does order amount relate to delivery distance?
sns.scatterplot(data=orders, x="distance_km", y="amount")
plt.title("Order Amount vs Delivery Distance")
plt.show()
Box plot - distribution across groups, with outliers
A box plot compares spreads side by side and exposes outliers - ideal for "fare distribution per city":
sns.boxplot(data=rides, x="city", y="fare")
plt.title("Cab Fare Distribution by City")
plt.show()
A quick chart-choice cheat sheet
- Trend over time -> line
- Compare categories -> bar
- Distribution of one number -> histogram
- Relationship between two numbers -> scatter
- Distribution across groups / spot outliers -> box plot
Choosing the wrong chart is the most common mistake - a line chart over unordered cities, or a bar chart where a histogram belongs. Match the chart to the question.
Clean styling habits that signal competence
Recruiters can tell a polished chart from a default one. Build these habits:
- Always label. Title, x-label, y-label - a chart without them is unreadable. Use
set_title,set_xlabel,set_ylabel. - Size sensibly. Set
figsizeso bars are not cramped and text is legible. - Rotate crowded ticks.
plt.xticks(rotation=45)when city or product names overlap. - Use
sns.set_theme()once at the top for consistent, professional aesthetics. - Call
plt.tight_layout()before showing or saving to stop labels being clipped. - Save crisp images with
plt.savefig("chart.png", dpi=150, bbox_inches="tight")for reports.
seaborn extras worth knowing
huesplits any plot by a category with automatic colours:sns.barplot(data=df, x="city", y="sales", hue="category").sns.countplotis a bar chart of category frequencies - "number of orders per city" without pre-aggregating.sns.heatmapvisualises a correlation matrix or a pivot table - excellent for "sales by city x month".
Common pitfalls
- Forgetting
plt.show()in scripts - the figure renders but never appears. - Overplotting. Thousands of points in a scatter become a blob; sample or add transparency with
alpha=0.3. - Mixing
plt.andax.carelessly. Pick the axes-based style and stay consistent. - No labels. The fastest way to look like a beginner.
Visualization is where your analysis earns its keep. Learn these five charts cold, label everything, and let seaborn handle the polish.
Related: apply, map & lambda in pandas (and When to Avoid Them) · Practise Python
Don't just read. Prove your skill on DevWithData.
Shashikant
· Founder, DevWithDataData professional and Power BI instructor. Building DevWithData to help analysts prove their skills, not just collect certificates.
Reading is not enough. Prove your skill.
DevWithData measures your actual ability with the Data Readiness Index. Stop reading — start practicing.
Continue Learning
Exploratory Data Analysis (EDA) Workflow with pandas
EDA is the step where you actually understand your data before modelling or building dashboards. This practical guide gives Indian data analysts a repeatable pandas workflow: shape and dtypes, describe, value_counts, spotting missing values, detecting outliers, and reading correlations. Follow it on any dataset and you will never be caught out by a surprise in your data again.
8 min readThe Data Analyst Resume Guide (with Examples)
Your resume has about six seconds to survive a recruiter's screen and an ATS filter before anyone reads it properly. This guide shows Indian data analysts how to build a resume that passes both: a tools-first skills section, quantified bullet points with real impact, a projects section that proves your skills, and ATS-friendly formatting. Includes before/after bullet examples and a clean section-by-section template you can copy.
9 min readPL-300 Exam-Day Tips & Practice Strategy
You have studied the four PL-300 domains, but the exam is also a test of time management and nerves. This guide gives Indian candidates a concrete plan: how the 100-minute exam is structured, the question types you will face (drag-drop, case studies, build-from-scratch), a 4-week mock-exam schedule, and the exact-day routine that stops you from running out of time on the case study at the end.
8 min read