Python Pandas Basics in Hindi | Data Handling Part 2 for Data Analysts

Pandas Tutorial in Hindi

Data Handling Part 2: Pandas 📊

Agar aap data analyst banna chahte hain, to Pandas aapka sabse important tool hai.

Data science, analytics, aur machine learning ki duniya me sabse pehla practical tool jo log sikte hain, wo Pandas hota hai. Isse aap raw data ko readable format me la sakte ho, usko clean kar sakte ho, filter kar sakte ho, aur analysis ke liye ready bana sakte ho. Jo kaam Excel me bahut time leta hai, Pandas usko kuch hi lines of code se kar deta hai.

Pandas is widely used in:

  • Data analysis
  • Data cleaning
  • Machine learning pipelines
  • Business analytics
  • Financial analysis

Agar aap future me reporting, dashboarding, ya AI/ML projects pe kaam karna chahte ho, to Pandas ek aisi skill hai jo har jagah kaam aayegi. Isi liye, agar aap data analyst ke roop me career banana chahte hain, to Pandas essential hai.

Short summary: Pandas data ko Excel se zyada powerful tarike se handle karne ka Python‑based solution hai, jahan automation, speed aur flexibility milti hai.

1️⃣ Install Pandas

Sabse pehle aapko Pandas install karna hota hai:

pip install pandas

Ye command aapko Command Prompt, Terminal, ya Anaconda Prompt me run karni hoti hai. Ek baar install ho gaya to aap har Python project me isko import karke use kar sakte hain.

2️⃣ Import Pandas

Pandas ko code me import karne ka standard tareeka:

import pandas as pd
  

Yaha pd Pandas ka standard alias hai jo sab use karte hain. Jab bhi aap Pandas ke functions use karte ho, aap pd.function_name() ya df.method() jaisa syntax dekhoge.

3️⃣ What is a Series?

Series ek one‑dimensional labeled array hota hai. Isko aap Excel ke single column ya Python list ke advanced version ke taur par samajh sakte hain, jisme har value ke sath ek index hota hai.

import pandas as pd
data = [10, 20, 30, 40]
s = pd.Series(data)
print(s)
  

Output:

0    10
1    20
2    30
3    40
  
Left side wale 0, 1, 2, 3 index hain, aur right side values actual data. Aap chahein to apna custom index bhi de sakte hain, jaise roll number ya dates.

4️⃣ What is a DataFrame?

DataFrame ek table jaisa structure hota hai, bilkul Excel sheet ki tarah. Isme rows aur columns hote hain, aur har column ek Series ki tarah behave karta hai.

import pandas as pd
data = {
    "Name": ["Amit", "Deepak", "Ravi"],
    "Marks": [78, 85, 90]
}
df = pd.DataFrame(data)
print(df)
  

Output:

    Name  Marks
0   Amit     78
1 Deepak     85
2   Ravi     90
  

Series → single column jaisa data.

DataFrame → complete table jisme multiple columns hote hain.

Short comparison: Pandas vs Excel

Excel beginners ke liye easy hai, lekin jab data bada ho jata hai (lakho rows), formulas complex ho jate hain, aur bar‑bar same kaam repeat karna padta hai, tab Pandas zyada useful ho jata hai.

  • Excel me kaam mostly manual hota hai, Pandas me code se automate ho jata hai.
  • Excel me limit hoti hai, Pandas me aap millions of rows handle kar sakte ho (system RAM ke hisaab se).
  • Version control, reproducible reports, aur ML integration Pandas me bahut easy hota hai.

5️⃣ Reading CSV Files

Data analysis me CSV files bahut common hoti hain. Pandas se CSV ko read karna bahut easy hai:

import pandas as pd
df = pd.read_csv("students.csv")
print(df)
  

Isse aap external file ko DataFrame me laa kar usko filter, sort, group, ya visualize kar sakte ho. Real projects me aap aksar CSV, Excel, database, ya API se data load karoge.

6️⃣ Basic Data Exploration

Data load karne ke baad sabse pehle usko explore karna zaroori hai:

print(df.head())      # first rows
print(df.tail())      # last rows
print(df.shape)       # rows and columns
print(df.columns)     # column names
  

Agar output me (100, 5) aaye to iska meaning hai: 100 rows, 5 columns. Isse aapko turant pata chal jata hai ki dataset kitna bada hai aur usme kaun‑kaun se columns hain.

7️⃣ Selecting Columns

Single column select karne ke liye:

print(df["Name"])

Multiple columns select karne ke liye:

print(df[["Name", "Marks"]])

Ye step important hai kyunki analysis me aksar aapko kuch specific columns ke sath kaam karna hota hai, saare columns ke nahi.

8️⃣ Basic Statistics

Marks column par basic statistics:

print(df["Marks"].mean())
print(df["Marks"].max())
print(df["Marks"].min())
  

In teen functions se aap quick insights nikal sakte ho, jaise class ka average score, highest marks, aur lowest marks. Similar tareeke se aap sum, median, standard deviation etc. bhi nikal sakte hain.

Real Example — Sales Data

import pandas as pd
data = {
    "Product": ["A","B","C"],
    "Sales": [100,150,200]
}
df = pd.DataFrame(data)
print("Total Sales:", df["Sales"].sum())
  

Is example me humne total sales calculate ki. Isi logic ko aap monthly sales report, product analysis, ya revenue tracking me bhi apply kar sakte ho.

Real‑world mini use case: Student result analysis

Socho aapke paas ek CSV hai jisme 500 students ke naam, marks aur subjects ki details hain. Excel me manually topper find karna, average nikalna, fail/pass count karna kaafi time lega.

Pandas me aap:

  • Average marks per subject nikal sakte ho.
  • Top 10 students filter kar sakte ho.
  • Fail hone wale students ka separate list bana sakte ho.

Ye sab kaam kuch hi lines of code se ho jata hai, jo data analyst ke daily workflow ko bahut fast bana deta hai.

Practice Tasks

1. Create DataFrame of students with name and marks.

2. Print first 3 rows using head().

3. Find average marks.

4. Find highest marks.

5. Select only Name column.

✅ Practice Task Solutions

Task 1. Create DataFrame of students with name and marks

import pandas as pd
data = {
    "Name": ["Amit", "Deepak", "Ravi", "Neha"],
    "Marks": [78, 85, 90, 82]
}
df = pd.DataFrame(data)
print(df)
  

Output example:

    Name  Marks
0   Amit     78
1 Deepak     85
2   Ravi     90
3   Neha     82
  

Task 2. Print first 3 rows using head()

print(df.head(3))
  

Output:

    Name  Marks
0   Amit     78
1 Deepak     85
2   Ravi     90
  

Task 3. Find average marks

print("Average Marks:", df["Marks"].mean())
  

Example output: Average Marks: 83.75

Task 4. Find highest marks

print("Highest Marks:", df["Marks"].max())
  

Output: Highest Marks: 90

Task 5. Select only Name column

print(df["Name"])
  

Output:

0     Amit
1  Deepak
2    Ravi
3    Neha
Name: Name, dtype: object
  

✅ Key Learning

  • pd.DataFrame() → table create karne ke liye
  • df.head(n) → preview first rows
  • df["column"].mean() → average value
  • df["column"].max() → highest value
  • pd.read_csv() → dataset load karne ke liye
  • Series → single column
  • DataFrame → table / dataset

Pandas data analysis ko simple aur fast bana deta hai. Jitni zyada practice karoge, utni jaldi aap real‑world datasets handle kar paoge, chahe vo students ka data ho, sales ho, ya website analytics.

Python-interview-questions
                            Paython programming
                                                                    Paython pandas

टिप्पणियाँ

Top Quizzes

100 Hard Level UP GK & GS Quiz in Hindi 2026: सभी सरकारी परीक्षाओं के लिए महत्वपूर्ण प्रश्न

Top 50 Pattern Programming Questions with Python Solutions

Interesting GK Questions (AI & Modern)

Complete Data Engineering Roadmap for Beginners (Step-by-Step Guide for Students)

Computer GK Questions in Hindi 2026 – SSC, Railway, UP Police Important MCQ

GK Question || GK In Hindi || GK Question and Answer || GK Quiz

Python OOP Encapsulation Explained 🔐 | Private, Protected & Public Variables

UP GK Mock Test 2026: 100 qution उत्तर प्रदेश सामान्य ज्ञान महत्वपूर्ण प्रश्नोत्तरी