EXPLORATORY DATA ANALYSIS- HAPPINESS INDEX 2021
World happiness index mean the survey of state of global happiness , where survey is taken place based on criteria like Social support , life expectancy , corruption etc of every country.
Thank you for Aakash N S as I have learned how to approach for any data set through the course https://jovian.ai/learn/data-analysis-with-python-zero-to-pandas
Also Thank you Armonia for idea of your blog as i am too new in this blog things.
Lets Get started :
This is Kaggle dataset you can easily download and analyze . Also I will upload 2022 dataset by end of the year .
So we need import few libraries before importing dataset. Let Import those.
Now lets download dataset , I have csv file ready downloaded from Kaggle
Now this is done , let do some data cleaning . Below are steps involved in those
We don't need every columns for analysis . So taking only required columns.
Renaming columns as python doesn't recognize space between word in columns.
Result : Column renamed.
Now Let do EDA (EXPLORATORY DATA ANALYSIS) AND VISUALIZATION
We will be doing some plotting to see relationship between each parameters
Let’s begin by importingmatplotlib.pyplot
and seaborn
Scatterplot between Happiness and GDP
Most of the Happy country are from Western Europe as their GDP is also High which is more than 8 in Happiness score . This shows that more the country earn in more the people will get employed and more happier people will be.
Followed By Western Europe, its Northern America with 7
In Middle it os Latin America like Brazil and few East and South Asian Countries and Commonwealth of Independent states like Armenia , Belarus, Georgia, Kazakhstan
Last is African countries like Cameroon , Ethiopia , Angola
There is an Outlier country in blue mark which we presume is Afghanistan.
Let us GDP based on region
For that we have to find sum of GDP of all regional indicator
I have used pie chart for visualizing GDP as this is traditional method and easy way to understand the GDP or any value.
As we can see surprisingly we can see most of the GDP contribution are from Sub Saharan Africa with 20.7%.
Followed by Western Europe , C & E Europe.
Another surprising thing is Northern America contribution seem less thats 3.1% least of all.
To Clear this please find the below no of countries in the region.
Above data clears that we have more countries in Africa to contribute to GDP i.e 36 and followed by Western Europe 21
As we can see Ladder score and GDP per capita is highly correlated as we have see in previous scatter plot. Which mean more the GDP is higher people happy in their country.
We can also see that Ladder / Happiness score is highly correlated with Social support , Life expectancy and less correlated to corruption which means those country has less corruption and people believe those very strongly.
Lets see AVG Corruption perception in each region
We can see Highest corruption is found in Central and Eastern Europe ,Second comes Latin America , South Asia.
Least corruption is found in North America , Australia and New Zealand followed by Western Europe.
Finally based on corruption let plot Corruption Vs Happiness score
We can see where there is low corruption , happiness index is high.
Western Europe is in Top in term of less corruption as per region , followed North America , ANZ.
Africa is having high corruption and less happy people.
Now you all have understood relationship between each paramaters . Lets have some questions for you.
Q1:Which is Top 10 and Bottom 10 Happy countries ?? Simple Question:)
Answer is simple .
Just by using head and Tail syntax.
Q2: Which region has highest freedom of making life choices ?
Same as we discussed before Western europe leads in this parameters too and least is africa
Q3: Simple Question What is Average Happiness score ?
Q4: Which region has Highest life expectancy?
Again Western Europe Leads in Life Expectancy and Africa in the Bottom
Q5: Final Question — How do you plot for top 10 Countries based on corruption?
You see! Singapore has less perception of corruption followed by Rwanda and Highest is Ireland
Inferences and Conclusion :
Now Inference i see is …
Happiness Index gives us perception as below :
A: How a country in administrated ?
B: How people have opinion on the government?
C: People’s health and lifestyle of that particular country
Also from dataset we can infer that Western Europe lead in most of parameters like Life expectancy , GDP per capita . If you have seen last year i.e. 2021 you can see most of the Happiest countries in Top 5 are from Europe . Attached Link for reference in reference columns
At last I want to conclude that more the govt work for their country more the happier people will be and eventually ladder score will high
References and Future Work
Pandas for Data Manipulation https://pandas.pydata.org/docs/reference/index.html#api
Seaborn for Advance Visualization
https://seaborn.pydata.org/tutorial.htmlMatplotlib for interactive visualizations https://matplotlib.org/stable/tutorials/introductory/pyplot.html
Statista For finding top Happiest countries https://www.statista.com/statistics/1225047/ranking-of-happiest-countries-worldwide-by-score/
About Mine:
My Name is Rakesh. I am from Non technical background but at same time I love working with data . I am going to work more and more on data .Eventually going to learn Machine learning, Deep learning as well.
You can go to visit my Github for my work and linkedin profile to know more about me.
https://www.linkedin.com/in/rakesh-sethu-n-p-6621b327/
Thank you all . Keep smiling always. Keep analyzing :)