Univariate graphs
Variable 1: Income per person
The first graph is skewed to the right (many countries have a low average income per person whereas a few countries have a large amount of money). The mean income is 8740, but the median is 2553, indicating that most of the countries have a lower income rather than a high income per person. However, since there are really rich countries in the list (like Monaco, with an income per person of 105147), the mean is very high.
Variable 2: Suicide rate per 100,000 inhabitants
This graph is slightly skewed to the right, but the peak is at the second observation (6 suicides for every 100,000 inhabitants), and not in the first (up to 2 suicides per 100,000 inhabitants). About 70% of the countries have lower suicide rates (up to 10 per 100,000), and very few have high rates. The mean is 9.64 and the median is 8.26, with a standard deviation of 6.3.
Bivariate graphs
Income per person X Suicide per 100,000 inhabitants
In the scatterplot, is possible to observe that the variables do not correlate as expected - the research question aimed to see with higher incomes are associated with higher suicide rates. Variables do not correlate. The differences are very low, as can be further explored in the graph below:
Countries were distributed in three groups, with group 1 with the lowest income per person, and group 4 with the higher income per person. The mean suicide rate in each group is very similar, with groups 2 and 4 with a mean suicide rate of about 9 per 100,000, and groups 1 and 3 with a mean suicide rate of about 10 per 100,000.
My conclusion is that average income per person does not correlate with suicide rates.
----
BTW, this is my code, written in SAS:
LIBNAME mydata "/courses/d1406ae5ba27fe300 " access=readonly;
DATA new; set mydata.gapminder;
LABEL incomeperperson="Income per person"
suicideper100TH="Suicide rate among 100,000 inhabitants"
lifeexpectancy="Life expectancy"
incomegroup="Aggregated income per person"
suicidegroup="Agregated suicide rate"
lifegroup="Aggregated life expectancy";
if incomeperperson LE 744.239 then incomegroup=1; /*Less than 744 per person*/
else if incomeperperson LE 2553.496 then incomegroup=2; /*Up to 2,500 per person*/
else if incomeperperson LE 9425.326 then incomegroup=3; /*Up to 9,400 per person*/
else if incomeperperson GT 9425.326 then incomegroup=4; /*More than 9,400 per person*/
if suicideper100TH LE 5 then suicidegroup=1; /*Up to 5 suicide per 100,000*/
else if suicideper100TH LE 10 then suicidegroup=2; /*Between 5 and 10 suicide per 100,000*/
else suicidegroup=3; /*More than 10 suicides per 100,000*/
if lifeexpectancy LE 60 then lifegroup=1; /*Life expectancy under 60 years*/
else if lifeexpectancy LE 75 then lifegroup=2; /*Life expectancy between 60 and 75 years old*/
else lifegroup=3; /*Life expectancy over 75 years*/
PROC SORT; by COUNTRY;
PROC FREQ; TABLES incomeperperson suicideper100TH lifeexpectancy incomegroup suicidegroup lifegroup;
PROC PRINT; var incomegroup suicidegroup lifegroup;
PROC Univariate; var incomeperperson suicideper100TH;
PROC GCHART; VBar incomeperperson/ type=pct width=30;
PROC GCHART; VBar suicideper100TH/ type=pct width=15;
proc gplot; plot suicideper100TH*incomegroup;
PROC GPLOT; PLOT lifeexpectancy*incomeperperson;
PROC GPLOT; PLOT lifeexpectancy*suicideper100TH;
proc gchart; vbar incomegroup/discrete type=mean SUMVAR=suicideper100TH;
RUN;




No comments:
Post a Comment