To go back to the Home page, click here.
The Workforce dataset shows the labour force participation rates for males and females over time for several countries in Europe and North America. The data for this example was taken from the worldbank website.
library("gplots")
library("tidyverse")
library("ggpubr")
df <- read.csv('../../data/workforce.csv');
df <- na.omit(df);
colnames(df) <- c('Country', 'M2000' , 'M2013', 'F2000', 'F2013', 'Continent');
Global trends in the female workforce over time
Has there been a change in rates of females participating in the labour force (RFPLF) across both continents from 2000 to 2013 ?
To answer this question, we plan to conduct a Student test. To trust the result obtained with this approach, we need to first check that the assumptions of the T-test hold for our data. We first need to study the distribution of the samples.
Checking the normality assumption
A qualitative approach
ggdensity(df$F2000,
main = "Density plot of rates of females participating in the labour force in 2000",
xlab = "RFPLF");
ggdensity(df$F2013,
main = "Density plot of rates of femalesparticipating in the labour force in 2013",
xlab = "RFPLF")
This first visual representation does not allow us to directly conclude that the assumption of gaussian distributions is not correct. To get more information, we continue our analysis using Q-Q plots.
ggqqplot(df$F2000, title='Q-Q plot for the rates of females participating in the labour force in 2000');
ggqqplot(df$F2013, title='Q-Q plot for the rates of females participating in the labour force in 2013')
The Q-Q plots seems to confirm our first intuition: the gaussian assumption looks reasonable.
A quantitative approach
To confirm the visual results of the previous section, we conduct a statistical test to draw our final conclusion regarding the distribution of our data.
##
## Shapiro-Wilk normality test
##
## data: df$F2000
## W = 0.98909, p-value = 0.94
##
## Shapiro-Wilk normality test
##
## data: df$F2013
## W = 0.98503, p-value = 0.8117
Both test give a p-value greater than 0.05, implying that the distribution of the data are not significantly different from normal distribution. In other words, we can assume the normality.
Checking the assumption that both sequences of the same variance
To apply a T-student, we need to further check that both sequences of variables have comparable variances. To make it rigorous, we conduct a Fisher test.
tfisher1 <- var.test(df$F2000, df$F2013, alternative = "two.sided", conf.level = 0.95);
tfisher1;
##
## F test to compare two variances
##
## data: df$F2000 and df$F2013
## F = 1.3403, num df = 45, denom df = 45, p-value = 0.3294
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
## 0.7416192 2.4222837
## sample estimates:
## ratio of variances
## 1.340303
We cannot reject the null hypothesis, which means that, assuming that the assumptions of the Fisher test hold, the difference between the variances of both sequences is not statistically significant. Hence, it is reasonable to use a Student Test to test a potential change in rates of females participating in the labour force across both continents from 2000 and 2013.
Conclusion using the Student Test
ttest1 <- t.test(df$F2000, df$F2013, alternative = "two.sided", paired = TRUE, conf.level = 0.95)
ttest1
##
## Paired t-test
##
## data: df$F2000 and df$F2013
## t = -4.3656, df = 45, p-value = 7.341e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -3.589846 -1.323198
## sample estimates:
## mean of the differences
## -2.456522
Since p<0.001, we reject the null hypothesis and we conclude that there is evidence of an increase in the rates of females participating in the labour force over time across both Europe and North America.
Difference in female workforce between continents in 2013
Is there a difference in female labour force participation rates in 2013 between Europe and North America?
To answer this question, we use again a T-test.
ttest2 <- t.test(df$F2013[df$Continent == 'Europe'], df$F2013[df$Continent != 'Europe'], alternative = "two.sided", conf.level = 0.95)
ttest2
##
## Welch Two Sample t-test
##
## data: df$F2013[df$Continent == "Europe"] and df$F2013[df$Continent != "Europe"]
## t = -1.8841, df = 13.893, p-value = 0.08065
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -10.9929599 0.7151821
## sample estimates:
## mean of x mean of y
## 52.36111 57.50000
The p-value is approximately 0.08. Hence, for a test with level 0.05, we would be lead to accept the null hypothesis. However, our dataset contains only the female labour force participation rates in 2013 for ten countries, which seems rather small to conclude with certainty. To draw our final conclusion, we should find more data.
LS0tDQp0aXRsZTogIkZpc2hlciBhbmQgU3R1ZGVudCBUZXN0czogV29ya2ZvcmNlIGRhdGFzZXQiDQpvdXRwdXQ6DQogICAgaHRtbF9kb2N1bWVudDoNCiAgICAgIGNvZGVfZG93bmxvYWQ6IHRydWUgICAgDQogICAgICB0aGVtZTogY29zbW8NCiAgICAgIHRvYzogdHJ1ZQ0KICAgICAgdG9jX2Zsb2F0OiB0cnVlDQogICAgICBoaWdobGlnaHQ6IHRhbmdvDQogICAgICBudW1iZXJfc2VjdGlvbnM6IHRydWUNCi0tLQ0KDQpUbyBnbyBiYWNrIHRvIHRoZSBIb21lIHBhZ2UsIGNsaWNrIFtoZXJlXShodHRwczovL3F1ZW50aW4tZHVjaGVtaW4uZ2l0aHViLmlvL0VOUEMtU0RBLykuDQoNCjxicj48L2JyPg0KDQpUaGUgV29ya2ZvcmNlIGRhdGFzZXQgc2hvd3MgdGhlIGxhYm91ciBmb3JjZSBwYXJ0aWNpcGF0aW9uIHJhdGVzIGZvciBtYWxlcyBhbmQgZmVtYWxlcyBvdmVyIHRpbWUgZm9yIHNldmVyYWwgY291bnRyaWVzIGluIEV1cm9wZSBhbmQgTm9ydGggQW1lcmljYS4gVGhlIGRhdGEgZm9yIHRoaXMgZXhhbXBsZSB3YXMgdGFrZW4gZnJvbSB0aGUgW3dvcmxkYmFuayB3ZWJzaXRlXShodHRwOi8vd2RpLndvcmxkYmFuay5vcmcvdGFibGUvMi4yKS4NCg0KYGBge3IgZXZhbD1GQUxTRSwgbWVzc2FnZT1GQUxTRSwgd2FybmluZz1GQUxTRSwgaW5jbHVkZT1GQUxTRX0NCmluc3RhbGwucGFja2FnZXMoJ3RpZHl2ZXJzZScpDQppbnN0YWxsLnBhY2thZ2VzKCJncGxvdHMiKQ0KaW5zdGFsbC5wYWNrYWdlcygiZ2dwdWJyIikNCmBgYA0KDQpgYGB7ciBtZXNzYWdlID0gRkFMU0UsIHdhcm5pbmcgPSBGQUxTRSwgZWNobyA9IFRSVUV9IA0KbGlicmFyeSgiZ3Bsb3RzIikNCmxpYnJhcnkoInRpZHl2ZXJzZSIpDQpsaWJyYXJ5KCJnZ3B1YnIiKQ0KZGYgPC0gcmVhZC5jc3YoJy4uLy4uL2RhdGEvd29ya2ZvcmNlLmNzdicpOw0KZGYgPC0gbmEub21pdChkZik7DQpjb2xuYW1lcyhkZikgPC0gYygnQ291bnRyeScsICdNMjAwMCcgLCAnTTIwMTMnLCAnRjIwMDAnLCAnRjIwMTMnLCAnQ29udGluZW50Jyk7DQpgYGANCg0KDQojIEdsb2JhbCB0cmVuZHMgaW4gdGhlIGZlbWFsZSB3b3JrZm9yY2Ugb3ZlciB0aW1lDQoNCg0KKipIYXMgdGhlcmUgIGJlZW4gIGEgIGNoYW5nZSAgaW4gIHJhdGVzICBvZiBmZW1hbGVzICBwYXJ0aWNpcGF0aW5nICBpbiAgdGhlICBsYWJvdXIgIGZvcmNlIChSRlBMRikgIGFjcm9zcyAgYm90aCAgY29udGluZW50cyAgZnJvbSAgMjAwMCAgdG8gMjAxMyA/KioNCg0KDQpUbyBhbnN3ZXIgdGhpcyBxdWVzdGlvbiwgd2UgcGxhbiB0byBjb25kdWN0IGEgU3R1ZGVudCB0ZXN0LiBUbyB0cnVzdCB0aGUgcmVzdWx0IG9idGFpbmVkIHdpdGggdGhpcyBhcHByb2FjaCwgd2UgbmVlZCB0byBmaXJzdCBjaGVjayB0aGF0IHRoZSBhc3N1bXB0aW9ucyBvZiB0aGUgVC10ZXN0IGhvbGQgZm9yIG91ciBkYXRhLg0KV2UgZmlyc3QgbmVlZCB0byBzdHVkeSB0aGUgZGlzdHJpYnV0aW9uIG9mIHRoZSBzYW1wbGVzLg0KDQojIyBDaGVja2luZyB0aGUgbm9ybWFsaXR5IGFzc3VtcHRpb24NCg0KIyMjIEEgcXVhbGl0YXRpdmUgYXBwcm9hY2gNCg0KYGBge3J9DQpnZ2RlbnNpdHkoZGYkRjIwMDAsIA0KICAgICAgICAgIG1haW4gPSAiRGVuc2l0eSBwbG90IG9mIHJhdGVzIG9mIGZlbWFsZXMgcGFydGljaXBhdGluZyBpbiB0aGUgbGFib3VyIGZvcmNlIGluIDIwMDAiLA0KICAgICAgICAgIHhsYWIgPSAiUkZQTEYiKTsNCmdnZGVuc2l0eShkZiRGMjAxMywgDQogICAgICAgICAgbWFpbiA9ICJEZW5zaXR5IHBsb3Qgb2YgcmF0ZXMgb2YgZmVtYWxlc3BhcnRpY2lwYXRpbmcgaW4gdGhlIGxhYm91ciBmb3JjZSBpbiAyMDEzIiwNCiAgICAgICAgICB4bGFiID0gIlJGUExGIikNCmBgYA0KDQpUaGlzIGZpcnN0IHZpc3VhbCByZXByZXNlbnRhdGlvbiBkb2VzIG5vdCBhbGxvdyB1cyB0byBkaXJlY3RseSBjb25jbHVkZSB0aGF0IHRoZSBhc3N1bXB0aW9uIG9mIGdhdXNzaWFuIGRpc3RyaWJ1dGlvbnMgaXMgbm90IGNvcnJlY3QuIFRvIGdldCBtb3JlIGluZm9ybWF0aW9uLCB3ZSBjb250aW51ZSBvdXIgYW5hbHlzaXMgdXNpbmcgUS1RIHBsb3RzLg0KDQpgYGB7cn0NCmdncXFwbG90KGRmJEYyMDAwLCB0aXRsZT0nUS1RIHBsb3QgZm9yIHRoZSByYXRlcyBvZiBmZW1hbGVzIHBhcnRpY2lwYXRpbmcgaW4gdGhlIGxhYm91ciBmb3JjZSBpbiAyMDAwJyk7DQpnZ3FxcGxvdChkZiRGMjAxMywgdGl0bGU9J1EtUSBwbG90IGZvciB0aGUgcmF0ZXMgb2YgZmVtYWxlcyBwYXJ0aWNpcGF0aW5nIGluIHRoZSBsYWJvdXIgZm9yY2UgaW4gMjAxMycpDQpgYGANCg0KVGhlIFEtUSBwbG90cyBzZWVtcyB0byBjb25maXJtIG91ciBmaXJzdCBpbnR1aXRpb246IHRoZSBnYXVzc2lhbiBhc3N1bXB0aW9uIGxvb2tzIHJlYXNvbmFibGUuDQoNCiMjIyBBIHF1YW50aXRhdGl2ZSBhcHByb2FjaA0KDQpUbyBjb25maXJtIHRoZSB2aXN1YWwgcmVzdWx0cyBvZiB0aGUgcHJldmlvdXMgc2VjdGlvbiwgd2UgY29uZHVjdCBhIHN0YXRpc3RpY2FsIHRlc3QgdG8gZHJhdyBvdXIgZmluYWwgY29uY2x1c2lvbiByZWdhcmRpbmcgdGhlIGRpc3RyaWJ1dGlvbiBvZiBvdXIgZGF0YS4NCg0KDQpgYGB7cn0NCnNoYXBpcm8udGVzdChkZiRGMjAwMCk7DQpzaGFwaXJvLnRlc3QoZGYkRjIwMTMpDQpgYGANCg0KQm90aCB0ZXN0IGdpdmUgYSBwLXZhbHVlIGdyZWF0ZXIgdGhhbiAwLjA1LCBpbXBseWluZyB0aGF0IHRoZSBkaXN0cmlidXRpb24gb2YgdGhlIGRhdGEgYXJlIG5vdCBzaWduaWZpY2FudGx5IGRpZmZlcmVudCBmcm9tIG5vcm1hbCBkaXN0cmlidXRpb24uIEluIG90aGVyIHdvcmRzLCB3ZSBjYW4gYXNzdW1lIHRoZSBub3JtYWxpdHkuDQoNCiMjIENoZWNraW5nIHRoZSBhc3N1bXB0aW9uIHRoYXQgYm90aCBzZXF1ZW5jZXMgb2YgdGhlIHNhbWUgdmFyaWFuY2UNCg0KVG8gYXBwbHkgYSBULXN0dWRlbnQsIHdlIG5lZWQgdG8gZnVydGhlciBjaGVjayB0aGF0IGJvdGggc2VxdWVuY2VzIG9mIHZhcmlhYmxlcyBoYXZlIGNvbXBhcmFibGUgdmFyaWFuY2VzLiBUbyBtYWtlIGl0IHJpZ29yb3VzLCB3ZSBjb25kdWN0IGEgRmlzaGVyIHRlc3QuDQoNCmBgYHtyfQ0KdGZpc2hlcjEgPC0gdmFyLnRlc3QoZGYkRjIwMDAsIGRmJEYyMDEzLCBhbHRlcm5hdGl2ZSA9ICJ0d28uc2lkZWQiLCBjb25mLmxldmVsID0gMC45NSk7DQp0ZmlzaGVyMTsgICANCmBgYA0KDQpXZSBjYW5ub3QgcmVqZWN0IHRoZSBudWxsIGh5cG90aGVzaXMsIHdoaWNoIG1lYW5zIHRoYXQsIGFzc3VtaW5nIHRoYXQgdGhlIGFzc3VtcHRpb25zIG9mIHRoZSBGaXNoZXIgdGVzdCBob2xkLCB0aGUgZGlmZmVyZW5jZSBiZXR3ZWVuIHRoZSB2YXJpYW5jZXMgb2YgYm90aCBzZXF1ZW5jZXMgaXMgbm90IHN0YXRpc3RpY2FsbHkgc2lnbmlmaWNhbnQuIA0KSGVuY2UsIGl0IGlzIHJlYXNvbmFibGUgdG8gdXNlIGEgU3R1ZGVudCBUZXN0IHRvIHRlc3QgYSBwb3RlbnRpYWwgY2hhbmdlIGluIHJhdGVzIG9mIGZlbWFsZXMgcGFydGljaXBhdGluZyBpbiB0aGUgbGFib3VyIGZvcmNlIGFjcm9zcyBib3RoIGNvbnRpbmVudHMgZnJvbSAyMDAwIGFuZCAyMDEzLg0KDQojIyBDb25jbHVzaW9uIHVzaW5nIHRoZSBTdHVkZW50IFRlc3QNCg0KYGBge3J9ICAgICAgICAgICAgDQp0dGVzdDEgPC0gdC50ZXN0KGRmJEYyMDAwLCBkZiRGMjAxMywgYWx0ZXJuYXRpdmUgPSAidHdvLnNpZGVkIiwgcGFpcmVkID0gVFJVRSwgY29uZi5sZXZlbCA9IDAuOTUpDQp0dGVzdDENCmBgYA0KDQpTaW5jZSBwPDAuMDAxLCAgd2UgcmVqZWN0IHRoZSBudWxsIGh5cG90aGVzaXMgYW5kIHdlIGNvbmNsdWRlIHRoYXQgdGhlcmUgaXMgZXZpZGVuY2Ugb2YgYW4gaW5jcmVhc2UgaW4gdGhlIHJhdGVzIG9mIGZlbWFsZXMgcGFydGljaXBhdGluZyBpbiB0aGUgbGFib3VyIGZvcmNlIG92ZXIgdGltZSBhY3Jvc3MgYm90aCBFdXJvcGUgYW5kIE5vcnRoIEFtZXJpY2EuDQoNCg0KIyBEaWZmZXJlbmNlIGluIGZlbWFsZSB3b3JrZm9yY2UgYmV0d2VlbiBjb250aW5lbnRzIGluIDIwMTMNCg0KKipJcyB0aGVyZSBhIGRpZmZlcmVuY2UgaW4gZmVtYWxlIGxhYm91ciBmb3JjZSBwYXJ0aWNpcGF0aW9uIHJhdGVzIGluIDIwMTMgYmV0d2VlbiBFdXJvcGUgYW5kIE5vcnRoIEFtZXJpY2E/KioNCg0KVG8gYW5zd2VyIHRoaXMgcXVlc3Rpb24sIHdlIHVzZSBhZ2FpbiBhIFQtdGVzdC4NCg0KYGBge3J9DQp0dGVzdDIgPC0gdC50ZXN0KGRmJEYyMDEzW2RmJENvbnRpbmVudCA9PSAnRXVyb3BlJ10sIGRmJEYyMDEzW2RmJENvbnRpbmVudCAhPSAnRXVyb3BlJ10sIGFsdGVybmF0aXZlID0gInR3by5zaWRlZCIsIGNvbmYubGV2ZWwgPSAwLjk1KQ0KdHRlc3QyDQpgYGANCg0KVGhlIHAtdmFsdWUgaXMgYXBwcm94aW1hdGVseSAwLjA4LiBIZW5jZSwgZm9yIGEgdGVzdCB3aXRoIGxldmVsIDAuMDUsIHdlIHdvdWxkIGJlIGxlYWQgdG8gYWNjZXB0IHRoZSBudWxsIGh5cG90aGVzaXMuDQpIb3dldmVyLCBvdXIgZGF0YXNldCBjb250YWlucyBvbmx5IHRoZSBmZW1hbGUgbGFib3VyIGZvcmNlIHBhcnRpY2lwYXRpb24gcmF0ZXMgaW4gMjAxMyBmb3IgdGVuIGNvdW50cmllcywgd2hpY2ggc2VlbXMgcmF0aGVyIHNtYWxsIHRvIGNvbmNsdWRlIHdpdGggY2VydGFpbnR5LiBUbyBkcmF3IG91ciBmaW5hbCBjb25jbHVzaW9uLCB3ZSBzaG91bGQgZmluZCBtb3JlIGRhdGEuDQoNCg0KDQo=