1 Back to Basics

1.1 The Welch Test

Let us recall that the Welch Test is an extension of the Student Test to test if two independent samples generated from two gaussian distributions have the same mean value in the case where they may have different variances.

Let us start with a simple application of the Welch Test. We generate two datasets from two normal distribution they have same mean but different variances.

library(ggplot2);
library('ramify');
## 
## Attaching package: 'ramify'
## The following object is masked from 'package:graphics':
## 
##     clip
s1 <- randn(200, mean=0, sd=1);
s2 <- randn(200, mean=0, sd=1.6);

We can check the distribution of them with hist plot:

df <- data.frame(
  samples=append(s1,s2),
  origin=append(rep('sample1',200),rep('sample2',200))
  )
ggplot(df, aes(x=samples, color=origin, fill=origin)) +geom_histogram(aes(y=..density..), position="identity", alpha=0.5)+
geom_density(alpha=0.2)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Next, we do the Welch test to compare these the mean of these two samples (since their variances are different).

tTest = t.test(s1, s2, var.equal = FALSE);
print(tTest);
## 
##  Welch Two Sample t-test
## 
## data:  s1 and s2
## t = 0.22867, df = 333.19, p-value = 0.8193
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.2200795  0.2779782
## sample estimates:
##  mean of x  mean of y 
## 0.07368490 0.04473557

The p-value is too high, we can not reject the null hypothesis at 0.05 level. The Welch Test finds correctly that the two samples have the same mean.

1.2 The Kolmogorov Smirnov Test (KS)

The KS test is a non parametric test that allows to test if two continuous distributions are different. Since the sequences generated in the previous section have different standard deviations, the KS test should reject the null.

ksTest <- ks.test(s1,s2);
print(ksTest);
## 
##  Two-sample Kolmogorov-Smirnov test
## 
## data:  s1 and s2
## D = 0.145, p-value = 0.02984
## alternative hypothesis: two-sided

The difference was be shown by the small p-value.

2 Power comparison between the Student or Welch Test, and the KS Test

2.1 Student VS KS

Now we consider two samples generated from two different gaussian distributions with the same variances but with different means. We want to study the distribution of p-value obtained from the Student-test and the KS-Test in this case.

powercomparison <- function (mu1,mu2,sigma1,sigma2,nametest='Student Test'){
  n <- 200;
  p_ttest = rep(0,300);
  p_kstest = rep(0,300);
  
  for (i in 1:300){
    s1 <- randn(200, mean=mu1, sd=sigma1);
    s2 <- randn(200, mean=mu2, sd=sigma2);
    if (nametest == 'Student Test'){
      tTest = t.test(s1,s2,var.equal=TRUE);
    }
    else{
      tTest = t.test(s1,s2,var.equal=FALSE);
    }
    p_ttest[i] = tTest$p.value;
    ksTest = ks.test(s1,s2);
    p_kstest[i] <- ksTest$p.value;
  }
  df <- data.frame(pvalue = append(p_ttest,p_kstest), test = append(rep(nametest,length(p_ttest)),rep('KS Test',length(p_kstest))))
  boxplot(pvalue ~ test, data = df, varwidth = TRUE)
}


mu1 <- 1;
mu2 <- 1.2;
sigma1 <- 1;
sigma2 <- 1;
powercomparison(mu1,mu2,sigma1,sigma2);

We see that the power of the Student Test is larger than the one of the KS test. Indeed, we obtain smaller p-values with the Student test compared to the KS test. This was predictable since we perfectly match the conditions of application of the Student Test. On the contrary, the KS test can be used for any continuous distributions and the no free lunch principle would lead us to think that the generality of the KS test should be paid in power in some situations.

2.2 Welch VS KS

We know consider different variances for the two gaussian distributions. In the following, we show that the power comparison between the KS and the Welch Test is less obvious in this case.

Based on the result of the previous section, it seems natural to think that if the two variances are close enough, the power of the Welch Test will still be larger than the one of the KS test. This is shown below on an example.

mu1 <- 1;
mu2 <- 1.2;
sigma1 <- 1;
sigma2 <- 1.02;
powercomparison(mu1,mu2,sigma1,sigma2,nametest='Welch Test');

However, when the difference between variances is getting larger, the KS test can become more powerful.

mu1 <- 1;
mu2 <- 1.2;
sigma1 <- 1;
sigma2 <- 1.25;
powercomparison(mu1,mu2,sigma1,sigma2,nametest='Welch Test');

LS0tDQp0aXRsZTogIlBvd2VyIGNvbXBhcmlzb24gYmV0d2VlbiB0aGUgU3R1ZGVudC9XZWxjaCBUZXN0IGFuZCB0aGUgS1MgVGVzdCINCm91dHB1dDoNCiAgICBodG1sX2RvY3VtZW50Og0KICAgICAgY29kZV9kb3dubG9hZDogdHJ1ZSAgICANCiAgICAgIHRoZW1lOiBjb3Ntbw0KICAgICAgdG9jOiB0cnVlDQogICAgICB0b2NfZmxvYXQ6IHRydWUNCiAgICAgIGhpZ2hsaWdodDogdGFuZ28NCiAgICAgIG51bWJlcl9zZWN0aW9uczogdHJ1ZQ0KLS0tDQoNCg0KIyBCYWNrIHRvIEJhc2ljcw0KDQoNCiMjIFRoZSBXZWxjaCBUZXN0DQoNCkxldCB1cyByZWNhbGwgdGhhdCB0aGUgV2VsY2ggVGVzdCBpcyBhbiBleHRlbnNpb24gb2YgdGhlIFN0dWRlbnQgVGVzdCB0byB0ZXN0IGlmIHR3byBpbmRlcGVuZGVudCBzYW1wbGVzIGdlbmVyYXRlZCBmcm9tIHR3byBnYXVzc2lhbiBkaXN0cmlidXRpb25zIGhhdmUgdGhlIHNhbWUgbWVhbiB2YWx1ZSBpbiB0aGUgY2FzZSB3aGVyZSB0aGV5IG1heSBoYXZlIGRpZmZlcmVudCB2YXJpYW5jZXMuDQoNCkxldCB1cyBzdGFydCB3aXRoIGEgc2ltcGxlIGFwcGxpY2F0aW9uIG9mIHRoZSBXZWxjaCBUZXN0LiBXZSBnZW5lcmF0ZSB0d28gZGF0YXNldHMgZnJvbSB0d28gbm9ybWFsIGRpc3RyaWJ1dGlvbiB0aGV5IGhhdmUgc2FtZSBtZWFuIGJ1dCBkaWZmZXJlbnQgdmFyaWFuY2VzLg0KDQpgYGB7cn0NCmxpYnJhcnkoZ2dwbG90Mik7DQpsaWJyYXJ5KCdyYW1pZnknKTsNCnMxIDwtIHJhbmRuKDIwMCwgbWVhbj0wLCBzZD0xKTsNCnMyIDwtIHJhbmRuKDIwMCwgbWVhbj0wLCBzZD0xLjYpOw0KYGBgDQoNCldlIGNhbiBjaGVjayB0aGUgZGlzdHJpYnV0aW9uIG9mIHRoZW0gd2l0aCBoaXN0IHBsb3Q6DQoNCmBgYHtyfQ0KZGYgPC0gZGF0YS5mcmFtZSgNCiAgc2FtcGxlcz1hcHBlbmQoczEsczIpLA0KICBvcmlnaW49YXBwZW5kKHJlcCgnc2FtcGxlMScsMjAwKSxyZXAoJ3NhbXBsZTInLDIwMCkpDQogICkNCmdncGxvdChkZiwgYWVzKHg9c2FtcGxlcywgY29sb3I9b3JpZ2luLCBmaWxsPW9yaWdpbikpICtnZW9tX2hpc3RvZ3JhbShhZXMoeT0uLmRlbnNpdHkuLiksIHBvc2l0aW9uPSJpZGVudGl0eSIsIGFscGhhPTAuNSkrDQpnZW9tX2RlbnNpdHkoYWxwaGE9MC4yKQ0KYGBgDQoNCk5leHQsIHdlIGRvIHRoZSBXZWxjaCB0ZXN0IHRvIGNvbXBhcmUgdGhlc2UgdGhlIG1lYW4gb2YgdGhlc2UgdHdvIHNhbXBsZXMgKHNpbmNlIHRoZWlyIHZhcmlhbmNlcyBhcmUgZGlmZmVyZW50KS4NCg0KYGBge3J9DQp0VGVzdCA9IHQudGVzdChzMSwgczIsIHZhci5lcXVhbCA9IEZBTFNFKTsNCnByaW50KHRUZXN0KTsNCmBgYA0KDQpUaGUgcC12YWx1ZSBpcyB0b28gaGlnaCwgd2UgY2FuIG5vdCByZWplY3QgdGhlIG51bGwgaHlwb3RoZXNpcyBhdCAwLjA1IGxldmVsLiBUaGUgV2VsY2ggVGVzdCBmaW5kcyBjb3JyZWN0bHkgdGhhdCB0aGUgdHdvIHNhbXBsZXMgaGF2ZSB0aGUgc2FtZSBtZWFuLg0KDQoNCiMjIFRoZSBLb2xtb2dvcm92IFNtaXJub3YgVGVzdCAoS1MpDQoNClRoZSBLUyB0ZXN0IGlzIGEgbm9uIHBhcmFtZXRyaWMgdGVzdCB0aGF0IGFsbG93cyB0byB0ZXN0IGlmIHR3byBjb250aW51b3VzIGRpc3RyaWJ1dGlvbnMgYXJlIGRpZmZlcmVudC4gU2luY2UgdGhlIHNlcXVlbmNlcyBnZW5lcmF0ZWQgaW4gdGhlIHByZXZpb3VzIHNlY3Rpb24gaGF2ZSBkaWZmZXJlbnQgc3RhbmRhcmQgZGV2aWF0aW9ucywgdGhlIEtTIHRlc3Qgc2hvdWxkIHJlamVjdCB0aGUgbnVsbC4gDQoNCmBgYHtyfQ0Ka3NUZXN0IDwtIGtzLnRlc3QoczEsczIpOw0KcHJpbnQoa3NUZXN0KTsNCmBgYA0KDQpUaGUgZGlmZmVyZW5jZSB3YXMgYmUgc2hvd24gYnkgdGhlIHNtYWxsIHAtdmFsdWUuDQoNCiMgUG93ZXIgY29tcGFyaXNvbiBiZXR3ZWVuIHRoZSBTdHVkZW50IG9yIFdlbGNoIFRlc3QsIGFuZCB0aGUgS1MgVGVzdA0KDQojIyBTdHVkZW50IFZTIEtTDQoNCk5vdyB3ZSBjb25zaWRlciB0d28gc2FtcGxlcyBnZW5lcmF0ZWQgZnJvbSB0d28gZGlmZmVyZW50IGdhdXNzaWFuIGRpc3RyaWJ1dGlvbnMgd2l0aCB0aGUgc2FtZSB2YXJpYW5jZXMgYnV0IHdpdGggZGlmZmVyZW50IG1lYW5zLiBXZSB3YW50IHRvIHN0dWR5IHRoZSBkaXN0cmlidXRpb24gb2YgcC12YWx1ZSBvYnRhaW5lZCBmcm9tIHRoZSBTdHVkZW50LXRlc3QgYW5kIHRoZSBLUy1UZXN0IGluIHRoaXMgY2FzZS4NCg0KDQpgYGB7cn0NCnBvd2VyY29tcGFyaXNvbiA8LSBmdW5jdGlvbiAobXUxLG11MixzaWdtYTEsc2lnbWEyLG5hbWV0ZXN0PSdTdHVkZW50IFRlc3QnKXsNCiAgbiA8LSAyMDA7DQogIHBfdHRlc3QgPSByZXAoMCwzMDApOw0KICBwX2tzdGVzdCA9IHJlcCgwLDMwMCk7DQogIA0KICBmb3IgKGkgaW4gMTozMDApew0KICAgIHMxIDwtIHJhbmRuKDIwMCwgbWVhbj1tdTEsIHNkPXNpZ21hMSk7DQogICAgczIgPC0gcmFuZG4oMjAwLCBtZWFuPW11Miwgc2Q9c2lnbWEyKTsNCiAgICBpZiAobmFtZXRlc3QgPT0gJ1N0dWRlbnQgVGVzdCcpew0KICAgICAgdFRlc3QgPSB0LnRlc3QoczEsczIsdmFyLmVxdWFsPVRSVUUpOw0KICAgIH0NCiAgICBlbHNlew0KICAgICAgdFRlc3QgPSB0LnRlc3QoczEsczIsdmFyLmVxdWFsPUZBTFNFKTsNCiAgICB9DQogICAgcF90dGVzdFtpXSA9IHRUZXN0JHAudmFsdWU7DQogICAga3NUZXN0ID0ga3MudGVzdChzMSxzMik7DQogICAgcF9rc3Rlc3RbaV0gPC0ga3NUZXN0JHAudmFsdWU7DQogIH0NCiAgZGYgPC0gZGF0YS5mcmFtZShwdmFsdWUgPSBhcHBlbmQocF90dGVzdCxwX2tzdGVzdCksIHRlc3QgPSBhcHBlbmQocmVwKG5hbWV0ZXN0LGxlbmd0aChwX3R0ZXN0KSkscmVwKCdLUyBUZXN0JyxsZW5ndGgocF9rc3Rlc3QpKSkpDQogIGJveHBsb3QocHZhbHVlIH4gdGVzdCwgZGF0YSA9IGRmLCB2YXJ3aWR0aCA9IFRSVUUpDQp9DQoNCg0KbXUxIDwtIDE7DQptdTIgPC0gMS4yOw0Kc2lnbWExIDwtIDE7DQpzaWdtYTIgPC0gMTsNCnBvd2VyY29tcGFyaXNvbihtdTEsbXUyLHNpZ21hMSxzaWdtYTIpOw0KYGBgDQoNCldlIHNlZSB0aGF0IHRoZSBwb3dlciBvZiB0aGUgU3R1ZGVudCBUZXN0IGlzIGxhcmdlciB0aGFuIHRoZSBvbmUgb2YgdGhlIEtTIHRlc3QuIEluZGVlZCwgd2Ugb2J0YWluIHNtYWxsZXIgcC12YWx1ZXMgd2l0aCB0aGUgU3R1ZGVudCB0ZXN0IGNvbXBhcmVkIHRvIHRoZSBLUyB0ZXN0LiBUaGlzIHdhcyBwcmVkaWN0YWJsZSBzaW5jZSB3ZSBwZXJmZWN0bHkgbWF0Y2ggdGhlIGNvbmRpdGlvbnMgb2YgYXBwbGljYXRpb24gb2YgdGhlIFN0dWRlbnQgVGVzdC4gT24gdGhlIGNvbnRyYXJ5LCB0aGUgS1MgdGVzdCBjYW4gYmUgdXNlZCBmb3IgYW55IGNvbnRpbnVvdXMgZGlzdHJpYnV0aW9ucyBhbmQgdGhlICpubyBmcmVlIGx1bmNoIHByaW5jaXBsZSogd291bGQgbGVhZCB1cyB0byB0aGluayB0aGF0ICoqdGhlIGdlbmVyYWxpdHkgb2YgdGhlIEtTIHRlc3Qgc2hvdWxkIGJlIHBhaWQgaW4gcG93ZXIgaW4gc29tZSBzaXR1YXRpb25zLioqDQoNCg0KIyMgV2VsY2ggVlMgS1MNCg0KV2Uga25vdyBjb25zaWRlciBkaWZmZXJlbnQgdmFyaWFuY2VzIGZvciB0aGUgdHdvIGdhdXNzaWFuIGRpc3RyaWJ1dGlvbnMuIEluIHRoZSBmb2xsb3dpbmcsIHdlIHNob3cgdGhhdCB0aGUgcG93ZXIgY29tcGFyaXNvbiBiZXR3ZWVuIHRoZSBLUyBhbmQgdGhlIFdlbGNoIFRlc3QgaXMgbGVzcyBvYnZpb3VzIGluIHRoaXMgY2FzZS4NCg0KDQpCYXNlZCBvbiB0aGUgcmVzdWx0IG9mIHRoZSBwcmV2aW91cyBzZWN0aW9uLCBpdCBzZWVtcyBuYXR1cmFsIHRvIHRoaW5rIHRoYXQgaWYgdGhlIHR3byB2YXJpYW5jZXMgYXJlIGNsb3NlIGVub3VnaCwgdGhlIHBvd2VyIG9mIHRoZSBXZWxjaCBUZXN0IHdpbGwgc3RpbGwgYmUgbGFyZ2VyIHRoYW4gdGhlIG9uZSBvZiB0aGUgS1MgdGVzdC4gVGhpcyBpcyBzaG93biBiZWxvdyBvbiBhbiBleGFtcGxlLg0KDQpgYGB7cn0NCm11MSA8LSAxOw0KbXUyIDwtIDEuMjsNCnNpZ21hMSA8LSAxOw0Kc2lnbWEyIDwtIDEuMDI7DQpwb3dlcmNvbXBhcmlzb24obXUxLG11MixzaWdtYTEsc2lnbWEyLG5hbWV0ZXN0PSdXZWxjaCBUZXN0Jyk7DQpgYGANCg0KSG93ZXZlciwgd2hlbiB0aGUgZGlmZmVyZW5jZSBiZXR3ZWVuIHZhcmlhbmNlcyBpcyBnZXR0aW5nIGxhcmdlciwgdGhlIEtTIHRlc3QgY2FuIGJlY29tZSBtb3JlIHBvd2VyZnVsLg0KDQpgYGB7cn0NCm11MSA8LSAxOw0KbXUyIDwtIDEuMjsNCnNpZ21hMSA8LSAxOw0Kc2lnbWEyIDwtIDEuMjU7DQpwb3dlcmNvbXBhcmlzb24obXUxLG11MixzaWdtYTEsc2lnbWEyLG5hbWV0ZXN0PSdXZWxjaCBUZXN0Jyk7DQpgYGANCg0K