Kruskal-Wallis and Power Analysis in R: Analysing Flight Delays

Comparing groups using the Kruskal-Wallis test

Michael Grogan
5 min readJun 23, 2023
Source: Image by Author — implementation in R using RStudio

The Kruskal-Wallis test is a non-parametric test that tests the null hypothesis that k sampled number of groups possess the same distribution function.

The test is performed by means of a ranking mechanism, whereby the observations across the samples are ordered by size and their values replaced with a corresponding rank. Using the kruskal.test function in R, this is performed automatically.

For this example, let us see how the Kruskal-Wallis test can be used to determine differences in delays across flights.

Airline Delay Example

Let us consider the following scenario. We wish to analyse hypothetical flight data across three separate airlines to determine whether the delay in takeoff time differs across these airlines.

In this regard, consider that we have three flights across separate airlines that have a delay in takeoff time as measured in minutes for each flight instance (a value of 0 means the flight took off on time).

> df
flight1 flight2 flight3
1 0 41 28
2 12 31 16
3 5 30 242
4 7 35 63
5 8 3 0

--

--

Michael Grogan

Statistical Data Scientist | Python and R trainer | Financial Writer | michael-grogan.com