So I recently had to write some code for a ranking system based on quartiles. After an initial Google search, I was not able to find any code that reliably produced the results based on Method 1 from the Wikipedia page. I am sure this is due to the fact that there are many different formulas. I ran across this interesting article on why Excel has multiple quartile functions. Since I could not find existing code that worked for me, I wrote the following code. I hope this helps someone else with the problem I had.
function quartile($arr) {
$count = count($arr);
$middleval = floor(($count-1)/2); // find the middle value, or the lowest middle value
if($count % 2) { // odd number, middle is the median
$median = $arr[$middleval];
} else { // even number, calculate avg of 2 medians
$low = $arr[$middleval];
$high = $arr[$middleval+1];
$median = (($low+$high)/2);
}
return number_format((float)$median, 2, '.', '');
}
$values = array(1.22, 1.29, 2.00, 2.17, 2.38, 2.43, 2.44, 2.50, 2.56, 2.57, 3.00, 3.00, 3.00, 3.13, 3.38, 3.50, 3.71);
$second = quartile($values);
$tmp = array();
foreach ($values as $key=>$val) {
if ($val > $second) {
$tmp['third'][] = $val;
} else if ($val < $second) {
$tmp['first'][] = $val;
}
}
$first = quartile($tmp['first']);
$third = quartile($tmp['third']);
So the function just produces the median value. I first find the median value of the entire array, which is the second quartile. After I have found the second quartile, I then loop through the array and create a new array for the first and third quartiles based on where the values fall in relation to the second quartile. After the new arrays are created, I find each of their median's.
Post a Comment