How to generate a histogram with Perl
I couldn't find a histogram library for Perl, so I had to write my own.
Save the following code in histogram.pl:
use POSIX qw(ceil floor);
# No bugs, please
use strict;
use warnings;
# Perl doesn't have round, so let's implement it
sub round
{
my($number) = shift;
return int($number + .5 * ($number <=> 0));
}
sub histogram
{
my ($bin_width, @list) = @_;
# This calculates the frequencies for all available bins in the data set
my %histogram;
$histogram{ceil(($_ + 1) / $bin_width) -1}++ for @list;
my $max;
my $min;
# Calculate min and max
while ( my ($key, $value) = each(%histogram) )
{
$max = $key if !defined($min) || $key > $max;
$min = $key if !defined($min) || $key < $min;
}
for (my $i = $min; $i <= $max; $i++)
{
my $bin = sprintf("% 10d", ($i) * $bin_width);
my $frequency = $histogram{$i} || 0;
$frequency = "#" x $frequency;
print $bin." ".$frequency."\n";
}
print "===============================\n\n";
print " Width: ".$bin_width."\n";
print " Range: ".$min."-".$max."\n\n";
}
To generate a histogram for a set of data include the histogram subroutine and pass the desired width of the bins to the routine and the dataset as an array:
do('histogram.pl');
histogram(10, (1,2,3,4,5,10,11,12,20,21,30));
The output of the above example is:
0 #####
10 ###
20 ##
30 #
===============================
Width: 10
Range: 0-3
The generated histogram tells us that there are: 5 numbers between 0-9, 3 between 10-19, 2 between 20-29, 1 between 30-39