How to generate a histogram with Perl

I couldn't find a histogram library for Perl, so I had to write my own.

Save the following code in

use POSIX qw(ceil floor);

# No bugs, please
use strict;
use warnings;

# Perl doesn't have round, so let's implement it
sub round
    my($number) = shift;
    return int($number + .5 * ($number <=> 0));

sub histogram
  my ($bin_width, @list) = @_;

  # This calculates the frequencies for all available bins in the data set
  my %histogram;
  $histogram{ceil(($_ + 1) / $bin_width) -1}++ for @list;

  my $max;
  my $min;

  # Calculate min and max
  while ( my ($key, $value) = each(%histogram) )
    $max = $key if !defined($min) || $key > $max;
    $min = $key if !defined($min) || $key < $min;

  for (my $i = $min; $i <= $max; $i++)
    my $bin       = sprintf("% 10d", ($i) * $bin_width);
    my $frequency = $histogram{$i} || 0;

    $frequency = "#" x $frequency;

    print $bin." ".$frequency."\n";

  print "===============================\n\n";
  print "    Width: ".$bin_width."\n";
  print "    Range: ".$min."-".$max."\n\n";

To generate a histogram for a set of data include the histogram subroutine and pass the desired width of the bins to the routine and the dataset as an array:


histogram(10, (1,2,3,4,5,10,11,12,20,21,30));

The output of the above example is:

0  #####
10 ###
20 ##
30 #


Width: 10
Range: 0-3

The generated histogram tells us that there are: 5 numbers between 0-9, 3 between 10-19, 2 between 20-29, 1 between 30-39

Perl script that can be used to calculate min, max, mean, mode, median and standard deviation for a set of log records

The best thing about this script is that it's easy to customize, right now it's optimized for comma delimited data.

use strict;
use warnings;

# Import stdev, average, mean and other statistical functions
# A copy of

my %page_runtimes;
my $delimitor = ';';
my @columns = ("page", "samples", "min", "max", "mean", "mode", "median", "stddev\n");
my $line;
my $first_timestamp, my $last_timestamp;

# ==========================================
# Parse log file
# ==========================================

# Don't use foreach as it reads the whole file into memory: foreach $line (<>) { 
while ($line=<>) {
  # remove the newline from $line, otherwise the report will be corrupted.

  my @columns               = split(';', $line);
  my $timestamp             = $columns[0];
  my $page_name             = $columns[1];
  my $page_runtime          = $columns[2];

    $first_timestamp = $timestamp;

  # print what we find
    print "Found page '$page_name'\n";
  # add page runtimes to one hash
  push(@{$page_runtimes{$page_name}}, $page_runtime);
  $last_timestamp = $timestamp;

# ==========================================
# Calculate and print page statistics
# ==========================================
open(PAGE_REPORT, ">report.csv") or die("Could not open report.csv.");

print PAGE_REPORT "First sample\n".$first_timestamp."\nLast sample\n".$last_timestamp."\n\n";
print PAGE_REPORT join($delimitor, @columns);

for my $page_name (keys %page_runtimes )
  my @runtimes = @{$page_runtimes{$page_name}};
  my $samples = @runtimes;
  my $min     = min(@runtimes);
  my $max     = max(@runtimes);
  my $mean    = mean(@runtimes);
  my $mode    = mode(@runtimes);
  my $median  = median(@runtimes);
  my $stddev  = stddev(@runtimes);
  my @data = ($page_name, $samples, $min, $max, $mean, $mode, $median, $stddev);
  my $line = join($delimitor, @data);
  # Use comma instead of decimal
  $line =~ s/\./\,/g;
  print PAGE_REPORT "$line\n";

To use it simply pipe some data into it like this:

grep "2008-31-12" silly-data.log | perl

How to use the Perl DBI module

Basic usage

use strict;
use DBI;
use DBD::mysql;

my $host = 'localhost';
my $database = 'xxx';
my $user = 'xxx';
my $password = '';

my $dsn = "dbi:mysql:$database:$host:3306";
my $db = DBI->connect($dsn, $user, $password);
my $sql = q(
    what (name, instructions) 
  VALUES (?, ?)

my $p = $db->prepare($sql);

my $result = $p->execute($name, $instructions);

print $result;

my $id = $dbh->{'mysql_insertid'};

One-liner for selecting one row

my $c = 'Horse';
my ($id, $instructions) = $db->selectrow_array("select id, instructions from categories where name = ?", undef, $c);

How to pipe input to a Perl script

Let's say you want to pipe some input to a Perl script. First, you create this Perl script (

while (<>) 
  print $_;

Then you call the script like this:

less access.log | perl

The script outputs the contents of access.log. To do some real work extend it with your own code--you might want to, for example, analyze an Apache access log.

You can also read the input line by line like this:

foreach $line (<>) 
  print $line;