Bulk renaming of files
Rename the files in a directory by replacing a space with an underscore. The rename program comes with most modern Linux distros.
rename 's/\ /_/g' *.*
Rename the files in a directory by replacing a space with an underscore. The rename program comes with most modern Linux distros.
rename 's/\ /_/g' *.*
Let's say you want to pipe some input to a Perl script. First, you create this Perl script (pipe_me.pl):
while (<>)
{
print $_;
}
Then you call the script like this:
less access.log | perl pipe_me.pl
The script outputs the contents of access.log. To do some real work extend it with your own code--you might want to, for example, analyze an Apache access log.
You can also read the input line by line like this:
foreach $line (<>)
{
print $line;
}
I couldn't find a histogram library for Perl, so I had to write my own.
Save the following code in histogram.pl:
use POSIX qw(ceil floor);
# No bugs, please
use strict;
use warnings;
# Perl doesn't have round, so let's implement it
sub round
{
my($number) = shift;
return int($number + .5 * ($number <=> 0));
}
sub histogram
{
my ($bin_width, @list) = @_;
# This calculates the frequencies for all available bins in the data set
my %histogram;
$histogram{ceil(($_ + 1) / $bin_width) -1}++ for @list;
my $max;
my $min;
# Calculate min and max
while ( my ($key, $value) = each(%histogram) )
{
$max = $key if !defined($min) || $key > $max;
$min = $key if !defined($min) || $key < $min;
}
for (my $i = $min; $i <= $max; $i++)
{
my $bin = sprintf("% 10d", ($i) * $bin_width);
my $frequency = $histogram{$i} || 0;
$frequency = "#" x $frequency;
print $bin." ".$frequency."\n";
}
print "===============================\n\n";
print " Width: ".$bin_width."\n";
print " Range: ".$min."-".$max."\n\n";
}
To generate a histogram for a set of data include the histogram subroutine and pass the desired width of the bins to the routine and the dataset as an array:
do('histogram.pl');
histogram(10, (1,2,3,4,5,10,11,12,20,21,30));
The output of the above example is:
0 #####
10 ###
20 ##
30 #
===============================
Width: 10
Range: 0-3
The generated histogram tells us that there are: 5 numbers between 0-9, 3 between 10-19, 2 between 20-29, 1 between 30-39
First add the following to your Nginx configuration file:
location / {
auth_basic "Restricted";
auth_basic_user_file /etc/nginx/htpasswd;
}
Then create the htpasswd file:
# this be passwords
thisbetheusername:thisbeencryptedpass:yercomment
To generate a htpasswd password without installing Apache you can use the following Perl or Ruby code:
Perl
perl -le 'print crypt("password", "salt")'
Ruby (run in irb)
"password".crypt("salt")
The crypt() method uses 56-bit DES encryption, which is used in /etc/passwd and htpasswd.
The best thing about this script is that it's easy to customize, right now it's optimized for comma delimited data.
use strict;
use warnings;
# Import stdev, average, mean and other statistical functions
# A copy of http://search.cpan.org/~brianl/Statistics-Lite-3.2/Lite.pm
do('stats.pl');
my %page_runtimes;
my $delimitor = ';';
my @columns = ("page", "samples", "min", "max", "mean", "mode", "median", "stddev\n");
my $line;
my $first_timestamp, my $last_timestamp;
# ==========================================
# Parse log file
# ==========================================
#
# Don't use foreach as it reads the whole file into memory: foreach $line (<>) {
#
while ($line=<>) {
# remove the newline from $line, otherwise the report will be corrupted.
chomp($line);
my @columns = split(';', $line);
my $timestamp = $columns[0];
my $page_name = $columns[1];
my $page_runtime = $columns[2];
if(!defined($first_timestamp))
{
$first_timestamp = $timestamp;
}
# print what we find
if(!defined(@{$page_runtimes{$page_name}}))
{
print "Found page '$page_name'\n";
}
# add page runtimes to one hash
push(@{$page_runtimes{$page_name}}, $page_runtime);
$last_timestamp = $timestamp;
}
# ==========================================
# Calculate and print page statistics
# ==========================================
open(PAGE_REPORT, ">report.csv") or die("Could not open report.csv.");
print PAGE_REPORT "First sample\n".$first_timestamp."\nLast sample\n".$last_timestamp."\n\n";
print PAGE_REPORT join($delimitor, @columns);
for my $page_name (keys %page_runtimes )
{
my @runtimes = @{$page_runtimes{$page_name}};
my $samples = @runtimes;
my $min = min(@runtimes);
my $max = max(@runtimes);
my $mean = mean(@runtimes);
my $mode = mode(@runtimes);
my $median = median(@runtimes);
my $stddev = stddev(@runtimes);
my @data = ($page_name, $samples, $min, $max, $mean, $mode, $median, $stddev);
my $line = join($delimitor, @data);
# Use comma instead of decimal
$line =~ s/\./\,/g;
print PAGE_REPORT "$line\n";
}
close(PAGE_REPORT);
To use it simply pipe some data into it like this:
grep "2008-31-12" silly-data.log | perl analyze.pl
use strict;
use DBI;
use DBD::mysql;
my $host = 'localhost';
my $database = 'xxx';
my $user = 'xxx';
my $password = '';
my $dsn = "dbi:mysql:$database:$host:3306";
my $db = DBI->connect($dsn, $user, $password);
my $sql = q(
INSERT INTO
what (name, instructions)
VALUES (?, ?)
);
my $p = $db->prepare($sql);
my $result = $p->execute($name, $instructions);
print $result;
my $id = $dbh->{'mysql_insertid'};
my $c = 'Horse';
my ($id, $instructions) = $db->selectrow_array("select id, instructions from categories where name = ?", undef, $c);
This example demonstrates how to inspect and filter POST parameters sent through nginx to a backend server with Perl:
nginx => filter (perl) => backend
This could also be done with the Lua module.
nginx.conf:
http {
perl_modules /tmp;
perl_require filter.pm;
server {
# The action we want to filter
location = /bananas/create {
perl filter::handler;
}
# The backend server
location = /backend {
internal; # Only allowed internally
proxy_pass http://backend/bananas/create;
}
# The action that handles blocked requests
location = /blocked {
internal;
perl filter::block;
}
}
/tmp/filter.pm:
package filter;
use nginx;
# Handle request
sub handler {
my $r = shift;
if ($r->has_request_body(\&checkRequest)) {
return OK;
}
return OK;
}
# Print a message
sub blocked {
my $r = shift;
$r->send_http_header("text/html");
$r->print("Blocked...\n<br/>");
$r->rflush;
return OK;
}
# Check request for invalid parameter
sub checkRequest {
my $r = shift;
my $body = $r->request_body;
# Try to detect invalid POST parameters
if ($body =~ /credit_card=/) {
$r->internal_redirect("/blocked"); # Redirect to /blocked
} else {
$r->internal_redirect("/backend"); # Redirect to /backend
}
return OK;
}
1;
__END__
find . -name "*.yaml" | xargs perl -p -i -e "s/foo/buzz/g"
find . -type f -name "*.yaml" -print0 | xargs -0 sed -i '' -e 's/foo/bar/g'
I recommend installing and using repren if you need a simpler solution: https://github.com/jlevy/repren
Reference: