regexp snippets

Bulk renaming of files

Tagged rename, regexp, bulk, filename, bash, perl, linux  Languages bash

Rename the files in a directory by replacing a space with an underscore. The rename program comes with most modern Linux distros.

rename 's/\ /_/g' *.*

Validating an email address in Java

Tagged email, validator, regexp, java, rfc 2822  Languages java

The source of the regexp is this site: Email unlimited. According to Wikipedia the regexp on the source page validates the email address according to RFC 2822 - Internet Message Format. I have still to write a comprehensive test suite, but the tests I do have for this validator pass.

public class EmailValidator {
  public static boolean validate(final String emailAddress) {
    if ( emailAddress == null ) return false;
    String patternStr = "^[-!#$%&'*+/0-9=?A-Z^_a-z{|}~](\\.?[-!#$%&'*+/0-9=?A-Z^_a-z{|}~])*@[a-zA-Z](-?[a-zA-Z0-9])*(\\.[a-zA-Z](-?[a-zA-Z0-9])*)+$";
    Pattern emailPattern = Pattern.compile(patternStr);
    return emailPattern.matcher(emailAddress).matches();
  }
}

Removing HTML tags from a string in Ruby

Tagged regexp, ruby, removing html tags  Languages ruby

I don't take credit for the regexp. The source for it is Mastering Regular Expressions by Jeffrey E.F. Frield.

def remove_html_tags
    re = /<("[^"]*"|'[^']*'|[^'">])*>/
    self.title.gsub!(re, '')
    self.description.gsub!(re, '')
  end

How to extract numbers from a string in Bash scripts

Tagged bash, extract, number, regexp, rematch  Languages bash

To extract numbers from a string in Bash scripts you can use a bash feature called REMATCH. You don’t need grep, sed, or awk.

Add this to script.sh (remember to run chmod +x script.sh):

#!/usr/bin/env bash
string="COPY 23845\n3409"
if [[ $string =~ ^COPY[[:space:]]([0-9]+) ]]; then
  echo "Match: ${BASH_REMATCH[1]}"
else
  echo "No match"
fi

This will print 23845, but not 3409. Note that this example uses a capture group.