Register now and start sharing your code snippets.

How to detect traffic from the most common search spiders with Ruby

Ruby posted about 1 month ago by christian
This snippet detects traffic from the following bots, which is enough for me:
  • Google – Googlebot/2.1 ( http://www.googlebot.com/bot.html)
  • Google Image – Googlebot-Image/1.0 ( http://www.googlebot.com/bot.html)
  • MSN Live – msnbot-Products/1.0 (+http://search.msn.com/msnbot.htm)
  • Yahoo – Mozilla/5.0 (compatible; Yahoo! Slurp;)

The code (via):

   1  user_agent = request.user_agent.downcase
   2  @bot = [ 'msnbot', 'yahoo! slurp','googlebot' ].detect { |bot| user_agent.include? bot }

When the Google bot visists your site the @bot string will contain ‘googlebot’.

If you need to detect more bots than these, then the user-agents.org site contains a list of various user agents for both bots and browsers.

Tagged spider, web crawler, bot, search, user agent, detect

A simple Jabber/XMPP bot that uses the Jabber:Simple library

Ruby posted 7 months ago by christian

First install Jabber::Simple:

   1  $sudo gem install xmpp4r-simple -y

On OSX you might get this error when installing xmpp4r-simple and the rdoc dependency:

   1  make
   2  gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I.  -fno-common -g -O2  -fno-common -pipe -fno-common  -c callsite.c
   3  gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I/usr/local/lib/ruby/1.8/i686-darwin8.10.3 -I.  -fno-common -g -O2  -fno-common -pipe -fno-common  -c rcovrt.c
   4  cc -dynamic -bundle -undefined suppress -flat_namespace  -L"/usr/local/lib" -o rcovrt.bundle callsite.o rcovrt.o  -lruby  -lpthread -ldl -lobjc  
   5  /usr/bin/ld: /usr/lib/gcc/i686-apple-darwin8/4.0.1/../../../libpthread.dylib unknown flags (type) of section 6 (__TEXT,__dof_plockstat) in load command 0
   6  /usr/bin/ld: /usr/lib/gcc/i686-apple-darwin8/4.0.1/../../../libdl.dylib unknown flags (type) of section 6 (__TEXT,__dof_plockstat) in load command 0
   7  /usr/bin/ld: /usr/lib/gcc/i686-apple-darwin8/4.0.1/../../../libobjc.dylib load command 9 unknown cmd field
   8  /usr/bin/ld: /usr/lib/gcc/i686-apple-darwin8/4.0.1/../../../libSystem.dylib unknown flags (type) of section 6 (__TEXT,__dof_plockstat) in load command 0
   9  /usr/bin/ld: /usr/lib/libSystem.B.dylib unknown flags (type) of section 6 (__TEXT,__dof_plockstat) in load command 0
  10  collect2: ld returned 1 exit status
  11  make: *** [rcovrt.bundle] Error 1

Simply install XCode 3 to make the error go away, then run this code to start the bot—warning the bot will execute the message body, for example “ls -la”, on the system:

   1  require 'rubygems'
   2  require 'xmpp4r-simple'
   3  
   4  include Jabber
   5  #Jabber::debug = true
   6  
   7  jid = 'user@server.com'
   8  pass = 'password'
   9  
  10  jabber = Simple.new(jid, pass)
  11  
  12  loop do
  13    messages = jabber.received_messages
  14    messages.each do |message| 
  15      body = message.body if message.type == :chat
  16      
  17      process = IO.popen(body)
  18      result = process.readlines
  19      
  20      jabber.deliver('some.user@gmail.com', result)
  21    end
  22        
  23    sleep 1
  24  end

To use GTalk from another domain than gmail, you need to edit the Jabber::Simple source code…

Tagged jabber, xmpp, gmail, gtalk, bot