Register now and start sharing your code snippets.
-->

How to automatically ping search engines when your sitemap has changed

Ruby posted 2 months ago by christian

I prefer letting cron update sitemaps in the background, and at the end of the script I ping search engines to let them know it’s been updated:

   1  # Recreate sitemap goes here
   2  
   3  # Let search engines know about the update
   4  [ "http://www.google.com/webmasters/tools/ping?sitemap=http://xxx/sitemap.xml",
   5    "http://search.yahooapis.com/SiteExplorerService/V1/ping?sitemap=http://xxx/sitemap.xml",
   6    "http://submissions.ask.com/ping?sitemap=http://xxx/sitemap.xml",
   7    "http://webmaster.live.com/ping.aspx?siteMap=http://xxx/sitemap.xml" ].each do |url|
   8    open(url) do |f|
   9      if f.status[0] == "200"
  10        puts "Sitemap successfully submitted to #{url}"      
  11      else
  12        puts "Failed to submit sitemap to #{url}"
  13      end
  14    end
  15  end
  16  

More about sitemaps: http://en.wikipedia.org/wiki/Sitemaps

Tagged sitemap, ruby, ping, search, google

How to optimize your MephistoBlog powered site's search engine ranking (SEO for MephistoBlog)

Plain Text posted 4 months ago by christian

At Aktagon we use MephistoBlog as CMS , and I couldn’t find any information on how to SEO optimize MephistoBlog on Google, so I’m sharing my notes here.

This tip shows you how to make your pages more search engine friendly.

First, add the title tag, plus the meta description and keywords tags to your layout’s Liquid template , as shown here:

   1  <meta name="description" content="{% if article %} {{ article.excerpt }}  {% else %} YOUR DEFAULT SITE DESCRIPTION {% endif %}" />
   2  	<meta name="keywords" content="{% if article %} {% for tag in article.tags %}{{ tag }}, {% endfor %} {% endif %} YOUR DEFAULT KEYWORDS" />
   3  	<title>{% if article %} {{ article.title }} &raquo; {{ site.title }} {% else %} {{ site.title }} &raquo; {{ site.subtitle }} {% endif %}</title>

Remember to update the default description and keywords in the meta tags’ body.

Now, whenever you publish an article, simply add an excerpt and some tags to it. The excerpt is used as the meta description and the article’s tags as the meta keywords, both make Google a bit happier, but the description is by far the more important.

Tagged seo, mephistoblog, meta, google, search, keywords

How to detect traffic from the most common search spiders with Ruby

Ruby posted 5 months ago by christian
This snippet detects traffic from the following bots, which is enough for me:
  • Google – Googlebot/2.1 ( http://www.googlebot.com/bot.html)
  • Google Image – Googlebot-Image/1.0 ( http://www.googlebot.com/bot.html)
  • MSN Live – msnbot-Products/1.0 (+http://search.msn.com/msnbot.htm)
  • Yahoo – Mozilla/5.0 (compatible; Yahoo! Slurp;)

The code (via):

   1  user_agent = request.user_agent.downcase
   2  @bot = [ 'msnbot', 'yahoo! slurp','googlebot' ].detect { |bot| user_agent.include? bot }

When the Google bot visists your site the @bot string will contain ‘googlebot’.

If you need to detect more bots than these, then the user-agents.org site contains a list of various user agents for both bots and browsers.

Tagged spider, web crawler, bot, search, user agent, detect

Sample thinking-sphinx configuration

Ruby posted 6 months ago by christian

First read this... then this

   1  draft

Tagged thinking-sphinx, sphinx, search

How to install and use the Sphinx search engine and acts_as_sphinx plugin on Debian Etch

Shell Script (Bash) posted 8 months ago by christian

Inspiration for this snippet was taken from this post on the Sphinx forum, plus this blog post.

Compiling Sphinx

First install the prerequisites:

   1  sudo aptitude install libmysql++-dev libmysqlclient15-dev checkinstall

Next download sphinx, libstemmer and install everything and the fish:

   1  cd /usr/local/src
   2  
   3  wget http://sphinxsearch.com/downloads/sphinx-0.9.8-rc2.tar.gz
   4  tar zxvf sphinx-0.9.8-rc2.tar.gz 
   5  
   6  cd sphinx-0.9.8-rc2/
   7  
   8  # Add stemming support for Swedish, Finnish and other fun languages.
   9  wget http://snowball.tartarus.org/dist/libstemmer_c.tgz
  10  tar zxvf libstemmer_c.tgz
  11  
  12  ./configure --with-libstemmer
  13  make
  14  
  15  make install

Configure Sphinx

Create a sphinx.conf file in your Rails config directory, as described here, or use this template.

Install acts_as_sphinx plugin

   1  ./script/plugin install http://svn.datanoise.com/acts_as_sphinx

Add acts_as_sphinx to your model:

   1  class Documents
   2     acts_as_sphinx
   3  end

Indexing content

   1  rake sphinx:index
   2  
   3  (in /var/www/xxx.com/releases/20080429144230)
   4  Sphinx 0.9.8-rc2 (r1234)
   5  Copyright (c) 2001-2008, Andrew Aksyonoff
   6  
   7  using config file './sphinx.conf'...
   8  indexing index 'xxx.com'...
   9  collected 5077 docs, 0.6 MB
  10  sorted 0.1 Mhits, 100.0% done
  11  total 5077 docs, 632096 bytes
  12  total 0.160 sec, 3950427.25 bytes/sec, 31729.86 docs/sec

Reindexing content

sphinx:index shouldn’t be run while the searchd process is running, so use rake sphinx:rotate instead, which restarts the searchd process after indexing.

Starting the daemon

   1  mkdir -m 664 /var/log/sphinx
   2  rake sphinx:start
   3  
   4  (in /var/www/xxx.com/releases/20080429144230)
   5  Sphinx 0.9.8-rc2 (r1234)
   6  Copyright (c) 2001-2008, Andrew Aksyonoff
   7  
   8  using config file './sphinx.conf'...
   9  Sphinx searchd server started.

Searching

   1  Documents.find_with_sphinx 'why did I write this'

Tagged sphinx, search, acts_as_sphinx, debian, etch, rails, install, libstemmer