Register now and start sharing your code snippets.
-->

Bulk content download from sites that serve it through a database using attachment id's

Shell Script (Bash) posted 6 months ago by marko

This needs to be customized according to site. The—content-disposition is experimental. It also may cause files by the same name to be downloaded. Wget will name them like abc.jpg, abc.jpg.1, abc.jpg.2 and so on. You can always rename them afterwards :)

   1  #!/bin/bash
   2  destination_dir=$HOME/thecontent
   3  mkdir -p $destination_dir
   4  for page in 0 1 2 3 4 5 6 7 8 9; do 
   5  	attachment_ids=$(wget -O - "http://www.xyz.com/showthread.php?page=77${page}" |grep attachmentid|cut -d'"' -f2| cut -d'=' -f3 |cut -d'&' -f1)
   6  	for attachment_id in $attachment_ids; do
   7  		wget --content-disposition --directory-prefix=$destination_dir http://www.xyz.com/attachment.php?attachmentid=${attachment_id}
   8  	done
   9  done

Tagged mass download content, wget