I’ve made a few website scrapers over the last few months, and have enjoyed it very much. Something that I need quite a bit was to extra all objects (images, css files, etc) that a website refers to.

This is the function I used to get images from a HTML document.

  function get_images($html)
  {
    $images = array();
    preg_match_all('/(img|src)\=(\"|\')[^\"\'\>]+/i', $html, $media);
    unset($html);
    $html=preg_replace('/(img|src)(\"|\'|\=\"|\=\')(.*)/i',"$3",$media[0]);
    foreach($data as $url) {
      $info = pathinfo($url);
      if (isset($info['extension'])) {
        if (($info['extension'] == 'jpg') ||
	   ($info['extension'] == 'jpeg') ||
	   ($info['extension'] == 'gif') ||
	   ($info['extension'] == 'png'))
	   array_push($images, $url);
      }
    }
    return $images;
  }

This function takes as input the HTML content as a string. You can get this using cURL or file_get_contents. It returns an array of all the images it found on that page.