If you’ve done the Thirty Day Challenge, you’ll know how tedious it can be to download all the PDF files. Norio created a cool downloading script that saved me a lot of time. I was just too lazy to sit and write something like it, so I’m stealing his!

<?php
// make sure our download doesn't time out or get interrupted by closing the browser
set_time_limit(0);
ignore_user_abort(1);
// destination to download to
$file_dir = "sites/default/files/30dc";
// create the destination directory if it doesn't exist
if (!is_dir($file_dir)) mkdir($file_dir);
// go through each day of training (1-31)
for ($i = 1; $i <= 31; $i++) {
  // download the HTML contents of the training page for that day
  if ($page = file_get_contents("http://www.thirtydaychallenge.com/training/2009day".sprintf("%02d", $i).".php")) {
    // provide some feedback on where we are
    echo "<b>Day $i:</b><br />";
    // flush output to browser - see php.net/flush
    flush();
    // directory to download the current day's PDFs to
    $daydir = $file_dir."/day$i";
    // create the directory if it doesn't exist
    if (!is_dir($daydir)) mkdir($daydir);
    // grab all the URLs to the PDFs (regular expressions are awesome!)
    preg_match_all('~(http://media.thirtydaychallenge.com.s3.amazonaws.com/training09/([0-9A-Za-z_]+.pdf))~', $page, $matches);
    // go through each url we grabbed above
    foreach ($matches[1] as $key => $filename) {
      // check if the file already exists (no use in re-downloading PDFs we have)
      if (!file_exists($matches[2][$key])) {
        // provide some feedback on where we are
        echo "Downloading {$matches[2][$key]}.<br />";
        // flush output to browser
        flush();
        // download the pdf and store it locally
        file_put_contents("{$daydir}/{$matches[2][$key]}", file_get_contents($matches[1][$key]));
      }
    }
  }
}
?>

Check out the full post at Boff.co.za