If you’ve done the Thirty Day Challenge, you’ll know how tedious it can be to download all the PDF files. Norio created a cool downloading script that saved me a lot of time. I was just too lazy to sit and write something like it, so I’m stealing his!
<?php // make sure our download doesn't time out or get interrupted by closing the browser set_time_limit(0); ignore_user_abort(1); // destination to download to $file_dir = "sites/default/files/30dc"; // create the destination directory if it doesn't exist if (!is_dir($file_dir)) mkdir($file_dir); // go through each day of training (1-31) for ($i = 1; $i <= 31; $i++) { // download the HTML contents of the training page for that day if ($page = file_get_contents("http://www.thirtydaychallenge.com/training/2009day".sprintf("%02d", $i).".php")) { // provide some feedback on where we are echo "<b>Day $i:</b><br />"; // flush output to browser - see php.net/flush flush(); // directory to download the current day's PDFs to $daydir = $file_dir."/day$i"; // create the directory if it doesn't exist if (!is_dir($daydir)) mkdir($daydir); // grab all the URLs to the PDFs (regular expressions are awesome!) preg_match_all('~(http://media.thirtydaychallenge.com.s3.amazonaws.com/training09/([0-9A-Za-z_]+.pdf))~', $page, $matches); // go through each url we grabbed above foreach ($matches[1] as $key => $filename) { // check if the file already exists (no use in re-downloading PDFs we have) if (!file_exists($matches[2][$key])) { // provide some feedback on where we are echo "Downloading {$matches[2][$key]}.<br />"; // flush output to browser flush(); // download the pdf and store it locally file_put_contents("{$daydir}/{$matches[2][$key]}", file_get_contents($matches[1][$key])); } } } } ?>

