I’m always interested in new way to scrape websites, and when I was given a recent project to do I thought it would be the perfect time to test out a class I found recently, Simple HTML POM Parser. This is what’s so great about the PHP development community, they share.
So with a few links of code, you are able to parse a complete HTML page and get various information from it. Here I will look at getting links on a website.
$url = "http://www.phpdeveloping.co.za/"; $html = file_get_html($url); if ($links = $html->find('a')) { foreach($links as $link) { echo $link->href."\r\n"; echo $link->title."\r\n"; } }
As simple as that!