| Web "scraping" the easy way |
|
Quick Intro: Say you go to the ABC website and want to get the contents of the alert at the top of the page, which is enclosed in a <div> with the class "alertContent". Solution : phpQuery : http://code.google.com/p/phpquery/ 3 lines of code : source require_once 'phpQuery.php'; This will include all of the HTML from that div, so you may want to clean this up a little bit. The echo command becomes : source if (ereg(".*<p>(.*)</p>.*",$doc['.alertContent'],$regs)) { Based on their current content - this gives me : The Season 9 cast has been revealed for <i style="font-size:15px;">Dancing With The Stars!</i><br><a href="/shows/dancing-with-the-stars/cast-announcement/">See all 16 new stars</a>. Still not 100%, since that URL is relative to the abc website. I smell another regex coming on : source if (ereg(".*<p>(.*)</p>.*",$doc['.alertContent'],$regs)) {
|

