As part of an update to the Five Filters Full-Text RSS service, I’ve been porting some JavaScript code (Arc90’s current version of Readability) to PHP. It contains a lot of DOM manipulation which translates very easily – thanks to PHP5’s DOM support. But one thing I wasn’t able to do was manipulate the DOM tree through the innerHTML property.
In JavaScript, it’s very easy to do. The Mozilla Developer Network’s page on innerHTML gives the following example:
var content = element.innerHTML; // Returns a string containing the HTML syntax describing all // of the element's descendants element.innerHTML = content; // Removes all of element's descendants, parses the content // string and assigns the resulting nodes as descendants of // the element.
Using PHP’s magic getter and setter methods, it’s possible to extend DOMElement to achieve this type of access and manipulation. My attempt at doing it is JSLikeHTMLElement. Here’s an example of how to use it (with relevant lines highlighted):
require_once 'JSLikeHTMLElement.php'; $doc = new DOMDocument(); $doc->registerNodeClass('DOMElement', 'JSLikeHTMLElement'); $doc->loadHTML('<div><p>Para 1</p><p>Para 2</p></div>'); $elem = $doc->getElementsByTagName('div')->item(0); // print innerHTML echo $elem->innerHTML; // prints '<p>Para 1</p><p>Para 2</p>' // set innerHTML $elem->innerHTML = 'FF'; // print document (with our changes) echo $doc->saveXML();
Download: JSLikeHTMLElement.php. Feedback appreciated.
12 Comments
Hey Keyvan,
I’d be really interested to see your progress on porting the changes of Readability to PHP – I took at look at five filters’ source control (here: http://bazaar.launchpad.net/~keyvan/fivefilters/content-only/files ) but that source looks pretty out of date. Do you have any plans on putting your work up anytime soon? Great work so far!
Chris Dary – Tech Lead on Readability at Arc90
Thanks Chris! Readability 1.6.2 has been ported over and will be available soon (probably over the weekend). It’s being used on http://fivefilters.org/content-only/ right now. I’ll email you when it’s up.
Thanks again for the great work on Readability. I’ve just seen your changes to 1.7 and I’m tempted to squeeze a few of those changes in before I release. π
Keyvan
Hi Keyvan,
This is exactly what I was looking for! Wasn’t looking forward to writing my own implementation! I’ve integrated it into a project I’m working on and will give all due credits.
Great work!
Kevin: that’s good to hear – glad it helped! π
Oh great, exactly what I needed – I would have gone crazy if I hadn’t found this π
Benny: glad it helped. π
You just saved my life… π Thanks!
Genius! Works like a charm. Thanks!
Thanks a lot, your function save my life π
Works very fine for me π
Hi, Thanks for your post. I cannot download the script you provided. When I clicked the link it redirected me to the homepage. Can I have the script? Thank you.
Hi Terry, sorry about that, BitBucket removed domain mapping from their offering so broke a bunch of our URLs. I’ve updated the links to point to the correct location.
Very nice! Much easier than other methods cited on the net.