SimplePie is neither simple, nor pie
As I mentioned before, Bloglines sucks. So tonight I was trying to build my own aggregator in an attempt to free myself. As it turns out, freedom can be quite painful. My first stop was to get the basics working. I whipped up a quick OPML parser so I could import my feeds from Bloglines. Once I had that done, I needed a way to fetch the feeds so I could display them. I didn’t care so much about read/unread state (I’ll handle that later), I just wanted to get the very basic guts of a feed reader put together.
So now that I’ve got my list of feeds, I need something to fetch the RSS/ATOM and parse it. I’ve looked at a couple of PHP feed parsers in the past, but decided to give SimplePie a shot. As it turns out, not such a great idea. SimplePie touts their software as “So easy, even your grandmother could.” Sounded awesome, so I installed it. Seemed pretty simple, really. Download the tarball, unzip it, get out a single PHP file, include that in your PHP script and then call their library. It was easy…a little too easy.
I first noticed something was up when I output feed items that had embedded images. None of them were showing up. I figured it must be some anti-hotlinking stuff in action, so I browsed over to my feed (which has no hotlink protection). Same, images still didn’t show up. I viewed the HTML source and noticed all of my image URLs had been URL encoded and prefixed by “?i=”. This seemed really odd, so I double checked to make sure that wasn’t showing up in the feed XML. It wasn’t. For some reason, SimplePie was doing it. So I did some searches on their site. After a bit of stumbling about, I found this page. Interesting…they built in a mechanism to work around hotlink blocking. As I dug further, it turned out I had to create a whole other page to handle all image requests. That page would backpost requests to fetch the images and then serve them back to my feed reader.
So I created my image serving page and made the necessary tweaks to the SimplePie configuration to let it know about this new page. Now I ought to be seeing “images.php?i=” instead of “?i=”. I saved my changes and reloaded my browser. Nothing, the images were still broken. So I once again viewed the source. When I looked at the image tags there was still only “?i=”. No reference at all to images.php. What the hell? So I went back to my source and double checked everything, making sure it looked exactly as it did in the SimplePie documentation. Looked fine. Hmm…maybe my browser cache still has the old version of the page. So I did a Shift-reload to clear the cache. Still nothing. Hmm…cache, oh yeah…SimplePie has a cache. But…they wouldn’t dare cache the content transformations would they? So I opened up the cache file and sure enough, they were caching the image tag transformations. I removed the cache file, reloaded my browser and voila! The images popped up perfectly. So as it turns out, the SimplePie installation documents are missing this all important step for anybody who wants to look at something other than broken image icons.
As I thought about it, I realized I really didn’t want hotlink protection. If people were so uptight about hotlinking that they added protection to their images to prevent it, who was I to ignore their wishes? Additionally, serving the images through images.php meant that it was counting against my bandwidth. I wasn’t really crazy about that notion either. So I searched through the SimplePie documentation trying to find anything that would let me disable the hotlink juju. Nada, evidently you get the hotlink stuff whether you want it or not.
So, in conclusion:
- simple = good
- pie = good
- SimplePie = not good
Up next I’ll check out Magpie. What’s up with using “pie” to name everything? Now I can’t stop thinking about pie.
August 6th, 2006 at 4:42 am
Oh so you already started this project? nice!.
August 6th, 2006 at 8:07 am
i use feed on feeds (http://feedonfeeds.com/) which uses magpie to fetch feed information. the project hasn’t been updated in a while, but feed on feeds seems to work well as it is.
August 6th, 2006 at 9:30 am
This is Ryan, one of the co-developers of SimplePie.
I’m really sorry you ran into this issue. This seems to work on MOST configurations, but not all (as we’ve learned). We’ve been doing a lot of tweaking of this particular feature in the current development builds, and it will be disabled by default in our next Beta 3 build.
Is this an issue you reported to the support forums? Both of us developers make it a point to spend lots of time on the forums trying to help people work through issues. We also make it a point to listen to the feedback our users give us.
The resolution to your particular problem was to set a configuration option of $feed->bypass_image_hotlink(false);, then dumping the cache.
If you care enough to take the time, I’d love to get your feedback on what could be better in terms of documentation, features, how things should work, etc. It’s feedback like yours that lets us know what we need to work on, and it’s this kind of criticism that shows us what we can do to improve.
Again, I’m sorry your experience with SimplePie was negative, but I’d really like to take any opportunity I can to make a second experience much better. Please let me know your thoughts and feedback.
August 6th, 2006 at 12:15 pm
Ryan - it was your mistake. In the regex that was meant to rewrite the URLs, if $this->bypass_image_hotlink_page was set you still didn’t add it the the URL…
September 11th, 2006 at 9:33 am
Don’t need to change how the function gets called… just declare bypass_image_hotlink = false in the top lines instead of bypass_image_hotlink = ‘i’, that way everything works just the same.
September 28th, 2006 at 6:51 am
Simplepie vs. Magpie: A RSS Parser shootout…
I parse RSS a lot. My newsbot (that automatically finds new newssources) heavily filters results via bayes and hidden markov, and i need speed, because the scripts run on a 266mhz P2 machine.
After reading all the buzz about Simplepie (”Faster th…
February 5th, 2007 at 9:40 am
I’ve had troubles with SimplePie. But how did your Magpie experience go? I understand it’s all spaghetti under the hood.
February 5th, 2007 at 10:25 am
I don’t remember, actually. It’s been so long and I haven’t gone back to look at either.