20903

simplepie not parsing google news rss feed

This code works perfectly with any other rss feed but not with google news feeds. I do not know what I am doing wrong, I think it's some bug. I keep getting this error when I try to read google news feeds

This XML document is invalid, likely due to invalid characters. XML error: SYSTEM or PUBLIC, the URI is missing at line 1, column 61

For example if we try the http://stackoverflow.com/feeds feeds it works nicely, but not with google news feeds. Can some one give me a hint?

<?php //get the simplepie library require_once('simplepie.inc'); //grab the feed $feed = new SimplePie(); $feed->set_feed_url("http://news.google.com/news?hl=en&gl=us&q=austria&ie=UTF-8&output=rss"); $feed->force_feed(true); //$feed->encode_instead_of_strip(true); //enable caching $feed->enable_cache(true); //provide the caching folder $feed->set_cache_location('cache'); //set the amount of seconds you want to cache the feed $feed->set_cache_duration(1800); //init the process $feed->init(); //let simplepie handle the content type (atom, RSS...) $feed->handle_content_type(); ?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> <title>simple</title> </head> <body> <div id="page-wrap"> <h1>News Finder</h1> <?php if ($feed->error): ?> <p><?php echo $feed->error; ?></p> <?php endif; ?> <?php foreach ($feed->get_items() as $item): ?> <div class="chunk"> <h4 style="background:url(<?php $feed = $item->get_feed(); echo $feed->get_favicon(); ?>) no-repeat; text-indent: 25px; margin: 0 0 10px;"><a href="<?php echo $item->get_permalink(); ?>"><?php echo $item->get_title(); ?></a></h4> <p class="footnote">Source: <a href="<?php $feed = $item->get_feed(); echo $feed->get_permalink(); ?>"><?php $feed = $item->get_feed(); echo $feed->get_title(); ?></a> | <?php echo $item->get_date('j M Y | g:i a T'); ?></p> </div> <?php endforeach; ?> </div>

Answer1:

Make sure you're using SimplePie 1.2.1, 1.2 had a bug with URL parsing which can cause this type of error.

(I'm also the SimplePie lead developer, so feel free to shoot questions straight to my email)

If you are using 1.2.1, it would appear that this is a manifestation of bug #162 which is currently unconfirmed. I'll take an indepth look into this, but it appears to definitely be an error in SimplePie, not in your code.

(I'll also post back here with why this is occurring for the curious amongst you.)

Answer2:

I have no clue about SimplePie, however, the simple way in your case might be just SimpleXML:

$url = "http://news.google.com/news?hl=en&gl=us&q=austria&bav=on.2,or.r_gc.r_pw.,cf.osb&biw=1920&bih=973&um=1&ie=UTF-8&output=rss"; $feed = simplexml_load_file($url); echo $feed->channel->title, "\n<", $feed->channel->link, ">\n\n"; foreach($feed->channel->item as $item) { echo "* $item->title\n <$item->link>\n"; }

SimpleXML is normally directly available with PHP, you don't need to install any library or so.

Demo

Answer3:

For Google News feed uses :

$feed->set_raw_data(file_get_contents($rssurl));

Answer4:

Just wanted to add a note here for others that think the above answer doesn't work. If your getting a null on item title, check the feed source, it may not be anything wrong with your simplepie or script, but your browser setting it to null because of html code within the title item tags.

Recommend

  • Make a POST request
  • Silverlight 4 OOB application access HTML DOM of the page in WebBrowser control
  • NPM MSSQL - error: uncaughtException: Cannot read property 'release' of null
  • Specify datetime format for WCF Data Service
  • Java - How to convert this string to date?
  • HTML elements in lxml get incorrectly encoded like Най
  • How to get a list of all blobs in a repository in Git
  • Insertion large number of Entities into SQL Server 2012 [duplicate]
  • Relative paths. baseUrl and paths not working on ionic2 - angular2
  • Android onKey w/ virtual keyboard
  • HttpServletRequest getLocale returns OS locale not browser locale
  • Time out Error in send mail
  • ApplePay PKPaymentAuthorizationViewController always shows processing
  • Graphics.CopyFromScreen [Web application] + The handle is invalid
  • x64 applications using gdi+: what are the consequences on performance?
  • How can I sort a a table with VBA with given text condition?
  • print() is showing quotation marks in results
  • Django rest serializer Breaks when data exists
  • Q promise. Difference between .when and .then
  • How to rebase a series of branches?
  • Pass value from viewmodel to script in zk
  • Illegal mix of collations for operation for date/time comparison
  • Azure Cloud Service Web Role web pages do not load
  • angularjs unit test when to use $rootScope.$new()
  • Eraser for UIBezierPath
  • what is the difference between the asp.net mvc application and asp.net web application
  • Compare two NSDates in iPhone
  • VB.net deserialize, JSON Conversion from type 'Dictionary(Of String,Object)' to type '
  • retrieve vertices with no linked edge in arangodb
  • Load html files in TinyMce
  • How can I get HTML syntax highlighting in my editor for CakePHP?
  • How to set the response of a form post action to a iframe source?
  • Understanding cpu registers
  • How do I configure my settings file to work with unit tests?
  • Change div Background jquery
  • Qt: Run a script BEFORE make
  • IndexOutOfRangeException on multidimensional array despite using GetLength check
  • reshape alternating columns in less time and using less memory
  • Binding checkboxes to object values in AngularJs
  • How do I use LINQ to get all the Items that have a particular SubItem?