Showing posts with label computer programming. Show all posts
Showing posts with label computer programming. Show all posts

Monday, February 12, 2007

Creating an RSS Reader: the Reader

In this article we are going to discuss how to create a PHP-based RSS reader. It would be helpful if you know something about XML, but not really necessary. RSS documents have three main tags: Title, Link and Description. And they all do exactly what their names suggest. I will go into detail about these tags in my second article dealing with “building an RSS file.” For now, we will only focus on the “reading” part of the article.
A downloadable file is available for this article.
As an extra I will introduce a database aspect of the reader. We will use the database to store and retrieve the latest stories. To continue with this article you will need PHP 4 and higher and optionally MYSQL.

Below is an example text from an RSS document:

Start example text

<item>

<title>First example</title>

<link>www.mylink.com/someplace.html</link>

<description>Some description, blah,blah,blah
</description>


</item>

<item>

<title>Thousands set to attend todays celebration</title>

<link>http://
www.mylink.com/someplace.html /NewsTopStories?m=318</link>


<description>blah,blah,blah </description>

</item>

End example text

Code

To create an RSS Reader in PHP, we need to:

  1. Create a function to read the start tag (start element).

  2. Create a function to read the end tag (endElement).

  3. Create function to read the text associated with the tags.


A typical RSS document will have the following structure:

<RSS>

<channel>

<item>

</item>

</channel>

</RSS>

A start tag is a tag without the “/” character, for example: <items>. An end tag is a tag with the “/” character, for example: </item>.

So the start and end tag functions will search for the “<item></item>” tags and once they have found those, it will be a simple matter of retrieving the text data from them to display.


Now, PHP provides us with several XML-related functions, a few of which we will be using here:


xml_parser_create() – Creates an instance of the xml parser object. Xml_parser_create() is a class. In order to use any class we need to instantiate it, or create a copy of it.

To create a new copy:

$xmlParser = xml_parser_create();

xml_set_element_handler() – Searches and sets the start and end elements(tags). This function sets the start and end tags for the parser. It accepts three parameters:

  • The parser: references the parser that is calling the handler.

  • The tagname: contains the name of the element for which the handler is called.

  • The attributes: an array that contains the element's attributes.


The parameters are used later in this article.


xml_set_character_data_handler() – This handles the text part of the tag elements. This function takes two parameters, the parser and data.




  • The parser: references the parser that is calling the handler.

  • The data: contains the character data as a string.


You can get more information about these and other XML functions at:


http://uk2.php.net/manual/en/ref.xml.php

The first thing we do is set the global variables that are going to be used by the functions.


$GLOBALS['titletag'] = false;

$GLOBALS['linktag'] = false;

$GLOBALS['descriptiontag'] = false;

$GLOBALS['thetitletxt'] = null;

$GLOBALS['thelinktxt'] = null;

$GLOBALS['thedesctxt'] = null;

These variables are going to be used to read in tag information from the RSS file that is going to be used with this reader.


The function below deals with the starting element. This function searches through the document to find one of the three tags we discussed earlier:


function startTag( $parser, $tagName, $attrs ) {

switch( $tagName ) {



case 'TITLE':

$GLOBALS['titletag'] = true;

break;

case 'LINK':

$GLOBALS['linktag'] = true;

break;

case 'DESCRIPTION':

$GLOBALS['descriptiontag'] = true;

break;

}

}

This next function deals with the end tag:

function endTag( $parser, $tagName ) {

switch( $tagName ) {



case 'TITLE':

echo "<p><b>" . $GLOBALS[the'titletxt'] . "</b><br/>";

$GLOBALS['titletag'] = false;

$GLOBALS['thetitletxt'] = "";

break;

case 'LINK':

echo "Link: <a href="". $GLOBALS['thelinktxt'] . "">" .
$GLOBALS['thelinktxt'] . "</a><br/>";


$GLOBALS['linktag'] = false;

$GLOBALS['thelinktxt'] = "";

break;

case 'DESCRIPTION':

echo "Desc: " . $GLOBALS['thedesctxt'] . "</p>";

$GLOBALS['descriptiontag'] = false;

$GLOBALS['thedesctxt'] = "";

break;

}

}

This next function verifies the tag that the text belongs to. Once we know which tag it is that we are dealing with, we set the global variable to true.

function txtTag( $parser, $text ) {

if( $GLOBALS['titletag'] == true ) {

$GLOBALS['thetitletxt'] .= htmlspecialchars( trim
($text) );




} else if( $GLOBALS['linktag'] == true ) {

$GLOBALS['thelinktxt'] .= trim( $text );

} else if( $GLOBALS['descriptiontag'] == true ) {

$GLOBALS['thedesctxt'] .= htmlspecialchars( trim
( $text ) );


}

}


Now that we have created the required functions, let's continue with the meat of the code:


function parsefile($RSSfile){

// Create an xml parser

$xmlParser = xml_parser_create();

// Set up element handler

xml_set_element_handler( $xmlParser, "startTag", "endTag" );



// Set up character handler

xml_set_character_data_handler( $xmlParser, "TxtTag" );

// Open connection to RSS XML file for parsing.

$fp = fopen( $RSSfile,"r" )

or die( "Cannot read RSS data file." );



// Parse XML data from RSS file.

while( $data = fread( $fp, 4096 ) ) {

xml_parse( $xmlParser, $data, feof( $fp ) );

or die(sprintf("XML error: %s at line %d",
xml_error_string(xml_get_error_code($xml_parser)),
xml_get_current_line_number($xml_parser)));


}



// Close file open handler

fclose( $fp );

// Free xml parser from memory

xml_parser_free( $xmlParser );

}

The above function calls both the startTag/endTag functions to loop through the XML file and displays the contents.




While it is good to have an RSS reader that can read any RSS document, it would be even better if you could store that information in a database and read it at your leisure when you are not connected to the Internet. It would also be good to be able to update your RSS file through the use of the database. It is relatively easy to achieve this, so let's create a table from which we will add our data:


CREATE TABLE `rss_tbl` (

`feed_id` int(5) NOT NULL auto_increment,

`title` varchar(200) NOT NULL default '',

`link` varchar(200) NOT NULL default '',

`description` text NOT NULL,

`the_date` date NOT NULL default '0000-00-00',

PRIMARY KEY (`feed_id`)

) TYPE=MyISAM AUTO_INCREMENT=1 ;

The table will store the individual links as they are read in by the RSS reader. Fill this table with data, using the following format:

  • Title – the title of your story.

  • Link – The link to your story.

  • Description – A short description of your story.


You can then use this data to write to your RSS file :

<?

$fp=fopen(“myrssfile”, “w+”);

if (!$fp){

echo “error opening file”;

exit;

}else{

$query1="Select *,DATE_FORMAT(the_date,'%W,%d %b %Y') as thedate
FROM rss_tbl WHERE DATE_SUB(CURDATE(),INTERVAL 30 DAY) ORDER BY
the_date DESC LIMIT 10 ";


$result=mysql_query($query1);

while($row=mysql_fetch_assoc($result)){

fwrite($fp,$row[‘title’]."\r\n");

fwrite($fp,$row[‘link’]."\r\n");

fwrite($fp,$row[‘description’]."\r\n");

fwrite($fp,$row[‘thedate’]."\r\n");

fwrite($fp, ” ”);

}//endwhile

fclose($fp);

}//end else

This code does two things. First, it opens (or creates) a file called "myrssfile":


$fp=fopen(“myrssfile”, “w+”);

The "w+" instructs PHP to create the file if it does not exist and to overwrite any contents that it might have. Then it checks to see if there are any problems opening the file:

if (!$fp){

echo “error opening file”;

exit;



If there are problems, the program displays a message and stops execution. If every thing is okay, a SQL query is run that retrieves ten articles from the database that were created in the last thirty days:

$query1="Select *,DATE_FORMAT(the_date,'%W,%d %b %Y') as
thedate FROM rss_tbl WHERE DATE_SUB(CURDATE(),INTERVAL 30 DAY)
ORDER BY the_date DESC LIMIT 10 ";


The DATE_FORMAT() function enables us to format the date column in what ever fashion we like. After this the code writes the database data to the file:

fwrite($fp,$row[‘title’]."\r\n");

fwrite($fp,$row[‘link’]."\r\n");

fwrite($fp,$row[‘description’]."\r\n");

fwrite($fp,$row[‘thedate’]."\r\n");

fwrite($fp, ” ”);

That’s it. A file called "myrssfile" should now be available and contain ten articles from the database. With small changes to the table you can expand the database usage and create an RSS aggregator, which is like a online "newspaper" that is entirely made up of RSS feeds from different websites.


To actually enter the data into the database, you only need to create a form that will take the necessary input values and write them to the table. In one of the articles that I wrote about RSS, I discuss how to create and populate an RSS file through a form. Although in that particular article we transfer data from a form to a file, with some small changes you can transfer the data from a form to a database.


Conclusion

To use this code make sure to include “xmlparser.php” in whatever page you are using. Then just call the “parsefile(“yourRSSfileLocation”)” function and your file data will parsed. Also, you might have noticed that in some news sites, the news headlines are scrolling from right to left on the screen. You can achieve this by using the <marquee> HTML tag; Google it to find out how to use it.


Download the xmlparser.php here. This is the same file we link to at the beginning of this article. Next, I will be discussing how to build a RSS File. Till then have fun.




 







Monday, February 05, 2007

XML::Simple for Perl Developers

"XML has become pervasive in the computing world and is buried more and more deeply into modern applications and operating systems. It's imperative for the Perl programmer to develop a good understanding of how to use it. In a surprisingly large number of cases, you only need one tool to integrate XML into a Perl application, XML::Simple. This article tells you where to get it, how to use it, and where to go next."

Monday, January 29, 2007

Who Killed the Webmaster?

Back in the frontier days of the web–when flaming skulls, scrolling marquees, and rainbow divider lines dominated the landscape–”Webmaster” was a vaunted, almost mythical, title. The Webmaster was a techno-shaman versed the black arts needed to make words and images appear on this new-fangled Information Superhighway. With the rise of the Webmaster coinciding with the explosive growth of the web, everyone predicted the birth of a new, well paying, and in-demand profession. Yet in 2007, this person has somehow vanished; even the term is scarcely mentioned. What happened? A decade later I’m left wondering “Who killed the Webmaster?”

Suspect #1: The march of technology


By 2000, I think every person in the developed world had a brother-in-law who created websites on the side. Armed with Frontpage and a pirated copy of Photoshop, he’d charge a reasonable fee per page (though posting more than three images cost extra.)

Eventually the web hit equilibrium and just having a website didn’t make a company hip and cutting-edge. Now management demanded that their website look better than the site immediately ranked above in search results. And as expensive as the sites were, ought they not “do something” too? Companies increasingly wanted an exceptional website requiring a sophisticated combination of talent to pull off. HTML and FTP skills, as useful as they had been, were no longer a sharp enough tool in the Webmaster’s toolbox. Technologies such as CSS and multi-tier web application development rapidly made WYSIWYG editors useless for all but ordinary websites. And with the explosion of competition and possibilities on the Internet few businesses were willing to pay for “ordinary”.

In 1995, the “professional web design firm” was single, talented person working from home. Today it’s a diverse team of back-end developers, front-end developers, graphic artists, UI designers, database and systems administrators, search engine marketing experts, analytics specialists, copywriters, editors, and project managers. The industry has simply grown so specialized, so quickly, for one person to hardly be a master of anything more than a single strand in the web.

Suspect #2: Is it the economy, stupid?


Then again, perhaps the disappearance of the Webmaster can better be explained by an underwhelming economy rather than overwhelming technology. Riding high on the bull market of the late 90’s, companies were increasingly willing to assume more risk to reach potential customers. This was especially true of small businesses, which traditionally have miniscule advertising and marketing budgets. Everyone wanted a piece of the Internet pie and each turned to the Webmaster to deliver. More than just a few Webmasters made a respectable living by cranking out a couple $500 websites every week.

Once the bubble burst in early 2000, the dot-com hangover left many small businesses clutching their heads and checking their wallets. As companies braced to solely maintain what they already had, the first cut inevitably was to marketing and advertising. In-house Webmasters were summarily let go, their duties hastily transferred to an already overworked office manager. Freelance Webmasters were hit even harder as business owners struggled to first take care of their own. The gold rush had crumbled to fools’ gold even faster than it had started.

While a few Webmaster were able to weather the storm—mostly those with either extraordinary skills or a gainfully employed spouse—the majority were forced to abandon their budding profession and return to the world of the mundane.

Suspect #3: The rise of Web 2.0


Another strong possibility is that the Internet has simply evolved beyond the Webmaster. “Web 2.0″ is the naked emperor of technological neologisms; we all nod our head at the term but then stammer when pressed for a definition. As far as I can tell, Web 2.0 is mostly about rounded corners, low-contrast pastel colors, and domain names with missing vowels. But it also seems to be about an emphasis on social collaboration. This may seem like a no-brainer given the connectedness of the Internet itself; however, thinking back to Web 1.0 there was a distinct lack of this philosophy. Web 1.0 was more an arms race to build “mindshare” and “eyeballs” in order to make it to the top of the hill with the most venture capital. Even the Web 1.0 term of “portal” conjures up an image of Lewis Carroll’s Alice tumbling down a hole and into an experience wholly managed by the resident experts–the Webmasters. Despite the power and promises to be so much more, the web wasn’t much different than network television or print. Even the most interesting and successful business models of the Web 1.0 era could have been accomplished years prior with an automated telephone system.

It wasn’t until after the failure of the initial experiment did people begin to rethink the entire concept of the Internet. Was the Webmaster as gatekeeper really necessary? If we all have a story to share, why can’t everyone contribute to the collective experience? Perhaps it was the overabundance Herman Miller chairs, but Web 1.0 was inarguably about style over substance. Yet, as anyone who’s ever visited MySpace can attest, today content is king. With all of us simultaneously contributing and consuming on blogs, MySpace, YouTube, Flickr, Digg, and SecondLife, who needs a Webmaster anymore?