473,324 Members | 2,535 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ


Parsing News: Where to Start

By Darrell Shifflett
Senior Editor, Linux-2000.org

Leeching News from sites and sending to HTML format

Steps:

  1. First we will write a small bash script to get the news.
  2. We will write a perl script for inserting news into MySQL.
  3. Using PHP we will make a simple configurable table for news results.

A simple bash script to get News:

# --------------- // cut // ---------------
#!/bin/sh

cd $HOME/News
rm -f $HOME/News/ultramode.txt
wget http://slashdot.org/ultramode.txt
cat $HOME/News/ultramode.txt | $HOME/newsbin/slashnews.pl

# --------------- // cut // ---------------

Step 2:

Writing the slashnews.pl script. Simple? Well let's see :)

-------------------------// cut //----------------------------

#!/usr/bin/perl
#

# Open your MySQL connection.
use DBI;
$dbh = DBI->connect("DBI:mysql:news:localhost", "user_id", "password")
or die "$dbh->errstr\n";

my $sth = $dbh->prepare(q{
delete from slashdot
}) or die "$dbh->errstr\n";
$sth->execute if (defined $sth);
$sth->finish if (defined $sth);

$FALSE = 0;
$TRUE = 1;

my $line;
my $haverecord = $FALSE;

my %record = ();
my @keys = ( "title", "link", "time", "author", "dept",
"category",
"numcomments", "storytype", "imagename" );

while ($line = <STDIN>) {
chomp $line;

if ($haverecord) {
my $last = $FALSE;

foreach $key (@keys) {
$record{$key} = $line;
if ($key eq $keys[-1]) {
$last = $TRUE;
}

if (!$last) {
$line = <STDIN>;
chomp $line;
}
}

$haverecord = $FALSE;

print "Title: ", $record{title}, "\n";

my $sql = "insert into slashdot values ( 0, ";

foreach $key (@keys) {
$sql .= $dbh->quote($record{$key}) .
($key eq $keys[-1] ? "" : ", ");
}

$sql .= ")";

$sth = $dbh->prepare($sql) or die "$dbh->errstr\n";
$sth->execute if (defined $sth);
$sth->finish if (defined $sth)
}
else {
if ($line =~ /^\%\%$/) {
$haverecord = $TRUE;
}
}
}

# Done , now lets clean up.
$dbh->disconnect;

exit 0;

# EOF

# <--//// script end ////-->

-------------------------// cut //----------------------------

  Put it into MySQL »

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.