Solved

ICS feeds

Posted on 2013-11-04
9
249 Views
Last Modified: 2013-11-05
im having problems with this failing i get errors in importing the records from ics files
i am looking to do different ones so i need some sort of error checking and to get the correct character encoding too (utf-8) i believe thats the problem here as its failing on areas that are utf-8 lettering.

i'm finding that theres all sorts of errors and rss/ical formats how does one import info of this type globally from multiple feeds. everyone i've tried so far errors and is not consistent at all with layout to character encoding. special characters & ' and like characters break the processing bad, so much its unusable.


can anyone help me in that it stops processing the ics file

here all my code and places i found things

thank you in advance for any code or help you may provide.


parser base i worked from
https://code.google.com/p/ics-parser/


navys piers ics file one that stops with an error
(i have not worked on this for a while but if you really need the error tell me ill try and duplicate it again, it should error for you and stop processing)

https://www.google.com/calendar/ical/navypierit%40gmail.com/public/basic.ics

my database structure (where im saving the info)
CREATE TABLE IF NOT EXISTS `iwia_events` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `event_title` varchar(255) NOT NULL DEFAULT '',
  `event_location` varchar(255) DEFAULT NULL,
  `lat` float(10,6) NOT NULL,
  `lng` float(10,6) NOT NULL,
  `type` text NOT NULL COMMENT 'used for google map icon',
  `event_price` varchar(20) DEFAULT NULL,
  `event_image` varchar(255) DEFAULT 'http://completelocal.info/images/iwia/default-event-avatar.png',
  `event_url` varchar(200) DEFAULT NULL,
  `event_start` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `event_end` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `event_text` text,
  `extra_info` text,
  `facebook` enum('0','1') NOT NULL DEFAULT '0',
  `twitter` enum('0','1') NOT NULL DEFAULT '0',
  `active` enum('0','1') NOT NULL DEFAULT '1',
  `categories` text NOT NULL,
  `tags` varchar(359) NOT NULL,
  `date_published` int(10) NOT NULL,
  `author_id` int(15) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=MyISAM  DEFAULT CHARSET=latin1 AUTO_INCREMENT=74 ;

Open in new window


my code to try and do all this
<?php
// https://www.google.com/calendar/ical/navypierit%40gmail.com/public/basic.ics
include('../../Admin_Panel/dbc.php');
//mysql_set_charset(‘utf8?);
header('Content-Type: text/html; charset=utf-8');
//mb_internal_encoding('utf-8');
//header(“Content-type: text/html; charset=utf-8?);

/**
 * This example demonstrates how the Ics-Parser should be used.
 *
 * PHP Version 5
 *
 * @category Example
 * @package  Ics-parser
 * @author   Martin Thoma <info@martin-thoma.de>
 * @license  http://www.opensource.org/licenses/mit-license.php  MIT License
 * @version  SVN: <svn_id>
 * @link     http://code.google.com/p/ics-parser/
 * @example  $ical = new ical('MyCal.ics');
 *           print_r( $ical->get_event_array() );
 */
require 'class.iCalReader.php';

// Do this for download convert
/*
echo mb_convert_encoding(
    file_get_contents('http://www.tvrage.com/quickinfo.php?show=Surviver&ep=20x02&exact=0'),
    "HTML-ENTITIES",
    "UTF-8"
  );
*/

//$ical   = new ICal('ical_navy_pier.ics');
$ical   = new ICal('basic2.ics');
$events = $ical->events();

$date = $events[0]['DTSTART'];
echo "The ical date: ";
echo $date;
echo "<br/>";

echo "The Unix timestamp: ";
echo $ical->iCalDateToUnixTimestamp($date);
echo "<br/>";

echo "The number of events: ";
echo $ical->event_count;
echo "<br/>";

echo "The number of todos: ";
echo $ical->todo_count;
echo "<br/>";
echo "<hr/><hr/>";
$num_rows=1;
foreach ($events as $event) {
	//utf-8
	$event['SUMMARY']=utf8_decode($event['SUMMARY']);
	$event['DESCRIPTION']=utf8_decode($event['DESCRIPTION']);
	$event['LOCATION']=utf8_decode($event['LOCATION']);

    echo "Row: ".$num_rows."<br>";    
    echo "SUMMARY: ".$event['SUMMARY']."<br/>";
    echo "DTSTART: ".$event['DTSTART']." - UNIX-Time: ".$ical->iCalDateToUnixTimestamp($event['DTSTART'])."<br/>";
    $dtStart = date('Y-m-d H:i:s',$ical->iCalDateToUnixTimestamp($event['DTSTART']));
    $dtEnd = date('Y-m-d H:i:s',$ical->iCalDateToUnixTimestamp($event['DTEND']));
    echo "formated date: dtstart: ".$dtStart." / dtend: ".$dtEnd."<br>";
    echo "DTEND: ".$event['DTEND']."<br/>";
    echo "DTSTAMP: ".$event['DTSTAMP']."<br/>";
    echo "UID: ".$event['UID']."<br/>";
    echo "CREATED: ".$event['CREATED']."<br/>";
    echo "DESCRIPTION: ".$event['DESCRIPTION']."<br/>";
    echo "LAST-MODIFIED: ".$event['LAST-MODIFIED']."<br/>";
    echo "LOCATION: ".$event['LOCATION']."<br/>";
    echo "SEQUENCE: ".$event['SEQUENCE']."<br/>";
    echo "STATUS: ".$event['STATUS']."<br/>";
    echo "TRANSP: ".$event['TRANSP']."<br/>";
    echo "<hr/>";

/*
 * iwia_feed_navy-pier_chicago-il
 * id
 * event_title
 * event_location
 * event_price
 * event_image
 * event_url
 * event_start
 * event_end
 * event_text
 * extra_info
 * active
 * categories
*/
//$event['DESCRIPTION']= str_replace("\"",'',$event['DESCRIPTION']);
$event['DESCRIPTION'] = preg_replace('/\s+/', ' ', $event['DESCRIPTION']);
$event['DESCRIPTION']= mysql_real_escape_string($event['DESCRIPTION']);

$sql="INSERT INTO iwia_events (event_title, event_location, lat, lng, type, event_url, event_image, event_start, event_end, event_text, extra_info, categories)
VALUES
 ('$event[SUMMARY]','Navy Pier, Inc 600 E Grand Ave, Chicago, IL 60611','41.8917374','-87.5998759','point_of_interest, amusement_park, establishment','http://blog.navypier.com','http://blog.navypier.com/wp-content/uploads/headway/header-uploads/navy-blog2_08.png','$dtStart','$dtEnd','$event[DESCRIPTION]','Location: $event[LOCATION]<br>Navy Pier, Inc <br>600 E Grand Ave, Chicago, IL 60611<br>Phone: (312) 595-7437','Navy Pier Event')";

//echo $sql."<br>";
$result = mysql_query($sql);
if (!$result)  {  die('Error: ' . mysql_error()); }
//$num_rows = mysql_num_rows($result);
$num_rows++;

}
echo "$num_rows Rows added <br> \n";
mysql_close();
?>

Open in new window

0
Comment
Question by:Johnny
  • 6
  • 3
9 Comments
 
LVL 108

Expert Comment

by:Ray Paseur
Comment Utility
... how does one import info of this type globally from multiple feeds
I've wrestled with things like this before from multiple weather APIs and all I can tell you is that you're on the threshold of a long stretch of programming.  The general design is probably a polymorphic adapter.  Each extension of the parent class will be adapted to the specific characteristics of a given ICS feed, with the objective of creating an abstracted and normalized data set that represents the collected information from many different ICS streams.

As far as character sets go, this article will give you a leg up on the inevitable conversion from the western-European character sets to UTF-8.  I would set up the data base table in UTF-8 and convert the collision characters on the inbound side of the adapter.
http://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/A_11880-Unicode-PHP-and-Character-Collisions.html

The advantages of creating the adapter design comes to you in three principal ways.  First, you can create an adapter that fits a given ICS feed, and immediately begin using the information from that feed.  Second, you can create new adapters for different ICS feeds, and as they are deployed, there need not be any disruption of the existing ICS adapters.  And third, when you find that an existing adapter can serve a new ICS feed, you can immediately integrate the new feed.

Good luck, ~Ray
0
 

Author Comment

by:Johnny
Comment Utility
@ray
thanks as always. so what do i do make a conversion table like this one
http://www.laprbass.com/RAY_entitize_western_letters.php?charset=windows-1252
i thought that php had routines for that, as you can see in my code i have experimented with a few ways to do that. so far nothing works utf-8 is such a pain.

is there a way to at least fix my current problem with this feed? did i miss something.

your android suggested reading is a bit not only to lengthy but i don't know android (java) to well to fallow it. as a path to fallow i understand i need to check for valid things. i cant believe theres not a way to fix this all ready. ics and rss has been around for a long time, and it seams to be no standard for it that anyones fallowing. and what supper stinks is the new site im involved with relies heavily on feeds.

i guess ill just keep getting feeds one at  a time and see how its breaks and try and build a function to fix that problem and see if i can check fault first in processing and accumulate fixes as i go.

so getting back to this utf-8 problem and wonky characters as you can see i've tried many ways to fix it, even tried direct matching. so what's the best way to fix this utf-8 problem

thanks again
0
 

Author Comment

by:Johnny
Comment Utility
what about possibly converting the ics file directly first line by line to non utf-8 format mainly for summery and description entries?

so its not doing it as its trying to process.

then process the pre-converted info via a ics parser?
0
 
LVL 108

Accepted Solution

by:
Ray Paseur earned 500 total points
Comment Utility
You want to be converting in the direction of UTF-8, not away from it.  The navy piers data set contains some characters that do not exist in my text editor's capabilities.  Unfortunately Textpad thinks the file is Windows-1252 when it's actually already UTF-8.  Here is how to detect the encoding.

<?php // RAY_temp_pern.php
error_reporting(E_ALL);
$url = 'https://www.google.com/calendar/ical/navypierit%40gmail.com/public/basic.ics';
$str = file_get_contents($url);
$var = mb_detect_encoding($str);
var_dump($var); // UTF-8

Open in new window

The article wasn't specific to this problem, it was just guidance in understanding the concept of polymorphic things.  The IS-A rule was well explained, IIRC.  You don't need to understand Java to get the major concept, which is that you may have several ICS data sources, but they're all ICS, even if they have different encoding and different tags in some cases.  The really hard and frustrating way to address this problem is to try to write one large generalized script that can handle all kinds of ICS feeds.  The easier way is to tailor an adapter script for each ICS data source.  Your code will recognize the ICS feed (probably by its URL) and will auto-load the appropriate adapter script.

Here is the source for the conversion table (but I would urge you to convert everything to UTF-8 instead).

<?php // RAY_entitize_western_letters.php
error_reporting(E_ALL);

// EXTENDED-ASCII CHARACTERS COLLIDE WITH UTF-8 ENCODINGS AND CANNOT BE RENDERED CORRECTLY
// DEMONSTRATE HOW TO TRANSLATE SOME WESTERN CHARACTERS INTO ENGLISH-PRINTABLE, UTF-8 OR ENTITIES
// SEE http://www.joelonsoftware.com/articles/Unicode.html

// CHOOSE A CHARSET VALUE FROM THE URL ARGUMENT utf-8, windows-1252, iso-8859-1, iso-8859-15, etc.
$charset = isset($_GET['charset']) ? $_GET['charset'] : 'ascii';

// START WITH HTML5 DOCTYPE AND WHATEVER CHARSET
$html5 = <<<ENDHTML5
<!DOCTYPE html>
<html dir="ltr" lang="en-US">
<head>
<meta charset="$charset" />
<title>CHARACTER SET $charset</title>
</head>
<body>
<pre>
YOU MIGHT WANT TO USE "VIEW SOURCE" TO LOOK AT THESE
THE ORIGINAL CHARACTER SET IS <b>$charset</b>
ENDHTML5;

echo $html5;


// TEST CASES
$arr
= array
( 'Françoise'
, 'Å-Ring'
, 'ßeta or Beta?'
, 'Öh löök, umlauts!'
, 'ENCYCLOPÆDIA'
, 'ça va! mon élève mi niña?'
, 'A stealthy ƒart'
, 'Ðe lónlí blú bojs'
)
;


// DISPLAY EACH TEST CASE USING ENTITIZED CHARACTERS
echo PHP_EOL . 'USING NUMERICALLY ENTITIZED CHARACTERS';
foreach ($arr as $str)
{
    echo PHP_EOL
    . $str
    . ' = '
    . '<strong>'
    . mungstring($str, 'ENT')
    . '</strong>'
    ;
}
echo PHP_EOL;


// DISPLAY EACH TEST CASE USING TEXT TRANSLATIONS
echo PHP_EOL . 'USING TEXT TRANSLATIONS';
foreach ($arr as $str)
{
    echo PHP_EOL
    . $str
    . ' = '
    . '<strong>'
    . mungstring($str, 'TXT')
    . '</strong>'
    ;
}
echo PHP_EOL;


// DISPLAY EACH TEST CASE USING UTF-8 TRANSLATIONS
echo PHP_EOL . 'USING UTF-8 CONVERSIONS';
foreach ($arr as $str)
{
    echo PHP_EOL
    . $str
    . ' = '
    . '<strong>'
    . mungstring($str, 'UTF')
    . '</strong>'
    ;
}
echo PHP_EOL;


// EXAMPLE SHOWING HOW TO TURN A PORTUGESE NAME INTO PART OF A URL STRING
$str = 'Armação de Pêra';
$new = mungString($str);
$new = strtolower($new);
$new = str_replace(' ', '-', $new);

// SHOW THE URL STRING
echo PHP_EOL
. '<strong>'
. '<a target="blank" href="http://lmgtfy.com?q='
. $new
. '">'
. mungString($str, 'Ent')
. '</a>'
. '</strong>'
;


// EXAMPLE SHOWING HOW TO TURN A STRING INTO A NUMERICALLY ENTITIZED STRING
echo PHP_EOL;
$str = 'Armação de Pêra';
$new = mungString($str, 'ENTITIES');
echo PHP_EOL
. $new
. ' = '
. '<strong>'
. htmlentities($new)
. '</strong>'
;


// EXAMPLE SHOWING ALL THE ORIGINAL LETTERS
echo PHP_EOL;
print_r( mungstring(NULL, NULL) );


// A FUNCTION TO RETURN THE WESTERNIZED/ENTITIZED STRING
function mungString($str, $return='TEXT')
{
    // OUR REPLACEMENT ARRAY OF ENTITIES
    static
    $entity
    = array();

    // OUR REPLACEMENT ARRAY OF UTF-8 CHARACERS
    static
    $utf8
    = array();

    // OUR REPLACEMENT ARRAY OF CHARACTERS (YOU MAY WANT SOME CHANGES HERE)
    static
    $normal
    = array
    ( 'ƒ' => 'f'  // http://en.wikipedia.org/wiki/%C6%91 florin
    , 'Š' => 'S'  // http://en.wikipedia.org/wiki/%C5%A0 S-caron (voiceless postalveolar fricative)
    , 'š' => 's'  // http://en.wikipedia.org/wiki/%C5%A0 s-caron
    , 'Ð' => 'Dh' // http://en.wikipedia.org/wiki/Eth (voiced dental fricative)
    , 'Ž' => 'Z'  // http://en.wikipedia.org/wiki/%C5%BD Z-caron (voiced postalveolar fricative)
    , 'ž' => 'z'  // http://en.wikipedia.org/wiki/%C5%BD z-caron
    , 'À' => 'A'
    , 'Á' => 'A'
    , 'Â' => 'A'
    , 'Ã' => 'A'
    , 'Ä' => 'A'
    , 'Å' => 'A'
    , 'Æ' => 'E'
    , 'Ç' => 'C'
    , 'È' => 'E'
    , 'É' => 'E'
    , 'Ê' => 'E'
    , 'Ë' => 'E'
    , 'Ì' => 'I'
    , 'Í' => 'I'
    , 'Î' => 'I'
    , 'Ï' => 'I'
    , 'Ñ' => 'N'
    , 'Ò' => 'O'
    , 'Ó' => 'O'
    , 'Ô' => 'O'
    , 'Õ' => 'O'
    , 'Ö' => 'O'
    , 'Ø' => 'O'
    , 'Ù' => 'U'
    , 'Ú' => 'U'
    , 'Û' => 'U'
    , 'Ü' => 'U'
    , 'Ý' => 'Y'
    , 'Þ' => 'Th' // http://en.wikipedia.org/wiki/Thorn_%28letter%29 (Capital Thorn is smaller)
    , 'ß' => 'Ss'
    , 'à' => 'a'
    , 'á' => 'a'
    , 'â' => 'a'
    , 'ã' => 'a'
    , 'ä' => 'a'
    , 'å' => 'a'
    , 'æ' => 'e'
    , 'ç' => 'c'
    , 'è' => 'e'
    , 'é' => 'e'
    , 'ê' => 'e'
    , 'ë' => 'e'
    , 'ì' => 'i'
    , 'í' => 'i'
    , 'î' => 'i'
    , 'ï' => 'i'
    , 'ð' => 'dh'  // http://en.wikipedia.org/wiki/Eth
    , 'ñ' => 'n'
    , 'ò' => 'o'
    , 'ó' => 'o'
    , 'ô' => 'o'
    , 'õ' => 'o'
    , 'ö' => 'o'
    , 'ø' => 'o'
    , 'ù' => 'u'
    , 'ú' => 'u'
    , 'û' => 'u'
    , 'ý' => 'y'
    , 'ý' => 'y'
    , 'þ' => 'th' // http://en.wikipedia.org/wiki/Thorn_%28letter%29
    , 'ÿ' => 'y'
    )
    ;

    // THE EXPECTED RETURN
    $r = strtoupper(substr($return,0,1));

    // RETURN THE "TRANSLATED" TEXT
    if ($r == 'T') return strtr($str, $normal);

    // RETURN THE "ENTITIZED" TEXT
    if ($r == 'E')
    {
        if (empty($entity))
        {
            foreach ($normal as $key => $nothing)
            {
                $entity[$key] = '&#' . ord($key) . ';';
            }
        }
        return strtr($str, $entity);
    }

    // RETURN THE UTF-8 TEXT
    if ($r == 'U')
    {
        if (empty($utf8))
        {
            foreach ($normal as $key => $nothing)
            {
                $utf8[$key] = utf8_encode($key);
            }
        }
        return strtr($str, $utf8);
    }

    // MIGHT BE USEFUL TO GET THE LIST OF ORIGINAL LETTERS
    return array_keys($normal);
}

Open in new window

0
How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

 

Author Comment

by:Johnny
Comment Utility
i really cant believe this is so horrid. and that feeds and readers do not have a standard any ones really fallowing. and that utf-8 is even used, its the worst problem i have ever faced.

i guess i'm stuck with still looking for a way to fix this problem as i really not gonna take months to write code for each one, altho it looks like thats the case. im sick of this stupid problem.

i really did not get my answer on how to fix my current problem i have tried to convert it in both input and output displaying but with it not parsing right and failing upon processing, good to know it was not me i too use textpad a lot) it seems like im going ot have to see if i can find someplace else that parses the info differently or has a check for it, yet more searching.

i did come across a ics parser that has more routines i was trying to keep it simple. guess thats not gonna be the case.

thank for the help i'm going to accept the answer as i do not believe this can be easily solved and you did point me in a  direction. altho its far from an solution.
0
 

Author Closing Comment

by:Johnny
Comment Utility
thx
0
 
LVL 108

Expert Comment

by:Ray Paseur
Comment Utility
Thanks for the points.  Sorry it's like this, but I've found the same issues any time I needed to get similar data from diverse sources.  There's always something that keeps you from using the same code, and usually it's something fairly small.

A good design for this kind of multiple-API might be an abstract class that defined the core functionality and abstracted out the specifics of each API.  The script would detect the API and instantiate the concrete class that matched the API.  

Best of luck, ~Ray
0
 

Author Comment

by:Johnny
Comment Utility
still a bit over my current understanding.
0
 

Author Comment

by:Johnny
Comment Utility
@ray

there is no way to fix this problem i currently have with navy pier? so it process the rest of the file..i have checked utf and tried to account for it but it does not work unless i checking or parsing wrong

thanks
0

Featured Post

6 Surprising Benefits of Threat Intelligence

All sorts of threat intelligence is available on the web. Intelligence you can learn from, and use to anticipate and prepare for future attacks.

Join & Write a Comment

Foreword (July, 2015) Since I first wrote this article, years ago, a great many more people have begun using the internet.  They are coming online from every part of the globe, learning, reading, shopping and spending money at an ever-increasing ra…
These days socially coordinated efforts have turned into a critical requirement for enterprises.
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
This tutorial will teach you the core code needed to finalize the addition of a watermark to your image. The viewer will use a small PHP class to learn and create a watermark.

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

8 Experts available now in Live!

Get 1:1 Help Now