Importing MySQL File with Special Characters

I am trying to transfer a WordPress installation to a new server and am experiencing errors bringing over the pages and posts.  Specifically, any special characters will escape the *.sql file I am trying to upload, so the full post is not transferred into the new database.  

Here is an example where the "Welcome to" is copied into the new page, but when it hits the double quotes for “Alternative Considerations of Jonestown and Peoples Temple,” it stops importing the content, and moves on:
INSERT INTO `wp_posts` (`ID`, `post_author`, `post_date`, `post_date_gmt`, `post_content`, `post_title`, `post_excerpt`, `post_status`, `comment_status`, `ping_status`, `post_password`, `post_name`, `to_ping`, `pinged`, `post_modified`, `post_modified_gmt`, `post_content_filtered`, `post_parent`, `guid`, `menu_order`, `post_type`, `post_mime_type`, `comment_count`) VALUES
(7, 1, '2013-01-05 18:33:49', '2013-01-06 02:33:49', '[caption id="attachment_69" align="alignright" width="240"]<a href="http://jonestown.sdsu.edu/wordpress/?attachment_id=69" rel="attachment wp-att-69"><img class="size-full wp-image-69" alt="Children in Peoples Temple" src="http://jonestown.sdsu.edu/wordpress/wp-content/uploads/2013/01/snapshot_children.jpg" width="240" height="213" /></a> Children in Peoples Temple[/caption]\r\n\r\nWelcome to “Alternative Considerations of Jonestown and Peoples Temple,” sponsored by the Department of Religious Studies at San Diego State University. This website is designed to give personal and scholarly perspectives on a major event in the history of religion in America. Its primary purpose is to present information about Peoples Temple as accurately and objectively as possible. In an effort to be impartial, we offer many diverse views and opinions about the Temple and the events in Jonestown.\r\n\r\nWe hope that visitors to the site will come away with an understanding that the story of Jonestown did not start or end on 18 November 1978.\r\n\r\nWhat is unique about this website are three main features:\r\n<ol>\r\n      <li>Memorialization of those who died and those who survived the tragedy of 18 November 1978 in order to remember their lives and humanize their deaths.</li>\r\n      <li>Documentation of the numerous government investigations into Peoples Temple and Jonestown through materials released under the Freedom of Information Act.</li>\r\n      <li>Presentation of Peoples Temple and its members in their own words: through articles, tapes, letters, photographs and other items. These materials let readers make their own judgments about the group and its end.</li>\r\n</ol>\r\nTape transcripts, summaries, some primary source documents, and photographs not otherwise designated as copyrighted on this site are free and available to the public for use by crediting: The Jonestown Institute, <a title="http://jonestown.sdsu.edu" href="http://jonestown.sdsu.edu">http://jonestown.sdsu.edu</a>. Other items require the author’s permission for re-publication.\r\n<table style="width: 100%; background-color: #f2e4c9; box-shadow: 0 0 3px rgba(0, 0, 0, 0.1);" border="0" cellspacing="0" cellpadding="0">\r\n<tbody>\r\n<tr>\r\n<td style="width: 22%;" valign="top"><a href="mailto:remoore@mail.sdsu.edu">Rebecca Moore</a>\r\nSite Manager</td>\r\n<td style="width: 28%;" valign="top"><a href="mailto:fieldingmcgehee@yahoo.com">Fielding M. McGehee III</a>\r\nResearch Director</td>\r\n<td style="width: 22%;" valign="top"><a href="mailto:ekshaules@gmail.com"> Elizabeth Shaules</a>\r\nSite Designer</td>\r\n<td style="width: 28%;" valign="top"><a href="mailto:Rikke@Wettendorff.net"><span>Rikke Wettendorff</span></a>\r\n<span class="MsoNormal">Peoples Temple\r\nDiscussion Forum</span></td>\r\n</tr>\r\n</tbody>\r\n</table>', 'Welcome!', '', 'publish', 'open', 'open', '', 'welcome', '', '', '2013-01-11 21:48:00', '2013-01-12 05:48:00', '', 0, 'http://jonestown.sdsu.edu/wordpress/?page_id=7', 0, 'page', '', 0),

How can I get around this?  

Thanks in advance!
maeve100Asked:
Who is Participating?
 
Ray PaseurConnect With a Mentor Commented:
ASCII is a subset of UTF-8, and no UTF-8 multibyte character appears in ASCII, so by definition a collision cannot occur between these encodings.  However Windows-1252 defines characters in the "dead zones" of both UTF-8 and ISO-8859-x.  The quote mark immediately before "Alternative" is hex 93, decimal 147 and the trailing quote mark is hex 94, decimal code point 148.  These are invalid codes unless your entire process is using Windows-1252 encoding.  See these links:
http://www.laprbass.com/RAY_temp_maeve100.php?charset=windows-1252 WORKS
http://www.laprbass.com/RAY_temp_maeve100.php?charset=utf-8 FAILS
http://www.laprbass.com/RAY_temp_maeve100.php?charset=ISO-8859-1 WORKS IN SOME BROWSERS

PHP internal encoding, absent any deliberate changes, is ISO-8859-1.  So my conclusion is that there is a character-encoding error somewhere along the line.  You might want to get the background knowledge from this article.
http://www.joelonsoftware.com/articles/Unicode.html

Check the original installation to see what character set was used in each of the HTML, the PHP, the MySQL.  If you use those same encodings consistently in the new installation you should be OK.

If you want to experiment with different encodings, this script will help.

<?php // RAY_temp_maeve100.php
error_reporting(E_ALL);


// SEE http://www.joelonsoftware.com/articles/Unicode.html


// CHOOSE A CHARSET VALUE FROM THE URL ARGUMENT utf-8, windows-1252, iso-8859-1, iso-8859-15, etc.
$charset = isset($_GET['charset']) ? $_GET['charset'] : 'windows-1252';

// SET THE INTERNAL ENCODING
mb_internal_encoding($charset);


// START WITH HTML5 DOCTYPE
$html5 = <<<ENDHTML5
<!DOCTYPE html>
<html dir="ltr" lang="en-US">
<head>
<meta charset="$charset" />
<title>CHARACTER SET $charset</title>
</head>
<body>
<pre>
THE ORIGINAL CHARACTER SET IS <b>$charset</b>

Welcome to “Alternative Considerations of Jonestown and Peoples Temple,” sponsored by 

ENDHTML5;

echo $html5;

Open in new window

HTH, ~Ray
0
 
gplanaCommented:
Have you identified which are the special characters that aren't recognized well? Maybe it's a problem of character encoding? See if your file is using ASCII and your database is using UTF-8, for example.
0
 
Beverley PortlockCommented:
The SQL above works just fine on a Wordpress import. In my test the table type was UTF8_general_ci however the SQL did fail until I removed the trailing comma

', 0, 'page', '', 0),  

changed to

', 0, 'page', '', 0)

(Last comma removed)
0
Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

 
gplanaCommented:
You should precede any single quote (') with a back-slash (\')

So this snipped:
               ', 0, 'page', '', 0),  
should be:
               \', 0, \'page\', \'\', 0),
0
 
Jason C. LevineConnect With a Mentor No oneCommented:
The "fancy quotes" can cause problems on import.  I always suggest doing a search and replace on those from their normal state to the HTML entity version.
0
 
Ray PaseurCommented:
This will replace the angle-quotes with numeric entities.  Use "View Source" to see the results.

<?php // RAY_temp_maeve100.php
error_reporting(E_ALL);


// THE STRING WITH MICROSOFT CHARACTERS
$str = 'Welcome to “Alternative Considerations of Jonestown and Peoples Temple,” sponsored by';

// TRANSFORM THE QUOTES
$arr = array
( '“' => '&#147;'
, '”' => '&#148;'
)
;
$new = strtr($str, $arr);

// SHOW THE NEW VALUES
echo $new;

Open in new window

HTH, ~Ray
0
 
maeve100Author Commented:
Unfortunately, I ended up having to comb the SQL files and replace all the special characters with their HTML counterpart.  I'm not sure why the import is so sensitive that it can't compensate for things like this - even an accented letter would break it.
0
 
Ray PaseurCommented:
Yeah, it's a nasty problem once the data base gets polluted!
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.