Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

Importing MySQL File with Special Characters

Posted on 2013-06-23
8
Medium Priority
?
845 Views
Last Modified: 2013-06-24
I am trying to transfer a WordPress installation to a new server and am experiencing errors bringing over the pages and posts.  Specifically, any special characters will escape the *.sql file I am trying to upload, so the full post is not transferred into the new database.  

Here is an example where the "Welcome to" is copied into the new page, but when it hits the double quotes for “Alternative Considerations of Jonestown and Peoples Temple,” it stops importing the content, and moves on:
INSERT INTO `wp_posts` (`ID`, `post_author`, `post_date`, `post_date_gmt`, `post_content`, `post_title`, `post_excerpt`, `post_status`, `comment_status`, `ping_status`, `post_password`, `post_name`, `to_ping`, `pinged`, `post_modified`, `post_modified_gmt`, `post_content_filtered`, `post_parent`, `guid`, `menu_order`, `post_type`, `post_mime_type`, `comment_count`) VALUES
(7, 1, '2013-01-05 18:33:49', '2013-01-06 02:33:49', '[caption id="attachment_69" align="alignright" width="240"]<a href="http://jonestown.sdsu.edu/wordpress/?attachment_id=69" rel="attachment wp-att-69"><img class="size-full wp-image-69" alt="Children in Peoples Temple" src="http://jonestown.sdsu.edu/wordpress/wp-content/uploads/2013/01/snapshot_children.jpg" width="240" height="213" /></a> Children in Peoples Temple[/caption]\r\n\r\nWelcome to “Alternative Considerations of Jonestown and Peoples Temple,” sponsored by the Department of Religious Studies at San Diego State University. This website is designed to give personal and scholarly perspectives on a major event in the history of religion in America. Its primary purpose is to present information about Peoples Temple as accurately and objectively as possible. In an effort to be impartial, we offer many diverse views and opinions about the Temple and the events in Jonestown.\r\n\r\nWe hope that visitors to the site will come away with an understanding that the story of Jonestown did not start or end on 18 November 1978.\r\n\r\nWhat is unique about this website are three main features:\r\n<ol>\r\n      <li>Memorialization of those who died and those who survived the tragedy of 18 November 1978 in order to remember their lives and humanize their deaths.</li>\r\n      <li>Documentation of the numerous government investigations into Peoples Temple and Jonestown through materials released under the Freedom of Information Act.</li>\r\n      <li>Presentation of Peoples Temple and its members in their own words: through articles, tapes, letters, photographs and other items. These materials let readers make their own judgments about the group and its end.</li>\r\n</ol>\r\nTape transcripts, summaries, some primary source documents, and photographs not otherwise designated as copyrighted on this site are free and available to the public for use by crediting: The Jonestown Institute, <a title="http://jonestown.sdsu.edu" href="http://jonestown.sdsu.edu">http://jonestown.sdsu.edu</a>. Other items require the author’s permission for re-publication.\r\n<table style="width: 100%; background-color: #f2e4c9; box-shadow: 0 0 3px rgba(0, 0, 0, 0.1);" border="0" cellspacing="0" cellpadding="0">\r\n<tbody>\r\n<tr>\r\n<td style="width: 22%;" valign="top"><a href="mailto:remoore@mail.sdsu.edu">Rebecca Moore</a>\r\nSite Manager</td>\r\n<td style="width: 28%;" valign="top"><a href="mailto:fieldingmcgehee@yahoo.com">Fielding M. McGehee III</a>\r\nResearch Director</td>\r\n<td style="width: 22%;" valign="top"><a href="mailto:ekshaules@gmail.com"> Elizabeth Shaules</a>\r\nSite Designer</td>\r\n<td style="width: 28%;" valign="top"><a href="mailto:Rikke@Wettendorff.net"><span>Rikke Wettendorff</span></a>\r\n<span class="MsoNormal">Peoples Temple\r\nDiscussion Forum</span></td>\r\n</tr>\r\n</tbody>\r\n</table>', 'Welcome!', '', 'publish', 'open', 'open', '', 'welcome', '', '', '2013-01-11 21:48:00', '2013-01-12 05:48:00', '', 0, 'http://jonestown.sdsu.edu/wordpress/?page_id=7', 0, 'page', '', 0),

How can I get around this?  

Thanks in advance!
0
Comment
Question by:maeve100
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
8 Comments
 
LVL 15

Expert Comment

by:gplana
ID: 39269045
Have you identified which are the special characters that aren't recognized well? Maybe it's a problem of character encoding? See if your file is using ASCII and your database is using UTF-8, for example.
0
 
LVL 34

Expert Comment

by:Beverley Portlock
ID: 39269194
The SQL above works just fine on a Wordpress import. In my test the table type was UTF8_general_ci however the SQL did fail until I removed the trailing comma

', 0, 'page', '', 0),  

changed to

', 0, 'page', '', 0)

(Last comma removed)
0
 
LVL 15

Expert Comment

by:gplana
ID: 39269207
You should precede any single quote (') with a back-slash (\')

So this snipped:
               ', 0, 'page', '', 0),  
should be:
               \', 0, \'page\', \'\', 0),
0
Migrating Your Company's PCs

To keep pace with competitors, businesses must keep employees productive, and that means providing them with the latest technology. This document provides the tips and tricks you need to help you migrate an outdated PC fleet to new desktops, laptops, and tablets.

 
LVL 111

Accepted Solution

by:
Ray Paseur earned 1600 total points
ID: 39269429
ASCII is a subset of UTF-8, and no UTF-8 multibyte character appears in ASCII, so by definition a collision cannot occur between these encodings.  However Windows-1252 defines characters in the "dead zones" of both UTF-8 and ISO-8859-x.  The quote mark immediately before "Alternative" is hex 93, decimal 147 and the trailing quote mark is hex 94, decimal code point 148.  These are invalid codes unless your entire process is using Windows-1252 encoding.  See these links:
http://www.laprbass.com/RAY_temp_maeve100.php?charset=windows-1252 WORKS
http://www.laprbass.com/RAY_temp_maeve100.php?charset=utf-8 FAILS
http://www.laprbass.com/RAY_temp_maeve100.php?charset=ISO-8859-1 WORKS IN SOME BROWSERS

PHP internal encoding, absent any deliberate changes, is ISO-8859-1.  So my conclusion is that there is a character-encoding error somewhere along the line.  You might want to get the background knowledge from this article.
http://www.joelonsoftware.com/articles/Unicode.html

Check the original installation to see what character set was used in each of the HTML, the PHP, the MySQL.  If you use those same encodings consistently in the new installation you should be OK.

If you want to experiment with different encodings, this script will help.

<?php // RAY_temp_maeve100.php
error_reporting(E_ALL);


// SEE http://www.joelonsoftware.com/articles/Unicode.html


// CHOOSE A CHARSET VALUE FROM THE URL ARGUMENT utf-8, windows-1252, iso-8859-1, iso-8859-15, etc.
$charset = isset($_GET['charset']) ? $_GET['charset'] : 'windows-1252';

// SET THE INTERNAL ENCODING
mb_internal_encoding($charset);


// START WITH HTML5 DOCTYPE
$html5 = <<<ENDHTML5
<!DOCTYPE html>
<html dir="ltr" lang="en-US">
<head>
<meta charset="$charset" />
<title>CHARACTER SET $charset</title>
</head>
<body>
<pre>
THE ORIGINAL CHARACTER SET IS <b>$charset</b>

Welcome to “Alternative Considerations of Jonestown and Peoples Temple,” sponsored by 

ENDHTML5;

echo $html5;

Open in new window

HTH, ~Ray
0
 
LVL 70

Assisted Solution

by:Jason C. Levine
Jason C. Levine earned 400 total points
ID: 39269573
The "fancy quotes" can cause problems on import.  I always suggest doing a search and replace on those from their normal state to the HTML entity version.
0
 
LVL 111

Expert Comment

by:Ray Paseur
ID: 39269632
This will replace the angle-quotes with numeric entities.  Use "View Source" to see the results.

<?php // RAY_temp_maeve100.php
error_reporting(E_ALL);


// THE STRING WITH MICROSOFT CHARACTERS
$str = 'Welcome to “Alternative Considerations of Jonestown and Peoples Temple,” sponsored by';

// TRANSFORM THE QUOTES
$arr = array
( '“' => '&#147;'
, '”' => '&#148;'
)
;
$new = strtr($str, $arr);

// SHOW THE NEW VALUES
echo $new;

Open in new window

HTH, ~Ray
0
 

Author Closing Comment

by:maeve100
ID: 39272541
Unfortunately, I ended up having to comb the SQL files and replace all the special characters with their HTML counterpart.  I'm not sure why the import is so sensitive that it can't compensate for things like this - even an accented letter would break it.
0
 
LVL 111

Expert Comment

by:Ray Paseur
ID: 39273034
Yeah, it's a nasty problem once the data base gets polluted!
0

Featured Post

Looking for a new Web Host?

Lunarpages' assortment of hosting products and solutions ensure a perfect fit for anyone looking to get their vision or products to market. Our award winning customer support and 30-day money back guarantee show the pride we take in being the industry's premier MSP.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Containers like Docker and Rocket are getting more popular every day. In my conversations with customers, they consistently ask what containers are and how they can use them in their environment. If you’re as curious as most people, read on. . .
By, Vadim Tkachenko. In this article we’ll look at ClickHouse on its one year anniversary.
The purpose of this video is to demonstrate how to integrate Mailchimp with WordPress, by placing a Mailchimp signup form on a WordPress Page or Post. This will be demonstrated using a Windows 8 PC. Mailchimp will be used. Log into your Mailchi…
The purpose of this video is to demonstrate how to prevent comment spam on a WordPress Website. This will be demonstrated using a Windows 8 PC. Plugin Akismet will be used. Go to your WordPress login page. This will look like the following: myw…

662 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question