Solved

php postgresql convert mixed encoding to utf-8

Posted on 2014-12-08
6
422 Views
Last Modified: 2014-12-08
I have user input/EDI that may be in different encoding (ISO, Win, etc.).  When I insert it into the database PostgreSQL complain "invalid byte sequence for encoding "UTF8".

PostgreSQL DB encoding is UTF-8.

I cannot control the user input/EDI.  The SQL query actually has mixed encoding, including UTF-8.

Is there a way to convert the none UTF-8 strings/chars to UTF-8 and ignore the strings/chars that are already UTF-8?

Thanks.
0
Comment
Question by:flowerbloom
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
6 Comments
 
LVL 110

Expert Comment

by:Ray Paseur
ID: 40487960
This is a complicated area of work, and you may find that there is a lot to know in order to make this come out right.  I've researched it and run it to ground for both PHP and MySQL.  If PostGreSQL is using UTF8 and the issue is that the client input is incompatible with UTF8, this article will lead you in the right direction.
http://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/A_11880-Unicode-PHP-and-Character-Collisions.html
0
 
LVL 1

Author Comment

by:flowerbloom
ID: 40487983
Hi Ray,

Nice article.  It does not help me.  Let me provide an example.

$q = "inset into t1 (f1,f2,f3) values ('v1','v2','v3');";
$pg_exec($q);

Where database is PostgreSQL and has utf-8 encoding, v1 encoding is utf-8, v2 encoding is iso5589-1, and v3 encoding is win1255.

Error message:  "invalid byte sequence for encoding "UTF8".

I need something like:
$q = convert_to_utf8_ignoring_already_utf8($q);
$pg_exec($q);

Update successfully.


I need something like:
function convert_to_utf8_ignoring_already_utf8 ($s) {
  $new_s = do some magic with $s.  break it apart, put it together, ignore utf-8.  Make all of $s utf-8.
  return $new_s;
}


Thanks.
0
 
LVL 83

Expert Comment

by:Dave Baldwin
ID: 40487992
There is this function utf8_encode http://php.net/function.utf8-encode but it does not 'automatically' recognize UTF-8 strings.  You have to know what you are feeding it.
0
MS Dynamics Made Instantly Simpler

Make Your Microsoft Dynamics Investment Count  & Drastically Decrease Training Time by Providing Intuitive Step-By-Step WalkThru Tutorials.

 
LVL 1

Author Comment

by:flowerbloom
ID: 40488032
Hi Dave.  This does not help. Thanks anyhow.
0
 
LVL 110

Accepted Solution

by:
Ray Paseur earned 500 total points
ID: 40488105
Where database is PostgreSQL and has utf-8 encoding, v1 encoding is utf-8, v2 encoding is iso5589-1, and v3 encoding is win1255.
This is something you must fix -- there is no automated solution.

$q = convert_to_utf8_ignoring_already_utf8($q);
The article explains why this is impossible.  Sorry, there are no unicorns.  Your script must detect the encoding of the existing data and must use the correct PHP functions to handle the process of conversion to UTF8, with a sensitivity to data that is already UTF8.  

If you want to go back to the article and read it carefully for comprehension, I'll be glad to help.  I wrote it as clearly as I could, but this is not a subject with magic bullets -- it requires detailed, step-by-step understanding of the issues.  If you read the article and think I might be able to help, please post your input data and the exact output you want to achieve.  By "input" I mean the data you get from the external sources, such as a client request.  By "output" I mean the data you want to put into the PostGreSQL query strings.
0
 
LVL 1

Author Closing Comment

by:flowerbloom
ID: 40488109
The perfect solution will be a function that breaks down the PostgreSQL query and go over each input fields/values and convert to UTF-8.  Oh well.  Thanks anyhow.
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

This article discusses four methods for overlaying images in a container on a web page
Introduction This article is intended for those who are new to PHP error handling (https://www.experts-exchange.com/articles/11769/And-by-the-way-I-am-New-to-PHP.html).  It addresses one of the most common problems that plague beginning PHP develop…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Steps to create a PostgreSQL RDS instance in the Amazon cloud. We will cover some of the default settings and show how to connect to the instance once it is up and running.

697 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question