Improve company productivity with a Business Account.Sign Up

x
?
Solved

php postgresql convert mixed encoding to utf-8

Posted on 2014-12-08
6
Medium Priority
?
632 Views
Last Modified: 2014-12-08
I have user input/EDI that may be in different encoding (ISO, Win, etc.).  When I insert it into the database PostgreSQL complain "invalid byte sequence for encoding "UTF8".

PostgreSQL DB encoding is UTF-8.

I cannot control the user input/EDI.  The SQL query actually has mixed encoding, including UTF-8.

Is there a way to convert the none UTF-8 strings/chars to UTF-8 and ignore the strings/chars that are already UTF-8?

Thanks.
0
Comment
Question by:flowerbloom
  • 3
  • 2
6 Comments
 
LVL 111

Expert Comment

by:Ray Paseur
ID: 40487960
This is a complicated area of work, and you may find that there is a lot to know in order to make this come out right.  I've researched it and run it to ground for both PHP and MySQL.  If PostGreSQL is using UTF8 and the issue is that the client input is incompatible with UTF8, this article will lead you in the right direction.
http://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/A_11880-Unicode-PHP-and-Character-Collisions.html
0
 
LVL 1

Author Comment

by:flowerbloom
ID: 40487983
Hi Ray,

Nice article.  It does not help me.  Let me provide an example.

$q = "inset into t1 (f1,f2,f3) values ('v1','v2','v3');";
$pg_exec($q);

Where database is PostgreSQL and has utf-8 encoding, v1 encoding is utf-8, v2 encoding is iso5589-1, and v3 encoding is win1255.

Error message:  "invalid byte sequence for encoding "UTF8".

I need something like:
$q = convert_to_utf8_ignoring_already_utf8($q);
$pg_exec($q);

Update successfully.


I need something like:
function convert_to_utf8_ignoring_already_utf8 ($s) {
  $new_s = do some magic with $s.  break it apart, put it together, ignore utf-8.  Make all of $s utf-8.
  return $new_s;
}


Thanks.
0
 
LVL 84

Expert Comment

by:Dave Baldwin
ID: 40487992
There is this function utf8_encode http://php.net/function.utf8-encode but it does not 'automatically' recognize UTF-8 strings.  You have to know what you are feeding it.
0
Get expert help—faster!

Need expert help—fast? Use the Help Bell for personalized assistance getting answers to your important questions.

 
LVL 1

Author Comment

by:flowerbloom
ID: 40488032
Hi Dave.  This does not help. Thanks anyhow.
0
 
LVL 111

Accepted Solution

by:
Ray Paseur earned 1500 total points
ID: 40488105
Where database is PostgreSQL and has utf-8 encoding, v1 encoding is utf-8, v2 encoding is iso5589-1, and v3 encoding is win1255.
This is something you must fix -- there is no automated solution.

$q = convert_to_utf8_ignoring_already_utf8($q);
The article explains why this is impossible.  Sorry, there are no unicorns.  Your script must detect the encoding of the existing data and must use the correct PHP functions to handle the process of conversion to UTF8, with a sensitivity to data that is already UTF8.  

If you want to go back to the article and read it carefully for comprehension, I'll be glad to help.  I wrote it as clearly as I could, but this is not a subject with magic bullets -- it requires detailed, step-by-step understanding of the issues.  If you read the article and think I might be able to help, please post your input data and the exact output you want to achieve.  By "input" I mean the data you get from the external sources, such as a client request.  By "output" I mean the data you want to put into the PostGreSQL query strings.
0
 
LVL 1

Author Closing Comment

by:flowerbloom
ID: 40488109
The perfect solution will be a function that breaks down the PostgreSQL query and go over each input fields/values and convert to UTF-8.  Oh well.  Thanks anyhow.
0

Featured Post

Get expert help—faster!

Need expert help—fast? Use the Help Bell for personalized assistance getting answers to your important questions.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

This article discusses how to implement server side field validation and display customized error messages to the client.
The title says it all. Writing any type of PHP Application or API code that provides high throughput, while under a heavy load, seems to be an arcane art form (Black Magic). This article aims to provide some general guidelines for producing this typ…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
The viewer will learn how to look for a specific file type in a local or remote server directory using PHP.

595 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question