Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
?
Solved

http_referer encoding

Posted on 2005-04-19
10
Medium Priority
?
323 Views
Last Modified: 2013-12-25
What charset is used in http_referer variable. I want to parse keywords passed to search engine and there may be used scandinavian and cyrilic characters. They are replaced with some weird code in http_referer. I know it is a hexadecimal value, but...

Joonas Harjumäki

0
Comment
Question by:OnLinux
  • 4
  • 3
7 Comments
 
LVL 18

Expert Comment

by:kandura
ID: 13819119
use CGI qw/:standard/;

print referer();

0
 
LVL 18

Expert Comment

by:kandura
ID: 13819133
or

    $string =~ s/%([0-9A-Fa-f]{2})/chr(hex($1))/eg;
0
 
LVL 51

Expert Comment

by:ahoffmann
ID: 13819259
>  What charset is used in http_referer
"is used" can be anything you could imagine or not ;-)
if you mean "recommended to use according standards" then it is US-ASCII (7-bit) with special encoding for some characters (URL-encodding)
0
Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

 
LVL 18

Expert Comment

by:kandura
ID: 13820975
digging a little deeper, turns up that you can even have utf-8 in there. url encoded, of course, and on the byte level, so it will be hell to figure out what the original encoding was.
I doubt Encode::Guess can help much with so little data.
0
 
LVL 51

Accepted Solution

by:
ahoffmann earned 100 total points
ID: 13821947
> ..  turns up that you can even have utf-8 in there.
no,
or more exactly: not for a standard-compliant URL
It has to be [a-zA-Z0-9/:&?;|!_-] and anything else URL-encoded, keep in mind that encoding is diffrent for URI and query string. Hence utf-8 needs to be URL-encoded (either %22 or %0022 for ")

This does not mean that a URL can not have other characters, you can write anything there even null bytes ;-)
0
 
LVL 18

Assisted Solution

by:kandura
kandura earned 100 total points
ID: 13823043
ahoffmann,
> and anything else URL-encoded

that's what i said: "you can even have utf-8 in there. url encoded, of course, and on the byte level".

ahoffmann,
> keep in mind that encoding is diffrent for URI and query string

what's that supposed to mean? the query string _is part of the uri_.


0
 
LVL 51

Expert Comment

by:ahoffmann
ID: 13823619
oops, always inter-mixing uri and url, I meant the part left and right of the leftmost ? character
0

Featured Post

Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this tutorial I will focus on how to use WhizBase as a tool for sending ICQ messages to ICQ. Here I will use a new technology in WhizBase, published in WhizBase 5.1 version. In this tutorial I will use 3 files, pager.wbsp for the processing, e…
In threads here at EE, each comment has a unique Identifier (ID). It is easy to get the full path for an ID via the right-click context menu. However, we often want to post a short link within a thread rather than the full link. This article shows a…
This tutorial will teach you the core code needed to finalize the addition of a watermark to your image. The viewer will use a small PHP class to learn and create a watermark.
In this seventh video of the Xpdf series, we discuss and demonstrate the PDFfonts utility, which lists all the fonts used in a PDF file. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any pl…
Suggested Courses

577 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question