Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 385
  • Last Modified:

"Dialectizer" in PHP.. fun project :D

OK, so I am trying to make my own Dialectizer type thing (based on the concept behind http://rinkworks.com/dialect/ )

I am usually pretty PHP competent, but this time I have failed.

Basically what I need it to do, is get a URL, and replace any instance of abc with xyz. I got that far.

Heres are the things I can't figure out (not yet anyway),
It must convert all relative image and CSS links to absolute links pointing to where they should
It has to point all relative links through the dialectizer as well

So, I thought I would throw it out here and see if anyone could help me out with the code..

Heres my code so far:

<?php
if ($_GET['url']) {
// grab out put from page
ob_start();
include($_GET['url']);
$pagecontent = ob_get_contents();
ob_end_clean();

// modify output
$pagecontent = eregi_replace("http://" , 'index.php?url=http://', $pagecontent);
$pagecontent = eregi_replace("ing", "in'", $pagecontent);
$pagecontent = eregi_replace("and", "an'", $pagecontent);
$pagecontent = eregi_replace("for", "fo'", $pagecontent);
$pagecontent = eregi_replace("broken", "busted", $pagecontent);
$pagecontent = eregi_replace("what", "whut", $pagecontent);
$pagecontent = eregi_replace("ever", "evah", $pagecontent);
$pagecontent = eregi_replace("tion", "shun", $pagecontent);
$pagecontent = eregi_replace("potatoe", "tater", $pagecontent);
//etc..



// write output to page
echo $pagecontent;
} else {
?>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<title>Dialectizer</title>
</head>

<body>
<form method="get" action="<?php echo($_SERVER['PHP_SELF']); ?>">
<input name="url" type="text" /><br />
<input name="submit" type="submit" value="Dialectize It!" />
</form>

</body>
</html>
<?php } ?>



Josh
0
JoshPowell
Asked:
JoshPowell
  • 4
  • 2
1 Solution
 
Marcus BointonCommented:
For a start I'd suggest that you build arrays of match patterns and replacements so you can do it all in one pass. It will be much faster:

$in=array('ing', 'and', 'for'); //etc
$out = array('in', 'an', 'fo');
$translation = preg_replace($in, $out, $pagecontent);

You only need to redirect <a> tag links through your script, so your search for "http" etc is too wide. Try:

$pagecontent = preg_replace('/(<\s*a\s+.*href\s*=\s*")(.*)(")/i', '$1index.php\?url=$2"$3', $pagecontent);

This also copes with links where the href is not the first attribute in an a tag, varying spacing, case insentitive.

You still need to convert images and CSS style or link meta tags to absolute URLs, along with form submission actions - I'll leave that for someone else, but you might find parse_url() and realpath() useful.
0
 
JoshPowellAuthor Commented:
Thanks.. I will read up on those functions :)

Josh
0
 
JoshPowellAuthor Commented:
So does anybody know how to convert relative style/image URLs in to absolute =P

Josh
0
Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

 
krazywestCommented:
Heh, I looked at this at work today and wrote a partial solution, but.. guess it doesn't matter now.

<?
 
$scriptpath='href="dialect.php?url=';
 
$pagereading="http://www.example.com/folder/page.php?arg1=fish&arg2=bob";
 
preg_match("!(.*/)[^/]*?!",$pagereading,$regs);
 
$folder=$regs[1];
 
preg_match("!([a-z0-9_-]+://[^/]*/).*!",$pagereading,$regs);
 
$domain=$regs[1];
 
 
 
function callback_abhref($matches)
{
 global $scriptpath;
 
 echo "abhref: {$matches[1]}<br>";
 return $scriptpath . urlencode($matches[1]) . '"';
}
 
function callback_href($matches)
{
 global $folder,$scriptpath;
 echo "href: {$matches[1]}<br>";
 if (strpos($scriptpath,$matches[1])==1)
  return "FISH";
 return $scriptpath . urlencode($folder . $matches[1]) . '"';
}
 

function callback_roothref($matches)
{
 global $domain,$scriptpath;
 echo "roothref: {$matches[1]}<br>";
 return $scriptpath . urlencode($domain . $matches[1]) . '"';
}
 
$string='
 
<a href="http://google.com/search">Link text</a>
 
<a href="/index.php">Link text</a>
 
<a href="fish/monkey?page=fish">Link text</a>
';
 
$a=$string;
 
$a=preg_replace_callback('!href\w*=\w*"(\w*[a-z0-9]+://.+)"!i',"callback_abhref",$a);
$a=preg_replace_callback('!href\w*=\w*"([^/]{1}.*)"!i',"callback_href",$a);
$a=preg_replace_callback('!href\w*=\w*"(/.*)"!i',"callback_roothref",$a);
 

echo "<pre>" . htmlspecialchars($a);
0
 
JoshPowellAuthor Commented:
Objection!


I didn't mean to abandon this.. I was just waiting for someone to awnser :P

Thanks for your help krazywest , I will play around with that code :)

Josh
0
 
krazywestCommented:
Just so you know, the problem I was having was that strpos("string","stringmonkeybob") was returning FALSE instead of 0 (like it should).

Ie.
 if (strpos($scriptpath,$matches[1])==1)
  return "FISH";

Change that to:

 if (strpos($scriptpath,$matches[1])===0)
  return $matches[1];

And then debug it. ;)
0
 
JoshPowellAuthor Commented:
Ah, thanks :)

You have proven most helpfull.. I shall start work on my script now :D

Josh
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

  • 4
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now