RT_
asked on
PHP preg_replace code convert to Delphi
The php code is the following :
$src = preg_replace("/\s+([\w]{3, 18}:)/u"," ",$src);
can anybody convert it to delphi?
thanks
$src = preg_replace("/\s+([\w]{3,
can anybody convert it to delphi?
thanks
do you have test case in php ?
sample input and output ?
sample input and output ?
technically it's just this :
that's besides any differences between php and delphi regex
there is slight differences... but you'd have to test to find if the differences have impact
http://www.regular-expressions.info/index.html
unit uMain;
interface
uses
Winapi.Windows, Winapi.Messages, System.SysUtils, System.Variants, System.Classes, Vcl.Graphics,
Vcl.Controls, Vcl.Forms, Vcl.Dialogs, Vcl.StdCtrls;
type
TForm1 = class(TForm)
mmoInput: TMemo;
mmoOutput: TMemo;
btnParse: TButton;
procedure btnParseClick(Sender: TObject);
end;
var
Form1: TForm1;
implementation
uses System.RegularExpressions;
{$R *.dfm}
procedure TForm1.btnParseClick(Sender: TObject);
var R: TRegEx;
begin
R.Create('/\s+([\w]{3,18}:)/u');
mmoOutput.Text := R.Replace(mmoInput.Text, '');
end;
end.
that's besides any differences between php and delphi regex
there is slight differences... but you'd have to test to find if the differences have impact
http://www.regular-expressions.info/index.html
ASKER
Delphi 10.1 Berlin
this is the input :
'Alapadatok: Sebesség: 300Mbps Otthoni: Igen Irodai: Igen Szabvány : 802.11a/b/g/n 802.11AC : Igen QoS : Igen WMM : Igen Gigabit Mbps LAN : Igen IPv6 : Igen'
The output must be the following:
'igen szabvány 802.11a/b/g/n 802.11ac qos wmm gigabit mbps lan ipv6'
this is the input :
'Alapadatok: Sebesség: 300Mbps Otthoni: Igen Irodai: Igen Szabvány : 802.11a/b/g/n 802.11AC : Igen QoS : Igen WMM : Igen Gigabit Mbps LAN : Igen IPv6 : Igen'
The output must be the following:
'igen szabvány 802.11a/b/g/n 802.11ac qos wmm gigabit mbps lan ipv6'
This is one of those questions that would benefit from having good test data!
I don't have test data or a Delphi system to make the tests, but I can explain the regular expression. HTH, ~Ray
I don't have test data or a Delphi system to make the tests, but I can explain the regular expression. HTH, ~Ray
<?php
/**
* http://php.net/manual/en/reference.pcre.pattern.modifiers.php
* http://php.net/manual/en/function.preg-replace.php
*/
error_reporting(E_ALL);
// FROM THE POST AT EE:
// $src = preg_replace("/\s+([\w]{3,18}:)/u","",$src);
// ANNOTATED REGULAR EXPRESSION
$rgx
= '/' // REGEX DELIMITER
. '\s+' // ONE OR MORE CHARACTERS OF WHITESPACE
. '(' // START CAPTURE GROUP
. '[' // START CHARACTER CLASS
. '\w' // CHARACTER CLASS = "WORDS"
. ']' // ENDOF CHARACTER CLASS
. '{3,18}' // LENGTH IS 3 TO 18, INCLUSIVE
. ':' // FOLLOWED BY A COLON
. ')' // ENDOF CAPTURE GROUP
. '/' // REGEX DELIMITER
. 'u' // FLAG MODIFIER - UTF-8
;
It looks like this regular expression does not work correctly in PHP. Maybe there is more to this process that we are not seeing here? Here's a test case using the inputs and outputs posted above.
https://www.iconoun.com/demo/temp_rt.php
https://www.iconoun.com/demo/temp_rt.php
<?php // demo/temp_rt.php
/**
* https://www.experts-exchange.com/questions/28993289/PHP-preg-replace-code-convert-to-Delphi.html
*
* http://php.net/manual/en/reference.pcre.pattern.modifiers.php
* http://php.net/manual/en/function.preg-replace.php
*
* From the post at E-E:
* $src = preg_replace("/\s+([\w]{3,18}:)/u","",$src);
*/
error_reporting(E_ALL);
// MAKE SURE THAT PHP WORKS WITH UTF-8
mb_internal_encoding('UTF-8');
mb_regex_encoding('UTF-8');
// ANNOTATED REGULAR EXPRESSION
$rgx
= '/' // REGEX DELIMITER
. '\s+' // ONE OR MORE CHARACTERS OF WHITESPACE
. '(' // START CAPTURE GROUP
. '[' // START CHARACTER CLASS
. '\w' // CHARACTER CLASS = "WORDS"
. ']' // ENDOF CHARACTER CLASS
. '{3,18}' // LENGTH IS 3 TO 18, INCLUSIVE
. ':' // FOLLOWED BY A COLON
. ')' // ENDOF CAPTURE GROUP
. '/' // REGEX DELIMITER
. 'u' // FLAG MODIFIER - UTF-8
;
// TEST STRING
$src = 'Alapadatok: Sebesség: 300Mbps Otthoni: Igen Irodai: Igen Szabvány : 802.11a/b/g/n 802.11AC : Igen QoS : Igen WMM : Igen Gigabit Mbps LAN : Igen IPv6 : Igen';
// DESIRED OUTPUT
$out = 'igen szabvány 802.11a/b/g/n 802.11ac qos wmm gigabit mbps lan ipv6';
// TRY
$new = preg_replace($rgx, NULL, $src);
// SHOW THE WORK
echo '<pre>';
echo PHP_EOL . $rgx;
echo PHP_EOL . $src;
echo PHP_EOL . $out;
echo PHP_EOL . $new;
Outputs:
/\s+([\w]{3,18}:)/u
Alapadatok: Sebesség: 300Mbps Otthoni: Igen Irodai: Igen Szabvány : 802.11a/b/g/n 802.11AC : Igen QoS : Igen WMM : Igen Gigabit Mbps LAN : Igen IPv6 : Igen
igen szabvány 802.11a/b/g/n 802.11ac qos wmm gigabit mbps lan ipv6
Alapadatok: 300Mbps Igen Igen Szabvány : 802.11a/b/g/n 802.11AC : Igen QoS : Igen WMM : Igen Gigabit Mbps LAN : Igen IPv6 : Igen
ASKER
The PHP solution is ok, only the first word (Alapadatok:) not replace because not have whitespace.
But the delphi code is not replace any word.
But the delphi code is not replace any word.
it won't replace anything in delphi
i was searching for the equivalent forward slash ... but hadn't found it yet.
based on ray's text, i now know it's not required
the /u at the end is not required either
the character class [ can be skipped and the end too ]
the capture group ... what do you want with that ?
i was searching for the equivalent forward slash ... but hadn't found it yet.
based on ray's text, i now know it's not required
the /u at the end is not required either
the character class [ can be skipped and the end too ]
the capture group ... what do you want with that ?
The forward slash regex delimiter is not required in some PHP work, too. The UTF-8 modifier may not be needed in this PHP regex because there are not any UTF-8 characters in the expression. If there are, PHP would recommend mb_ereg_ functions instead.
to have something to play with, just add an edit1.Text
and change the function to this:
and change the function to this:
procedure TForm1.btnParseClick(Sender: TObject);
var R: TRegEx;
begin
R.Create(edit1.Text); // '\s+\w{3,18}:');
mmoOutput.Text := R.Replace(mmoInput.Text, '');
end;
your output sample doesn't match your regex
anyway ...
in delphi i get closest to it with this regex:
\s?\w{3,18}:
anyway ...
in delphi i get closest to it with this regex:
\s?\w{3,18}:
unit uMain;
interface
uses
Winapi.Windows, Winapi.Messages, System.SysUtils, System.Variants, System.Classes, Vcl.Graphics,
Vcl.Controls, Vcl.Forms, Vcl.Dialogs, Vcl.StdCtrls, System.RegularExpressions;
type
TForm1 = class(TForm)
mmoInput: TMemo;
mmoOutput: TMemo;
btnParse: TButton;
Edit1: TEdit;
procedure btnParseClick(Sender: TObject);
private
function MatchReplace(const Match: TMatch): string;
end;
var
Form1: TForm1;
implementation
{$R *.dfm}
procedure TForm1.btnParseClick(Sender: TObject);
var R: TRegEx;
mc: TMatchCollection;
begin
R.Create(edit1.Text); // '\s+\w{3,18}:');
mc := R.Matches(mmoInput.Text);
mmoOutput.Text := R.Replace(mmoInput.Text, MatchReplace);
end;
function TForm1.MatchReplace(const Match: TMatch): string;
begin
Result := '';
end;
end.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Geert Gruwez
your regex is very close to the perfect.
The only problem is,
they cut the words with no space before the ":" like this : Alapadatok:
and cut the words with space before the ":" like this : Szabvány :
I want cut only the words with no space before the ":" like this : Alapadatok:
!!!don't cut the words with space before the ":" like this : Szabvány :!!!
can you work on it please ?
your regex is very close to the perfect.
The only problem is,
they cut the words with no space before the ":" like this : Alapadatok:
and cut the words with space before the ":" like this : Szabvány :
I want cut only the words with no space before the ":" like this : Alapadatok:
!!!don't cut the words with space before the ":" like this : Szabvány :!!!
can you work on it please ?
if you can provide an accurate sample of what your regex in php gives
actually multiple samples would be nice.
you typed your output, you didn't just copy .
so if you can't give a valid output, it's just guessing.
actually multiple samples would be nice.
you typed your output, you didn't just copy .
- the case of the letters was wrong
- the 300Mb was missing
- etc ...
so if you can't give a valid output, it's just guessing.
as of a certain version there is unit RegularExpressions