EK 365
asked on
Python negative lookahead for aaa:bbb@hostname
Hi, below python regular expression is for getting id/pwd/hostname/port from user input.
Python Re:
(?P<username>[^@:]*)(:?)(?P<password>.*)(?!\\)@(?P<hostname>[^:]*):?(?P<port>[0-9]*)
Target String:
aaa:bbb@hostname:22
Could you please explain above bold formatted characters' roles ?
I am not sure why (:?) has parenthesis however :? before "port" parsing does not have one.
I am not sure why (?!\\) was used why do we need negative look-ahead for '\' ? It works without (?!\\) regex.
Regular Expression tool: http://pythex.org/
Code used: https://github.com/pexpect/pexpect/blob/master/examples/hive.py
Python Re:
(?P<username>[^@:]*)(:?)(?P<password>.*)(?!\\)@(?P<hostname>[^:]*):?(?P<port>[0-9]*)
Target String:
aaa:bbb@hostname:22
Could you please explain above bold formatted characters' roles ?
I am not sure why (:?) has parenthesis however :? before "port" parsing does not have one.
I am not sure why (?!\\) was used why do we need negative look-ahead for '\' ? It works without (?!\\) regex.
Regular Expression tool: http://pythex.org/
Code used: https://github.com/pexpect/pexpect/blob/master/examples/hive.py
ASKER
Thank you but i tried with windows unc with/without negative look-ahead but same result it's not working w that re string.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
:? means: 0 or more ":"No, it means zero or one.
The negative lookahead is pointless. The very next character to be found is an @, so the lookahead will always be true.
Selected answer is not correct. If the lookahead were instead a lookbehind, then Dan Craciun would be correct.
You can see why using the following string:
By Dan's (and the referenced article's logic), the password should be "p", but the result is "p@ssword4\\@host". As I said, the lookahead is pointless in this scenario.
You can see why using the following string:
user:p@ssword4\@host@2:11
By Dan's (and the referenced article's logic), the password should be "p", but the result is "p@ssword4\\@host". As I said, the lookahead is pointless in this scenario.
ASKER
thank you! I reached out author of that code and he said the same thing!
The first (:?) is the capturing group 2. The second :? is not in a capturing group.
The only reason I can think of for (?!\\) is Windows UNC paths, that contain \.
So the regular expression will stop if that path is of the type: user:pass\\hostname
HTH,
Dan