Link to home
Start Free TrialLog in
Avatar of Karthik Dhamodharan
Karthik Dhamodharan

asked on

Mongo Regex not considering '?' symbol.

My collection contain name attribute, which contains list of brand names.

db.brand.find({'name':{'$regex' : '^Apple?$', '$options' : 'i'}})

Open in new window


This query returns the Brand name 'Apple', but the fact ^ and $ is mentioned, shouldnt return the name 'Apple'. I.e., 'Apple' is not equal to 'Apple?'

Please let me know if we have any option in $regex to solve this ?
Avatar of Rgonzo1971
Rgonzo1971

Hi,

pls try
db.brand.find({'name':{'$regex' : '^Apple\?$', '$options' : 'i'}})

Open in new window

or

db.brand.find({'name':{'$regex' : '^Apple\.$', '$options' : 'i'}})

Open in new window

or

db.brand.find({'name':{'$regex' : '^Apple\.w$', '$options' : 'i'}})

Open in new window

Regards
Avatar of Karthik Dhamodharan

ASKER

Thank you for your response

'^Apple\?$'  still returns the documents that has 'Apple'.

Does other two options trying to check existence of a dot ?
What are you trying to check?
I want to run a find query in Mongo, which need to be case insensitive. What ever Brand Name i pass, it should return the matched brand details. Issue i am facing is, if the brand name contains a '?', then mongo regex is not honoring it and treating it as good as its not there.

For eg: Consider the Brand Name as 'Apple?'. If we use '^Apple\?$'  then it is returning the value 'Apple'. Ideally 'Apple?' is not equal 'Apple'.
What programming language do you use? It is likely, that your backslash or have to be doubled for regular expressions, or you have to use a special kind of string literal that does not interpret escape sequences -- like '^Apple\\?$', or r'^Apple\?$', or /^Apple\?$/.
then try
db.brand.find({'name':{'$regex' : '^Apple\\?$', '$options' : 'i'}})

Open in new window

@pepr, I am using Java language. \\? is working from the Mongo client.

Thank you @pepr and @Rgonzo1971
Yes. Then this is the reason. Java is probably one of few widespread languages that does not support raw string literals. The C# has @"raw string literal", Python has r'raw string literal', even newer C++ has R"(multi-line raw string literal)", Perl, awk, sed, and the old friends have /raw string literals/ in some context. Not Java. Because of that, you may find writing regular expression patterns more difficult (not mentally, better to say "error prone") in Java than in other languages.

When not having raw string literals, the question mark in "\?" has no special meaning. This way it is treated simply as "?" which means non-greedy in a regular expression (which means nothing when used after a normal character).
it does not mean "nothing" : it means 0 or 1 instance of the previous match group or character :

'apple?' matches 'appl' and 'apple'

in this case 'apple\?' is internally interpreted as 'apple' while 'apple\\?' is passed to regular expression engine as 'apple\?' which produces the expected behavior. a simple way to use regular escaped strings is to escape the string on the fly.
@skullnobrains Oh, my fault! You are right. The greedy vs. non-greedy behaviour is related to situation when question mark is placed after * or +.
This question needs an answer!
Become an EE member today
7 DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform.
View membership options
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.