Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people, just like you, are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
Solved

How do i perform an arabic search in SQL Server ignoring diacritics?

Posted on 2008-06-22
9
2,298 Views
Last Modified: 2012-06-22
Hi,

I have SQL Server 2005 and using a database with arabic collation.  Certain fields have english and others have arabic (there are no mixed languages, strictly one language per column).

In the arabic field I would like to store words with diacritics.  So for those who know arabic, instead of storing #-E/ I will store #N-REN/Q .  For those who know english, perhaps the closest example is instead of storing Pierre in the database, I will stored PieNrre (perhaps not correct, but you get the idea).

So my question how do i do an SQL search that will exclude diacretics?  Perhaps in a Web form my user will type in a search for '#-E/', however SELECT * FROM tableName WHERE word LIKE '%#-E/%' doesn't work.

I have been searching around the web but can't really get a head start on this.  
0
Comment
Question by:almuallim
  • 5
  • 4
9 Comments
 

Author Comment

by:almuallim
ID: 21840403
It seems arabic characters did not appear properly in my post. I'm attaching an SQL sample query.
SELECT * 
FROM dbo.ArabicPlurals
WHERE Singular LIKE ('%#-E/%')

Open in new window

0
 

Author Comment

by:almuallim
ID: 21840425
Attaching a txt file to see if works!  Will need arabic support to view properly.
arabicdiacriticQuestion.txt
0
 
LVL 4

Expert Comment

by:cavehop
ID: 21842523
You will want to use "COLLATE <collation>" in your SQL query. This keyword follows the column whose collation you wish to explicitly specify.

So for example, if you want to use Arabic collation that is case insensitive and accent insensitive (i.e. don't pay attention to diacritics), you would do something like:

SELECT * FROM dbo.ArabicPlurals
WHERE Singular COLLATE Arabic_CI_AI LIKE ('some arabic string')

Similary, case sensitive, accent sensitive is Arabic_CS_AS.

You can find a list of the collations supported by SQL Server here: http://www.collation-charts.org/mssql/
0
Windows Server 2016: All you need to know

Learn about Hyper-V features that increase functionality and usability of Microsoft Windows Server 2016. Also, throughout this eBook, you’ll find some basic PowerShell examples that will help you leverage the scripts in your environments!

 
LVL 4

Expert Comment

by:cavehop
ID: 21842542
Oh, I should also mention that you can specify the collation for a particular column, so that collation will be used as the default for all queries.

To do this enter "COLLATE <collation" immediately after the datatype in the CREATE or ALTER TABLE query.

e.g. (obviously, use whatever datatype your column really is)
ALTER TABLE dbo.ArabicPlurals
ALTER COLUMN Singular NVARCHAR(40) COLLATE Arabic_CS_AI
0
 

Author Comment

by:almuallim
ID: 21843921
Hi cavehop,

Thanks for your response.  I tried using the accent and case insensitive arabic collation and although it makes a difference it only seems to ignore the end diacritic, so my word which has a diacritic on each letter is not found.

Changing the collation definitely makes a difference though....I'm not sure whether Arabic_CS_AI would be beneficial only when sorting as is implied by some articles.



 
0
 
LVL 4

Expert Comment

by:cavehop
ID: 21845956
Unfortunately, I only know the general principle of how the collation works; I'm not at all knowledgable about Arabic. The behavior you describe sounds awfully like a bug to me.

You probably have already thought to test this but, is it always just the end diacritic that is ignored? Or is it certain characters that are not being ignored when they should be (i.e. try swapping a couple of the letters and see how it behaves)?
0
 

Accepted Solution

by:
almuallim earned 0 total points
ID: 21853785
Thanks for your help anyway.  Would still like it if someone has an idea on how to go about this, even if it's complex.

Having said that, i found the following article http://www.experts-exchange.com/Database/Miscellaneous/Q_21276274.html.  The solution might just be to have one column with diacritics and one without.  
0
 
LVL 4

Expert Comment

by:cavehop
ID: 21858590
Here's something else I just came across that may or may not be pertinent, but seems like it might be.

According to an article on Arabic language support on the MSDN SQL Server Developer Center website:

"When dealing with Unicode string constants in SQL Server 2005, you must precede all Unicode strings with a capital letter N. The N prefix stands for National Language in the SQL-92 standard, and must be uppercase. If the N prefix is not used, SQL Server uses the non-Unicode code page of the current database before it uses the string."

This would mean your query should look something like:
SELECT * FROM dbo.ArabicPlurals
WHERE Singular COLLATE Arabic_CI_AI LIKE (N'some arabic string')

I don't know whether or not this will make any difference, but it's simple enough to try quickly anyway.

Also, the article this comes from has all kinds of information about support for Arabic in SQL Server: http://msdn.microsoft.com/en-us/library/cc295829.aspx
0
 

Author Comment

by:almuallim
ID: 21863473
Hi,

I tried it and it still doesn't work. I will take a look at that article and see if there is any mention of anything else.

Thanks again.
0

Featured Post

Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
SQL Activity Monitor detail 2 24
Need help with T-SQL on SQL Server 2014 9 37
First Max value 3 28
MS SQL SERVER and ADODB.commands 8 20
International Data Corporation (IDC) prognosticates that before the current the year gets over disbursing on IT framework products to be sent in cloud environs will be $37.1B.
The Delta outage: 650 cancelled flights, more than 1200 delayed flights, thousands of frustrated customers, tens of millions of dollars in damages – plus untold reputational damage to one of the world’s most trusted airlines. All due to a catastroph…
This video shows, step by step, how to configure Oracle Heterogeneous Services via the Generic Gateway Agent in order to make a connection from an Oracle session and access a remote SQL Server database table.
Via a live example, show how to set up a backup for SQL Server using a Maintenance Plan and how to schedule the job into SQL Server Agent.

856 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question