Link to home
Start Free TrialLog in
Avatar of BadKarma
BadKarma

asked on

Problem with 'not in' SQL command on string field - known error/bug?

All,

When I run the following query:

SELECT Sourcetable.question
FROM Sourcetable
WHERE (Sourcetable.index not In (select Answertable.index from Answertable))

(Question is a text column, index is an integer column).

I get the result I want - a list of questions that are in the sourcetable but not in the answertable.

However, when I run:

SELECT Sourcetable.question
FROM Sourcetable
WHERE (Sourcetable.question not In (select Answertable.question from Answertable))

I get an empty recordset as result - while I should be getting the exact same answer.

If I test the query:

SELECT Sourcetable.question
FROM Sourcetable
WHERE (Sourcetable.question In (select Answertable.question from Answertable))  // 'Not' has been removed

I do get a list of questions, where each question is both in the SourceTable and the Answertable.

Now, my conclusion is that for some reason, the engine is having difficulty executing the 'not in' command on a string field/column. Is this is a known bug / error? If yes, is there a patch or anything similar?

(Note that I can already solve my problem by adding a autonumeric primary column - which I will do anyway. If I hadn't been receiving those tables from an external source, it would've already been present. :) )
Avatar of szacks
szacks

I would go with a Left Outer Join rather then the not in subquery syntax. However, whenever I have tried "not" it has worked fine and I have done plenty of string comparisons.

Left Outer Join Syntax -
Select SourceTable.Question from
SourceTable a Left Join AnswerTable b on
a.question=b.question
where b.question is null

As an aside, are you sure that there are no questions in the sourcetable that are not in the answer table?
Your original query doesn't necessarily prove it because there could be the same question twice in the source table with a different index.

Avatar of BadKarma

ASKER

Actually, I have also done string comparisons with "not" before without trouble - but this time it's not working. There are some fairly long strings there, though, and some have special characters such as ' or " - which still shouldn't be a problem, but something's still rotten in the state of Denmark.

I wish the info wasn't proprietary, or I'd put up the tables.

As to your answer - I think I see where you're going, but the matter is a bit more complex. Your solution works to a degree, but not in my particular instance, I think. Allow me to sketch:

I basically have a user that can define a Department as a collection of Subdepartments, like so:

SubDepartments in Department
Department_Id    Subdepartment_Name
1                             Subdep1
1                             Subdep2
2                             Subdep3

In other words, department_Id 1  has subdepartments Subdep1 and Subdep2
Department_Id 2 has one subdepartment of type Subdep3

The source table defines whether a question is valid for a type of subdepartment.

Sourcetable
Index     Question    SubDepartment_Name
1              a?                Subdep1
2              b?                Subdep1
3              b?                Subdep2
4              d?                Subdep2
5              e?                Subdep3

So the question "b?" is valid for both Subdep_Name's Subdep1 and Subdep2. All the other questions are valid for just one Subdep each.

(Note: I am aware that in this example, the queries that I posted earlier will not give the same resulting recordsets. In that case, the question was simplified).

Now, the Answertable is basically a re-organised list of question for the appropriate departments. So for department_Id 1, which has subdepartments Subdep1 and Subdep2, the question "b?" should appear only once, which is handled by a Group command in the SQL. (I am aware distinct would also be a possibility, but in this case due to other concerns a GROUP is needed.)

This would be the 'proper' Answertable for the Departments:

Answertable
Index     Question   Department_Id
1               a?                1
2               b?                1
4               d?                1
5               e?                2

The user will then be able to select his department, and get an interface to answer the appropriate questions for him.

Now, at some point the user may redefine the subdepartments in a department, and in that case, I want the Answertable to adapt to this new definition. I wish to keep the answers that have already been given, but I want to add the new questions that have now become appropriate for that Department to the Answertable.

For instance, say that Department_Id 1 received the additional SubDep_Name Subdep3, then the question "e?" would have to be added to the Answertable for Department_Id 1. If Department_Id 2 received Subdep1, "a?" and "b?" would have to be added to the Answertable for Department_Id 2.

To do so, I have an insert query, with the earlier asked question being a subquery - namely, how do I find the (newly appropriate) questions from Sourcetable that aren't yet present in Answertable - for a specific Department_Id. The "Not in" construction would seem to be the simple and elegant thing to use, and not a left join with the "WHERE b.question is null"

This wasn't really my original question, of course, so if you want to give this one a crack, I'll up the amount of points when I accept your answer. :)

To the aside: yes, there are questions in the source table that are not in the answer table - that's the whole point. In fact, you are correct that sometimes the same question can be present in the source table with a different index, but I can solve that problem through judicious use of GROUP BY and MAX statements - I think.

And I'll still add that primary key, of course.


ASKER CERTIFIED SOLUTION
Avatar of szacks
szacks

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
'not real' - you mean just adding a record to the source table? Yes, I have. It's still not working.

I can understand you'd prefer the left join syntax, but the 'where b.question=null' won't work in the table construction above, I thought? I'll try to get it to work again (heck, I'll have to. :) )
Ok, I did manage to get it to work, like this (from the WHERE statement of the larger query this is a part of):

(Tekstentabel.Vraag in (SELECT Tekstentabel.Vraag
FROM [select * from Antwoorden where afdeling_Id=1]. AS Hulp RIGHT JOIN Tekstentabel ON Hulp.Vraag = Tekstentabel.Vraag
WHERE (((Tekstentabel.Vraag) Is Not Null) AND ((Hulp.Vraag) Is Null))))

I upped the point count and give you an A - since I learned something and it was quite useful. Ok?
thank you