Link to home
Start Free TrialLog in
Avatar of Mike Jacobs
Mike JacobsFlag for United Kingdom of Great Britain and Northern Ireland

asked on

SQL (MariaDb) search fails when looking for Integer, unless we treat the integer as a string

I have duplicated (by hand) the Maria database I have on my main workstation to a new dedicated one which runs about 5 times faster. On screen, (using the Heidi UI) every row definition looks identical (to the original on "Main"). 


ROWNUM is an AUTO_INCREMENTED unsigned INTEGER. 


My SQL string which finds any given Rownum is 


'SELECT ROWNUM, HASH_LINKED, HASH_CONFIRMED_BY, HASH_SALTED,HASH_SOURCE,BLOCK_PREV,BLOCK_NEXT,BLOCK_NUM FROM HASH_CHAIN WHERE ROWNUM=?ROW_NUM'


on the original database, I can set ROW_NUM to any integer and execute that string. It returns the relevant Row (assuming it exists) 


On the duplicate, if I try a ROW_NUM=(any integer) it fails UNLESS I wrap the integer in quotation marks, as though ROWNUM were a text field.  i.e. it won't find row 1 but will find row "1"


Given that it is clearly defined as above, how is this possible and what do I do to fix it?








Avatar of Pavel Celba
Pavel Celba
Flag of Czechia image

I am not using MariaDB but it is probably not good practice to name the column as the name of existing internal function. In addition to this, ROWNUM can be used w/o parenthesis in ORACLE mode... More info

OTOH, I don't have any explanation for the integer/string behavior difference because implicit conversion should work here.

What means your sentence "if I try a ROW_NUM=(any integer) it fails "?  Does it return no rows or is an error message displayed?

Could you please add a new INT column to your table, populate it with the values from the ROWNUM column and test your query again?

BTW, what MariaDB version do you use? I am asking because of the following note in the documentation:

  • If one argument is string and the other argument is integer, they are compared as decimals. This conversion was added in MariaDB 10.3.36. Prior to 10.3.36, this combination was compared as floating point values, which did not always work well for huge 64-bit integers because of a possible precision loss on conversion to double.


Avatar of skullnobrains
skullnobrains

my bet is you imported the row_num as text when moving data from the master to the copy.

can you provide the output of "show create table YOUR_table" on the replica ?

that said, for a simple integer i totally concur the implicit conversion just should work
Avatar of Mike Jacobs

ASKER

Pavel, greetings

What means your sentence "if I try a ROW_NUM=(any integer) it fails "?  Does it return no rows or is an error message displayed?


it returns no rows. No error

While I take your point about ROWNUM, its news to me that it is a reserved term. I'm quite surprised because I went looking for something like it when I set up the initial table and only added the column when I failed to find it!

However, what doesn't add up is that if it's a problem on the duplicate, why is it working on the original?

My Maria version is 10.8.

I may try the added column trick but the next comment gave me an idea which I just tried and it worked, ONCE, as I'm about to describe.

:Skullnobrains

You lose your bet. I didn't import anything. I created it manually, copying the template from the original. However, your request for the CREATE TABLE code made me realise I could have just pasted the code from the original to create the table, which is what I did, and it appeared to work. i.e. I was able to search ROWNUM by setting ROW_NUM to an integer, as intended. But

a) the warm glow of success evaporated almost as quickly as it had arrived.
b) I don't understand what made the difference and
c) why it actually ended up worse!  (now the search fails whether I assign ROW_NUM as string or number!)

When I first recreated the table using the original CREATE code, I ran my program to create the first hundred records on the table, which it duly did but failed when I went looking for Rownum=1.

However, this turned out to be because, when I recreated the table, I didn't pay any attention to the value of the AUTO-INCREMENT counter and it was set in the mid 16500s (presumably having carried that across from the original, though I didn't spot how that was possible either). My brief moment of triumph occured when I set ROW_NUM=16505 and retrieved the relevant row.

So I deleted that hundred records, reset the auto increment counter to 1 and ran the code again. This time I had 100 records numbered from 1 to 100

But this time I couldn't perform a successful search using ROW_NUM= 1 or ROWNUM= "1" (or other random values up to 100)

That leaves me utterly confused!

Here's the code from the Duplicate:

CREATE TABLE `hash_chain` (
    `ROWNUM` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
    `HASH_SOURCE` VARBINARY(32) NULL DEFAULT '0',
    `HASH_LINKED` VARBINARY(32) NOT NULL DEFAULT '0',
    `HASH_SALTED` VARBINARY(32) NULL DEFAULT '0',
    `HASH_PREVIOUS` VARBINARY(32) NULL DEFAULT '0',
    `HASH_CONFIRMED_BY` VARBINARY(32) NULL DEFAULT 'UNCONFIRMED',
    `BLOCK_NUM` INT(10) UNSIGNED NULL DEFAULT '0',
    `BLOCK_PREV` INT(10) NOT NULL DEFAULT '0',
    `BLOCK_NEXT` INT(10) UNSIGNED NULL DEFAULT '0',
    `CODEL_TIME` TIMESTAMP NULL DEFAULT NULL,
    `CHAIN_TIME` TIMESTAMP NULL DEFAULT current_timestamp(),
    PRIMARY KEY (`HASH_LINKED`) USING BTREE,
    UNIQUE INDEX `ROWNUM` (`ROWNUM`) USING BTREE,
    UNIQUE INDEX `HASH_SALTED` (`HASH_SALTED`) USING BTREE,
    INDEX `HASH_SOURCE` (`HASH_SOURCE`) USING BTREE,
    INDEX `HASH_PREVIOUS` (`HASH_PREVIOUS`) USING BTREE,
    INDEX `HASH_CONFIRMED_BY` (`HASH_CONFIRMED_BY`) USING BTREE,
    INDEX `BLOCK_NUM` (`BLOCK_NUM`) USING BTREE,
    INDEX `BLOCK_PREV` (`BLOCK_PREV`) USING BTREE,
    INDEX `BLOCK_NEXT` (`BLOCK_NEXT`) USING BTREE


I dropped the table, captured the code from original, pasted it into the duplicate Maria and ran it to recreate the table. (I know. I should have done that to start with, but I'm a novice on working with SQL)

And, as I said above. It worked. For about 5 minutes.

Not obvious why it worked, and even less obvious why it then failed when I tried again.

Here's the create code from the original:

CREATE TABLE `hash_chain` (
    `ROWNUM` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
    `HASH_SOURCE` VARBINARY(32) NULL DEFAULT '0',
    `HASH_LINKED` VARBINARY(32) NOT NULL DEFAULT '0',
    `HASH_SALTED` VARBINARY(32) NOT NULL DEFAULT '0',
    `HASH_PREVIOUS` VARBINARY(32) NOT NULL DEFAULT '0',
    `HASH_CONFIRMED_BY` VARBINARY(32) NULL DEFAULT 'UNCONFIRMED',
    `BLOCK_NUM` INT(10) UNSIGNED NOT NULL DEFAULT '0',
    `BLOCK_PREV` INT(11) NOT NULL DEFAULT '0',
    `BLOCK_NEXT` INT(10) UNSIGNED NOT NULL DEFAULT '0',
    `CODEL_TIME` TIMESTAMP NULL DEFAULT NULL,
    `CHAIN_TIME` TIMESTAMP NOT NULL DEFAULT current_timestamp(),
    UNIQUE INDEX `HASH_LINKED` (`HASH_LINKED`) USING BTREE,
    UNIQUE INDEX `HASH_SALTED` (`HASH_SALTED`) USING BTREE,
    UNIQUE INDEX `ROWNUM` (`ROWNUM`) USING BTREE,
    INDEX `HASH_SOURCE` (`HASH_SOURCE`) USING BTREE,
    INDEX `HASH_CONFIRMED_BY` (`HASH_CONFIRMED_BY`) USING BTREE,
    INDEX `BLOCK_NUM` (`BLOCK_NUM`) USING BTREE
)

which, as you can see, has some minor differences from the duplicate (because, when setting up the dupe, I decided I needed a couple more indexes, a primary key and I fixed an error or inconsistency or two when I created the dupe) but, crucially, as you can also see, the code for ROWNUM is identical in both versions; so I don't understand how the original code either initially worked or eventually failed catastrophically.


Finally, I took on board Pavel's point about ROWNUM being a reserved term and changed all reference from ROWNUM to ROWNUMBER in both the table definitions and throughout my own code.

so the first line of the CREATE Code now reads:

`ROWNUMBER` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,

and the unique index shares the same name.  

Which I presumed was enough to make Maria distinguish between the reserved term and the column name.

Made absolutely no difference. I'm now unable to find any target ROWNUMBER where ROW_NUM is either integer or string.

So we've actually gone backwards!
Hmmm.... this sounds like a negative karma.

What should be the next step?
Create a new table containing two columns (your ROWNUMBER + one more of any data type), insert some rows, and test some queries.

Queries should start with no WHERE clause, the try some constants in the WHERE clause, and then use the variable. Then you may add more columns/indexes and try again.
Additional option is to test it from a different client and/or to use different driver.
hmm actually rownumber IS indeed the reserved term (or both are) but that is not the reason for your issue. i suggest you use "id" for any autoincrement id and never use "id" for anything else. that is rather common practice and convenient.

given your above explanations i truly believe all this mess will be rather clear if you simply look at the data. try "select ROWNUMBER from table limit 10" so you assess the rownumbers that do exist.

i am unsure what your code is for copying the data but it would seem the source of your problems are there. the simple way to copy data from a db to another is to use mysqldump. if the machines can communicate together directly, you would use something like this :

mysqldump --add-drop-table DATABASE TABLE | mysql -u USER -p -h REMOTE_SERVER

obviously you need to replace the capital letters with adequate values

if they cannot communicate directly over the mysql port, you can setup the same kind of pipeline from a third machine ( mysqldump also features -u -p -h options ) or maybe pipe this through ssh

ssh source_host mysqldump .... | ssh destination_host mysql ...

or on the source

mysqldump ... | ssh destination mysql ...

or on the target

ssh source_host mysqldump ... | mysql ...
Ah. So ROWNUMBER is not the get out of jail card I thought it was.

Your advice looks useful. Unfortunately, I've been pulled off this problem in order to finish a paper I'm supposed to be preparing so I wont get a chance to play with this till next week. Will feed back then. Cheers
no sweat... only you might be in a hurry to solve that issue ;)... good luck with that paper
I'll try to install MariaDB over the weekend resulting in better investigation (hopefully)
MySQL isn't enough here... 
mariadb and mysql should display equivalent behaviors. afaik, there is no mariadb_dump tool as there would be no need for it and it would be backwards compatible anyways if such a tool was created.
My MariaDB installation on Windows includes all of the usual MySQL utility programs including 'mysqldump.exe'.
 like always. what is your point ?
greetings chaps.

Damn things obsessing me. So stole a quick break from writing papers in favour of rewriting code. Decided to drop, rebuild and rename (the already renamed) ROWNUMBER column to ROWUNQ. (Don't like ID as a column name as it doesn't really tell me what kind of data it represents. For me ROWUNQ is a better fit)

Then replaced all refs to ROWNUMBER with ROWUNQ, in the code and regenerated the first 100 rows.

Result: partial return of sanity but weird.
Still fails to find any ROWUNQ if the search is expressed as a Number but will find it if expressed as text.

But the record it returns displays ROWUNQ as a number and if I ask VFP to display VARTYPE(ROWUNQ) or TYPE("ROWUNQ") they confirm that it IS a number.

Frankly, if I was working with minimal data and a short term project, and it stays consistent, I could live with that, but, as a seasoned coder, I know that "symptoms" like this can come back to bite us and, given that, once I get the table stable, I need to pump about 500 million rows into it; which will later need to migrate into other SQL installations and, eventually, a Dynamo non sql database, I'd rather not take any risk that the data might be "tainted".

So, assuming that ROWUNQ isn't yet another reserved term, what's the next option?

if you pass it as a stored procedure or prepared query parameter and the parameter is a string, there is a simple explanation

where does ?ROW_NUM come from ??
ROW_NUM is just the arbitrary variable I assign as a search parameter.

in VFP, we initialise connection handle to the data source. Then assign an SQL string to the handle. When we subsequently need to run the query, we execute the query assigned to the handle.

The SQL string I use is the one I mentioned in the opening post:

'SELECT ROWNUM, HASH_LINKED, HASH_CONFIRMED_BY, HASH_SALTED,HASH_SOURCE,BLOCK_PREV,BLOCK_NEXT,BLOCK_NUM FROM HASH_CHAIN WHERE ROWNUM=?ROW_NUM'

Up to now, I've assigned ROW_NUM an integer value prior to any call to the query code. And, as I've explained above, it continues to work - as is - on my main workstation, DESPITE the continued use of the reserved term (ROWNUM)!


i am wondering whether he way you assign that variable might be the issue.

if it is cast to a string in the process, numbers would be casted to the empty string which would match ROWID = 0

can you try inserting an id directly in the query without using "?whatever" ?
"assign an SQL string to the handle" if you worded that literally, you pass a (string) parameter so it is casted at that moment
In my experience, the SQL query in MySQL and MariaDB should be:
"SELECT `ROWNUM`, `HASH_LINKED`, `HASH_CONFIRMED_BY`, `HASH_SALTED`, `HASH_SOURCE`, `BLOCK_PREV`, `BLOCK_NEXT`, `BLOCK_NUM` FROM `HASH_CHAIN` WHERE `ROWNUM` = '?ROW_NUM'"

Open in new window

The ` tells the server to consider the item as a column name or table name and not a reserved word.
reserved word is not the issue. if it were misinterpreted as one, we'd get an ugly parse error

Dave, tried your version, just out of curiousity, but it fails a syntax test in VFP and, in any case, the construction I've used is embedded into about 15 different SQL strings and they all work as expected (on the original workstation)

can you try inserting an id directly in the query without using "?whatever" ? 

think you're onto something. I tried it with
'SELECT ROWUNQ, HASH_LINKED, HASH_CONFIRMED_BY, HASH_SALTED,HASH_SOURCE,BLOCK_PREV,BLOCK_NEXT,BLOCK_NUM FROM HASH_CHAIN WHERE ROWUNQ=1'

and that duly found the first row. Which suggests you're right about the casting.

But I have to pass an arbitrary value in place of that digit 1, and I was schooled by an early precursor to this question that the way to do that was as I've illustrated. So how can I include the variable in the sqlstring without hitting the same problem. (and, of course, how and why is this NOT a problem on the other machine?)

I'm beginning to wonder if this behaviour is caused by something below the level I'm working on. Like, perhaps, the default character encoding format difference between the two machines. As far as I can tell, they're both using utf8 but I might be missing something...




unless you show the relevant part of the code, we won't know easily.

there are tons of way to insert a param in a query and you do not provide relevant information. is that a named param of a prepared query ? if it is, the type of the variable you pass to the execute statement is probably relevant.

charset would be a good hunch but in this case probably not since integers have no. charsets. but it may impact the conversion. that is actually what led me to figure it out.
is that a named param of a prepared query ? if it is, the type of the variable you pass to the execute statement is probably relevant.


precisely so. Forgive the lack of detail. As Pavel would know, because he's contributed to a number of them, I have raised multiple related questions over the past couple of years as I battle to complete this task and, within those prior questions, I have included the detail you refer to and, because, in my head, I've already explained those, I ignored the fact that newcomers to the saga might not have seen the previous episodes!

The VFP sequence for preparing SQL queries is

1 Declare a public variable to be assigned as a connection handle to the data source
i.e. PUBLIC findrow.

2 Get the new handle
FINDROW=NEWHANDLE('FINDROW')

(newhandle is my wrapper around the code
   &handle_name=SQLCONNECT('CODEL_MYSQL_CONNECT')
   
   'CODEL_MYSQL_CONNECT'  is the ODBC data source name

3 define the sql code and bind it to the handle

IF NOT PREPSQL('FINDROW','SELECT ROWNUM, HASH_LINKED, HASH_CONFIRMED_BY,                   HASH_SALTED,HASH_SOURCE,BLOCK_PREV,BLOCK_NEXT,BLOCK_NUM FROM HASH_CHAIN WHERE ROWNUM=?ROW_NUM','ROWFOUND')
   * this finds a row on the SQL table by looking for a target Row Number. If found, downloads the fields I need to      * retrieve from the relevant Row into a VFP cursor called 'ROWFOUND'
   
    stop('PREPSQL FAIL on FINDROW')   && my errortrap code
ENDIF

(PREPSQL is my wrapper around the VFP code
  SQLPREPARE(&handle_name,sqstring,cursorname)

4 if dynamic parameters will be required, declare those as public vars and assign appropriate values before executing the query

so "PUBLIC ROW_NUM" appears roughly the same place as PUBLIC FINDROW, in the opening lines of the code.

5 execute the query, passing the handle (and other approp parameters if required).

So, in this case, I have successfully inserted the first 100 rows into HASH-CHAIN  
I now need to confirm various attrributes of those records (chiefly that they're properly linked) so I run a FOR loop

here is a truncated version of the loop, which stops exactly where I'm hitting the problem we're discussing:


FOR cnt2=1 TO 100
    ROW_NUM=CNT2
     IF NOT SQLEXECUTE(FINDROW)  && TRAP any failure in the SQL code
          STOP('SQLX FAIL on FINDROW')
     ENDIF
     IF RECCOUNT('ROWFOUND')=0
            STOP('Why not finding Row: '+TRANSFORM(row_num),PROGRAM())
   [and this is where it is stopping because it IS passing ROW_NUM as a number but HASH-CHAIN doesn't want to deal with it unless I send it as a string, which is easy to do (just do what I do in the error trap and make ROW_NUM=TRANSFORM(cnt2)) but its a bodge I really need to avoid because this table is going to become enormous and I need to know it has rock solid foundations and this behaviour doesn't give me that confidence]


you got it wrong imho. and this is why you do not show the relevant part of the code.

the cast occurs before your query is executed. most likely when you create the handle or whatever variable that stores the row number before executing the query.

i know nothing of vfp but this is where you should be looking.

there is zero doubt mysql or mariadb can compare a numeric val with a number
btw, cnt2 might not be the same as CNT2.
VFP is "relaxed" about typing.

Unless we specify a public var as a specific type
(eg PUBLIC ROW_NUM AS INTEGER) it defaults to boolean. (.f.) and only acquires type when assigned a value.

(eg ROW_NUM=cnt2)  and it's equally relaxed about Case, so cnt2 is identical to CNT2.

We can even assign a numeric value to it in one expression, then a string, date, whatever in the next. Which makes it entirely plausible that variable type is the cause of the problem.

However, if that was the cause of the problem, we'd have to explain why it works on machine 1 and not, in otherwise identical circumstances, on machine 2.

Despite which, I've tried (on Machine 2) the explicit assignment of both ROW_NUM and cnt2 as INTEGER and it made no difference.

And, on reflection, I don't understand how casting can be the cause of the problem.

Although it's currently unstable and inconsistent, the most typical failure I'm getting is that the query returns a null result WHEN ROW_NUM IS NUMERIC.  I can then (sometimes) manually repeat the query by setting ROW_NUM as a string. And get back the relevant Row.

So if Casting IS the problem, I'd expect it to be that it is turning the initial Numeric var into a Character Var. But if so, it ought to be working because that form of ROW_NUM is the only one that does return a result.  But we know that, at the point of failure, ROW_NUM remains, as it was initialised, Numeric.

This is a very tangled mess!




an integer casted to a string produces the empty string in many languages.

the empty string in a mysql comparison with an int is equal to zero. any string that does not start with a digit, + or - is zero as well.

you do not have a row with id zero so you end up with an empty set. you can test that theory by creating a row with id 0.

you already know numeric ids work because you tried, so whatever your variable contains at that point is not the original integer.

--

i must say helping is really difficult with bits and pieces of code. given the above i would guess the convertion occurs when you call findrow.

btw, in vfp, you have loose types but you do have types. converting an int to a string may produce all stars or some string in scientific notation as per the str() function which may be called implicitely in various places.

maybe you can debug by printing the contents of the variable in different places, maybe you can post the whole concerned piece of code so we can actually see what is going on. you definitely can print the id's value in the loop.
can't dispute your logic.
Have obviously tried (and failed) to capture the point at which the type is changing. Everywhere I can test it, it remains numeric. The only place left for it to change (which I don't know how to debug) is within Maria's parsing of the incoming sql.

Intrigued by the notion of creating a row with id zero. How is that possible with auto-incrementing?
(and I can't turn that off as it would mean I'd have to feed a value to the field which would almost certainly disrupt this behaviour)

The question that stays prominent in my head is: how and why am I getting such different behaviour running the same software on two different machines? That surely suggests an "environmental" difference below the level of the code I'm running...


@Mike
Sorry, no time to install MariaDB yet. But I would like to know how are you accessing MariaDB? Is the query in your original question passed from FoxPro through ODBC?  
What ODBC connector version do you use? 
you can insert specific ids in an autoincrement column. if the id is gt than the last id, the last id is updated.

i assume you printed the id right before the select statement ?

i had rather debug and understand where the issue occurs before trying to figure why is does not happen on another machine.

it is fairly possible the cast occurs in a buggy wrapper driver in vfp
I would like to extend my previous question:  Could you please post ODBC connector version from both computers?
ASKER CERTIFIED SOLUTION
Avatar of Mike Jacobs
Mike Jacobs
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
thanks for sharing the details. good to know i wasn't sending you after some more red herrings ;)

as far as i am concerned, i do not really care about points but recognition and feedback are very welcome.

as an asker, i usually select all comments that either helped me or are likely to help a future reader facing a similar issue. but that is just me and i am unsure about the forums guidelines nowadays. this choice is slightly unfair because the number of comments does not represent their values but it allows to keep things simple.

there is a guideline regarding points awarding but i must admit i never went through it exhaustively. i usually ignore the step when i end up accepting an incomplete or mildly on spot answer and award the max when i really got what hoped for. but again, that is an interpretation based on my reluctancy to grade people or comments.