SybaseUk
asked on
Sybase IQ generating distinct rows for dummy table
Hi,
I am trying to generate a table containing 25 columns and populate with some sample data. My aim is to get around 300k rows. I have generated the table with 300 distinct rows of sample data using powerdesigner. Does anybody know a nice way of blowing up this table to 300k rows without duplicating any of the data?? I tried using the STUFF function to take the last 2 digits of the ID and append to each column as follows: -
insert into test1 (C1)
select stuff(C1,5,2,substring(con vert(char, ID), length(convert(varchar, ID))-1,2))
from test1
But this method gives me duplicate values when I execute distinct count. Can some1 please help?
But this
I am trying to generate a table containing 25 columns and populate with some sample data. My aim is to get around 300k rows. I have generated the table with 300 distinct rows of sample data using powerdesigner. Does anybody know a nice way of blowing up this table to 300k rows without duplicating any of the data?? I tried using the STUFF function to take the last 2 digits of the ID and append to each column as follows: -
insert into test1 (C1)
select stuff(C1,5,2,substring(con
from test1
But this method gives me duplicate values when I execute distinct count. Can some1 please help?
But this
create table TEST1 (
ID int null default autoincrement,
C1 char(6) null,
C2 char(6) null,
C3 char(6) null,
C4 char(6) null,
C5 char(6) null,
C6 char(6) null,
C7 char(6) null,
C8 char(6) null,
C9 char(6) null,
C10 char(6) null,
C11 char(6) null,
C12 char(6) null,
C13 char(6) null,
C14 char(6) null,
C15 char(6) null,
C16 char(6) null,
C17 char(6) null,
C18 char(6) null,
C19 char(6) null,
C20 char(6) null,
C21 char(6) null,
C22 char(6) null,
C23 char(6) null,
C24 char(6) null,
C25 char(6) null
);
insert into TEST1 (C1, C2, C3, C4, C5, C6, C7, C8, C9, C10, C11, C12, C13, C14, C15, C16, C17, C18, C19, C20, C21, C22, C23, C24, C25) values ('H0A9HT', 'SGYO53', '27V8L2', '9EM508', ' MX12I', '5KP34L', '2S40AO', '2XIYW6', '0G5COU', 'T2QLOI', '4AQXA8', 'YIA80X', ' 6L8S2', 'IFR7NN', '8223LK', 'XHEKTW', '304GVR', 'W EMVT', '08G1JX', 'LT3CVB', 'GT38TX', '21DHHX', '5MKAWW', 'BNCISN', '1XDDJ2');
insert into TEST1 (C1, C2, C3, C4, C5, C6, C7, C8, C9, C10, C11, C12, C13, C14, C15, C16, C17, C18, C19, C20, C21, C22, C23, C24, C25) values ('RU812M', 'M6 7WG', '3HUU4A', 'QU55G9', 'RJMNOY', 'WRDJMH', 'V8KGCT', 'UWONM1', '6C3O Q', '9Y 6G', '2H927L', 'TVUT U', 'R5RG0N', '9SUJ37', 'VTOO2Y', '4XPLMN', 'OLMPU8', 'WU6VLD', 'P70A3X', ' 12VPO', '0MI3TX', 'G2E1W2', 'NNBMI5', 'NOH5C9', '73MKPD');
ASKER
Hi,
Thanks for that info. How would I create the loop to insert new rows (i.e. double the rows with distinct values) instead of updating the existing column values?
Thanks!
Thanks for that info. How would I create the loop to insert new rows (i.e. double the rows with distinct values) instead of updating the existing column values?
Thanks!
Just as an example:
declare @max int
declare @cur int
select @cur = 1, @max = 300000 -- test with a smaller number first as I haven't tested it
while @cur < @max
begin
insert into TEST1 values (str(rand()), str(rand(@cur)), str(rand()), ....)
select @cur = @cur + 1
end
You can use different seed values to manipulate random generator as well, I just put @cur as seed as an example.
declare @max int
declare @cur int
select @cur = 1, @max = 300000 -- test with a smaller number first as I haven't tested it
while @cur < @max
begin
insert into TEST1 values (str(rand()), str(rand(@cur)), str(rand()), ....)
select @cur = @cur + 1
end
You can use different seed values to manipulate random generator as well, I just put @cur as seed as an example.
You have another problem as well. Doing 300K individual inserts in Sybase IQ will take a very long time. IQ simply is not made for fast individual record manipulation.
If you have to do all this inside IQ, create the table without any indexes (except the default FPs of course) and populate it. Start with a few thousand rows of random data. For now, leave the IDs blank. Then, use INSERT/SELECT to double the number of rows each time. If you start with 9375 seed rows and double it 5 times, you will wind up with 300K.
After you have all the rows you need, you can go back and assign the IDs in bulk. You can either use something like UPDATE table SET IDcol = NUMBER(*). Alternatively, you can use the ROWID with something like... UPDATE table SET IDcol = ROWID(table). Of course, if you need a character prefix or something else, you can add that into the expression you use in the UPDATE
Of course, you can always use scripts or another database (ASA or ASE) to generate the data, BCP it out and do a LOAD TABLE to get it into IQ.
Regards,
Bill
If you have to do all this inside IQ, create the table without any indexes (except the default FPs of course) and populate it. Start with a few thousand rows of random data. For now, leave the IDs blank. Then, use INSERT/SELECT to double the number of rows each time. If you start with 9375 seed rows and double it 5 times, you will wind up with 300K.
After you have all the rows you need, you can go back and assign the IDs in bulk. You can either use something like UPDATE table SET IDcol = NUMBER(*). Alternatively, you can use the ROWID with something like... UPDATE table SET IDcol = ROWID(table). Of course, if you need a character prefix or something else, you can add that into the expression you use in the UPDATE
Of course, you can always use scripts or another database (ASA or ASE) to generate the data, BCP it out and do a LOAD TABLE to get it into IQ.
Regards,
Bill
ASKER
Hi Bill,
Many thanks for your reply. As always, your help is very much valued & appreciated.
Its a little more tricky, because my table has 25 columns which are char(6). I designed it with powerdesigner and also used PD to generate 300 rows of sample data. I have been using INSERT/SELECT to double the rows within IQ but the problem is that I end up with duplicate rows. I then tried to use RAND() within my INSERT/SELECT statement but I am still ending up with duplicate rows.
insert into test1 (C1, C2, C3, C4, C5, C6, C7, C8, C9, C10, C11, C12, C13, C14, C15, C16, C17, C18, C19, C20, C21,
C22, C23, C24, C25)
select stuff(C1,5,2,substring(con vert(char, rand()), 2,2))
select stuff(C2,5,2,substring(con vert(char, rand()), 2,2))
select stuff(C3,5,2,substring(con vert(char, rand()), 2,2))
........................
etc
Running this statement twice now gives me 1200 rows with only 900 distinct values in each column.
select top 5 c1 from test1
c1
'H0A9HT'
'RU812M'
'FACK8O'
'UBC86S'
'3REOAK'
select top 5 c1 from test1 where ROWID("Test1") > 300
c1
'H0A9.5'
'RU81.5'
'FACK.5'
'UBC8.5'
'3REO.5'
select top 5 c1 from test1 where ROWID("Test1") > 600
c1
'H0A9.9'
'RU81.9'
'FACK.9'
'UBC8.9'
'3REO.9'
select top 5 c1 from test1 where ROWID("Test1") > 900
c1
'H0A9.9'
'RU81.9'
'FACK.9'
'UBC8.9'
'3REO.9'
Any ideas why my data is being duplicated and not giving me different values using the rand() function???
Thanks
Many thanks for your reply. As always, your help is very much valued & appreciated.
Its a little more tricky, because my table has 25 columns which are char(6). I designed it with powerdesigner and also used PD to generate 300 rows of sample data. I have been using INSERT/SELECT to double the rows within IQ but the problem is that I end up with duplicate rows. I then tried to use RAND() within my INSERT/SELECT statement but I am still ending up with duplicate rows.
insert into test1 (C1, C2, C3, C4, C5, C6, C7, C8, C9, C10, C11, C12, C13, C14, C15, C16, C17, C18, C19, C20, C21,
C22, C23, C24, C25)
select stuff(C1,5,2,substring(con
select stuff(C2,5,2,substring(con
select stuff(C3,5,2,substring(con
........................
etc
Running this statement twice now gives me 1200 rows with only 900 distinct values in each column.
select top 5 c1 from test1
c1
'H0A9HT'
'RU812M'
'FACK8O'
'UBC86S'
'3REOAK'
select top 5 c1 from test1 where ROWID("Test1") > 300
c1
'H0A9.5'
'RU81.5'
'FACK.5'
'UBC8.5'
'3REO.5'
select top 5 c1 from test1 where ROWID("Test1") > 600
c1
'H0A9.9'
'RU81.9'
'FACK.9'
'UBC8.9'
'3REO.9'
select top 5 c1 from test1 where ROWID("Test1") > 900
c1
'H0A9.9'
'RU81.9'
'FACK.9'
'UBC8.9'
'3REO.9'
Any ideas why my data is being duplicated and not giving me different values using the rand() function???
Thanks
ASKER
applmoon,
I need to use stuff to append the first 2 s.f's of the generated random value to the existing column C1, C2, C3 respectively.....
So I can't use literal values, I must use INSERT/SELECT. However, I am still getting duplicates........
I need to use stuff to append the first 2 s.f's of the generated random value to the existing column C1, C2, C3 respectively.....
So I can't use literal values, I must use INSERT/SELECT. However, I am still getting duplicates........
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
I agree that "Using a RAND function will not guarantee the uniqueness". But possibility of uniqueness depends on how that function is used. A dice has 6 sides, a two digit number has only 99 possibilities. If you use rand function 6 times to generate 6 different characters, you would have 308,915,776 possibilities and the chance of uniqueness would be high enough even though it is not guaranteed.
I think using some seed rows is not a good idea.
I think using some seed rows is not a good idea.
The problem is that the procedure SybaseUK is using does not generate six individual random characters for each of the 25 fields for each row. It looks as if he is simply stuffing a couple of characters in. Besides, if you did try calling RAND 150 times for each row, I suspect there would be a performance issue. I am pretty sure that RAND is an ASA function, not an IQ function, so all the data would have to pass back and forth between the two engines. This kind of thing should be better documented in IQ but as of now it is more experiencial than anything else.
Bottom line, if you need guaranteed uniqueness, you cannot use random functions.
Regards,
Bill
Bottom line, if you need guaranteed uniqueness, you cannot use random functions.
Regards,
Bill
I might be wrong, but my understanding is this is a one off process to run and prepare a sample data for a test. So performance shouldn't be an issue. Also data can be prepared on another server and then loaded into IQ. Also if it is the case and there are a few duplicate rows, they can be deleted.
You are right saying that instead of using seed rows generating all rows at once would be a better approach. I am not really sure whether using rand function to create six character data is easier or your method is easier. But, I think they are both valid methods.
There might be around 300 duplicate rows out of 300K rows when rand() is used, but they can be deleted easily. Also it depends on how many charaters are used. If small caps and numbers are used, the possibility of uniqueness would be much higher.
You are right saying that instead of using seed rows generating all rows at once would be a better approach. I am not really sure whether using rand function to create six character data is easier or your method is easier. But, I think they are both valid methods.
There might be around 300 duplicate rows out of 300K rows when rand() is used, but they can be deleted easily. Also it depends on how many charaters are used. If small caps and numbers are used, the possibility of uniqueness would be much higher.
ASKER
Hi alpmoon,
I have tried using the loop method you have suggested but I am still getting the same results as I did without the loop. Can you please look at my code and tell me whats wrong?
declare @max int
declare @cur int
declare @char char(2)
select @cur = 1, @max = 3
while @cur < @max
begin
select @char = substring(convert(char, rand(@cur)),2,2)
insert into Test1 (C1)
select
stuff(C1,5,2,@char) from Test1
select @cur = @cur + 1
end
As you can see I am using stuff to append the last 2 random numbers to replace the last 2 characters of C1. I cannot use literal values therefore I am using INSERT/SELECT...
Any help appreciated please....
I have tried using the loop method you have suggested but I am still getting the same results as I did without the loop. Can you please look at my code and tell me whats wrong?
declare @max int
declare @cur int
declare @char char(2)
select @cur = 1, @max = 3
while @cur < @max
begin
select @char = substring(convert(char, rand(@cur)),2,2)
insert into Test1 (C1)
select
stuff(C1,5,2,@char) from Test1
select @cur = @cur + 1
end
As you can see I am using stuff to append the last 2 random numbers to replace the last 2 characters of C1. I cannot use literal values therefore I am using INSERT/SELECT...
Any help appreciated please....
ASKER
Hi Bill,
alpmoon is right in that the sample data does not need to 100% unique and I agree with him. However, the loop method isn't working presently and I am stuck. I desperately need to get around 300k rows and as alpmoon suggested, I can easily remove any duplicates.
The problem I am facing is that the rand() function is not producing random values which I can then stuff into C1, C2, C3 etc....
alpmoon is right in that the sample data does not need to 100% unique and I agree with him. However, the loop method isn't working presently and I am stuck. I desperately need to get around 300k rows and as alpmoon suggested, I can easily remove any duplicates.
The problem I am facing is that the rand() function is not producing random values which I can then stuff into C1, C2, C3 etc....
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Bill you are correct that RAND() will not guarantee uniqueness but a combination of 25 columns with random values appended should be unique (across a single row)....
INSERT/SELECT is doing this process in batches of 300 and thats the reason I think I am not getting a true random value. I need a way of stuffing a new random value (even though there are only 99 possibilities) into the respective columns for every row.
Can you suggest any solution please?
P.S. I'm just trying to generate a modest size table for testing purposes...
INSERT/SELECT is doing this process in batches of 300 and thats the reason I think I am not getting a true random value. I need a way of stuffing a new random value (even though there are only 99 possibilities) into the respective columns for every row.
Can you suggest any solution please?
P.S. I'm just trying to generate a modest size table for testing purposes...
Now, I understand what you want better. If you only want to generate only two digit, you should multiply the result of rand() with at least 100:
insert into test1 (C1, C2, C3, C4, C5, C6, C7, C8, C9, C10, C11, C12, C13, C14, C15, C16, C17, C18, C19, C20, C21,
C22, C23, C24, C25)
select stuff(C1,5,2,substring(con vert(char, rand()*100), 2,2))
select stuff(C2,5,2,substring(con vert(char, rand()*100), 2,2))
select stuff(C3,5,2,substring(con vert(char, rand()*100), 2,2))
....
or use module additionally:
insert into test1 (C1, C2, C3, C4, C5, C6, C7, C8, C9, C10, C11, C12, C13, C14, C15, C16, C17, C18, C19, C20, C21,
C22, C23, C24, C25)
select stuff(C1,5,2,substring(con vert(char, rand()*100000 % 100), 2,2))
select stuff(C2,5,2,substring(con vert(char, rand()*100000 % 100), 2,2))
select stuff(C3,5,2,substring(con vert(char, rand()*100000 % 100), 2,2))
......
insert into test1 (C1, C2, C3, C4, C5, C6, C7, C8, C9, C10, C11, C12, C13, C14, C15, C16, C17, C18, C19, C20, C21,
C22, C23, C24, C25)
select stuff(C1,5,2,substring(con
select stuff(C2,5,2,substring(con
select stuff(C3,5,2,substring(con
....
or use module additionally:
insert into test1 (C1, C2, C3, C4, C5, C6, C7, C8, C9, C10, C11, C12, C13, C14, C15, C16, C17, C18, C19, C20, C21,
C22, C23, C24, C25)
select stuff(C1,5,2,substring(con
select stuff(C2,5,2,substring(con
select stuff(C3,5,2,substring(con
......
alpmoon,
Yeah, this gets a log clearer now that we know sybaseUK does not each column value to be unique; just the entire row.
That code will work 90 percent of the time but you still have a problem with single digit numbers. You need to ensure there is a leading zero or you will wind up with about 10 percent of the values being too short.
Try...
select stuff(C1,5,2,right('0'+con vert(varch ar,rand()* 100000 % 100),2))
Come to think of it, STUFF is slower than substring and concatenation. We could use...
select substring(C1,1,4)+right('0 '+convert( varchar,ra nd()*10000 0 % 100),2)
Finally, instead of 100000, I would choose a prime number in that vicinity. Something tells me you'll get better (more random) results than using a multiplier that is an even multiple of the modulo.
Regards,
Bill
Yeah, this gets a log clearer now that we know sybaseUK does not each column value to be unique; just the entire row.
That code will work 90 percent of the time but you still have a problem with single digit numbers. You need to ensure there is a leading zero or you will wind up with about 10 percent of the values being too short.
Try...
select stuff(C1,5,2,right('0'+con
Come to think of it, STUFF is slower than substring and concatenation. We could use...
select substring(C1,1,4)+right('0
Finally, instead of 100000, I would choose a prime number in that vicinity. Something tells me you'll get better (more random) results than using a multiplier that is an even multiple of the modulo.
Regards,
Bill
ASKER
hi guys,
i was trying to convert the rand() output to char and then take a substring of that i.e. 2 digits and stuff it into C1.
so for example:
select stuff('H0A9HT', 5,2,substring(convert(char , rand()), 2,2))
would produce the result:
H0A935
the last 2 digits being random values
i was trying to convert the rand() output to char and then take a substring of that i.e. 2 digits and stuff it into C1.
so for example:
select stuff('H0A9HT', 5,2,substring(convert(char
would produce the result:
H0A935
the last 2 digits being random values
ASKER
Hi,
I get a syntax error from executing the following statement: -
select stuff(C1,5,2,right('0'+con vert(varch ar,rand()* 100000 % 100),2))
If I remove % 100 then the statement executes but I dont think its doing what it should be????
select stuff(C1,5,2,right('0'+con vert(varch ar,rand()* 100000),2) )
I get a syntax error from executing the following statement: -
select stuff(C1,5,2,right('0'+con
If I remove % 100 then the statement executes but I dont think its doing what it should be????
select stuff(C1,5,2,right('0'+con
The text of the syntax error might be useful but I think I know what the issue is.
The modulo operator only works on integers. You will have to add a CONVERT(INT,..... to the expression to get it to function properly.
You know, since you are going to grab two digits by position and then convert them into a string anyway, why bother with the modulo. Try something like....
select stuff(C1,5,2,substring(con vert(varch ar,rand(), 3,2)))
Regards,
Bill
The modulo operator only works on integers. You will have to add a CONVERT(INT,..... to the expression to get it to function properly.
You know, since you are going to grab two digits by position and then convert them into a string anyway, why bother with the modulo. Try something like....
select stuff(C1,5,2,substring(con
Regards,
Bill
ASKER
Hi Bill,
This is what I began with!
I appreciate that I will get some column values which will be duplicates but the chances of the entire row being a duplicate (or identical) should be should be very less likely. But I am getting many rows which are completely identical so I know there is a bug in my code or the loop.
I was thinking the purpose of the loop is to do single row by row inserts and thus giving me a true random value each time. But I am not doing single row by row inserts, rather I am doubling the amount of rows.
How do I do single row by row inserts but only change/replace the last 2 characters of each value with a random value????
This is getting very frustrating now.
Thanks for all your help.
This is what I began with!
I appreciate that I will get some column values which will be duplicates but the chances of the entire row being a duplicate (or identical) should be should be very less likely. But I am getting many rows which are completely identical so I know there is a bug in my code or the loop.
I was thinking the purpose of the loop is to do single row by row inserts and thus giving me a true random value each time. But I am not doing single row by row inserts, rather I am doubling the amount of rows.
How do I do single row by row inserts but only change/replace the last 2 characters of each value with a random value????
This is getting very frustrating now.
Thanks for all your help.
ASKER
Hi Guys,
Please disregard all the previous messages. I have decided to scrap my 300 rows of sample data and start from scratch.
My table has an ID (auto-identity) column and 25 columns of char(6). I would like to populate this table with random data and have atleast 300k rows that are unique....please advise the best way to go about
many thanks!
Please disregard all the previous messages. I have decided to scrap my 300 rows of sample data and start from scratch.
My table has an ID (auto-identity) column and 25 columns of char(6). I would like to populate this table with random data and have atleast 300k rows that are unique....please advise the best way to go about
many thanks!
I am correcting the syntax there:
declare @max int
declare @cur int
declare @char6 char(6)
select @cur = 1, @max = 300
while @cur < @max
begin
select @char6 = char((rand()*100000 % 26) + 65)+char((rand()*100000 % 26) + 65) +
char((rand()*100000 % 26) + 65)+char((rand()*100000 % 26) + 65) +
char((rand()*100000 % 26) + 65)+char((rand()*100000 % 26) + 65)
insert into Test1 (C1)
select @char6
select @cur = @cur + 1
end
I have tested it for one column and it works fine(covers A to Z). You can repeat @char6 variable as many as you can.
declare @max int
declare @cur int
declare @char6 char(6)
select @cur = 1, @max = 300
while @cur < @max
begin
select @char6 = char((rand()*100000 % 26) + 65)+char((rand()*100000 % 26) + 65) +
char((rand()*100000 % 26) + 65)+char((rand()*100000 % 26) + 65) +
char((rand()*100000 % 26) + 65)+char((rand()*100000 % 26) + 65)
insert into Test1 (C1)
select @char6
select @cur = @cur + 1
end
I have tested it for one column and it works fine(covers A to Z). You can repeat @char6 variable as many as you can.
ASKER
alpmoon,
I have tried the above and I get a syntax error. If I execute this on its own: -
select char((rand()*100000 % 26) + 65)
I get syntax error near '(end of line)' message. Please help me resolve this.
I have tried the above and I get a syntax error. If I execute this on its own: -
select char((rand()*100000 % 26) + 65)
I get syntax error near '(end of line)' message. Please help me resolve this.
ASKER
Hi guys,
I now have the following and it works fine but it takes a very long time and also eventually runs out of space. Is there a quicker way of doing this?
declare @max int
declare @cur int
//declare @char6 char(6)
select @cur = 1, @max = 100000
while @cur < @max
begin
insert into Test1 (C1, C2, C3, C4, C5, C6, C7, C8, C9, C10, C11, C12, C13, C14, C15, C16, C17, C18, C19, C20, C21,
C22, C23, C24, C25)
select stuff(inttohex(convert(int ,rand()*10 0000)),1,3 ,'A'),
stuff(inttohex(convert(int ,rand()*10 0000)),1,3 ,'B'),
stuff(inttohex(convert(int ,rand()*10 0000)),1,3 ,'C'),
stuff(inttohex(convert(int ,rand()*10 0000)),1,3 ,'D'),
stuff(inttohex(convert(int ,rand()*10 0000)),1,3 ,'E'),
stuff(inttohex(convert(int ,rand()*10 0000)),1,3 ,'F'),
stuff(inttohex(convert(int ,rand()*10 0000)),1,3 ,'G'),
stuff(inttohex(convert(int ,rand()*10 0000)),1,3 ,'H'),
stuff(inttohex(convert(int ,rand()*10 0000)),1,3 ,'I'),
stuff(inttohex(convert(int ,rand()*10 0000)),1,3 ,'J'),
stuff(inttohex(convert(int ,rand()*10 0000)),1,3 ,'K'),
stuff(inttohex(convert(int ,rand()*10 0000)),1,3 ,'L'),
stuff(inttohex(convert(int ,rand()*10 0000)),1,3 ,'M'),
stuff(inttohex(convert(int ,rand()*10 0000)),1,3 ,'N'),
stuff(inttohex(convert(int ,rand()*10 0000)),1,3 ,'O'),
stuff(inttohex(convert(int ,rand()*10 0000)),1,3 ,'P'),
stuff(inttohex(convert(int ,rand()*10 0000)),1,3 ,'Q'),
stuff(inttohex(convert(int ,rand()*10 0000)),1,3 ,'R'),
stuff(inttohex(convert(int ,rand()*10 0000)),1,3 ,'S'),
stuff(inttohex(convert(int ,rand()*10 0000)),1,3 ,'T'),
stuff(inttohex(convert(int ,rand()*10 0000)),1,3 ,'U'),
stuff(inttohex(convert(int ,rand()*10 0000)),1,3 ,'V'),
stuff(inttohex(convert(int ,rand()*10 0000)),1,3 ,'W'),
stuff(inttohex(convert(int ,rand()*10 0000)),1,3 ,'X'),
stuff(inttohex(convert(int ,rand()*10 0000)),1,3 ,'Y')
select @cur = @cur + 1
end
I now have the following and it works fine but it takes a very long time and also eventually runs out of space. Is there a quicker way of doing this?
declare @max int
declare @cur int
//declare @char6 char(6)
select @cur = 1, @max = 100000
while @cur < @max
begin
insert into Test1 (C1, C2, C3, C4, C5, C6, C7, C8, C9, C10, C11, C12, C13, C14, C15, C16, C17, C18, C19, C20, C21,
C22, C23, C24, C25)
select stuff(inttohex(convert(int
stuff(inttohex(convert(int
stuff(inttohex(convert(int
stuff(inttohex(convert(int
stuff(inttohex(convert(int
stuff(inttohex(convert(int
stuff(inttohex(convert(int
stuff(inttohex(convert(int
stuff(inttohex(convert(int
stuff(inttohex(convert(int
stuff(inttohex(convert(int
stuff(inttohex(convert(int
stuff(inttohex(convert(int
stuff(inttohex(convert(int
stuff(inttohex(convert(int
stuff(inttohex(convert(int
stuff(inttohex(convert(int
stuff(inttohex(convert(int
stuff(inttohex(convert(int
stuff(inttohex(convert(int
stuff(inttohex(convert(int
stuff(inttohex(convert(int
stuff(inttohex(convert(int
stuff(inttohex(convert(int
stuff(inttohex(convert(int
select @cur = @cur + 1
end
ASKER
Do you think it would be easier to write a shell script to create a data file containing delimited rows and then load these into IQ???
Can anyone please help me write a C shell script to create a test data file??
Thanks,
Can anyone please help me write a C shell script to create a test data file??
Thanks,
ASKER
P.S. I need your help to write a shell script that will create the data (300K rows) each containing a unique id and random data made up of 25 columns of char(6).....
Please let me know whether I shud ask a new question and I would be happy to do that....
Many Thanks,
Please let me know whether I shud ask a new question and I would be happy to do that....
Many Thanks,
Actually the code I have tested was for ASE. If you have an ASE server. You can extend it for 25 columns and run for 300K rows. Then bcp out the data and insert into IQ. Inserting data into IQ one row at a time is not a good idea at all as Bill said before.
Or if you have ASA, you can do it in ASA as well. If you don't both, directing output to a text file and then eliminating unnecessary lines and tabs by sed might be a better option without writing a shell script.
Or if you have ASA, you can do it in ASA as well. If you don't both, directing output to a text file and then eliminating unnecessary lines and tabs by sed might be a better option without writing a shell script.
All I can do is repeat my suggestion from early in the thread....
So what I would do is create the list of numbers from 1 to 300,000, then generate each column value from that by adding the column prefix (A-Z) and calculating a hex number from the ID (1-300000) and the offset for that column.You can do this by creating a single column table with the IDs and a helper table with 26 columns of INT and the offset for each one. You can then generate the entire target table in an INSERT/SELECT by joining your ID table with the Column Offset table and using an expression to generate each column value, e.g.INSERT INTO TargetTableSELECT ID, 'A' + RIGHT(INTTOHEX(ID+CS.COL_A ),5), 'B'.........FROM ID_Table, Column_Offset CS
This will do all the inserts in one pass and be at least 100 and probably closer to 1000 times faster than the individual inserts you were doing.
How you create the ID_Table rows is irrelevant. You could create a list of 300000 numbers in a shell script, stuff it in a flat file, and then do a LOAD TABLE to get it in.
Regards,
Bill
So what I would do is create the list of numbers from 1 to 300,000, then generate each column value from that by adding the column prefix (A-Z) and calculating a hex number from the ID (1-300000) and the offset for that column.You can do this by creating a single column table with the IDs and a helper table with 26 columns of INT and the offset for each one. You can then generate the entire target table in an INSERT/SELECT by joining your ID table with the Column Offset table and using an expression to generate each column value, e.g.INSERT INTO TargetTableSELECT ID, 'A' + RIGHT(INTTOHEX(ID+CS.COL_A
This will do all the inserts in one pass and be at least 100 and probably closer to 1000 times faster than the individual inserts you were doing.
How you create the ID_Table rows is irrelevant. You could create a list of 300000 numbers in a shell script, stuff it in a flat file, and then do a LOAD TABLE to get it in.
Regards,
Bill
ASKER
Hi,
I have now inserted the data into ASE and I have unique rows of data...
I want to bcp the data out and have it pipe delimited and row delimited by new line. Can you please show me how I can format the data so it is pipe delimited and row delimited by new line? I am using windows.....
I will then load this flat file into IQ
Thanks,
I have now inserted the data into ASE and I have unique rows of data...
I want to bcp the data out and have it pipe delimited and row delimited by new line. Can you please show me how I can format the data so it is pipe delimited and row delimited by new line? I am using windows.....
I will then load this flat file into IQ
Thanks,
ASKER
Hi,
Can anybody explain why a simple query like the following takes around 6 seconds in ASE but is taking around 4 mins in IQ????????????
select C1, count(*)
from test1
group by C1
having count(*) > 1
I have loaded the IQ table by generating the random data in ASE and BCP'ing the data out. I am very confused why it is taking so long for this query to run.....
Can some1 please have a look at my query plan also and help me?
queryplan.doc
Can anybody explain why a simple query like the following takes around 6 seconds in ASE but is taking around 4 mins in IQ????????????
select C1, count(*)
from test1
group by C1
having count(*) > 1
I have loaded the IQ table by generating the random data in ASE and BCP'ing the data out. I am very confused why it is taking so long for this query to run.....
Can some1 please have a look at my query plan also and help me?
queryplan.doc
That one sounds like a whole new topic and question. Why don't you close off this one an open a new one. Give the points to Alpmoon.
Regards,
Bill
Regards,
Bill
ASKER
I am splitting the points 50/50 as both of you participated equally in the final solution..
http://www.sybase.com/detail?id=648
You can develop a similar loop to create rows.