Oneill0003
asked on
Parsing Data into Multiple Fields
Experts:
Got the following string and I need to parse it into 9 fields:
1236/14852/369215852/14785 2/3652/147 896/145639 8/14523/12 54896
I will like to parse this data into 9 fields, since the most the set of ids that this string can have is 9, but not all the time is 9 it can range between 1 to 9 sets of tickets.
I got some logic worked but Im only pulling the first and second set of numbers:
Select substring('123456/147852/3 69852/1478 52/369852/ 147896/145 6398/14523 /1254896/1 23654',1,C HARINDEX(' /','123456 /147852/36 9852/14785 2/369852/1 47896/1456 398/14523/ 1254896/12 3654
')- 1) AS ID1
I will really appreciate any tip or idea,
Thanks!
Got the following string and I need to parse it into 9 fields:
1236/14852/369215852/14785
I will like to parse this data into 9 fields, since the most the set of ids that this string can have is 9, but not all the time is 9 it can range between 1 to 9 sets of tickets.
I got some logic worked but Im only pulling the first and second set of numbers:
Select substring('123456/147852/3
')- 1) AS ID1
I will really appreciate any tip or idea,
Thanks!
ASKER
Hey Bill:
I'm running Sybase 9. I'm trying to come up with 9 fields.
Sound like is going to be a trip to get this done. I'm new to Sybase i work usually with MS SQL Server.
Do you think you can give me a hand on coming up with the loop? I have been working with CHARINDEX and actually LOCATE and SUBSTRING in sybase but with no results. So do I need more then 1 query for this?
Not to sure what you mean by "if you were willing to pad the ticket number out to a fixed width"?
The length of the ticket number right now seems to be always 10 characters but it could change.
Thanks Bill!
I'm running Sybase 9. I'm trying to come up with 9 fields.
Sound like is going to be a trip to get this done. I'm new to Sybase i work usually with MS SQL Server.
Do you think you can give me a hand on coming up with the loop? I have been working with CHARINDEX and actually LOCATE and SUBSTRING in sybase but with no results. So do I need more then 1 query for this?
Not to sure what you mean by "if you were willing to pad the ticket number out to a fixed width"?
The length of the ticket number right now seems to be always 10 characters but it could change.
Thanks Bill!
Oh, you are not running ASE, you are running Sybase ASA (Adaptive Server Anywhere).
Well the good news is that you have the LOCATE function and can create User Defined Functions so you have a fighting chance.
I don't have an ASA 9 database handy so I can't create the actual code for you but I can outline it and you can slog on from there.
First, you will use the CREATE FUNCTION syntax to create a function that takes the field/string you want to parse, the separator, and the number of the element you want to return. The function then returns that element as a string.
As I said before, you will build a WHILE loop that takes the first element off of the beginning of the string and then removes that portion and the delimiter from the front of the string. You will do that N times until you get the element you want out of the string.
To use this in a query, you will have to call the function 9 times, one for each of the elements you are trying to retrieve. This won't be particularly fast and you will have a lot of edge conditions to deal with in the code including the first element, the last element (since there is no trailing delimiter, a single element in the string, and an empty string.
The example you gave shows 5, 6, and 7 digit long ticket numbers, so I am not sure how you can say they are always 10 digits. It may be they are a maximum of 10 digits but that does you no good at all.
When I said you should pad them out, I meant that they should all be the same length in the string. Whether you pad them with leading zeros or spaces doesn't matter but it would make life so much easier you should really consider it.
0002345601/0000001234/0008 765432 --- etc. or
2345601/ 1234/ 8765432 ........
That way you could create a simple function that took the field, a field length, and an element number and pulled the element you wanted with a simple substring. Definitely the way to go.
Regards,
Bill
Well the good news is that you have the LOCATE function and can create User Defined Functions so you have a fighting chance.
I don't have an ASA 9 database handy so I can't create the actual code for you but I can outline it and you can slog on from there.
First, you will use the CREATE FUNCTION syntax to create a function that takes the field/string you want to parse, the separator, and the number of the element you want to return. The function then returns that element as a string.
As I said before, you will build a WHILE loop that takes the first element off of the beginning of the string and then removes that portion and the delimiter from the front of the string. You will do that N times until you get the element you want out of the string.
To use this in a query, you will have to call the function 9 times, one for each of the elements you are trying to retrieve. This won't be particularly fast and you will have a lot of edge conditions to deal with in the code including the first element, the last element (since there is no trailing delimiter, a single element in the string, and an empty string.
The example you gave shows 5, 6, and 7 digit long ticket numbers, so I am not sure how you can say they are always 10 digits. It may be they are a maximum of 10 digits but that does you no good at all.
When I said you should pad them out, I meant that they should all be the same length in the string. Whether you pad them with leading zeros or spaces doesn't matter but it would make life so much easier you should really consider it.
0002345601/0000001234/0008
2345601/ 1234/ 8765432 ........
That way you could create a simple function that took the field, a field length, and an element number and pulled the element you wanted with a simple substring. Definitely the way to go.
Regards,
Bill
ASKER
Bill:
Yes you are correct Im using ASA.
At this point I dont have enough privileges to create a User Defined Functions, our DBAs are very picky and to get that kind of access I have to go throw many hoops and that this point I dont have the time. I know this is not the most professional way to get this accomplish but Im going to have to do this just with brute force right in the query. I know Im going to be creating a lot of syntax but like I mention right now I dont have the level of access to create UDF. Im creating a DTS package and connecting to Sybase and generating a SQL task in it to accomplish this. I will love to do the UDF but cant right now.
I now that the example I provided shows 5, 6 and 7 digits all I was trying to do was to show that the length of each set of number can change but on the existing data they all seem to have a max of 10 digits but it can change eventually in the near future and I just wanted to display a worse case scenario.
Im not to sure how to do the padding, remember that they could be less then 10 or greater then 10 digits.
I know Im not giving you many options but many of the times they dont give me the right tools that I need or the level of access to get stuff accomplish and I just have to work with what I have.
I really appreciate all of your help Bill and trying to brain stormed a solution. Im going to keep playing with LOCATE and SUBSTRING and see were I get.
Thanks Again!
Yes you are correct Im using ASA.
At this point I dont have enough privileges to create a User Defined Functions, our DBAs are very picky and to get that kind of access I have to go throw many hoops and that this point I dont have the time. I know this is not the most professional way to get this accomplish but Im going to have to do this just with brute force right in the query. I know Im going to be creating a lot of syntax but like I mention right now I dont have the level of access to create UDF. Im creating a DTS package and connecting to Sybase and generating a SQL task in it to accomplish this. I will love to do the UDF but cant right now.
I now that the example I provided shows 5, 6 and 7 digits all I was trying to do was to show that the length of each set of number can change but on the existing data they all seem to have a max of 10 digits but it can change eventually in the near future and I just wanted to display a worse case scenario.
Im not to sure how to do the padding, remember that they could be less then 10 or greater then 10 digits.
I know Im not giving you many options but many of the times they dont give me the right tools that I need or the level of access to get stuff accomplish and I just have to work with what I have.
I really appreciate all of your help Bill and trying to brain stormed a solution. Im going to keep playing with LOCATE and SUBSTRING and see were I get.
Thanks Again!
Don't bother. With the constraints you have, there is literally no way to do what you need to do. You can not get LOCATE, CHAR_INDEX, or PATINDEX to give you the second through ninth elements. The parser will give up long before you manage to write the complex in-line expressions nested 9 levels deep. Again, the database just won't do it.
- You are stuck with a crappy data design
- You have no privileges, and
- It appears you are unable to change the application to store the ticket#s in a useful form
Game over!
When your DBA is the obstacle, I have found great success in publicly dumping the problem in their lap. First, just ask for help. If that doesn't work....
Write a sickeningly sweet e-mail to the DBA explaining what you need done, all the constraints (including the lack of privileges to actually use the database as it was intended), that you can't figure out how to do it in plain SQL and that you need his/her help. Carbon copy your boss and his (if they are not the same) and make it clear that you are dead in the water until this is resolved. Now you have made it his problem, not yours, and made him and the problem visible to management.
One of two things will happen; either the DBA will write the function you need or he will grant you the privileges to write stored procedures and functions. This setup your in is really kind of stupid since you should be working in a development environment where you can't really hurt anything anyway. If the DBA wants to review the code before it goes in the production database, that makes sense though I can guarantee you that trying to do everything in SQL statements alone leads to some very, very bad (undesirably complex, inefficient, difficult to maintain, and impossible to tune) code.
Best of luck,
Bill
- You are stuck with a crappy data design
- You have no privileges, and
- It appears you are unable to change the application to store the ticket#s in a useful form
Game over!
When your DBA is the obstacle, I have found great success in publicly dumping the problem in their lap. First, just ask for help. If that doesn't work....
Write a sickeningly sweet e-mail to the DBA explaining what you need done, all the constraints (including the lack of privileges to actually use the database as it was intended), that you can't figure out how to do it in plain SQL and that you need his/her help. Carbon copy your boss and his (if they are not the same) and make it clear that you are dead in the water until this is resolved. Now you have made it his problem, not yours, and made him and the problem visible to management.
One of two things will happen; either the DBA will write the function you need or he will grant you the privileges to write stored procedures and functions. This setup your in is really kind of stupid since you should be working in a development environment where you can't really hurt anything anyway. If the DBA wants to review the code before it goes in the production database, that makes sense though I can guarantee you that trying to do everything in SQL statements alone leads to some very, very bad (undesirably complex, inefficient, difficult to maintain, and impossible to tune) code.
Best of luck,
Bill
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Bret,
That works, but the updates to temp table are really expensive, particularly if there is any real data volume. It also assumes the developer even has privileges to create a temp table. We have both seen questions from the poor souls trying to do work against that kind of nonsense.
If he can't create a stored procedure, he could still create a block of T-SQL, declare variabales, a cursor against the input table, the while loop, and use the "pull one off the front and do it again" methodology.
I guess the usefulness of doing this in a block of T-SQL depends on what context it all has to happen in. If it is a report of some kine, this just might work. If you have to call it from a reporting tool, e.g. Cognos, Crystal, etc., I think it is going to be an uphill battle.
Oneill0003, what is the context for this effort?
Regards,
Bill
That works, but the updates to temp table are really expensive, particularly if there is any real data volume. It also assumes the developer even has privileges to create a temp table. We have both seen questions from the poor souls trying to do work against that kind of nonsense.
If he can't create a stored procedure, he could still create a block of T-SQL, declare variabales, a cursor against the input table, the while loop, and use the "pull one off the front and do it again" methodology.
I guess the usefulness of doing this in a block of T-SQL depends on what context it all has to happen in. If it is a report of some kine, this just might work. If you have to call it from a reporting tool, e.g. Cognos, Crystal, etc., I think it is going to be an uphill battle.
Oneill0003, what is the context for this effort?
Regards,
Bill
ASKER
Hi guys:
Bret thanks for your input on this.
Bill, this data will be exported to a delimiter file from my dts package, which it will be delivery to another user that will either import it into excel or an Access database to analyze it.
Thanks
Isaac
Bret thanks for your input on this.
Bill, this data will be exported to a delimiter file from my dts package, which it will be delivery to another user that will either import it into excel or an Access database to analyze it.
Thanks
Isaac
I think the question may be more on how much data will you be dealing with and how frequently? Are you having to just occasionally parse out a single input, or will you be parsing out batches (of what size?) of inputs at a time?
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Hi Guys:
Once again I really appreciate all of your input on this issue.
Talking to my boss we end up using one of our BIA tools BRIO(Hyperion) which has a beautiful functions that lets you return the position of the mth occurrence of the s2 in S1 starting at n (Thats the definition of the function). I used it and it let me parse that field with no problem. We were trying our best to stay away from BRIO because of some other requirements of the project plus we had to do some other stuff that we cant do in BRIO, but because of this issue we have to do part of it in my DTS package and then do the parsing and the export of the files through BRIO.
I will definitely keep all of your suggestions Bill and we still agree that is better to have a UDF that will do this for us in a much better and cleaner way.
I will double the points and split them.
Thanks again for all of your Help!!!!
Once again I really appreciate all of your input on this issue.
Talking to my boss we end up using one of our BIA tools BRIO(Hyperion) which has a beautiful functions that lets you return the position of the mth occurrence of the s2 in S1 starting at n (Thats the definition of the function). I used it and it let me parse that field with no problem. We were trying our best to stay away from BRIO because of some other requirements of the project plus we had to do some other stuff that we cant do in BRIO, but because of this issue we have to do part of it in my DTS package and then do the parsing and the export of the files through BRIO.
I will definitely keep all of your suggestions Bill and we still agree that is better to have a UDF that will do this for us in a much better and cleaner way.
I will double the points and split them.
Thanks again for all of your Help!!!!
Next question, are you trying to come up with 9 records, or just 9 fields?
In either case, the problem is that CHARINDEX (and PATINDEX) only return the the first instance of the character or pattern you are searching for. I have not been able to find a way around that. As a result, you will have to do this programmatically. Bottom line, unless you have UDFs available (either JAVA or T-SQL), you cannot accomplish this in a single query.
Now, if you were willing to pad the ticket number out to a fixed width, you can pull it apart with no problem with 9 substring operators. That would be my recommendation since the design of stuffing variable length delimited strings into a varchar is so highly non-relational. Just not a good way to go at all. The really relational thing to do is to normalize the ticket numbers into an additional table but that may not be practical at this point.
As far as an algorithm to pull the ticket numbers apart goes, you need to setup a WHILE loop to iterate 9 times. Then each time through the loop, you use CHARINDEX to find the first '/' character, substring out what you need off of the front of it, and then chop the front off the string. You can do that with either another substring or a STUFF function.
You can, of course, exit the WHILE loop early when the remaining length of the string is zero.
Sorry there was no magic way to make this work but the functionality does not exist in native T-SQL.
Regards,
Bill