Link to home
Start Free TrialLog in
Avatar of ankuratvb
ankuratvbFlag for United States of America

asked on

Best Way to parse SQL

Hi,

I was working on a SQL query generator, and was wondering what'd be the best way to parse SQL using ANSI C.

Right now what i do is:

Tokenize the query based on spaces.
Determine which SQL statement it is based on these tokens.
Evaluate it accordingly.I have different syntax check mechanisms for different statements,

I dont think this is the best way to do it.

What would be a better,faster method of parsing SQL?
ASKER CERTIFIED SOLUTION
Avatar of Kent Olsen
Kent Olsen
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
If you can parse it correctly you can execute it! (at least in theory), so I recommend you to use a genuine SQL parser from any SQL implementation.
Avatar of ankuratvb

ASKER

Hi Kent,

Basically,i look at the first word of the query string to determine which statement it is,then i extract the substrings required and process them.

For e.g.:
char query[]="insert into emp values('as',1234)";

By looking at the first word, i know that its an insert command,so i extract the content between "insert into" and "values" to get the table name,validate it and then move on to validating the values within the parentheses.

Right now,it works fine but i need to support sub-queries as well.Thats where the problem comes.

For instance,

char query[]="select * from emp where empno=(select empno from emp where empno=1)";


>Are you planning to support user functions and stored procedures, too?
I might, but sub-queries is first priority right now.

jaime,
If i use an SQL parser written by somebody else, half my work would've been done by somebody else.Yes,i can definitely learn from their parser's implementation.
Do you have links to any simple SQL parser's implementation?
I've seen MySQL's code but it was way too complicated for me.
I've downloaded SQLite's code which is supposed to be a very small,simple and fast database engine(adopted by PHP in place of MySQL).I'll be studying the code in the next few days.

Hi Ankur,

That's what I was referring to when I said that SQL is recursive.  Just like evaluating the expression (A*(B+C)), subqueries force you to process outward from the middle.

Unless you want to get REALLY ugly, you're going to have to utilize a stack and recursive evaluation techniques.  It adds a bunch of work to your project, but in the long run it's the only practical way to process a recursive language.


Kent
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of alikoank
alikoank

Parsers are generally written as "Finite State Machine" implementations. This is a very detailed subject. You may want to google abit.

SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Hi gseidman,

I was looking at lex and yacc for doing the job,i even have the language definition for SQL for them.

But,i was aiming for a portable solution, and i guess there arent such tools as lex and yacc in Windoze.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial