• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 359
  • Last Modified:

replacing characters within a delimited string (i.e. " quotes)

hello experts

Her eis my problem, that I have been struggling with. I am writing a library (SO) on soalris to be used by some 3rd party application. It accepts calls to internal procedures using its own language given you pass it through their c function, which is run_code(arg);

So I have to pass in the internal call to this function run_code(arg). Thus I have a line that looks like

run_code(internalFunction/SubRoutine("arg1", "arg2", ..., "arg n"))

The problem is if any of the args to the internal function call have any of the following characters in its string (i.e " quote, ' single quote, \ backslash, or ) right parenthesis) then it has a problem. The run_code function, which is vendor propietary can not handle these characters, thus I need to replace any of these special characters with "entity refernece" and then replace it once we get through this wall (run_code)... Thus I want to do the following....

Given this call...

run_code(Class/Method("my argument with special chars is: )'"\", "next arg", "last arg"))

needs to be turned into this

run_code(Class/Method("my argument with special chars is: #rpar;#apos;#quot;#bksl;", "next arg", "last arg"))

were
) -> #rpar;
\ -> #bksl;
' -> #apos;
" -> #quot;

I have a wonderful substring replacement function that I wrote, that would work great if I could break this into tokens successfully... replacing the single quot and backslash really aren't a problem because they are not special within the function call itself... I can just do a substring replacement for single quot and backslash for the whole string... the problem is the " double quote and ) right parenthesis...

What I need is something that will read this string... and give me a substring that I can do replacements on, were the substring is delimited by "...", or "..."))
that is give me a sub string if it begins with " has some stuff and is followed by ", " (i include comma b/c there may be " in the argument) or " ... ")) and 2 parenthesis followed by no more characters or white space... (signaling end of call.. i.e. last argument)

Sorry if that is confusing.. it is confusing me too.. that is why I need help. In simple words I need a way to replace the following character " ' ) and \ when they are used as characters in the argument portion of the call below.

run_code(Class/Method("my argument with special chars is: )'"\", "next arg", "last arg"))

Any help, thoughts, or ideas...?? pseudo code, examples greatly welcomed... This should be highly flexible in case the user

Thanks in advance

rechard
0
rechard
Asked:
rechard
  • 3
  • 2
1 Solution
 
jcaldwelCommented:
// Pseudo-C

boolean inEscape;
boolean collecting;
char* collector;
char* escapeseq;
int collidx;
int escapeidx;

inEscape = false;
collecting = false;
collidx = 0;
escidx = 0;

for( int i = 0; i < strlen; i++ )
{
  // Check if we are a token delimiter
  if( str[i] == ' ' || str[i] == '\t' || str[i] == '\n' )
  {
    // Assumes you are breaking tokens into a list
    if( collecting )
    {
      // NULL Terminate
      collector[ collidx ] = 0;
      addToken( collector );
    }
    collecting = false;
    collector = malloc( MAXTOKEN );
    collidx = 0;
    continue;
  }
 
  if( inEscape == false && str[i] == '#' )
  {
    inEscape = true;
    escidx = 0;
  }
  if( inEscape == true && str[i] == ';' )
  {
    inEscape = false;
    // NULL Terminate
    escapeseq[escidx] = 0;

    if( strcmp( escapeseq, "rpar" ) == 0 )
      collector[ collidx ] == ')';
    else if( strcmp( escapeseq, "bksl" ) == 0 )
      collector[ collidx ] == '\\';
    else if( strcmp( escapeseq, "apos" ) == 0 )
      collector[ collidx ] == '\'';
    else if( strcmp( escapeseq, "quot" ) == 0 )
      collector[ collidx ] == '"';
    else
      return SYNTAX_ERROR;
    collidx++;
    continue;
  }
  else if( inEscape == true )
  {
    escapeseq[ escidx++ ] = str[ i ];
    continue;
  }
 
  collector[ collidx++ ] = str[i];
  collecting = true;
 
 

}
0
 
rechardAuthor Commented:
after thinking about this some more....

I am going to attempt to make this clearer...

assume the following function call...

SomeClassA/SomeMethodB("arg 1 with special characters " ' \ )", "arg2", "arg ...", "arg n")

In the above call the function I am feeding this call to has problems interpreting the special characters " ' \ ), double quote, single quote, backslash, and right parenthesis... thus I need to convert these "special characters" to a reference code until I can pass it through the function creating the problem...

were
) -> #rpar;
\ -> #bksl;
' -> #apos;
" -> #quot;

after I have made it "through" the problem propietary function call that has this special character limitation I will decode them back to their intended values...

replacing the backslash and single quote are not a problem because they do not have any "significance" in the call, however the real problem is with the ) right parenthesis and double quotes " because they both have significance in the function call (i.e double quotes to delimit arguments and right parenthesis to end function call.).

Any ideas?

Thanks in advance

rechard
0
 
rechardAuthor Commented:
jcaldwel thanks for the suggestion.. I guess I am being a little dense.. possible to get a more in depth code example...

anyone who responds to this please provide an example if possible...

thanks

rechard
0
What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

 
rechardAuthor Commented:
jcaldwell.. I think I get what you were showing, but the problem is that the " double quotes are argument delimiters and right parenthesis is end of function delimiter... the problem is I can't use these as delimiters because they can occur in each argument string... thus I need a way to treat this like a normal function call. That is when you make a function call in C, C is smart enough to know that the string "This is \"really cool\"", is supposed to have quotes around really cool and thus leaves them there... I need something that is somewhat intuitive... knows what are intended to be treated as string arguments whether or not it contains special characters...

adding more points for whoever can help me here...

thanks
rechard
0
 
jcaldwelCommented:
If I were to do a language tokenizer from scratch, I would probably do a list of some sort (linked list or array) of a "token" struct, each "token" would contain a string for the actual value of each token, and a int type for each token.

I would put an entire literal string into a single token with a type of STRING_LITERAL, sans the quotes. Then I am free to put the escaped characters into the value explicitly because the type aleady tells me that it is a literal string.

I would probably put "(" and ")" in tokens by themselves as types OPEN_FUNC_PRAM and CLOSE_FUNC_PRAM.
0
 
jmcgOwnerCommented:
Nothing has happened on this question in more than 10 months. It's time for cleanup!

My recommendation, which I will post in the Cleanup topic area, is to
accept answer by jcaldwel [grade B] (good answer but asker seems unsatisfied).

PLEASE DO NOT ACCEPT THIS COMMENT AS AN ANSWER!

jmcg
EE Cleanup Volunteer
0

Featured Post

Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

  • 3
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now