Solved

Replace string with string in unsigned char array

Posted on 2008-10-23
32
1,110 Views
Last Modified: 2013-12-14
Hi

I have a dynamic unsigned char array, I want a piece of code which can find for example:

Test word in this array and replace that with larger string like Test22222 and re-allocate this space for unsigned char array.

Please advice.

Thanks from now!
0
Comment
Question by:CSecurity
  • 13
  • 7
  • 7
  • +1
32 Comments
 
LVL 53

Expert Comment

by:Infinity08
Comment Utility
>> Test word in this array and replace that with larger string

You cannot simply replace it. If the buffer is big enough to hold the extra bytes, then you can move the part after the word x bytes to the right, and then insert the replacement word.
If the buffer is not big enough, you'll have to either realloc it to a big enough size, or just create a new buffer, and copy the data into it.
0
 
LVL 45

Accepted Solution

by:
Kdo earned 250 total points
Comment Utility
Hi CSecurity.

It's only a few steps to make the change, but it's critical that all of the steps occur, in order, and the proper cleanup takes place.  And you'll need to decide what to do if the string occurs more than once.

Given that you have a string called Old and you want to replace the first occurence, you'll need to do this:

1)  Search the string for the target string.
2)  If the string does not occur, exit.
3)  Determine the length of string Old.
4)  Determine the length of the target string.
5)  Determine the length of the replacement string.
6)  Allocate a buffer large enough for the new string (after the replacement).
7)  Copy the Old string, up to where the target string starts, to the New string.
8)  Copy the replacement string to the New string.
9)  Copy the rest of the Old string, starting after the target string, to the New string.

Afterwards, you'll want to free the Old string and assign the New string to the variable the contained the pointer to the Old string.


Good Luck,
Kent
0
 
LVL 11

Assisted Solution

by:alexcohn
alexcohn earned 250 total points
Comment Utility
To make the code cleaner, I used a cast from unsigned char to char before calling the replace() function.
#include <string.h>
 

unsigned char *array = (unsigned char *)strdup("string with word Test inside");
 

/* use the original array */
 

replace((char**)&array, "Test", "Test22222");
 

/* use the array after replacements */
 

free(array);

...
 

void replace(char** parray, const char* to_find, const char* replace_with)

{

    const char* found = strstr(*parray, to_find);

    if (found)

    {

        char *tmpbuf = (char*)malloc(strlen(*parray) - strlen(to_find) + strlen(replace_with);

        strncpy(tmpbuf, *parray, found - *parray);

        strcat(tmpbuf, replace_with); /* or strcpy(tmpbuf + (found - *parray), replace_with) */

        strcat(tmpbuf, found + strlen(to_find)); /* or strcpy(tmpbuf + (found - *parray) + strlen(replace_with), found + strlen(to_find)); */

        free(*parray);

        *parray = tmpbuf;

    }

    return;

}

Open in new window

0
 
LVL 17

Author Comment

by:CSecurity
Comment Utility
Thank you all, Thanks alex for great code, just a problem, I have non printing chars like char 157, I can't cast that as char... Any ideas?
0
 
LVL 53

Expert Comment

by:Infinity08
Comment Utility
You cannot use the string functions on binary data, as they would get confused by the null bytes that might be in there.

Use memcpy's instead for example.
0
 
LVL 17

Author Comment

by:CSecurity
Comment Utility
Thank you Infinity, can you modify Alex's code to use memcpy etc. as you say? Thank you so much
0
 
LVL 53

Expert Comment

by:Infinity08
Comment Utility
Just follow Kdo's step-by-step plan, and you should be fine :)
0
 
LVL 17

Author Comment

by:CSecurity
Comment Utility
Is there any ready code for it? Can anyone help me for this? Thanks
0
 
LVL 11

Expert Comment

by:alexcohn
Comment Utility
You should not worry about unprintable characters, like 157. The only limitation of strcpy() and other functions is that they cannot handle strings that contain zero characters ('\0').
0
 
LVL 17

Author Comment

by:CSecurity
Comment Utility
But I have 0 char also
0
 
LVL 45

Expert Comment

by:Kdo
Comment Utility
Hi CSecurity,

Is this classwork related?


0
 
LVL 11

Expert Comment

by:alexcohn
Comment Utility
If you have '\0', how do you determine the word lengths? How do you determine the actual length of your original array, to start with?
0
 
LVL 17

Author Comment

by:CSecurity
Comment Utility
I have length in another variable... It's unsigned char array and I have length of it in another variable.

Kdo, no, I'm too older for having classworks :-)
0
 
LVL 45

Expert Comment

by:Kdo
Comment Utility

"Too Old"  :)   A lot of that going on around here.   :)


Since the question was first asked, a bit more detail has been offered that might cloud things.

If the "string" that you want to examine has binary data that could contain bytes with a value of 0, the string functions won't work.  Similarly, there is no really good built-in search function that I know of to see if a "string" is contained within the buffer.

So let's get the last couple of details ironed out.

-  Is the buffer to be edited really a zero-terminated string or is it a binary buffer?
-  Is the object that you're trying to find in the buffer really a string or is it a binary buffer?
-  Is the object that you're trying to put into the buffer really a string or is it a binary buffer?


With these answers, the code's pretty easy to put together.  Alex already provided one example.


kent
0
 
LVL 17

Author Comment

by:CSecurity
Comment Utility
Yes, it's a TCP packet, contains all chars including zero char.
I'm going to find some words and replace them with another word, but word I'm going to find is completely printable chars like "test" and I'll replace that with printable chars like test2222

It's all, I want to have a code that works, I'm not so good in C++ coding, if it was another language I was coded that myself, but when it comes to C++, and unsigned char arrays which strcpy, strcat etc not works on them, it's hard for me, if possible please show me an example code in internet that does it or please modify alex's code to work with unsigned chars...

Thank you all...
0
 
LVL 45

Expert Comment

by:Kdo
Comment Utility
TCP packet -- one more complication.....

I'm assuming the packaging of the data within a packet will be external to this process?  It's asking an awful lot of a routine to modify embedded data AND maintain the integrity of the packet(s).

If the data to be edited is still in the packet, a semi-static buffer is in order to keep the integrity of the current packet.  If the data has already been unpacked to a buffer, programmer preference prevails.  :)

The last question is, what happens if the target string occurs more than once in the buffer?

Kent
0
How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

 
LVL 11

Expert Comment

by:alexcohn
Comment Utility
If you are working with binary data (a.k.a. byte stream), the code I published above essentially holds. Instead of char* you should pass structures that contain unsigned char* and length. Instead of strcpy() use memcpy(). You cannot use the strcat(), use the full variant instead. And finally, you need the find() function to replace strstr().

The function is oversimplified to demonstrate the principle, it is very far from being optimal.
typedef struct

{

    unsigned char* buf;

    size_t len;

} bytestream;
 

const unsigned char* find(bytestream haystack, bytestream needle)

{

    const unsigned char* candidate;

    for (candidate = haystack.buf; candidate < haystack.buf + haystack.len - needle.len; candidate++)

    {

        if (0 == memcmp(candidate, needle.buf, needle.len))

            return candidate;

    }

    return NULL;

}

Open in new window

0
 
LVL 11

Expert Comment

by:alexcohn
Comment Utility
Oops, saw your new details now. I agree with Kdo that keeping packet integrity could pose a problem. But if the words to be looked for an replaced are printable (the only true limitation is that there should have no '\0', you can use my original code with no modifications except one simple change.
#include <string.h>

 

size_t arraylen = 36; // please verify

unsigned char *array = (unsigned char *)malloc(arraylen);

memset(array, "string with zeros\0 and word Test inside", arraylen);

 

/* use the original array */
 

replace((char**)&array, &arraylen, "Test", "Test22222");

 

/* use the array after replacements */

 

free(array);

...
 

const unsigned char* memstr(const char* array, size_t arraylen, const char* to_find)

{

    const char* candidate;

    for (candidate = array; candidate < array + arraylen - strlen(to_find); candidate++)

    {

        if (0 == strncmp(candidate, to_find, strlen(to_find)))

            return candidate;

    }

    return NULL;

}
 

void replace(char** parray, size_t* parraylen; const char* to_find, const char* replace_with)

{

    const char* found = memstr(*parray, *parraylen, to_find);

    if (found)

    {

        char *tmpbuf = (char*)malloc(*parraylen - strlen(to_find) + strlen(replace_with);

        memcpy(tmpbuf, *parray, found - *parray);

        strcpy(tmpbuf + (found - *parray), replace_with);

        strcpy(tmpbuf + (found - *parray) + strlen(replace_with), found + strlen(to_find));

        free(*parray);

        *parray = tmpbuf;

        *parraylen += strlen(replace_with) - strlen(to_find);

    }

    return;

}

Open in new window

0
 
LVL 17

Author Comment

by:CSecurity
Comment Utility
Don't worry about packet integrity, my code works properly just code I wrote to do replace can't re-allocate space for extra chars and it overwrittes next bytes, but everything works properly, it works...

Alex, thank you so much for your last comment, but again you are casting the array as char again, you sure it will not cause data lose and 0 char problem?
0
 
LVL 53

Expert Comment

by:Infinity08
Comment Utility
>> Don't worry about packet integrity, my code works properly just code I wrote to do replace can't re-allocate space for extra chars and it overwrittes next bytes, but everything works properly, it works...

Did you update the checksums ?
0
 
LVL 17

Author Comment

by:CSecurity
Comment Utility
Yes, everything works properly, just as I said my code overwrites bytes
0
 
LVL 53

Expert Comment

by:Infinity08
Comment Utility
Ok, so what problem do you still have then ?
0
 
LVL 17

Author Comment

by:CSecurity
Comment Utility
Alex's last code again casts unsigned char array as char, I afraid it will cause losing 0 char and maybe some other data... is it correct?
0
 
LVL 11

Expert Comment

by:alexcohn
Comment Utility
The cast to signed char in this case is absolutely legitimate and causes no side effects.
0
 
LVL 53

Expert Comment

by:Infinity08
Comment Utility
But why do you need his code ? You said that your code already works ?
0
 
LVL 17

Author Comment

by:CSecurity
Comment Utility
Alex, so you think with your last code I'll not lose any data, right?

Infinity, my code overwrites extra bytes, when I replace test with test2222 2222 is overwritten over next bytes, that's why I need advice
0
 
LVL 53

Expert Comment

by:Infinity08
Comment Utility
>> that's why I need advice

So, you SHOULD worry about data integrity, and what you said in http:#22796660 does not apply.

You don't want to overwrite, but you want to insert. Which means that not only will the checksums change, but the size of the packet will also increase, potentially to the point where it needs to be split up. And that causes a whole other series of problems, since the packet sequence id's will no longer be valid, including those of the next packets.
0
 
LVL 17

Author Comment

by:CSecurity
Comment Utility
Don't worry about data integrity, I know what I'm saying... Data integrity will not corrupt... How about Alex's last code?
0
 
LVL 45

Expert Comment

by:Kdo
Comment Utility
Hi CSecurity,

There seems to be only one small hitch.  Note a couple of key lines:

        memcpy(tmpbuf, *parray, found - *parray);
        strcpy(tmpbuf + (found - *parray), replace_with);
        strcpy(tmpbuf + (found - *parray) + strlen(replace_with), found + strlen(to_find));

The first line copies up to the target string, the second line copies the replacement string, and the last line copies the data after the target string.  But since the buffer could contain binary data, the copy could be "short".

        memcpy (tmpbuf + (found - *parray) + strlen (replace_with), found + strlen (to_find), *parraylen - (found - *parray) - strlen (replace_with));

I believe that the line above should replace the last line in the block of code above.

Kent

0
 
LVL 11

Expert Comment

by:alexcohn
Comment Utility
Kdo, thanks for this reminder... I forgot this second memcpy when I was modifying my original code.

Except from that, binary data in the input stream will be copied correctly. Remember: if your to_find and/or replace_with items may contain zero chars, the strcpy and strlen functions cannot be used anymore. Also, the code snippet should be considered an illustration, it does lots of unnecessary work, and was not carefully debugged (see the flaw that Kdo found right now).
0
 
LVL 17

Author Comment

by:CSecurity
Comment Utility
I modified your code, because I got a errors like 5 errors, here is my modification:

const char* memstr(const char* array, size_t arraylen, const char* to_find)
{
    const char* candidate;
    for (candidate = array; candidate < array + arraylen - strlen(to_find); candidate++)
    {
        if (0 == strncmp(candidate, to_find, strlen(to_find)))
            return candidate;
    }
    return NULL;
}
 
void replace(char** parray, size_t* parraylen, const char* to_find, const char* replace_with)
{
    const char* found = memstr(*parray, *parraylen, to_find);
    if (found)
    {
        char *tmpbuf = (char*)malloc(*parraylen - strlen(to_find) + strlen(replace_with));
        memcpy(tmpbuf, *parray, found - *parray);
        strcpy(tmpbuf + (found - *parray), replace_with);
       
        memcpy (tmpbuf + (found - *parray) + strlen (replace_with), found + strlen (to_find), *parraylen - (found - *parray) - strlen (replace_with));
        free(*parray);
        *parray = tmpbuf;
        *parraylen += strlen(replace_with) - strlen(to_find);
    }
    return;
}



Also I do this:
size_t mylen = (size_t) tcplen;


because tcplen is int I do this conversation


I get a lot of runtime exceptions and access violations
0
 
LVL 17

Author Comment

by:CSecurity
Comment Utility
I wrote a piece of code and solved the problem but thank you all for your time and help
0

Featured Post

How to improve team productivity

Quip adds documents, spreadsheets, and tasklists to your Slack experience
- Elevate ideas to Quip docs
- Share Quip docs in Slack
- Get notified of changes to your docs
- Available on iOS/Android/Desktop/Web
- Online/Offline

Join & Write a Comment

Update (December 2011): Since this article was published, the things have changed for good for Android native developers. The Sequoyah Project (http://www.eclipse.org/sequoyah/) automates most of the tasks discussed in this article. You can even fin…
Go is an acronym of golang, is a programming language developed Google in 2007. Go is a new language that is mostly in the C family, with significant input from Pascal/Modula/Oberon family. Hence Go arisen as low-level language with fast compilation…
The viewer will learn how to pass data into a function in C++. This is one step further in using functions. Instead of only printing text onto the console, the function will be able to perform calculations with argumentents given by the user.
The viewer will learn how to user default arguments when defining functions. This method of defining functions will be contrasted with the non-default-argument of defining functions.

772 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now