Solved

Replace string with string in unsigned char array

Posted on 2008-10-23
32
1,121 Views
Last Modified: 2013-12-14
Hi

I have a dynamic unsigned char array, I want a piece of code which can find for example:

Test word in this array and replace that with larger string like Test22222 and re-allocate this space for unsigned char array.

Please advice.

Thanks from now!
0
Comment
Question by:CSecurity
  • 13
  • 7
  • 7
  • +1
32 Comments
 
LVL 53

Expert Comment

by:Infinity08
ID: 22785425
>> Test word in this array and replace that with larger string

You cannot simply replace it. If the buffer is big enough to hold the extra bytes, then you can move the part after the word x bytes to the right, and then insert the replacement word.
If the buffer is not big enough, you'll have to either realloc it to a big enough size, or just create a new buffer, and copy the data into it.
0
 
LVL 45

Accepted Solution

by:
Kdo earned 250 total points
ID: 22785793
Hi CSecurity.

It's only a few steps to make the change, but it's critical that all of the steps occur, in order, and the proper cleanup takes place.  And you'll need to decide what to do if the string occurs more than once.

Given that you have a string called Old and you want to replace the first occurence, you'll need to do this:

1)  Search the string for the target string.
2)  If the string does not occur, exit.
3)  Determine the length of string Old.
4)  Determine the length of the target string.
5)  Determine the length of the replacement string.
6)  Allocate a buffer large enough for the new string (after the replacement).
7)  Copy the Old string, up to where the target string starts, to the New string.
8)  Copy the replacement string to the New string.
9)  Copy the rest of the Old string, starting after the target string, to the New string.

Afterwards, you'll want to free the Old string and assign the New string to the variable the contained the pointer to the Old string.


Good Luck,
Kent
0
 
LVL 11

Assisted Solution

by:alexcohn
alexcohn earned 250 total points
ID: 22785807
To make the code cleaner, I used a cast from unsigned char to char before calling the replace() function.
#include <string.h>
 

unsigned char *array = (unsigned char *)strdup("string with word Test inside");
 

/* use the original array */
 

replace((char**)&array, "Test", "Test22222");
 

/* use the array after replacements */
 

free(array);

...
 

void replace(char** parray, const char* to_find, const char* replace_with)

{

    const char* found = strstr(*parray, to_find);

    if (found)

    {

        char *tmpbuf = (char*)malloc(strlen(*parray) - strlen(to_find) + strlen(replace_with);

        strncpy(tmpbuf, *parray, found - *parray);

        strcat(tmpbuf, replace_with); /* or strcpy(tmpbuf + (found - *parray), replace_with) */

        strcat(tmpbuf, found + strlen(to_find)); /* or strcpy(tmpbuf + (found - *parray) + strlen(replace_with), found + strlen(to_find)); */

        free(*parray);

        *parray = tmpbuf;

    }

    return;

}

Open in new window

0
 
LVL 17

Author Comment

by:CSecurity
ID: 22789114
Thank you all, Thanks alex for great code, just a problem, I have non printing chars like char 157, I can't cast that as char... Any ideas?
0
 
LVL 53

Expert Comment

by:Infinity08
ID: 22789136
You cannot use the string functions on binary data, as they would get confused by the null bytes that might be in there.

Use memcpy's instead for example.
0
 
LVL 17

Author Comment

by:CSecurity
ID: 22789588
Thank you Infinity, can you modify Alex's code to use memcpy etc. as you say? Thank you so much
0
 
LVL 53

Expert Comment

by:Infinity08
ID: 22789616
Just follow Kdo's step-by-step plan, and you should be fine :)
0
 
LVL 17

Author Comment

by:CSecurity
ID: 22790739
Is there any ready code for it? Can anyone help me for this? Thanks
0
 
LVL 11

Expert Comment

by:alexcohn
ID: 22792074
You should not worry about unprintable characters, like 157. The only limitation of strcpy() and other functions is that they cannot handle strings that contain zero characters ('\0').
0
 
LVL 17

Author Comment

by:CSecurity
ID: 22792083
But I have 0 char also
0
 
LVL 45

Expert Comment

by:Kdo
ID: 22792147
Hi CSecurity,

Is this classwork related?


0
 
LVL 11

Expert Comment

by:alexcohn
ID: 22792171
If you have '\0', how do you determine the word lengths? How do you determine the actual length of your original array, to start with?
0
 
LVL 17

Author Comment

by:CSecurity
ID: 22794038
I have length in another variable... It's unsigned char array and I have length of it in another variable.

Kdo, no, I'm too older for having classworks :-)
0
 
LVL 45

Expert Comment

by:Kdo
ID: 22795612

"Too Old"  :)   A lot of that going on around here.   :)


Since the question was first asked, a bit more detail has been offered that might cloud things.

If the "string" that you want to examine has binary data that could contain bytes with a value of 0, the string functions won't work.  Similarly, there is no really good built-in search function that I know of to see if a "string" is contained within the buffer.

So let's get the last couple of details ironed out.

-  Is the buffer to be edited really a zero-terminated string or is it a binary buffer?
-  Is the object that you're trying to find in the buffer really a string or is it a binary buffer?
-  Is the object that you're trying to put into the buffer really a string or is it a binary buffer?


With these answers, the code's pretty easy to put together.  Alex already provided one example.


kent
0
 
LVL 17

Author Comment

by:CSecurity
ID: 22795902
Yes, it's a TCP packet, contains all chars including zero char.
I'm going to find some words and replace them with another word, but word I'm going to find is completely printable chars like "test" and I'll replace that with printable chars like test2222

It's all, I want to have a code that works, I'm not so good in C++ coding, if it was another language I was coded that myself, but when it comes to C++, and unsigned char arrays which strcpy, strcat etc not works on them, it's hard for me, if possible please show me an example code in internet that does it or please modify alex's code to work with unsigned chars...

Thank you all...
0
 
LVL 45

Expert Comment

by:Kdo
ID: 22796055
TCP packet -- one more complication.....

I'm assuming the packaging of the data within a packet will be external to this process?  It's asking an awful lot of a routine to modify embedded data AND maintain the integrity of the packet(s).

If the data to be edited is still in the packet, a semi-static buffer is in order to keep the integrity of the current packet.  If the data has already been unpacked to a buffer, programmer preference prevails.  :)

The last question is, what happens if the target string occurs more than once in the buffer?

Kent
0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 
LVL 11

Expert Comment

by:alexcohn
ID: 22796061
If you are working with binary data (a.k.a. byte stream), the code I published above essentially holds. Instead of char* you should pass structures that contain unsigned char* and length. Instead of strcpy() use memcpy(). You cannot use the strcat(), use the full variant instead. And finally, you need the find() function to replace strstr().

The function is oversimplified to demonstrate the principle, it is very far from being optimal.
typedef struct

{

    unsigned char* buf;

    size_t len;

} bytestream;
 

const unsigned char* find(bytestream haystack, bytestream needle)

{

    const unsigned char* candidate;

    for (candidate = haystack.buf; candidate < haystack.buf + haystack.len - needle.len; candidate++)

    {

        if (0 == memcmp(candidate, needle.buf, needle.len))

            return candidate;

    }

    return NULL;

}

Open in new window

0
 
LVL 11

Expert Comment

by:alexcohn
ID: 22796256
Oops, saw your new details now. I agree with Kdo that keeping packet integrity could pose a problem. But if the words to be looked for an replaced are printable (the only true limitation is that there should have no '\0', you can use my original code with no modifications except one simple change.
#include <string.h>

 

size_t arraylen = 36; // please verify

unsigned char *array = (unsigned char *)malloc(arraylen);

memset(array, "string with zeros\0 and word Test inside", arraylen);

 

/* use the original array */
 

replace((char**)&array, &arraylen, "Test", "Test22222");

 

/* use the array after replacements */

 

free(array);

...
 

const unsigned char* memstr(const char* array, size_t arraylen, const char* to_find)

{

    const char* candidate;

    for (candidate = array; candidate < array + arraylen - strlen(to_find); candidate++)

    {

        if (0 == strncmp(candidate, to_find, strlen(to_find)))

            return candidate;

    }

    return NULL;

}
 

void replace(char** parray, size_t* parraylen; const char* to_find, const char* replace_with)

{

    const char* found = memstr(*parray, *parraylen, to_find);

    if (found)

    {

        char *tmpbuf = (char*)malloc(*parraylen - strlen(to_find) + strlen(replace_with);

        memcpy(tmpbuf, *parray, found - *parray);

        strcpy(tmpbuf + (found - *parray), replace_with);

        strcpy(tmpbuf + (found - *parray) + strlen(replace_with), found + strlen(to_find));

        free(*parray);

        *parray = tmpbuf;

        *parraylen += strlen(replace_with) - strlen(to_find);

    }

    return;

}

Open in new window

0
 
LVL 17

Author Comment

by:CSecurity
ID: 22796660
Don't worry about packet integrity, my code works properly just code I wrote to do replace can't re-allocate space for extra chars and it overwrittes next bytes, but everything works properly, it works...

Alex, thank you so much for your last comment, but again you are casting the array as char again, you sure it will not cause data lose and 0 char problem?
0
 
LVL 53

Expert Comment

by:Infinity08
ID: 22796771
>> Don't worry about packet integrity, my code works properly just code I wrote to do replace can't re-allocate space for extra chars and it overwrittes next bytes, but everything works properly, it works...

Did you update the checksums ?
0
 
LVL 17

Author Comment

by:CSecurity
ID: 22796811
Yes, everything works properly, just as I said my code overwrites bytes
0
 
LVL 53

Expert Comment

by:Infinity08
ID: 22796829
Ok, so what problem do you still have then ?
0
 
LVL 17

Author Comment

by:CSecurity
ID: 22796882
Alex's last code again casts unsigned char array as char, I afraid it will cause losing 0 char and maybe some other data... is it correct?
0
 
LVL 11

Expert Comment

by:alexcohn
ID: 22796963
The cast to signed char in this case is absolutely legitimate and causes no side effects.
0
 
LVL 53

Expert Comment

by:Infinity08
ID: 22796967
But why do you need his code ? You said that your code already works ?
0
 
LVL 17

Author Comment

by:CSecurity
ID: 22797003
Alex, so you think with your last code I'll not lose any data, right?

Infinity, my code overwrites extra bytes, when I replace test with test2222 2222 is overwritten over next bytes, that's why I need advice
0
 
LVL 53

Expert Comment

by:Infinity08
ID: 22797037
>> that's why I need advice

So, you SHOULD worry about data integrity, and what you said in http:#22796660 does not apply.

You don't want to overwrite, but you want to insert. Which means that not only will the checksums change, but the size of the packet will also increase, potentially to the point where it needs to be split up. And that causes a whole other series of problems, since the packet sequence id's will no longer be valid, including those of the next packets.
0
 
LVL 17

Author Comment

by:CSecurity
ID: 22797157
Don't worry about data integrity, I know what I'm saying... Data integrity will not corrupt... How about Alex's last code?
0
 
LVL 45

Expert Comment

by:Kdo
ID: 22797306
Hi CSecurity,

There seems to be only one small hitch.  Note a couple of key lines:

        memcpy(tmpbuf, *parray, found - *parray);
        strcpy(tmpbuf + (found - *parray), replace_with);
        strcpy(tmpbuf + (found - *parray) + strlen(replace_with), found + strlen(to_find));

The first line copies up to the target string, the second line copies the replacement string, and the last line copies the data after the target string.  But since the buffer could contain binary data, the copy could be "short".

        memcpy (tmpbuf + (found - *parray) + strlen (replace_with), found + strlen (to_find), *parraylen - (found - *parray) - strlen (replace_with));

I believe that the line above should replace the last line in the block of code above.

Kent

0
 
LVL 11

Expert Comment

by:alexcohn
ID: 22797485
Kdo, thanks for this reminder... I forgot this second memcpy when I was modifying my original code.

Except from that, binary data in the input stream will be copied correctly. Remember: if your to_find and/or replace_with items may contain zero chars, the strcpy and strlen functions cannot be used anymore. Also, the code snippet should be considered an illustration, it does lots of unnecessary work, and was not carefully debugged (see the flaw that Kdo found right now).
0
 
LVL 17

Author Comment

by:CSecurity
ID: 22802438
I modified your code, because I got a errors like 5 errors, here is my modification:

const char* memstr(const char* array, size_t arraylen, const char* to_find)
{
    const char* candidate;
    for (candidate = array; candidate < array + arraylen - strlen(to_find); candidate++)
    {
        if (0 == strncmp(candidate, to_find, strlen(to_find)))
            return candidate;
    }
    return NULL;
}
 
void replace(char** parray, size_t* parraylen, const char* to_find, const char* replace_with)
{
    const char* found = memstr(*parray, *parraylen, to_find);
    if (found)
    {
        char *tmpbuf = (char*)malloc(*parraylen - strlen(to_find) + strlen(replace_with));
        memcpy(tmpbuf, *parray, found - *parray);
        strcpy(tmpbuf + (found - *parray), replace_with);
       
        memcpy (tmpbuf + (found - *parray) + strlen (replace_with), found + strlen (to_find), *parraylen - (found - *parray) - strlen (replace_with));
        free(*parray);
        *parray = tmpbuf;
        *parraylen += strlen(replace_with) - strlen(to_find);
    }
    return;
}



Also I do this:
size_t mylen = (size_t) tcplen;


because tcplen is int I do this conversation


I get a lot of runtime exceptions and access violations
0
 
LVL 17

Author Comment

by:CSecurity
ID: 22802618
I wrote a piece of code and solved the problem but thank you all for your time and help
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Exception thrown at 0x00007FFD5BC81F28 7 38
Unable to start eclipse ? 17 128
Least Squares Curve Fitting 4 60
C++ mouse_event mouse look 7 68
Article by: SunnyDark
This article's goal is to present you with an easy to use XML wrapper for C++ and also present some interesting techniques that you might use with MS C++. The reason I built this class is to ease the pain of using XML files with C++, since there is…
Here is a helpful source code for C++ Builder programmers that allows you to manage and manipulate HTML content from C++ code, while also handling HTML events like onclick, onmouseover, ... Some objects defined and used in this source include: …
The goal of the video will be to teach the user the difference and consequence of passing data by value vs passing data by reference in C++. An example of passing data by value as well as an example of passing data by reference will be be given. Bot…
The viewer will learn how to clear a vector as well as how to detect empty vectors in C++.

914 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

18 Experts available now in Live!

Get 1:1 Help Now