Link to home
Create AccountLog in
Avatar of Mike R.
Mike R.

asked on

Segfault 11 on long string, PRIOR to accessing string, only when string > 14

ANSI c on OSX 10.13.6 <br>
Apple LLVM version 9.1.0 (clang-902.0.39.2)
Target: x86_64-apple-darwin17.7.0
Thread model: posix

I'm learning `c`

This is a function that manually (character-by-character) adds two character strings representing large numbers (that exceed the unsigned long long or double size).

It functions fine with any two strings 14 or less characters long, but segmentation fault 11 with any strings greater than 14 chars.

Changing the string's memory allocation method seems to have no effect (I.e. from
char[15] addend1; // not a ptr

Open in new window

to
char *addend1 = (char *) malloc(sizeof(char) * (16) ); // pointer

Open in new window


One things that's curious, is that it seems to segfault on the ...
for (int j = maxlength - 1 ; j >= 0; j--)

Open in new window

... prior to accessing either of `addend1` or `addend2`, but I'm not able to find an error there or change it to prevent the segfault.  

Am I misreading where the error arises, or could it be related to the for loop?

MAIN.c

	#include <stdio.h>
	#include <stdlib.h>
	#include "../../c-library/include/addViaStrings.h"
	
	int main(void) {
		//	s[0] = 72; s[1] = 101; s[2] = 108; s[3] = 108; s[4] = 111; s[5] = 32; s[6] = 87; s[7] = 111; s[8] = 114; s[9] = 108; s[10] = 100; s[11] = 0;
	
		// WORKS
		// char s1[] = "14073748835532";
		// char s2[] = "14073748835532";
	
		// FAILS
		char s1[] = "140737488355328";
		char s2[] = "140737488355328";
	
		char *sum = addNumericStrings(&s1, &s2);
		printf("main.sum = %s\n", sum);
	}

Open in new window


addViaStrings.h

	char* addNumericStrings(char *s1, char *s2){
		// Find the length of the greater of the two
		int maxlength = findMaxLength(s1, s2);
		printf("maxlength = %d\n", maxlength); //333
	
		///////////////////////////////////////////////
		// Using malloc instead of char[maxlength] seems to have NO EFFECT on the issue
		// char addend1[maxlength]; // not a pointer
		char *addend1 = (char *) malloc(sizeof(char) * (maxlength + 1) );
		// char addend2[maxlength]; // not a pointer
		char *addend2 = (char *) malloc(sizeof(char) * (maxlength + 1) );
	
		// Allocate sum pointer
		printf("char *sum = (char *) malloc(sizeof(char) * (maxlength + 1) ) ... "); //333
		char *sum = (char *) malloc(sizeof(char) * (maxlength + 1) );
		printf("DONE\n"); //333
	
		// General use vars
		int a1, a2, total;
		int carry = 0;
	
		// Prepare the strings for manual addition. Pad the left with char 0s
		leftPad(addend1, s1, maxlength);
		leftPad(addend2, s2, maxlength);
	
		// Buffer sum with zeros
		sum[maxlength + 1] = 0; // end flag
		printf("for (int i = 0; i < (maxlength); i++) { sum[i] = '0'; } ... "); //333
		for (int i = 0; i < (maxlength); i++) { sum[i] = '0'; } // Fill w/ 0s
		printf("DONE\n"); //333
	
		// Run the manual addition
		// Start adding individual ints from end (right side)
		printf("Start adding individual ints from end (right side) ...\n"); //333
		for (int j = maxlength - 1 ; j >= 0; j--) {
			///////////////////////////////////////////
			// The segfault seems to happen BEFORE accessing addend1 or addend2
			printf("%d ..."); // 333 This DOES NOT print
			///////////////////////////////////////////
			a1 = addend1[j] - '0'; // Convert to int
			a2 = addend2[j] - '0'; // Convert to int
			total = (a1 + a2 + carry);
			carry = 0;
			if ( total >= 10){
				carry += 1;
				total -= 10;
			}
			sum[j + 1] = '0'+total; // convert to ascii value for numbers (adding 48)
		}
		sum[0] = '0' + carry; // add last carry to start of num always, even if 0
		// Before returning, truncate leading zeros
		char *returnsum = (char *) malloc(sizeof(char) * (strlen(sum) + 1) );
		int sum_i = 0;
		int returnsm_i = 0;
	//    bool truncate = true; // Find out why this wont compile
		int truncate = 1; // true
		while (1){
			// if order is important here
			if (sum[sum_i] == '\0') { break; } // we're done
			if (sum[sum_i] == '0' && truncate == 1) { sum_i += 1; continue; } // 1 is true
			// if a num, Stop truncating 0s but DO continue adding numbers
			if (sum[sum_i] != '0') { truncate = 0; } // 0 is false
			returnsum[returnsm_i] = sum[sum_i];
			returnsm_i += 1;
			sum_i += 1;
		}
		return returnsum;
	}
[\code]

[i][b]Successful run (less than 15 chars)[/b][/i]
[code]
    maxlength = 14
    char *sum = (char *) malloc(sizeof(char) * (maxlength + 1) ) ... DONE
    for (int i = 0; i < (maxlength); i++) { sum[i] = '0'; } ... DONE
    Start adding individual ints from end (right side) ...
    13 ...12 ...11 ...10 ...9 ...8 ...7 ...6 ...5 ...4 ...3 ...2 ...1 ...0 ...main.sum = 28147497671064

Open in new window


UNSuccessful run (15 chars)
    maxlength = 15
    char *sum = (char *) malloc(sizeof(char) * (maxlength + 1) ) ... DONE
    for (int i = 0; i < (maxlength); i++) { sum[i] = '0'; } ... DONE
    Start adding individual ints from end (right side) ...

Open in new window

ASKER CERTIFIED SOLUTION
Avatar of phoffric
phoffric

Link to home
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
See answer
Avatar of Mike R.
Mike R.

ASKER

Yes. It was (technically) the result of maxlength.

Actually, formally, the allocation of of the memory for sum was
char *sum = (char *) malloc(sizeof(char) * (maxlength + 1) ) 

Open in new window

and it should have been
char *sum = (char *) malloc(sizeof(char) * (maxlength + 2) )

Open in new window

Normally we don't do this:
(maxlength + 2)

Instead we set maxlength to the correct string length and add 1.
SOLUTION
Link to home
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
Avatar of Mike R.

ASKER

@sarabande
Very nice. Thank you for taking the time to write that better code.
>> there is the rule that dynamic pointers should be freed in the same function where they have been created

This is a good rule to follow if you think of the function as being the owner of the allocated memory. But as with many rules there are exceptions. And on many projects I've seen so many exceptions, that I would call this rule more as a guide than a rule.

For example, if you have a C factory, then the function that allocates the memory passes the ownership to the caller and that function will likely take responsibility for freeing the memory.

There are other schemes where huge data chunks are allocated by a function that takes in null pointers and allocates the memory based on input arguments. Again, this allocator does not own the memory and will not free it. It may be responsible for touching each page before giving ownership to a higher level.

When documenting your design about memory allocations, you should note the ownership of the memory.
if you have a C factory, then the function that allocates the memory passes the ownership to the caller

Open in new window


yes, same is with malloc which creates a pointer or any other function where the main functionality is to provide a new pointer. it is easy to determine whether a function that returns a new pointer also passes ownership or not.

the function here was to calculate the sum of two string numbers and returns it as a string. if the function returns a char pointer which must be freed by the caller, the caller must know this what without doubt is bad design.

schemes where huge data chunks need to allocated based on parameters may have one function which does the allocation, but this function is only a helper that replaces a malloc call in C or a new operation in C++.  you easily could make this function only return the required size and do the allocation by the owner. or define the pointers as shared or global data what also clarifies the question of ownership.

Sara
ANSI c on OSX 10.13.6

i would assume you were not using ansi c compiler but a c++ compiler.

in ansi c all variables must be defined at top of a {} block without any normal c statements between. but in your original post there are printf statements above new char [] variables.

also a statement like

for (int j = maxlength - 1 ; j >= 0; j--) {

Open in new window


normally doesn't compile with an ansi c compiler.

in c++ you could use std::string class for strings and and not char pointers (or char buffers).

Sara
>> you easily could make this function only return the required size and do the allocation by the owner
Yes you could. As a designer you can do whatever you want, unless there are unecessary "rules" that force you to do it your way. Now all the designers on the team are forced to have the caller do the allocations and free the memory in that same function unless they get a waiver from the "rule".
as told it is a bad design if you return the result of  a function as a pointer that must be freed by the caller. i am a programmer since 40 years but this "rule" has nothing to do with paternalism but makes your life happier.  even for a factory or a private memory heap, you should provide a mechanism which disposes the caller from freeing memory. i don't know why you insist that this is unnecessary it is vital instead. give me any code where you ignore that and i will turn it to better code.

Sara
At this point I can't believe we are talking about the same general software engineering principles so we can stop.
Avatar of Mike R.

ASKER

@sarabande
"as told it is a bad design if you return the result of  a function as a pointer that must be freed by the caller."

I come from a Python background (so I'm used to creating an object once, towards the beginning - in Main() if you will - and passing it around as need be) ...

BUT, I'm learning C for embedded uses (not C++ for large software...but C for very small memory environments that need to be fast).

So, based on those two thoughts - What if you want this data value to be reusable by other functions? Should old data allocations be freed, and then recreated again, over and over?
SOLUTION
Link to home
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.