Link to home
Start Free TrialLog in
Avatar of InquisitiveProgrammer
InquisitiveProgrammer

asked on

Help with substring function

Hello,

Here is my problem.

I have a string, let's say it is "Herman Newsome" and I want to convert it to first_name and a last_name strings.

I have already done the conversions so that the only upper case letters in the names are the first letter, so what I need to do now is capture the two values into different strings.

I can't just check for the first white space, because there might be a middle name in the string like "Herman von Newsome" so what I need to do is check for the next capital letter. I was thinking maybe checking unicode values, but I'm not sure if that is the best bet.

Any suggestions?

Thanks,
Jay
Avatar of for_yan
for_yan
Flag of United States of America image

I think something like that should work:

String s0 = "Herman von Newsome";

String [] ss = s0.split("\\s+");

Strin ssum = "";
ArrayList<String> ar = new ArrayList<String>();
for(String s1: ss) {

if(!Character.isLowerCase(s1.charAt(0))) {ssum = ssum + " " + s1; ar.add(ssum);  ssum = ""; continue;}
else { ssum = s1;}

}

System.out .println(ar);

Open in new window


I tested, and this works (see output):

        String s300 = "Herman von Newsome";

String [] ss300 = s300.split("\\s+");

String ssum = "";
ArrayList<String> ar300 = new ArrayList<String>();
for(String s1300: ss300) {

if(!Character.isLowerCase(s1300.charAt(0))) {ssum = ssum + " " + s1300; ar300.add(ssum.trim());  ssum = ""; continue;}
else { ssum = s1300;}

}

System.out .println(ar300);

Open in new window


Ouput:
[Herman, von Newsome]

Open in new window

Avatar of InquisitiveProgrammer
InquisitiveProgrammer

ASKER

I actually need for the two strings to be formatted like this:

First Name: Herman von
Last Name: Newsome
OK, we can change it
This will do it:

        String s300 = "Herman von Newsome";

String [] ss300 = s300.split("\\s+");

String ssum = "";
ArrayList<String> ar300 = new ArrayList<String>();
for(String s1300: ss300) {
   // ssum += s1300;

if(Character.isUpperCase(s1300.charAt(0))) {
    if(ssum.trim().length() != 0){ar300.add(ssum.trim());
    ssum = s1300;  }else {
        ssum +=  " " +  s1300;
    }


}    else
{
    ssum += " " + s1300;
}



}

        if(ssum.trim().length() != 0)ar300.add(ssum.trim());

System.out .println(ar300);

Open in new window


Output:

[Herman von, Newsome]

Open in new window

Maybe this is a little bit easier to undersand, but still the same
way; sometimes simple things require
some not very simple logic

       String s300 = "Herman von Newsome";

String [] ss300 = s300.trim().split("\\s+");

        ArrayList<String> ar302 = new ArrayList<String>();
        String ssum1 = "";

        for(int j=0; j<ss300.length; j++){
              if(Character.isUpperCase(ss300[j].charAt(0))){
                  if(ssum1.trim().length() != 0)ar302.add(ssum1.trim());
                  ssum1 = ss300[j];
              }   else {
                  ssum1 += " " + ss300[j];
              }

        }

        if(ssum1.trim().length() != 0)ar302.add(ssum1.trim());

        System.out.println("ar302: " + ar302);
        

Open in new window


Output:
ar302: [Herman von, Newsome]

Open in new window


maybe anothe option is to use look-ahead
with regex checking only for spaces that end with upper case,
but I'm not sure split will work with it
I need for two strings to be output.

Maybe you can just tell me what I'm doing wrong with this code:

 
public static void divideString(String testString) {
		StringBuilder b = new StringBuilder(testString.length());
		
		for (char c : testString.toCharArray()) {
			if (c < 97) {
				firstNameIs = testString.substring(0,c).trim();
				lastNameIs = testString.substring(c+1).trim();
				System.out.println("The new First Name reads: " + firstNameIs);
				System.out.println("The new Last Name reads: " + lastNameIs);
			}
		}
	}

Open in new window

Output two strings is not at all a problem -  see below:

        String s300 = "Herman von Newsome";

String [] ss300 = s300.trim().split("\\s+");

        ArrayList<String> ar302 = new ArrayList<String>();
        String ssum1 = "";

        for(int j=0; j<ss300.length; j++){
              if(Character.isUpperCase(ss300[j].charAt(0))){
                  if(ssum1.trim().length() != 0)ar302.add(ssum1.trim());
                  ssum1 = ss300[j];
              }   else {
                  ssum1 += " " + ss300[j];
              }

        }

        if(ssum1.trim().length() != 0)ar302.add(ssum1.trim());

        for(String s302 : ar302){
            System.out.println(s302);

        }

Open in new window


Herman von
Newsome

Open in new window

No, this has many problems:
you create StringBuilder
StringBuilder b = new StringBuilder(testString.length());
and never use it

if (c < 97)  <-- what do you mean by this?

and then

testString.substring(0,c) and here substring takes two integers, so c will be interpreted as int
so even if it does not compalian - sometimes you can use char as int - but then it will use its code
and it will be far byond the length od the string.
No this code has many problems

You can scan till you see space with substring like that:

string firstStr = testString.substring(0, testString.indexOf(" "));

and in this way you can split, but then when you have more than one space it will be more
difficult or if it has tab instaed of space; no split is definitely better in this respect

Better use the way abover

ASKER CERTIFIED SOLUTION
Avatar of for_yan
for_yan
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
What is this doing here with the split?

String [] ss300 = s300.trim().split("\\s+");
method split takes regular expression as argument
regular expression "\\s+" will match any number of consecutive spaces or tabs

So split will find in the string all pieces matching regular expression and will
split the string into array of strings broken by these matching pieces

So in Herman von Newsome spaces will match as splitting groups and it will split in three parts Herman will become array lelemnt number 0,
von is array element numebr 1, and Newsom - elements 2
when I used more sophisticeted reg exp
"\\s+(?=[A-Z])"

Open in new window

this matches only spaces followed by upper case charcter, so it splits the line
into two elements
"Herman von"
and "Newsome" because space before "von" will not match regex and will not be splitting
So, such split gives immediately the result we want

By the wa if we make it like that:

"\\s+(?=[a-z])"

Open in new window


we'll as easily get the split I din initially
into
"Herman" and "von Newsome"

How would I save my two results into seperate strings?
What do you mean;
 they are in separate strings:

        String s300 = "Herman von Newsome";

String [] ss300 = s300.trim().split("\\s+(?=[A-Z])");

//at this pioint: ss300[0] = "Herman von" and ss300[1] = "Newsome"

        for(String s303: ss300){
            System.out.println(s303);
        }

Open in new window


they are in two elemenst of the String array, so then you can say for example
String s1 = ss300[0];
String s2 = ss300[1];

Open in new window

Thank you for your patience and help.

I appreciate it.!
You are always welcome.