goodluck11
asked on
String.Split Method (Char()) problem
We download a file which can be tab or comma delimited format.
We then split it in c# and handle the data.
Some fields contain commas(,) and some contain tabs.
How would you handle not making a mistaken Split on the data or know when the next record is the real one ?
Sample , delimited:
101,Penthouse ...Corner...Directly on intracoastal, with three BRs, 145 ne 1 ave, etcetc
intracoastal, -> its not the correct split
We then split it in c# and handle the data.
Some fields contain commas(,) and some contain tabs.
How would you handle not making a mistaken Split on the data or know when the next record is the real one ?
Sample , delimited:
101,Penthouse ...Corner...Directly on intracoastal, with three BRs, 145 ne 1 ave, etcetc
intracoastal, -> its not the correct split
This file does not follow the conventions of a delimited format file.
The delimiter should be a character or combination of characters that cannot be part of the data, so you need to control the thing at the point where the file is created.
Note that tabs and comas are used most of the time because they fit most uses. But you can use any character. Once again, this has to be defined at the point where the file is created.
You could use any of the following formats in which the separators are respectively #, "," and xyz.
101#Penthouse ...Corner...Directly on intracoastal, with three BRs# 145 ne 1 ave# etcetc
101","Penthouse ...Corner...Directly on intracoastal, with three BRs"," 145 ne 1 ave"," etcetc
101xyzPenthouse ...Corner...Directly on intracoastal, with three BRsxyz 145 ne 1 avexyz etcetc
The delimiter should be a character or combination of characters that cannot be part of the data, so you need to control the thing at the point where the file is created.
Note that tabs and comas are used most of the time because they fit most uses. But you can use any character. Once again, this has to be defined at the point where the file is created.
You could use any of the following formats in which the separators are respectively #, "," and xyz.
101#Penthouse ...Corner...Directly on intracoastal, with three BRs# 145 ne 1 ave# etcetc
101","Penthouse ...Corner...Directly on intracoastal, with three BRs"," 145 ne 1 ave"," etcetc
101xyzPenthouse ...Corner...Directly on intracoastal, with three BRsxyz 145 ne 1 avexyz etcetc
This could be done if some rules exist on how the record is constructed.
For example:
- first field is three characters (as shown)
- second field is minimum twenty characters (as shown)
- third field contains between one and five digits (not shown)
- fourth field can be empty or will have both characters and digits up to four, and
characters will be uppercase (not shown)
- etc.
/gustav
For example:
- first field is three characters (as shown)
- second field is minimum twenty characters (as shown)
- third field contains between one and five digits (not shown)
- fourth field can be empty or will have both characters and digits up to four, and
characters will be uppercase (not shown)
- etc.
/gustav
ASKER
How come when we open the file on excel, excel is able to separate the columns correctly for , commas and tabs
how can we do that in c# ?
how can we do that in c# ?
ASKER
so we have this code, and we are going to call it from protected void Page_Load(object sender, EventArgs e)
{
where exactly do we put it ? Error 1 Extension method must be defined in a non-generic static class C:\Users\Documents\Visual Studio 2010\Projects\Default.aspx .cs 21 22
public static string[] SplitWithQualifier(this string text,
char delimiter,
char qualifier,
bool stripQualifierFromResult)
{
string pattern = string.Format(
@"{0}(?=(?:[^{1}]*{1}[^{1} ]*{1})*(?! [^{1}]*{1} ))",
Regex.Escape(delimiter.ToS tring()),
Regex.Escape(qualifier.ToS tring())
);
string[] split = Regex.Split(text, pattern);
if (stripQualifierFromResult)
return split.Select(s => s.Trim().Trim(qualifier)). ToArray();
else
return split;
}
{
where exactly do we put it ? Error 1 Extension method must be defined in a non-generic static class C:\Users\Documents\Visual Studio 2010\Projects\Default.aspx
public static string[] SplitWithQualifier(this string text,
char delimiter,
char qualifier,
bool stripQualifierFromResult)
{
string pattern = string.Format(
@"{0}(?=(?:[^{1}]*{1}[^{1}
Regex.Escape(delimiter.ToS
Regex.Escape(qualifier.ToS
);
string[] split = Regex.Split(text, pattern);
if (stripQualifierFromResult)
return split.Select(s => s.Trim().Trim(qualifier)).
else
return split;
}
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
> How come when we open the file on excel, excel is able to
> separate the columns correctly for , commas and tabs ..
That is not possible with the sample you provided:
> 101,Penthouse ...Corner...Directly on intracoastal, with three BRs, 145 ne 1 ave, etcetc
But thanks for your solution tested with your secret test data.
/gustav
> separate the columns correctly for , commas and tabs ..
That is not possible with the sample you provided:
> 101,Penthouse ...Corner...Directly on intracoastal, with three BRs, 145 ne 1 ave, etcetc
But thanks for your solution tested with your secret test data.
/gustav
ASKER
secret test data
601,140 NE 28 AVE # 601,POMPANO,33062,"Penthou se ...Corner...Directly on intracoastal, beautiful view of the intracoastal and ocean, nice building, walking distance to beach and food stores, restos etc....SPECIAL PRICE FROM MAY TO NOVEMBER $2,500 or Best Offer...LETS MAKE A DEAL....Short term ","3,500",2,2,0,1100,"Pent house corner unit ...Directly on intracoastal,Walking distance to beach, Fantastic view from both sides,close to resto, food store, etc....",F738942,16,pompan o yacht & beach clu
PH F,1390 S OCEAN BL # PH F,POMPANO,33062,"SPECTACUL AR FOREVER OCEAN VIEW FROM THIS SE CORNER PENTHOUSE. BUILDING DIRECTLY ON THE SAND. EAT-IN-KITCHEN, NEW CARPET IN BEDROOMS, MARBLE IN FOYER, LIVING AREA, 3 BALCONIES (EAST, SOUTH AND WEST), NEARLY 3000 SQ. FT. 2 GARAGE PARKING SPACES. SEMI-PRIVAT","3,500",3,3,0 ,0,SPECTAC ULAR FOREVER OCEAN VIEW FROM THIS SE CORNER PENTHOUSE. GARAGE PARKING. SEMI PRIVATE ELEV. 24 HR SEC. FEELS LIKE A HOME IN THE SKY,F1008585,14,THE WITTINGTON CONDO
601,140 NE 28 AVE # 601,POMPANO,33062,"Penthou
PH F,1390 S OCEAN BL # PH F,POMPANO,33062,"SPECTACUL
The best way to make people want to help you is to let them work for free for you and then end up with a stupid "secret" and not giving the points ot anybody.
"The best way to make people NOT want to help ..." - ha? :)
ASKER
solution found
One of common ways is to escape commas when creating files, see http://stackoverflow.com/questions/769621/dealing-with-commas-in-a-csv-file .
Another way is to put comma-containing values between quotation marks. However this rises other issues (how to handle quotation marks...).