Link to home
Start Free TrialLog in
Avatar of adnankh
adnankhFlag for Australia

asked on

how to Delete extra delimeter "," in text file

HI Expert,

Ive a situation where we have huge dump and we need to be import these dump text files into BI Systems, My problem is our dump from source systems have many extra delimiters "," in line, so when we importing we getting error extra columns. I need to delete these extra delimiters in any editor by using regular expression or any other method if you can suggest it. Please see the below examples. The first two records are good ones and third records has many extra ","
How to remove those extra "," ?

Many Thanks
167773206,	Book,		EN,	AUD,	1492565,	HTC,	2013-09-13 00:00:00,	2013-09-16 00:00:00,	2044201141,	2013-08-25 17:01:40.080000000,	False,	3.5367,	False,		True,			False,	400,	AU,	778.03,	AUD,	778.03,	0,	1,	True,	EN,	550474231,	701.4718,	0,	0,	0,	0,	{CCE23FC4-5F3D-44FE-B7DD-932810A91D86},	0,	252950,	11618,	4,		2,	0,	21840,	0,	0,				NULL

80840528,	Book,		EN,	GBP,	1546177,	HDE,	2012-08-03 00:00:00,	2012-08-06 00:00:00,	3155834540,	2012-07-05 13:31:55.960000000,	False,	1.3537,	True,		False,			False,	400,	RO,	142.05,	GBP,	142.05,	0,	1,	True,	,	337512825,	222.8647,	0,	0,	0,	0,	{775B2E0E-5625-4B99-B498-5D711FCF0623},	0,	0,	2302,	,		,	,	1203,	0,	0,				NULL


198730174,      Book,,          EN,     PHP,    1751671,        AGD,    2014-04-09 00:00:00,    2014-04-12 00:00:00,   1168894580,     2014-02-16 20:02:26.137000000,     False,      1.772,     True,,              False,,,                    False,       300,US,  12600,  PHP,    12600,  0, 9,          True,  EN,     -2143808383,    280.5012,     3,      0,      0,      0,  {A9EA2F36-8582-4E28-AD3E-38D3D8115F86}, 0,    72806,  1471,   4,,             20,     0,      0,      0,      0,,-228393454,4

Open in new window

ASKER CERTIFIED SOLUTION
Avatar of Terry Woods
Terry Woods
Flag of New Zealand image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Note that you don't really need a regex for this; just repeating a standard search (for ,,) and replace (with ,) will probably give you the same result with a few repetitions.
Avatar of adnankh

ASKER

Hi Terry,

Thanks for your prompt reply, I didn't know how to tag you this question.. Anyway.. I tried but I'm losing good data as well..that is why I added question here...

Regards
Just mention TerryAtOpus and I think it should trigger an alert I have set up (not sure how many experts have done this! I've never asked...)

That makes the problem trickier. Would it solve it to limit the replacement to particular fields (eg the 7th and 10th, or where ever the problem occurs)? A regex could target particular fields.
I think the best you can do is to write a script that will s{,,+}{,}g and then count the columns and report if it isn't the right number.  Unless, as TerryAtOpus suggests, the issue is always within a certain character range, then it may be possible to be more targeted in the fixes and not remove good data.

Do you have any examples of rows where s{,,+}{,}g is removing good data as well?
Off-topic - TerryAtOpus, how do you setup an alert on your name?  I haven't looked for it specifically but I've never seen a way to do that (I just have alerts on certain topics/tags).
@wilcoxon, use these settings in a Saved Search:
Keywords: Include -> All these terms -> In any part: TerryAtOpus
Members: Exclude -> Any activity by: TerryAtOpus
Topics: Exclude -> These topics only: Cleanup
Title: Mentions of me
Alerts: Email notifications -> Every result

Hmm, I might put that into an article... :-)
Avatar of adnankh

ASKER

Thank you Terry.