Regular Expressions (Java pattern syntax): Need a transform recipe for a tab delimited file

Techies,
 I want to transform these 2 sample lines in a tab delimited file:

77785027532971      02/05/2017      G7FD20D37B77
77785027533003      02/06/2017      G74420D29220

into 3 separate variable values (@EmpId, @StartDate, @Cube) which will later be written to a database. The end values will need to to look like this:

77785027532971
2017-02-05
G7FD20D37B77

How would I transform this using a regular expression (java pattern) ?
Paula DiTalloIntegration developerAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

CEHJCommented:
You're able to do this in a Unix-based OS?
0
Hanno P.S.IT Consultant and Infrastructure ArchitectCommented:
sed should do?
field #1 is the EmpId consisting of any numbers [0-9]* , delimited by tab \t
field #2 is the Month .. (2 charaters), delimited by /
field #3 is the Day .. (2 charaters), delimited by /
field #4 is the Year .... (4 characters), delimited by tab \t
field #5 is everything after the second tab)

The matching part is (white space added for readability only):
  ^   [0-9]*  \t  ..   /  ..  /  ....  \t   .*
To store parts in numbered field variables, "(" and ")" are being used. Don't forget to mask these with "\":
  ^\([0-9]*\)\t\(..\)/\(..\)/....\)\t\(.*\)
sed -e 's!^\([0-9]*\)\t\(..\)/\(..\)/....\)\t\(.*\)!\1\t\4-\2-\3\t\5!' 

Open in new window

0
Rgonzo1971Commented:
Hi,

pls try
import java.util.regex.Matcher;
import java.util.regex.Pattern;

final String regex = "^(\\d*)\\t(\\d{2})\\/(\\d{2})\\/(\\d{4})\\t(.*)";
final String string = "77785027532971	02/05/2017	G7FD20D37B77\n"
	 + "77785027533003	02/06/2017	G74420D29220";
final String subst = "\\1\\t\\4-\\2-\\3\\t\\5";

final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);

// The substituted value will be contained in the result variable
final String result = matcher.replaceAll(subst);

System.out.println("Substitution result: " + result);

Open in new window

REgards
0
Cloud Class® Course: CompTIA Cloud+

The CompTIA Cloud+ Basic training course will teach you about cloud concepts and models, data storage, networking, and network infrastructure.

Hanno P.S.IT Consultant and Infrastructure ArchitectCommented:
Sorry, found typo: opening \( missing. Should read
sed -e 's!^\([0-9]*\)\t\(..\)/\(..\)/\(....\)\t\(.*\)!\1\t\4-\2-\3\t\5!'

Open in new window

0
Rgonzo1971Commented:
Splitting the result into array
import java.util.regex.Matcher;
import java.util.regex.Pattern;

final String regex = "^(\\d*)\\t(\\d{2})\\/(\\d{2})\\/(\\d{4})\\t(.*)";
final String string = "77785027532971	02/05/2017	G7FD20D37B77\n"
	 + "77785027533003	02/06/2017	G74420D29220";
final String subst = "\\1\\t\\4-\\2-\\3\\t\\5";

final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);

// The substituted value will be contained in the result variable
final String result = matcher.replaceAll(subst);

System.out.println("Substitution result: " + result);
String[] ResItems = result.split("\\t");

Open in new window

0
CEHJCommented:
awk -F'[\t/]' '{ printf("%s\n%s-%s-%s\n%s\n",$1,$4,$2,$3,$5); }' oldfile.csv >newfile.csv

Open in new window

And that's your whole file processed into 'newfile.csv'
0
Paula DiTalloIntegration developerAuthor Commented:
RGonzo,

Thanks so much for looking at this.  Here's what I'm seeing when testing
https://www.freeformatter.com/java-regex-tester.html#ad-output

ResultsTestingRegExPatternMatches
0
Paula DiTalloIntegration developerAuthor Commented:
CEH,
Yes, Linux--however, I am using a tool called Apache NiFi as the dataflow.  NiFi's default is the Java Regular Expression.
0
CEHJCommented:
That web form cannot contain tab characters so what you're attempting to do won't work
0
Hanno P.S.IT Consultant and Infrastructure ArchitectCommented:
Unix shell, sed, awk and Java use different regex formats.
Even grep (grep, egrep, fgrep) use different regex'
Therefore: A sed regex is not equivalent to a Java regex.
0
CEHJCommented:
It might let you embed them escaped, so try

77785027532971\t02/05/2017\tG7FD20D37B77

Open in new window

0
Paula DiTalloIntegration developerAuthor Commented:
Experts,

Thanks regarding the online web testing with tabs--

CEHJ,
I think you're on to something with awk. It looks like dealing with tabs may be somewhat problematic. It may be easier in the long run to transform the file as replacing tabs with commas.
0
CEHJCommented:
It looks like dealing with tabs may be somewhat problematic.
Certainly if you're dealing with GUIs, tabs will be difficult

sed -r 's/\t+/,/g' foo.csv

Open in new window

Once you're happy that works, add the -i flag to edit it inline
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
CEHJCommented:
:)
0
Hanno P.S.IT Consultant and Infrastructure ArchitectCommented:
Interesting: The "accepted solution" is in no way an answer to the original question.
0
CEHJCommented:
Yes, maybe you meant to choose THIS Paula?
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Regular Expressions

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.