Solved

Import correctly a very large text file in Excel

Posted on 2013-01-11
10
253 Views
Last Modified: 2013-02-01
Hi all...

I have a very big text file containing data (space separated) about people.

I need to order all those people in excel but then I import it in excel I have no line break so I get one single very big row.

I saw that each record is separated by the letters RP but before RP there's a numeric code of different lenght.

So the final solution would be an automation to automatically put a ^p at the beginning of every word in the txt file that contains RP

How can I achieve that?

(if possible I'd love to do it using linux command line otherwise any kind of solution for any kind of operating system is ok: also including microsoft office and or other tools)
0
Comment
Question by:ltpitt
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 3
  • 2
10 Comments
 
LVL 37

Expert Comment

by:TommySzalapski
ID: 38768090
You can actually open the file in Word and do a find replace. Replace RP with ^pRP and it should do the trick.
0
 
LVL 95

Expert Comment

by:John Hurst
ID: 38768097
What I would do is open the text file in a good text editor and to add return (CR LF) at the end of each line. You may wish to remove paragraph formatting to do this. Make a copy of the file so you can test different approaches.

.... Thinkpads_User
0
 
LVL 37

Expert Comment

by:TommySzalapski
ID: 38768100
For a linux command this should work
sed -i 's/RP/\nRP/g' yourfile.txt
0
Instantly Create Instructional Tutorials

Contextual Guidance at the moment of need helps your employees adopt to new software or processes instantly. Boost knowledge retention and employee engagement step-by-step with one easy solution.

 
LVL 1

Author Comment

by:ltpitt
ID: 38768164
Maybe I wasn't very clear:

There is no end of line...
The txt file is all a ultra-long line itself.
The data is separated by spaces.

I can't replace RP with ^pRP because RP is a string composed by various numbers e.g.

10293701273RP

000033RP

so I need to put the ^p at the beginning of the "word" that end with RP
0
 
LVL 95

Expert Comment

by:John Hurst
ID: 38768208
Try a trial copy of UltraEdit.

UltraEdit can take flowing text and break it at a column (say 72) and add hard returns.

To do this, the data would have to be regular (that is, say, RP at every 70 characters). If the data is free flowing with respect to implied line length, this will not work.

I have UltraEdit and have used it to convert unformatted lines into regular lines. It has worked for me, but may not for you.

Make sure you make a backup copy of the file.

.... Thinkpads_User
0
 
LVL 37

Accepted Solution

by:
TommySzalapski earned 500 total points
ID: 38768484
No problem. Just use this
sed 's/[0-9]*RP/\n&/g' yourfile.txt > yournewfile.txt

That will take any digits followed by RP and put a newline in front, keeping the digits and RP. Try it.

or sed -i 's/[0-9]*RP/\n&/g' yourfile.txt

The -i means to do it in place and wipe out the old version. The first option is nice because it leaves the backup in case it screws up.
0
 
LVL 37

Assisted Solution

by:TommySzalapski
TommySzalapski earned 500 total points
ID: 38768787
To break it down into parts because it annoys me when people don't do that:

sed is a tool that does a lot
the s/ means "substitute"
the /g at the end ("global") means to replace every occurrence
So sed s/foo/bar/g input.txt > output.txt means to replace every instance of foo in input.txt with bar and put the output into output.txt
The [0-9]*RP means to find any number of digits (including none) followed by RP
\n is newline and & is whatever was matched (in this case digits and RP)
0
 
LVL 1

Author Comment

by:ltpitt
ID: 38773689
@TommySzalapski

Thank you: really helpful! :)

The problem is that I have a various number of digits before RP so I can't tell for sure if sed has to place the newline 4 or 7 digits before RP
0
 
LVL 37

Assisted Solution

by:TommySzalapski
TommySzalapski earned 500 total points
ID: 38774643
The [0-9]* matches any number of digits and the & includes all of them. It doesn't matter how many digits there are, it will work.
0
 
LVL 1

Author Closing Comment

by:ltpitt
ID: 38843835
wow
0

Featured Post

PeopleSoft Has Never Been Easier

PeopleSoft Adoption Made Smooth & Simple!

On-The-Job Training Is made Intuitive & Easy With WalkMe's On-Screen Guidance Tool.  Claim Your Free WalkMe Account Now

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

If you’re thinking to yourself “That description sounds a lot like two people doing the work that one could accomplish,” you’re not alone.
This article will inform Clients about common and important expectations from the freelancers (Experts) who are looking at your Gig.
Viewers will learn how to properly install Eclipse with the necessary JDK, and will take a look at an introductory Java program. Download Eclipse installation zip file: Extract files from zip file: Download and install JDK 8: Open Eclipse and …
With the power of JIRA, there's an unlimited number of ways you can customize it, use it and benefit from it. With that in mind, there's bound to be things that I wasn't able to cover in this course. With this summary we'll look at some places to go…

737 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question