Solved

Import correctly a very large text file in Excel

Posted on 2013-01-11
10
254 Views
Last Modified: 2013-02-01
Hi all...

I have a very big text file containing data (space separated) about people.

I need to order all those people in excel but then I import it in excel I have no line break so I get one single very big row.

I saw that each record is separated by the letters RP but before RP there's a numeric code of different lenght.

So the final solution would be an automation to automatically put a ^p at the beginning of every word in the txt file that contains RP

How can I achieve that?

(if possible I'd love to do it using linux command line otherwise any kind of solution for any kind of operating system is ok: also including microsoft office and or other tools)
0
Comment
Question by:ltpitt
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 3
  • 2
10 Comments
 
LVL 37

Expert Comment

by:TommySzalapski
ID: 38768090
You can actually open the file in Word and do a find replace. Replace RP with ^pRP and it should do the trick.
0
 
LVL 95

Expert Comment

by:John Hurst
ID: 38768097
What I would do is open the text file in a good text editor and to add return (CR LF) at the end of each line. You may wish to remove paragraph formatting to do this. Make a copy of the file so you can test different approaches.

.... Thinkpads_User
0
 
LVL 37

Expert Comment

by:TommySzalapski
ID: 38768100
For a linux command this should work
sed -i 's/RP/\nRP/g' yourfile.txt
0
NFR key for Veeam Backup for Microsoft Office 365

Veeam is happy to provide a free NFR license (for 1 year, up to 10 users). This license allows for the non‑production use of Veeam Backup for Microsoft Office 365 in your home lab without any feature limitations.

 
LVL 1

Author Comment

by:ltpitt
ID: 38768164
Maybe I wasn't very clear:

There is no end of line...
The txt file is all a ultra-long line itself.
The data is separated by spaces.

I can't replace RP with ^pRP because RP is a string composed by various numbers e.g.

10293701273RP

000033RP

so I need to put the ^p at the beginning of the "word" that end with RP
0
 
LVL 95

Expert Comment

by:John Hurst
ID: 38768208
Try a trial copy of UltraEdit.

UltraEdit can take flowing text and break it at a column (say 72) and add hard returns.

To do this, the data would have to be regular (that is, say, RP at every 70 characters). If the data is free flowing with respect to implied line length, this will not work.

I have UltraEdit and have used it to convert unformatted lines into regular lines. It has worked for me, but may not for you.

Make sure you make a backup copy of the file.

.... Thinkpads_User
0
 
LVL 37

Accepted Solution

by:
TommySzalapski earned 500 total points
ID: 38768484
No problem. Just use this
sed 's/[0-9]*RP/\n&/g' yourfile.txt > yournewfile.txt

That will take any digits followed by RP and put a newline in front, keeping the digits and RP. Try it.

or sed -i 's/[0-9]*RP/\n&/g' yourfile.txt

The -i means to do it in place and wipe out the old version. The first option is nice because it leaves the backup in case it screws up.
0
 
LVL 37

Assisted Solution

by:TommySzalapski
TommySzalapski earned 500 total points
ID: 38768787
To break it down into parts because it annoys me when people don't do that:

sed is a tool that does a lot
the s/ means "substitute"
the /g at the end ("global") means to replace every occurrence
So sed s/foo/bar/g input.txt > output.txt means to replace every instance of foo in input.txt with bar and put the output into output.txt
The [0-9]*RP means to find any number of digits (including none) followed by RP
\n is newline and & is whatever was matched (in this case digits and RP)
0
 
LVL 1

Author Comment

by:ltpitt
ID: 38773689
@TommySzalapski

Thank you: really helpful! :)

The problem is that I have a various number of digits before RP so I can't tell for sure if sed has to place the newline 4 or 7 digits before RP
0
 
LVL 37

Assisted Solution

by:TommySzalapski
TommySzalapski earned 500 total points
ID: 38774643
The [0-9]* matches any number of digits and the & includes all of them. It doesn't matter how many digits there are, it will work.
0
 
LVL 1

Author Closing Comment

by:ltpitt
ID: 38843835
wow
0

Featured Post

Online Training Solution

Drastically shorten your training time with WalkMe's advanced online training solution that Guides your trainees to action. Forget about retraining and skyrocket knowledge retention rates.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Since upgrading to Office 2013 or higher installing the Smart Indenter addin will fail. This article will explain how to install it so it will work regardless of the Office version installed.
In this post we will learn how to make Android Gesture Tutorial and give different functionality whenever a user Touch or Scroll android screen.
In this fourth video of the Xpdf series, we discuss and demonstrate the PDFinfo utility, which retrieves the contents of a PDF's Info Dictionary, as well as some other information, including the page count. We show how to isolate the page count in a…
In this fifth video of the Xpdf series, we discuss and demonstrate the PDFdetach utility, which is able to list and, more importantly, extract attachments that are embedded in PDF files. It does this via a command line interface, making it suitable …

726 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question