?
Solved

Import correctly a very large text file in Excel

Posted on 2013-01-11
10
Medium Priority
?
255 Views
Last Modified: 2013-02-01
Hi all...

I have a very big text file containing data (space separated) about people.

I need to order all those people in excel but then I import it in excel I have no line break so I get one single very big row.

I saw that each record is separated by the letters RP but before RP there's a numeric code of different lenght.

So the final solution would be an automation to automatically put a ^p at the beginning of every word in the txt file that contains RP

How can I achieve that?

(if possible I'd love to do it using linux command line otherwise any kind of solution for any kind of operating system is ok: also including microsoft office and or other tools)
0
Comment
Question by:ltpitt
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 3
  • 2
10 Comments
 
LVL 37

Expert Comment

by:TommySzalapski
ID: 38768090
You can actually open the file in Word and do a find replace. Replace RP with ^pRP and it should do the trick.
0
 
LVL 97

Expert Comment

by:Experienced Member
ID: 38768097
What I would do is open the text file in a good text editor and to add return (CR LF) at the end of each line. You may wish to remove paragraph formatting to do this. Make a copy of the file so you can test different approaches.

.... Thinkpads_User
0
 
LVL 37

Expert Comment

by:TommySzalapski
ID: 38768100
For a linux command this should work
sed -i 's/RP/\nRP/g' yourfile.txt
0
Free Backup Tool for VMware and Hyper-V

Restore full virtual machine or individual guest files from 19 common file systems directly from the backup file. Schedule VM backups with PowerShell scripts. Set desired time, lean back and let the script to notify you via email upon completion.  

 
LVL 1

Author Comment

by:ltpitt
ID: 38768164
Maybe I wasn't very clear:

There is no end of line...
The txt file is all a ultra-long line itself.
The data is separated by spaces.

I can't replace RP with ^pRP because RP is a string composed by various numbers e.g.

10293701273RP

000033RP

so I need to put the ^p at the beginning of the "word" that end with RP
0
 
LVL 97

Expert Comment

by:Experienced Member
ID: 38768208
Try a trial copy of UltraEdit.

UltraEdit can take flowing text and break it at a column (say 72) and add hard returns.

To do this, the data would have to be regular (that is, say, RP at every 70 characters). If the data is free flowing with respect to implied line length, this will not work.

I have UltraEdit and have used it to convert unformatted lines into regular lines. It has worked for me, but may not for you.

Make sure you make a backup copy of the file.

.... Thinkpads_User
0
 
LVL 37

Accepted Solution

by:
TommySzalapski earned 2000 total points
ID: 38768484
No problem. Just use this
sed 's/[0-9]*RP/\n&/g' yourfile.txt > yournewfile.txt

That will take any digits followed by RP and put a newline in front, keeping the digits and RP. Try it.

or sed -i 's/[0-9]*RP/\n&/g' yourfile.txt

The -i means to do it in place and wipe out the old version. The first option is nice because it leaves the backup in case it screws up.
0
 
LVL 37

Assisted Solution

by:TommySzalapski
TommySzalapski earned 2000 total points
ID: 38768787
To break it down into parts because it annoys me when people don't do that:

sed is a tool that does a lot
the s/ means "substitute"
the /g at the end ("global") means to replace every occurrence
So sed s/foo/bar/g input.txt > output.txt means to replace every instance of foo in input.txt with bar and put the output into output.txt
The [0-9]*RP means to find any number of digits (including none) followed by RP
\n is newline and & is whatever was matched (in this case digits and RP)
0
 
LVL 1

Author Comment

by:ltpitt
ID: 38773689
@TommySzalapski

Thank you: really helpful! :)

The problem is that I have a various number of digits before RP so I can't tell for sure if sed has to place the newline 4 or 7 digits before RP
0
 
LVL 37

Assisted Solution

by:TommySzalapski
TommySzalapski earned 2000 total points
ID: 38774643
The [0-9]* matches any number of digits and the & includes all of them. It doesn't matter how many digits there are, it will work.
0
 
LVL 1

Author Closing Comment

by:ltpitt
ID: 38843835
wow
0

Featured Post

Get 15 Days FREE Full-Featured Trial

Benefit from a mission critical IT monitoring with Monitis Premium or get it FREE for your entry level monitoring needs.
-Over 200,000 users
-More than 300,000 websites monitored
-Used in 197 countries
-Recommended by 98% of users

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This is about my first experience with programming Arduino.
In this post we will learn how to connect and configure Android Device (Smartphone etc.) with Android Studio. After that we will run a simple Hello World Program.
An introduction to basic programming syntax in Java by creating a simple program. Viewers can follow the tutorial as they create their first class in Java. Definitions and explanations about each element are given to help prepare viewers for future …
In this fourth video of the Xpdf series, we discuss and demonstrate the PDFinfo utility, which retrieves the contents of a PDF's Info Dictionary, as well as some other information, including the page count. We show how to isolate the page count in a…
Suggested Courses

752 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question