Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win

x
?
Solved

Import correctly a very large text file in Excel

Posted on 2013-01-11
10
Medium Priority
?
256 Views
Last Modified: 2013-02-01
Hi all...

I have a very big text file containing data (space separated) about people.

I need to order all those people in excel but then I import it in excel I have no line break so I get one single very big row.

I saw that each record is separated by the letters RP but before RP there's a numeric code of different lenght.

So the final solution would be an automation to automatically put a ^p at the beginning of every word in the txt file that contains RP

How can I achieve that?

(if possible I'd love to do it using linux command line otherwise any kind of solution for any kind of operating system is ok: also including microsoft office and or other tools)
0
Comment
Question by:ltpitt
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 3
  • 2
10 Comments
 
LVL 37

Expert Comment

by:TommySzalapski
ID: 38768090
You can actually open the file in Word and do a find replace. Replace RP with ^pRP and it should do the trick.
0
 
LVL 98

Expert Comment

by:John Hurst
ID: 38768097
What I would do is open the text file in a good text editor and to add return (CR LF) at the end of each line. You may wish to remove paragraph formatting to do this. Make a copy of the file so you can test different approaches.

.... Thinkpads_User
0
 
LVL 37

Expert Comment

by:TommySzalapski
ID: 38768100
For a linux command this should work
sed -i 's/RP/\nRP/g' yourfile.txt
0
Free learning courses: Active Directory Deep Dive

Get a firm grasp on your IT environment when you learn Active Directory best practices with Veeam! Watch all, or choose any amount, of this three-part webinar series to improve your skills. From the basics to virtualization and backup, we got you covered.

 
LVL 1

Author Comment

by:ltpitt
ID: 38768164
Maybe I wasn't very clear:

There is no end of line...
The txt file is all a ultra-long line itself.
The data is separated by spaces.

I can't replace RP with ^pRP because RP is a string composed by various numbers e.g.

10293701273RP

000033RP

so I need to put the ^p at the beginning of the "word" that end with RP
0
 
LVL 98

Expert Comment

by:John Hurst
ID: 38768208
Try a trial copy of UltraEdit.

UltraEdit can take flowing text and break it at a column (say 72) and add hard returns.

To do this, the data would have to be regular (that is, say, RP at every 70 characters). If the data is free flowing with respect to implied line length, this will not work.

I have UltraEdit and have used it to convert unformatted lines into regular lines. It has worked for me, but may not for you.

Make sure you make a backup copy of the file.

.... Thinkpads_User
0
 
LVL 37

Accepted Solution

by:
TommySzalapski earned 2000 total points
ID: 38768484
No problem. Just use this
sed 's/[0-9]*RP/\n&/g' yourfile.txt > yournewfile.txt

That will take any digits followed by RP and put a newline in front, keeping the digits and RP. Try it.

or sed -i 's/[0-9]*RP/\n&/g' yourfile.txt

The -i means to do it in place and wipe out the old version. The first option is nice because it leaves the backup in case it screws up.
0
 
LVL 37

Assisted Solution

by:TommySzalapski
TommySzalapski earned 2000 total points
ID: 38768787
To break it down into parts because it annoys me when people don't do that:

sed is a tool that does a lot
the s/ means "substitute"
the /g at the end ("global") means to replace every occurrence
So sed s/foo/bar/g input.txt > output.txt means to replace every instance of foo in input.txt with bar and put the output into output.txt
The [0-9]*RP means to find any number of digits (including none) followed by RP
\n is newline and & is whatever was matched (in this case digits and RP)
0
 
LVL 1

Author Comment

by:ltpitt
ID: 38773689
@TommySzalapski

Thank you: really helpful! :)

The problem is that I have a various number of digits before RP so I can't tell for sure if sed has to place the newline 4 or 7 digits before RP
0
 
LVL 37

Assisted Solution

by:TommySzalapski
TommySzalapski earned 2000 total points
ID: 38774643
The [0-9]* matches any number of digits and the & includes all of them. It doesn't matter how many digits there are, it will work.
0
 
LVL 1

Author Closing Comment

by:ltpitt
ID: 38843835
wow
0

Featured Post

Veeam Disaster Recovery in Microsoft Azure

Veeam PN for Microsoft Azure is a FREE solution designed to simplify and automate the setup of a DR site in Microsoft Azure using lightweight software-defined networking. It reduces the complexity of VPN deployments and is designed for businesses of ALL sizes.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Whether you’re a college noob or a soon-to-be pro, these tips are sure to help you in your journey to becoming a programming ninja and stand out from the crowd.
Today, the web development industry is booming, and many people consider it to be their vocation. The question you may be asking yourself is – how do I become a web developer?
In this seventh video of the Xpdf series, we discuss and demonstrate the PDFfonts utility, which lists all the fonts used in a PDF file. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any pl…
Six Sigma Control Plans

618 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question