?
Solved

Parsing text files and putting it in a table or CSV

Posted on 2007-11-28
5
Medium Priority
?
458 Views
Last Modified: 2013-11-15
I am trying to parse a public domain dictionary.  The text files look at this.

Kamala (n.) The red dusty hairs of the capsules of an East Indian tree (Mallotus Philippinensis) used for dyeing silk. It is violently emetic, and is used in the treatment of tapeworm.

Kame (n.) A low ridge.

Kami (n. pl.) A title given to the celestial gods of the first mythical dynasty of Japan and extended to the demigods of the second dynasty, and then to the long line of spiritual princes still represented by the mikado.

All the words are broken up like 3 examples above: word (type of word) meaning.  There would be 3 columns parsed.   Everything after the (type of word) is the definitions and might or might not contain ().

Thanks for the help.
S
0
Comment
Question by:StaceyShindo
5 Comments
 
LVL 6

Expert Comment

by:cottsak
ID: 20371407
in summary, i'd use regex to capture the different parts of each def and then iterate the groups collection for the columns.
use named/numbered regex groups to get the parts:
http://msdn2.microsoft.com/en-au/library/bs2twtah.aspx

the regex expression might look like this:
^(?<word>\w*)\s\((?<type>[a-zA-Z_. ]*)\)\s(?<meaning>.*)
ive tested the above on ur samples. it works assuming (1) that u dont break across lines in the meaning section and (2) that you have the .NET MultiLine Regex option enabled (just a parameter u pass in)
it accounts for ur last requirement with the () in the meaning section. u may need to expand the character class for the type section if further type sections contain other chatacters - do some Regex reading on Character Classes if necessary.

because i've explicitly named my groups in the expression u can reference the group collection item with the string name
hav fun ;D
0
 
LVL 18

Expert Comment

by:Jinesh Kamdar
ID: 20371434
So where is it that you want to put them after parsing? In some DB?
0
 

Author Comment

by:StaceyShindo
ID: 20371492
I want to ultimately put them in a db, but if we can get them in a Excel sheet or CSV file, that would be fine.  At the end of the day, we are going to put them in MS-SQL.

Thanks,
S
0
 
LVL 4

Accepted Solution

by:
origfla earned 500 total points
ID: 20371558
Here is a solution - it will work in Excel as a quick way to do the job.
Paste your dictionary into a column in an excel spreadsheet - say column C. and say your first entry is in cell "C4".

TO GET THE   "WORD"   PART:
in cell D4, paste the following function:
=LEFT(C4,FIND("(",C4)-2)

TO GET THE   "TYPE OF WORD"   PART:
in cell E4, paste the following function:
=MID(C4,FIND("(",C4),FIND(")",C4)-FIND("(",C4)+1)

TO GET THE   "DEFINITION"   PART:
in cell F4, paste the following function:
=MID(C4,FIND(")",C4)+2,9999)

Now, all you have to do is copy the formula down along the D,E and F columns. You can do this by:
 - Highlighting cells D4,E4 & F4 all at once.
 - Place the cursor over the small square at the right bottom of cell F4 (cursor will become a "+" shape)
 - Double click while cursor is the "+" shape, it will copy the formulas all the way down the list of definitions.

FINALLY, if you want the columns D,E & F as vlaues instead of formulas, so you can use them as you wish, simply select the entire columns D,E & F and go Edit-> Copy.
Then, select any other Column that is empty - Say Column I, then go Edit->Paste Special
and paste Values only.

This is my first reply, so I hope this helps and that this gives you what you're looking for.
0
 

Author Closing Comment

by:StaceyShindo
ID: 31411584
This is a great solution that is easy and works.  It will take some more work to copy and paste the dictionary, but that is fine, I am not afraid of a little work.  Thank you so much for your help.

S
0

Featured Post

Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Instead of error trapping or hard-coding for non-updateable fields when using QODBC, let VBA automatically disable them when forms open. This way, users can view but not change the data. Part 1 explained how to use schema tables to do this. Part 2 h…
Windows Explorer let you handle zip folders nearly as any other folder: Copy, move, change, and delete, etc. In VBA you can also handle normal files and folders, but zip folders takes a little more - and that you'll find here.
This Micro Tutorial will demonstrate in Google Sheets how to use the HYPERLINK function to create live links inside your spreadsheet.
In this video, Percona Solutions Engineer Barrett Chambers discusses some of the basic syntax differences between MySQL and MongoDB. To learn more check out our webinar on MongoDB administration for MySQL DBA: https://www.percona.com/resources/we…

809 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question