Solved

Python/CSV string help

Posted on 2014-09-21
4
362 Views
Last Modified: 2014-10-03
So I am working with a CSV that has a many to one relationship and I have 2 problems I need assistance in solving. The first is that I have the string set up like

thisismystr=thisisanemail@addy.com,blah,blah,blah, startnewCSVcol

So I need to split the string twice, once on = and once on , as I am basically attempting to get the portion that is an e-mail address (thisisanemail@addy.com) so far I have figured out how to split the string on the = using something like this:

str = thisismystr=thisisanemail@addy.com,blah,blah,blah

print str.split("=")

Which returns this "thisisanemail@addy.com,blah,blah,blah"... however this leaves the ,blah,blah,blah portion to removed... after a ton of research I am stumped as nothing explains how to remove from the middle, just the 1st part or the last part. Does anyone know how to do this?

For the 2nd part I need to do this from multiple line, so this is more of an advice question... is it best to plug this into a variable and loop through like (i = 1, for i, #endofCSV do splitcmd) or is there a more efficient manner to do this? I am more familiar with LUA and I am learning that the more I work with python the more it differs from LUA.

Please help. Thanks!
0
Comment
Question by:shellee1983
  • 2
  • 2
4 Comments
 
LVL 25

Expert Comment

by:clockwatcher
ID: 40335936
It would be best if you gave us some real example strings (a few lines from your file) rather than a mock-up.  

From your example, it's difficult to tell whether the commas within your "blah, blah, blah" are actually there or whether they're part of your mockup.  I'm guessing based on the ", startNewCSVcol" that they might just be part of your mock-up rather than actually in your string.  If the commas are actually part of your string, then it's a simple matter of doing the CSV split first and then the "=" split on your first column.

import csv

f = csv.reader(open("myfile.csv","r"))
for row in f:
    email_field = row[0]
    email = email_field.split("=")[1] 
    print email

Open in new window


If there really is stuff tagged onto the end of your email address all within the first field of your CSV, then in order to make a helpful suggestion it would best to see a real example of your data.
0
 

Author Comment

by:shellee1983
ID: 40335949
Clockwatcher, the commas are actually there... if you were to view the CSV file in a notepad the string would look as so:

"thisismystr=thisisanemail@addy.com,blah,blah,blah", startnewCSVcol, nextnewCol, etc.

which is why I was having a bit of difficulty in doing this without converting it to tab delimited.
0
 
LVL 25

Accepted Solution

by:
clockwatcher earned 500 total points
ID: 40335966
So you have a csv embedded within your csv?  If the embedded csv is simple (doesn't require true csv parsing -- doesn't itself contain quotes) then it's just a sample matter of doing a "," split and then an "=" split on your first element.  

import csv

f = csv.reader(open("myfile.csv","r"))
for row in f:
    email_field = row[0]
    email = email_field.split(",")[0].split("=")[1] 
    print email

Open in new window


e.g.,

import StringIO
import csv
f = StringIO.StringIO('"thisismystr=thisisanemail@addy.com,blah,blah,blah", startnewCSVcol, nextnewCol,')
f.seek(0)
f_csv = csv.reader(f)
for row in f_csv:
    email_field = row[0]
    email = email_field.split(",")[0].split("=")[1]
    print email

Open in new window


That's if the embedded csv is simple.  If the embedded csv isn't always simple (can contain quotes) then you'd be better off passing it through a csv parser as well.
import StringIO
import csv
f = StringIO.StringIO('"thisismystr=thisisanemail@addy.com,blah,blah,blah", startnewCSVcol, nextnewCol,')
f.seek(0)
f_csv = csv.reader(f)
for row in f_csv:
    email_csv_field = csv.reader(StringIO.StringIO(row[0]))
    email_field = email_csv_field.next()[0]
    email = email_field.split("=")[1]
    print email

Open in new window

0
 

Author Closing Comment

by:shellee1983
ID: 40360401
Thank you, I still need to figure out how in the world i can append this back into the csv.
0

Featured Post

Top 6 Sources for Identifying Threat Actor TTPs

Understanding your enemy is essential. These six sources will help you identify the most popular threat actor tactics, techniques, and procedures (TTPs).

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
withoutTen challenge 14 88
noX challenge 17 76
Permutation and Combination 9 47
BASH script to modify crontab? 3 22
The greatest common divisor (gcd) of two positive integers is their largest common divisor. Let's consider two numbers 12 and 20. The divisors of 12 are 1, 2, 3, 4, 6, 12 The divisors of 20 are 1, 2, 4, 5, 10 20 The highest number among the c…
Whether you’re a college noob or a soon-to-be pro, these tips are sure to help you in your journey to becoming a programming ninja and stand out from the crowd.
This tutorial will introduce the viewer to VisualVM for the Java platform application. This video explains an example program and covers the Overview, Monitor, and Heap Dump tabs.
This tutorial explains how to use the VisualVM tool for the Java platform application. This video goes into detail on the Threads, Sampler, and Profiler tabs.

757 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

21 Experts available now in Live!

Get 1:1 Help Now