Solved

Python/CSV string help

Posted on 2014-09-21
4
409 Views
Last Modified: 2014-10-03
So I am working with a CSV that has a many to one relationship and I have 2 problems I need assistance in solving. The first is that I have the string set up like

thisismystr=thisisanemail@addy.com,blah,blah,blah, startnewCSVcol

So I need to split the string twice, once on = and once on , as I am basically attempting to get the portion that is an e-mail address (thisisanemail@addy.com) so far I have figured out how to split the string on the = using something like this:

str = thisismystr=thisisanemail@addy.com,blah,blah,blah

print str.split("=")

Which returns this "thisisanemail@addy.com,blah,blah,blah"... however this leaves the ,blah,blah,blah portion to removed... after a ton of research I am stumped as nothing explains how to remove from the middle, just the 1st part or the last part. Does anyone know how to do this?

For the 2nd part I need to do this from multiple line, so this is more of an advice question... is it best to plug this into a variable and loop through like (i = 1, for i, #endofCSV do splitcmd) or is there a more efficient manner to do this? I am more familiar with LUA and I am learning that the more I work with python the more it differs from LUA.

Please help. Thanks!
0
Comment
Question by:shellee1983
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
4 Comments
 
LVL 25

Expert Comment

by:clockwatcher
ID: 40335936
It would be best if you gave us some real example strings (a few lines from your file) rather than a mock-up.  

From your example, it's difficult to tell whether the commas within your "blah, blah, blah" are actually there or whether they're part of your mockup.  I'm guessing based on the ", startNewCSVcol" that they might just be part of your mock-up rather than actually in your string.  If the commas are actually part of your string, then it's a simple matter of doing the CSV split first and then the "=" split on your first column.

import csv

f = csv.reader(open("myfile.csv","r"))
for row in f:
    email_field = row[0]
    email = email_field.split("=")[1] 
    print email

Open in new window


If there really is stuff tagged onto the end of your email address all within the first field of your CSV, then in order to make a helpful suggestion it would best to see a real example of your data.
0
 

Author Comment

by:shellee1983
ID: 40335949
Clockwatcher, the commas are actually there... if you were to view the CSV file in a notepad the string would look as so:

"thisismystr=thisisanemail@addy.com,blah,blah,blah", startnewCSVcol, nextnewCol, etc.

which is why I was having a bit of difficulty in doing this without converting it to tab delimited.
0
 
LVL 25

Accepted Solution

by:
clockwatcher earned 500 total points
ID: 40335966
So you have a csv embedded within your csv?  If the embedded csv is simple (doesn't require true csv parsing -- doesn't itself contain quotes) then it's just a sample matter of doing a "," split and then an "=" split on your first element.  

import csv

f = csv.reader(open("myfile.csv","r"))
for row in f:
    email_field = row[0]
    email = email_field.split(",")[0].split("=")[1] 
    print email

Open in new window


e.g.,

import StringIO
import csv
f = StringIO.StringIO('"thisismystr=thisisanemail@addy.com,blah,blah,blah", startnewCSVcol, nextnewCol,')
f.seek(0)
f_csv = csv.reader(f)
for row in f_csv:
    email_field = row[0]
    email = email_field.split(",")[0].split("=")[1]
    print email

Open in new window


That's if the embedded csv is simple.  If the embedded csv isn't always simple (can contain quotes) then you'd be better off passing it through a csv parser as well.
import StringIO
import csv
f = StringIO.StringIO('"thisismystr=thisisanemail@addy.com,blah,blah,blah", startnewCSVcol, nextnewCol,')
f.seek(0)
f_csv = csv.reader(f)
for row in f_csv:
    email_csv_field = csv.reader(StringIO.StringIO(row[0]))
    email_field = email_csv_field.next()[0]
    email = email_field.split("=")[1]
    print email

Open in new window

0
 

Author Closing Comment

by:shellee1983
ID: 40360401
Thank you, I still need to figure out how in the world i can append this back into the csv.
0

Featured Post

On Demand Webinar: Networking for the Cloud Era

Did you know SD-WANs can improve network connectivity? Check out this webinar to learn how an SD-WAN simplified, one-click tool can help you migrate and manage data in the cloud.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Dictionaries contain key:value pairs. Which means a collection of tuples with an attribute name and an assigned value to it. The semicolon present in between each key and values and attribute with values are delimited with a comma.  In python we can…
The purpose of this article is to demonstrate how we can upgrade Python from version 2.7.6 to Python 2.7.10 on the Linux Mint operating system. I am using an Oracle Virtual Box where I have installed Linux Mint operating system version 17.2. Once yo…
The viewer will learn how to implement Singleton Design Pattern in Java.
The viewer will learn how to use the return statement in functions in C++. The video will also teach the user how to pass data to a function and have the function return data back for further processing.

691 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question