We help IT Professionals succeed at work.
Private
Troubleshooting Question

python query

tech
tech asked
on
285 Views
Last Modified: 2020-09-24
There is 1 column named x in csv file. It has values like this.
{'MAX': 6.0, 'AVG': -3.0, 'MIN': -1.0}
so x =  {'MAX': 6.0, 'AVG': -3.0, 'MIN': -1.0} (curly braces included in a row value).
1.csv

I am using pandas with python.

I wanted to get rid of curly braces and extract the values of max, avg, min and store in 3 different columns named x_max, x_min, x_avg, so I should be left with
x_max = 6.0
x_min = -3.0
x_avg = -1.0

2.csv
Comment
Watch Question

Analyst Assistant
CERTIFIED EXPERT
Commented:
This one is on us!
(Get your first solution completely free - no credit card required)
UNLOCK SOLUTION

Author

Commented:
Works for a single row, thanks.
NorieAnalyst Assistant
CERTIFIED EXPERT

Commented:
Do you need it to work for multiple rows?

Author

Commented:
Yes please.. new csv file attached.
1.csv

Author

Commented:
multiple rows and multiple columns.

I have 2 million rows and 8 columns.
NorieAnalyst Assistant
CERTIFIED EXPERT

Commented:
Could you post some sample data and expected results?

Obviously not millions of row but a representative sample.

Actually, it might be an idea to open another question for that and include a link back to this for reference.
NorieAnalyst Assistant
CERTIFIED EXPERT

Commented:
This works with the sample data and should work with larger data sets.
import pandas as pd

dfIn = pd.read_csv('1.csv')
dfOut = pd.DataFrame()

for col in dfIn.columns:
    prefix = col
    dict = [eval(row) for row in dfIn[col]]
    dfCol = pd.DataFrame(dict)
    dfCol.columns = [prefix+'_' + str(oldCol) for oldCol in dfCol.columns]
    if dfOut.empty:
        dfOut = dfCol
    else:
        dfOut = pd.concat([dfOut,dfCol], axis = 1)
        
# convert all columns to integer - not sure if really required
dfOut = dfOut.astype(int)

dfOut.to_csv('2.csv', index=False)

Open in new window

Author

Commented:
Thanks a million. I tried on the sample code, works fine. I will try with my dataset. Thanks again and Cheers.

Author

Commented:
Error at this line of code:  dict = [eval(row) for row in dfIn[col]]

TypeError: eval() arg 1 must be a string, bytes or code object

Error is coming because I have a date column and id column to which I want to keep as such.
NorieAnalyst Assistant
CERTIFIED EXPERT

Commented:
Do you know what row is when you get the error?

Author

Commented:
when the code reaches this line dict = [eval(row) for row in dfIn[col]], error comes because I have id, date column and then rest of the columns. eval does not work on id and date column.
NorieAnalyst Assistant
CERTIFIED EXPERT

Commented:
There was no id and date column in the data you posted.

Can you post some sample data that has those columns?

Author

Commented:
I just need to do 1 thing, escaping the first 2 columns i.e id and date column, then code will work as it should.

Author

Commented:
ok,

Author

Commented:
NorieAnalyst Assistant
CERTIFIED EXPERT

Commented:
Do you need the id and date in the final result?

Author

Commented:
yes, I would need that.

Author

Commented:
NorieAnalyst Assistant
CERTIFIED EXPERT

Commented:
Is the 2nd row of the sample data correct?

Author

Commented:
no, it wasn't correct. Sorry.

uploaded again.
1.csv
NorieAnalyst Assistant
CERTIFIED EXPERT

Commented:
This should take care of the id and date columns.
import pandas as pd

dfIn = pd.read_csv('1.csv')
dfOut =dfIn[['date', 'id']]

dfIn = dfIn.drop(['date', 'id'], axis=1)

for col in dfIn.columns:
    prefix = col
    dict = [eval(row) for row in dfIn[col]]
    dfCol = pd.DataFrame(dict)
    dfCol.columns = [prefix+'_' + str(oldCol) for oldCol in dfCol.columns]
    dfCol = dfCol.astype(int)
    if dfOut.empty:
        dfOut = dfCol
    else:
        dfOut = pd.concat([dfOut,dfCol], axis = 1)
        
dfOut.to_csv('2.csv', index=False)

Open in new window

Author

Commented:
---> 14         dfOut = pd.concat([dfOut,dfCol], axis = 1)

InvalidIndexError: Reindexing only valid with uniquely valued Index objects
NorieAnalyst Assistant
CERTIFIED EXPERT

Commented:
Anything different about the data when you get that error?

Author

Commented:
I am still wondering why this error come as there are no null values. There is no change in the dataset. I come to know while researching that this error can be resolved by creating a new index df.reset_index. trying if this can resolve this.


screenshot

Author

Commented:
pd.concat requires that the indices be unique. may be that's why this error is coming. trying with each column to locate the error.
NorieAnalyst Assistant
CERTIFIED EXPERT

Commented:
The index should be unique, it should start at 0 and goto n-1 where n is the no of rows of data.


Any chance you could post a larger data set?

Author

Commented:
here, the unique count of id and the total number of rows differ which means for same ids, the dataset has multiple rows. but id+date make a unique key.
Not permitted for that, sorry.
NorieAnalyst Assistant
CERTIFIED EXPERT

Commented:
That's nothing to do with the index.

Try printing out dfIn, or a subset of it, as soon as it's loaded.

Gain unlimited access to on-demand training courses with an Experts Exchange subscription.

Get Access
Why Experts Exchange?

Experts Exchange always has the answer, or at the least points me in the correct direction! It is like having another employee that is extremely experienced.

Jim Murphy
Programmer at Smart IT Solutions

When asked, what has been your best career decision?

Deciding to stick with EE.

Mohamed Asif
Technical Department Head

Being involved with EE helped me to grow personally and professionally.

Carl Webster
CTP, Sr Infrastructure Consultant
Empower Your Career
Did You Know?

We've partnered with two important charities to provide clean water and computer science education to those who need it most. READ MORE

Ask ANY Question

Connect with Certified Experts to gain insight and support on specific technology challenges including:

  • Troubleshooting
  • Research
  • Professional Opinions
Unlock the solution to this question.
Join our community and discover your potential

Experts Exchange is the only place where you can interact directly with leading experts in the technology field. Become a member today and access the collective knowledge of thousands of technology experts.

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

OR

Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.