Avatar of ocean O
ocean O asked on

save dataframe to a csv file in python

Hi, I have a question for python.  I had use cursor.fetchmany() to fetch the data. It will fetch multiple times depends on the data number. I want to generate one csv file after all the fetching. How can I do it? Code is below. Thanks

 datafile = pd.DataFrame([])
       
            while True:

                results = cursor.fetchmany(7000)

                if not results:
                    break
               

                col_Names = ['market', 'site_id', 'latitude', 'longitude', 'state', 'zip', 'structure_type']
                data = pd.DataFrame(results)
                data.to_csv(filename, index=False, header=col_Names) //This is the first time fetching
          //how can I append the next fetching data to the csv file?
               

        cursor.close()
Python

Avatar of undefined
Last Comment
Ryan Chong

8/22/2022 - Mon
Ryan Chong

have you heard about the to_csv method?

pandas.DataFrame.to_csv
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_csv.html#pandas-dataframe-to-csv

some examples:

Saving a Pandas Dataframe as a CSV
https://www.geeksforgeeks.org/saving-a-pandas-dataframe-as-a-csv/

How to Export Pandas DataFrame to a CSV File
https://datatofish.com/export-dataframe-to-csv/
David Favor

Just open the .csv file outside your loop. Write to the .csv file as desired. Close the .csv file when your SQL looping is complete.
ASKER
ocean O

I can't put all the data in one dataframe. Since the data is too big, like 7241764, if everytime I fetch 5000, it will take over 1400 times to fetch. so I am thinking everytime when fetching 5000, then I will write it to a csv file , then append the second csv file from second time fetching to the previous csv file. But I am not sure how to append the csv file in python.

 if execution_type == _READ:
            while True:

                results = cursor.fetchmany(5000)

                if not results:
                    break
       

                if i == 0:
                   data = pd.DataFrame(results)
                   data.to_csv(filename, index=False, header=col_Names)

                if i >= 1:
                    data.to_csv(filename1, index=False)
                    append = filename.join(filename1) //How to append csv file filename1 to csv file filename???
                i += 1

        cursor.close()
This is the best money I have ever spent. I cannot not tell you how many times these folks have saved my bacon. I learn so much from the contributors.
rwheeler23
Ryan Chong

Since the data is too big, like 7241764, if everytime I fetch 5000, it will take over 1400 times to fetch.

what's the database you are using? if we are saying to export 7.2 mil of records, we probably need to have a better approach on this.
ASKER CERTIFIED SOLUTION
Louis LIETAER

Log in or sign up to see answer
Become an EE member today7-DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform
Sign up - Free for 7 days
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
See how we're fighting big data
Not exactly the question you had in mind?
Sign up for an EE membership and get your own personalized solution. With an EE membership, you can ask unlimited troubleshooting, research, or opinion questions.
ask a question
ASKER
ocean O

Ryan:
we are using postgresql
Ryan Chong

@ocean O

sorry as I didn't use much on postgresql, so can't really provide advice on that

perhaps with such volume of data, some off-the-shelf software + RPA would be a more reliable solution for you?

dbForge Studio for PostgreSQL
https://www.devart.com/dbforge/postgresql/studio/postgresql-migration.html
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.