Link to home
Start Free TrialLog in
Avatar of ocean O
ocean OFlag for United States of America

asked on

Python sql

Hi, I have a question about python.
I have a line of code in py file: results = cursor.fetchall()
But I need change to retrieve the rows in some chunks ( ex: 5000 rows) instead of  "fetchall"

For example:     rows = cursor.fetchmany(5000)

How can I do it?
Thanks
Avatar of John Tsioumpris
John Tsioumpris
Flag of Greece image

cursor.fetchmany(size) is probably what you want
Avatar of Norie
Norie

fetchmany should do what you want, isn't it working?
Avatar of ocean O

ASKER

Thanks for reply!
Actually I need do something like : if in the DB there are records of 11000, I need fetch three times, the first two 5000, the third time 1000.  And put this 11000 row into one csv file . How can I do it? Thanks
Why not fetch all the records in one go?

Are you processing them before the data in the CSV file?
I need fetch three times, the first two 5000, the third time 1000.  And put this 11000 row into one csv file . How can I do it?

what's the time gap for these 3 processes we are talking about? we also have mentioned how frequent you wish to do this.

if there's no time gap, then try fetch all records in one go as mentioned.

if there's a time gap, perhaps you can try to save the output in 3 separate CSVs, and then combine them together in your final step.
Avatar of ocean O

ASKER

Thanks for the reply.
I am calling a postgresql db in python, since data in some table is very big which can affect the performance. So I am thinking to use something like "cursor fetch batch", so I can fetch three times instead of one time, then I can combine all the data and return one csv file.

Thanks
I don't really have a hands-on experience on postgresql + python implementation, but I guess in order to fetch large data from a very large table.

perhaps you could try to:

1. Replicate the data needed to be exported into another temporary table.
2. Build a FIFO process model, when your ETL (or extraction python code) is running, mark the records that have been exported
3. When it reaches certain amount of records being exported, combine the CSVs into a single CSV.
4. Delete such marked records in temporary table when it's necessary

in short, you may need more than one python scripts to handle these processes.
ASKER CERTIFIED SOLUTION
Avatar of Norie
Norie

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of ocean O

ASKER

Hi, Norie:
Thanks for the suggestion.
Do you have any example that need fetch data 3 times and then put into dataframe?

Thanks
Avatar of ocean O

ASKER

I had written a function like below, but I am not sure how to put all the data into one dataframe. For example if it fetchs 3 times, how can I append all the data into one dataframe?  Thanks

def db_execute(sql, execution_type):

    try:
        conn = get_connection_handle()
        conn.autocommit = True
        cursor = conn.cursor()
        cursor.execute(sql)
        if execution_type == _READ:
            while True:

                results = cursor.fetchmany(4000)
                if not results:
                    break
                for r in results:
                    data = pd.DataFrame(results)

        cursor.close()
        close_connection_handle(conn)
        return data