Link to home
Start Free TrialLog in
Avatar of Errang Genevre
Errang Genevre

asked on

Is it possible to reuse file handle while changing the file name/location?

Hello,

I was wondering if it was possible to reuse a file handle in Python and change where the file handle points to?

Basically what I want to do is, instead of...
for x in range(0, 10000)
    file = open('file_' + str(x) + '.txt', 'w')
    file.write('some text\n')
    file.close()

Open in new window


I want to do something like...
file = open('file_0.txt', 'w')
for x in range(0, 10000)
    file.write('some text\n')
    <magic code to change file location so I don't need to create a new file object>
file.close()

Open in new window


I'm asking this because I'll be creating about 500k files; and creating new file objects seems to be slowing down the script.

Appreciate any help on this!
ASKER CERTIFIED SOLUTION
Avatar of arnold
arnold
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Errang Genevre
Errang Genevre

ASKER

Alright, thanks.

Didn't expect it to be possible, but just thought I'd check.

I'll try experimenting with the shell output redirection; hopefully that'll have a smaller footprint.
the comparative delay to close/open file handle is significantly smaller than your processing of data.

depending on what you are doing, you could use syslog/rsyslog as the destination for your output and have it manage the writeout into files.

pushing control out means you also are pushing the means of detecting errors in the process.
That's true.

The current process runs for 1.5+ hours, but its writing 500k files, and also creating over 2-3 times the number of directories. So I'm just trying to see if possible to use a combination of scripts to get it done quicker.

Still haven't found a decent enough solution for the file creation part; but yea, I still need to consider the Error handling part.
If you comment out the file creation part and time the processing, dies it decrease.

Multiple directories suggests there is some additional logic to directories/file handles?
Since there are 500k files (file names are based on a numeric range), the folders are a way to group similar files, and not have Unix scream at us for having that many files in one location.
Not sure whether running a daemon that will be charged with file creation to which your processing app will be sending its output is a worth while thing.

One thing to look at in your existing script is whether it is buffering the output.

I still think if the processing of the data can be sped up, that will likely be the way to go.
Alright, cool; thanks.