Errang Genevre
asked on
Is it possible to reuse file handle while changing the file name/location?
Hello,
I was wondering if it was possible to reuse a file handle in Python and change where the file handle points to?
Basically what I want to do is, instead of...
I want to do something like...
I'm asking this because I'll be creating about 500k files; and creating new file objects seems to be slowing down the script.
Appreciate any help on this!
I was wondering if it was possible to reuse a file handle in Python and change where the file handle points to?
Basically what I want to do is, instead of...
for x in range(0, 10000)
file = open('file_' + str(x) + '.txt', 'w')
file.write('some text\n')
file.close()
I want to do something like...
file = open('file_0.txt', 'w')
for x in range(0, 10000)
file.write('some text\n')
<magic code to change file location so I don't need to create a new file object>
file.close()
I'm asking this because I'll be creating about 500k files; and creating new file objects seems to be slowing down the script.
Appreciate any help on this!
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
the comparative delay to close/open file handle is significantly smaller than your processing of data.
depending on what you are doing, you could use syslog/rsyslog as the destination for your output and have it manage the writeout into files.
pushing control out means you also are pushing the means of detecting errors in the process.
depending on what you are doing, you could use syslog/rsyslog as the destination for your output and have it manage the writeout into files.
pushing control out means you also are pushing the means of detecting errors in the process.
ASKER
That's true.
The current process runs for 1.5+ hours, but its writing 500k files, and also creating over 2-3 times the number of directories. So I'm just trying to see if possible to use a combination of scripts to get it done quicker.
Still haven't found a decent enough solution for the file creation part; but yea, I still need to consider the Error handling part.
The current process runs for 1.5+ hours, but its writing 500k files, and also creating over 2-3 times the number of directories. So I'm just trying to see if possible to use a combination of scripts to get it done quicker.
Still haven't found a decent enough solution for the file creation part; but yea, I still need to consider the Error handling part.
If you comment out the file creation part and time the processing, dies it decrease.
Multiple directories suggests there is some additional logic to directories/file handles?
Multiple directories suggests there is some additional logic to directories/file handles?
ASKER
Since there are 500k files (file names are based on a numeric range), the folders are a way to group similar files, and not have Unix scream at us for having that many files in one location.
Not sure whether running a daemon that will be charged with file creation to which your processing app will be sending its output is a worth while thing.
One thing to look at in your existing script is whether it is buffering the output.
I still think if the processing of the data can be sped up, that will likely be the way to go.
One thing to look at in your existing script is whether it is buffering the output.
I still think if the processing of the data can be sped up, that will likely be the way to go.
ASKER
Alright, cool; thanks.
ASKER
Didn't expect it to be possible, but just thought I'd check.
I'll try experimenting with the shell output redirection; hopefully that'll have a smaller footprint.