Link to home
Start Free TrialLog in
Avatar of Giambattista
GiambattistaFlag for Italy

asked on

Problems with CFFILE APPEND

I make a select over about 1000 records, I create a file and append each record to the file.

In brief:

1) select 1000 records
2) create file.csv
3) loop query
4) for each record ==> append a line to the file.csv and do some simple calculations
5) output the result of calculation

Sometime the script run and run and never stops. Looking at the file.csv I notice it grows normally, but sometime, the size of the file decrease and then restart growing.

I noticed also that sometime the script stops running, gave the results, but the file.csv continue to grows for tens of minutes ...

If I avoid the building of the file.csv, the script runs really fast and give to me the results almost immediately.

Do you have any clue what the problem could be?
Avatar of _agx_
_agx_
Flag of United States of America image

At a guess - are you're writing to the same file from multiple threads? Can you post some code? In particular the section that "creates" and "appends to" the file.

As an aside, it's often more efficient to build a string containing the content and do a single write/append, rather than doing thousands of appends.
Avatar of Giambattista

ASKER

@_agx_

I'm afraid you guessed right. I simplified my explaination, but yes, I create 5 different strings and append them to the same file. I need 5 lines for each record to append.

If this is the problem, how could I avoid it? Maybe using CFLOCK around each append?
Appending is just less efficient. But it shouldn't cause this kind of problem unless maybe another thread is trying to write to the same file at the same time.  Is your script called by multiple threads? And is it writing to a constant file name or  a unique file name?
@_agx_

This is the script, very simplified:

<loop query>

calculate string1
append string1 to #directory#file.csv
calculate string2
append string2 to #directory#file.csv
calculate string3
append string3 to #directory#file.csv
calculate string4
append string4 to #directory#file.csv
calculate string5
append string5 to #directory#file.csv

</loop query>

Sorry, I'm not able to understand if the script is called by multiple threads or it is written to a constant file name or a unique file name.
What's the purpose of the script in plain english?

Can you at least post the <cffile> code that does what you're calling a "create" and one that does an "append".  Because I understand what you're doing, but it's how you're doing it in code that'll tell us what's causing the problem.
The creation process of the string is very long, but I can assure it usually works fine.
The error is not there at 100%.

This is the append:

<cffile action = "append" file = "#directory#file.csv" output = "#string1#" addnewline="no">
Do you ever do a "write" (not an append) to the file ?

> This is the append:

Ok, so you're using a hard coded file name.  That answers one question.  

How is the script used ie what's it's purpose?

Yes, of course I initialize the file once at the start creating the file with a write:

<cffile action = "write" file = "#directory#file.csv" output = "" addnewline="no">

The purpose of the script is to create a file with a list of invoices in a metafile so my accountant can import it in his system monthly.
Then it's probably what I suspected in the beginning: multiple threads writing to the same file.

> file = "#directory#file.csv"

If two users try and run this script at the same time (or a single user runs the script and refreshes before it's finished) you'll run into something called a race condition.  It's when multiple threads are all racing to modify the same resource (ie a file) at the same time and end up with corrupted results:

- user A kicks off the process and overwrites "file.csv" with an empty string
- user A's process appends 5 lines to the file
- user B kicks off the process and overwrites "file.csv". this erases all data added by userA
- user B's process appends 3 lines to the file
- user A's process (unaware of what's happened) appends another 10 lines to the file
- user C kicks off the process and overwrites "file.csv", erasing userA and userB's data
- user A's process appends another 15 lines
- user C's process appends 5 lines to the file
- user B's process appends 8 lines to the file
Because all 3 threads are fighting to modify the same file, the final result is a jumbled mix of partial data from all 3 threads.

The best solution is to use a unique file name for each thread, so they will not conflict.  I wouldn't use cflock unless there's a good reason you must use the same file name for all threads.  
ASKER CERTIFIED SOLUTION
Avatar of _agx_
_agx_
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
That was a smart workaround for my stupid solution! ;-)

Thank you!
Welcome :)