asked on

Problems with CFFILE APPEND

I make a select over about 1000 records, I create a file and append each record to the file.

In brief:

1) select 1000 records
2) create file.csv
3) loop query
4) for each record ==> append a line to the file.csv and do some simple calculations
5) output the result of calculation

Sometime the script run and run and never stops. Looking at the file.csv I notice it grows normally, but sometime, the size of the file decrease and then restart growing.

I noticed also that sometime the script stops running, gave the results, but the file.csv continue to grows for tens of minutes ...

If I avoid the building of the file.csv, the script runs really fast and give to me the results almost immediately.

Do you have any clue what the problem could be?

_agx_

At a guess - are you're writing to the same file from multiple threads? Can you post some code? In particular the section that "creates" and "appends to" the file.

As an aside, it's often more efficient to build a string containing the content and do a single write/append, rather than doing thousands of appends.

Giambattista

ASKER

@_agx_

I'm afraid you guessed right. I simplified my explaination, but yes, I create 5 different strings and append them to the same file. I need 5 lines for each record to append.

If this is the problem, how could I avoid it? Maybe using CFLOCK around each append?

_agx_

Appending is just less efficient. But it shouldn't cause this kind of problem unless maybe another thread is trying to write to the same file at the same time. Is your script called by multiple threads? And is it writing to a constant file name or a unique file name?

Giambattista

ASKER

@_agx_

This is the script, very simplified:

<loop query>

calculate string1
append string1 to #directory#file.csv
calculate string2
append string2 to #directory#file.csv
calculate string3
append string3 to #directory#file.csv
calculate string4
append string4 to #directory#file.csv
calculate string5
append string5 to #directory#file.csv

</loop query>

Sorry, I'm not able to understand if the script is called by multiple threads or it is written to a constant file name or a unique file name.

_agx_

What's the purpose of the script in plain english?

Can you at least post the <cffile> code that does what you're calling a "create" and one that does an "append". Because I understand what you're doing, but it's how you're doing it in code that'll tell us what's causing the problem.

Giambattista

ASKER

The creation process of the string is very long, but I can assure it usually works fine.
The error is not there at 100%.

This is the append:

<cffile action = "append" file = "#directory#file.csv" output = "#string1#" addnewline="no">

_agx_

Do you ever do a "write" (not an append) to the file ?

> This is the append:

Ok, so you're using a hard coded file name. That answers one question.

How is the script used ie what's it's purpose?

Giambattista

ASKER

Yes, of course I initialize the file once at the start creating the file with a write:

<cffile action = "write" file = "#directory#file.csv" output = "" addnewline="no">

The purpose of the script is to create a file with a list of invoices in a metafile so my accountant can import it in his system monthly.

_agx_

Then it's probably what I suspected in the beginning: multiple threads writing to the same file.

> file = "#directory#file.csv"

If two users try and run this script at the same time (or a single user runs the script and refreshes before it's finished) you'll run into something called a race condition. It's when multiple threads are all racing to modify the same resource (ie a file) at the same time and end up with corrupted results:

- user A kicks off the process and overwrites "file.csv" with an empty string
- user A's process appends 5 lines to the file
- user B kicks off the process and overwrites "file.csv". this erases all data added by userA
- user B's process appends 3 lines to the file
- user A's process (unaware of what's happened) appends another 10 lines to the file
- user C kicks off the process and overwrites "file.csv", erasing userA and userB's data
- user A's process appends another 15 lines
- user C's process appends 5 lines to the file
- user B's process appends 8 lines to the file

Because all 3 threads are fighting to modify the same file, the final result is a jumbled mix of partial data from all 3 threads.

The best solution is to use a unique file name for each thread, so they will not conflict. I wouldn't use cflock unless there's a good reason you must use the same file name for all threads.

ASKER CERTIFIED SOLUTION

_agx_

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

Giambattista

ASKER

That was a smart workaround for my stupid solution! ;-)

Thank you!

_agx_

Welcome :)