Avatar of weklica
weklica
 asked on

Insert Log file contents to a database

I have a large .log file.  I want to take each new line item (from last processed loop) to add to a simple database.  The database can be a field or two.... I will make it however is best to accommodate the project.  

here is a sample of the file:
05/03/11 16:49:51 U I - Ambassador- cmd=38238
05/03/11 16:49:52 U I - Ambassador- cmd=38017
05/03/11 16:49:57 U I - Ambassador- cmd=38013
05/03/11 16:49:57 S I - Ambassador- before exam display: DropRule = Memory Load(75 to 80), Tuning(+-0); ImageMem = 1055.5 MB; PhysicalMem = 1,276,993,536/3,220,701,184; Pagefile = 5,805,072,384/7,385,329,664 Virtual Avail =1,219,174,400/2,621,308,928 Memory load = 60
05/03/11 16:50:07 U I - Ambassador- cmd=38238
05/03/11 16:50:08 U I - Ambassador- cmd=38026
05/03/11 16:50:10 U I - Ambassador- cmd=41920
05/03/11 16:50:12 U I - Ambassador- cmd=38264
05/03/11 16:50:16 U I - Ambassador- cmd=38223
05/03/11 16:50:26 U I - Ambassador- cmd=38223
05/03/11 16:50:28 U I - Ambassador- cmd=41903
05/03/11 16:50:28 U I - Ambassador- cmd=38223
05/03/11 16:52:26 U I - Ambassador- cmd=38238
05/03/11 16:52:27 U I - Ambassador- cmd=38152
05/03/11 16:52:32 S I - (Begin) Save Image Edits.




==============  now, there is much more to the log file, but each new record begins with a timestamp.  So, how can I have a script loop and insert each row into a record on the database?  It would be nice if it would insert "PROCESSED" or something on its own new line  in the log file and on the next cycle, pick up from the last Procssed word.\

this would allow me to loop the script and really only process the new records.
VB ScriptMicrosoft DOS

Avatar of undefined
Last Comment
weklica

8/22/2022 - Mon
ASKER CERTIFIED SOLUTION
franked_it

THIS SOLUTION ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
GET A PERSONALIZED SOLUTION
Ask your own question & get feedback from real experts
Find out why thousands trust the EE community with their toughest problems.
weklica

ASKER
The log file i sent was a small snippet.  The only thing consistent about a new line is that it is a timestamp.  There are tons of other variations of data after the timestamp, but some contains information that I can't paste.  

JAVA would work too.  I was just assuming those would the be easiest options.  

Can't really move the log file out as it is constantly being written to, so I would probably rather just stick in a new line so I know where to start off from,.  
franked_it

So does the file just grow and grow?  Until when?  I'm assuming it has to stop at some point and delete old data or roll to a new file?

I understand about not being able to post some information.  It's more helpful in a DB environment to have the data in fields instead of a text field as you can query it easier later.  Having all the text in a single field turns it into partially unstructured data which is harder to manage and work with.

You'll want to consider performance and efficiency with trying to figure out where to start in a file.  If you have to process every line with an "If" statement to determine if you have already processed it, that may end up wasting compute cycles and make the program last longer than it needs to.  You may be able to record what line number you process, and therefore jump straight to that line.

I get nervous adding a comment or extra line to the log file for two reasons:
1 - It means the log data is no longer the original data.  If something goofs up, you may end up with corrupted source data.
2 - Two programs cannot both have the log file open at the same time.  So if this is being logged by another program, normally you would not be able to start a new program/script and be able to write to the same file at the same time.  If it almost always works, but at one point the two processes try to write at the same time, you may run into unexpected results.
SOLUTION
weklica

THIS SOLUTION ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
GET A PERSONALIZED SOLUTION
Ask your own question & get feedback from real experts
Find out why thousands trust the EE community with their toughest problems.
franked_it

Sounds good. 5 MB in size means what... about 70,000 lines or so?  You can probably afford to process through that whole list each time the script/program runs then right?

My only concern with changing that file is again, what happens when two processes try to open the file for writing at the same time.  You might be able to store a hash of each line as part of your DB.  So you'd have the timestamp, Log line, and hash.  That way you can check any given line for whether it already exists.  See if there is a matching timestamp, if so, see if the hash is the same.  Seems like that may drive up CPU, but maybe not too bad.

Since the file purges itself to stay near 5 MB, I'm guessing you can't rely on line number.  For example you can't read up to line 5,432 then pick up at that line number at the next run because the lines could have all shifted due to the purging mechanism.
Your help has saved me hundreds of hours of internet surfing.
fblack61
franked_it

I did provide suggestions for each question that was brought up and ended up not getting a final answer from the requester.  I didn't provide working code, but rather suggestions and a possible execution flow that could be used to help write the script.  If there is any more help I can offer or specific needs, please feel free to let me know.  Thanks!
weklica

ASKER
Appreciate the help. Sorry for the delayed response.  I knew what needed to be done from a workflow standpoint, but was struggling with the code.  I just hired it out.  Thanks