We help IT Professionals succeed at work.

PHP / MYSQL restore broken link script in table...using file name with appended ID

m-higgins
m-higgins used Ask the Experts™
on
Hello Experts...
I have a directory full of CV's and cover letters that carry regular expressions and an ApplicantID
eg:
CoveringLetterforVacancyRef:005_CoveringLetter_4_6663.doc
CV_CV_4_6074.doc
The file extention may also be .docx or .pdf

I also have a table in my database with the Applicants and their ID's that match.

My problem is (I didn't do a backup before performing a Structural sysnc on phpMyAdmin)  I have lost just under 3K rows from another table that linked the ApplicantID to the filename located in "documents/applicant/cv"

Does anyone have any sugestions how I can fix this?
Here is the table info:
Table1: applications
ID
CandidateID
DateAdded
EncryptID


Table2: docs
ID
ApplicantID
Description
FileName
DateAdded

I'm thinking:
 - 1st find the missing info between the two tables on 'CandidateID' AS 'ApplicantID'.

 - Then use the PHP function if_file_exist ../documents/applicant/cv/"CV-{ApplicantID}.doc" (or .docx, or .pdf etc)

 - If it DOES, then insert a new row in the (docs) table:
copy "ApplicantID", "DateAdded", "EncryptID" from the (applications) table a pre-defind string for the "Description"
then add the filename in the "FileName" column thus restoring the link and we can download the documents again...

 - And if there's any left maybe display a list??

I'm desperate right now as I seem to be fumbling along with no real results....

Can anyone please help
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®

Commented:
So data got from deleted from docs, but applications still has the original data? If so, this query should show you all records in "applications" that don't have a corresponding row in "docs":

SELECT * FROM applications t1 WHERE t1.CandidateID NOT IN (SELECT ApplicantID FROM docs)

So start by running the below script. It should be a dry run. It won't change anything in the docs or the database - it will just check to make sure it is possible to do the linking the way that has been suggested in your question and log the results to a file called fixlog.txt. This way you can be certain we're on the right track or if we need to adjust the script first.

Also need you to verify all the possible filename conventions used. You used both dashes - and underscores _ in your description of how the files are named, so we need to be certain of the potential results. The log should also contain a list of files that it was not able to figure out, but it's currently checking for the candidate ID at the end of the filename after an underscore and before the extension, like this: *_[CANDIDATE ID].[extension].

<?php

// Start logging
file_put_contents("fixlog.txt",date("c") . " :: Starting log...\n");

// Define dir holding the docs and change into it
$docDir = "../documents/applicant/cv/";
if(file_exists($docDir)) { chdir($docDir); } else { die("That doc dir doesn't exist! The script is currently running from " . getcwd()); }

// Scan for PDFs and DOC(X)s
file_put_contents("fixlog.txt",date("c") . " :: Scanning for files...\n",FILE_APPEND);
$cvFiles = array();
$dh = opendir(".");
$counter = 0;
while(($file = readdir($dh)) !== false)
{
	if(($file == ".") || ($file == "..")) { continue; }

	if(preg_match("/_([0-9]+).(doc|pdf|docx)/",$file,$matches))
	{
                $counter++;
		$cvFiles[$matches[1]][] = $file;
	}
        else
        {
		file_put_contents("fixlog.txt",date("c") . " :: Unmatched filename: {$file}\n",FILE_APPEND);
        }
}
closedir($dh);
file_put_contents("fixlog.txt",date("c") . " :: {$counter} files found across " . count($cvFiles) . " candidates...\n",FILE_APPEND);

// Connect to DB 
file_put_contents("fixlog.txt",date("c") . " :: Connecting to the database...\n",FILE_APPEND);
mysql_connect("hostname","user","pass");
mysql_select_db("the database name");

// Find rows in applications that are missing corresponding rows from docs
file_put_contents("fixlog.txt",date("c") . " :: Running missing docs query...\n",FILE_APPEND);
$rs = mysql_query("SELECT * FROM applications t1 WHERE t1.CandidateID NOT IN (SELECT ApplicantID FROM docs)");
while($row = mysql_fetch_assoc($rs))
{
  extract($row);
  $logEntry = "Candidate {$CandidateID} has " . (isset($cvFiles[$CandidateID]) ? count($cvFiles[$CandidateID]) : "0") . " matching files.";
   
  file_put_contents("fixlog.txt",date("c") . " :: {$logEntry}\n",FILE_APPEND);
}

?>

Open in new window

Commented:
You might be sleeping now, so I'll finish out the second part of the solution in case I'm not around later.

In the database loop, you just need to construct queries like this (basically replace the old database loop with the one below):

while($row = mysql_fetch_assoc($rs))
{
  extract($row);
  if( isset($cvFiles[$CandidateID]) )
  {
    file_put_contents("fixlog.txt",date("c") . " :: Candidate {$CandidateID} has " . count($cvFiles[$CandidateID]) . " matching files.\n",FILE_APPEND);
    foreach($cvFiles[$CandidateID] as $filename)
    {
       file_put_contents("fixlog.txt",date("c") . " :: Linking {$filename} to Candidate {$CandidateID}.\n",FILE_APPEND);
       mysql_query("INSERT INTO docs (ApplicantID,Description,Filename,DateAdded) VALUES ($CandidateID,'predefined description','{$filename}','{$DateAdded}')");
    }
  }
  else
  {
    file_put_contents("fixlog.txt",date("c") . " :: Candidate {$CandidateID} has no matching files.\n",FILE_APPEND);
  }    
}

Open in new window


However, do NOT use this code until you've run the original script first and looked at fixlog.txt to make sure that it is going to do what you want it to do!!! You don't want to screw up the database further (and make a backup of the current state while you're at it!)

Commented:
Checking to see if you've had a chance to try this out...

Author

Commented:
Excellent gr8gonzo! sorry for the late reply....but thanks to you and your help I have managed to put something together that has sorted my situation.

Thank you again.

The fixlog was a nice touch! defo marking this one for another time...