asked on

Setting up distributed PHP application

I have an existing PHP application that has run quite nicely for a while on just a single web/db server. It's gotten to the point where it needs to be broken up into servers for each service (Apache, MySQL, and batch processing). It's the batch processing that's stumped me the most. We're currently using the CodeIgniter framework for the app, so it'd be preferable to continue to use this for the batch processing (if possible), but this batch processing needs to take place on a separate server from where the front-end web application lives and runs.

I know this is kinda a broad topic, but can anyone offer any pointers? Should I be storing the code on one server, and mounting to that server from the others to access the scripts? Should I just store separate copies of the code on both batch and web servers?

The more insight or specific examples or use cases the better. I'll gladly provide more specific info if you tell me what you need to know. Thanks in advance!

ASKER CERTIFIED SOLUTION

Richard Quadling

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

anoyes

ASKER

Thanks for your comments. Couple questions.

1) Is the "small PHP command line app" that you use to sync files something that's home grown, or is it some open-source tool that you're using? When you deploy do you just deploy to one server and then it takes care of the replication?

2) Are the tasks that you'll be branching off to the new server scheduled tasks, or are they monitoring a queue of some sort (like sending emails), or both? If they're monitoring a queue, how are you queuing tasks? Reading / writing to the DB? In-memory?

I think I had another question but I can't think of it at the moment...

Richard Quadling

1) It is home-grown. The script allows me to examine the differences and then decide what to commit to all the other servers. I could go down a version control server route, but I haven't as I manage quite well how I am and I can easily show others what they the situation is if required.

2) The new tasks system will allow me to create mini-processes. Say take a file from a location, operate on it, log the results in a DB and then process the the destructive part of the process (the point of no return sort of thing).

The processes are handling faxes and scans from multiple sources. Each of the mini steps will perform a single action on the file. Each mini-step will allow for failure and only have a single destructive call - normally a file rename - which can only succeed and be fine or fail and have done nothing. That way the filenames and the locations indicate their state and can therefore be easily recovered.

I'm in the process of learning to compile the win32service extension for 5.3. This will allow me to have a PHP based service manager for my scripts. I'll have a pool class to allow me to have multiple instances of a single process (some process take a LOT longer to run and hold up the queue - as such - by having multiple "threads", I can process more files in this particular mini-step).

I initially wanted to use proc_open() and communicate using pipes, but the win32 version of PHP currently does not support non-blocking file access, so I'm having to use either a DB or file i/o to allow thread/pool/manager commuinication.

I've seen how other scripting languages have dealt with the non-blocking (there are 2 ways it seems - a looping thread approach and the use the alternative file i/o library). Integrating either of these is currently outside my skill set.

anoyes

ASKER

Sorry for taking so long to get back to you. Would it be possible to take a look at that script that does the file sync's for you?

Richard Quadling

The syncs are based upon having a $ID tag in a comment (just like a CVS comment). These are updated when the file is used (post-use via an auto_prepend script).

I extract the $ID tag, compare against the OS last modified datetime and know if the tag matches the ID, if so it is "good", if not it is "bad". I can then compare version numbers and datetime across the multiple machines to determine the latest version. I then pass it through my editor when I need to manually resolve the code from different machines and then push it to the machines that need it.

It is a VERY much tuned to my our setup and is a little ropey (cause it is a short term solution).

If you are absolutely having 1 source, then use a VCS system (CVS, SVN, GIT, etc.) and have a simple push to all the machines when you want to. Ultimately, that's what I'll be doing, but I've got to get all the source's sync first.

anoyes

ASKER

thanks for the input - good stuff, and plenty to think about and tinker with