asked on

server failure recovery

I am trying to setup a server failure recovery strategy.

How does it sound?

1. I will use multiple A records pointing the same address to 2 different servers (see the discussion here: http://webmasters.stackexchange.com/questions/10927/using-multiple-a-records-for-my-domain-do-web-browsers-ever-try-more-than-one)
2. At normal conditions the server B forwards the traffic to server A programmatically to avoid any problem of multi A records
3. Server B checks if Server A runs good. If B sees that A is not running then stops forwarding the traffic.
4. Server B also starts triggering a page at server A to stop it to avoid temporary recoveries (eg. every 5 seconds server B requests the page: IP_of_server_A/stoprunning.php)
5. Thanks to multi A records the browser will forward the requests to B when A is not running.

Qlemo

Why do you want to prevent fallback to A? You are not concerned about sync, I suppose, as B isn't in sync anyway.
Other than that, it seems to be a working concept.

myyis

ASKER

There is a continuous DB insert transactions. I will use one way sync. That means the A will write to B continuously. Therefore when A is down I want that no body inserts data. When A is online and stable again I will write the new DB records to A.

How does that sound?

gheist

Where is the DB insert uniqueness handled? In PHP or in database?

myyis

ASKER

Sorry I don't understand what you mean DB insert uniqueness? Can you give an example?

gheist

How do you know that only one server can be up and running?
Normally you can have 2 or 200 inserting new records in database (how do you think facebook works)

myyis

ASKER

1. In normal conditions 2 servers are up (multi A records), all requests to backup server(B) are forwarded to main server (A). So only A is receives requests.
2. Every insert query at A is also triggered at B
3. Always checks the inserted ID after the transaction at B, it should be equal to the inserted ID of related transaction at A
Therefore both servers always have the same records.

When server A is down,
1. it is made permanently down to avoid temporary recoveries (explained above)
2. Forwarding from B to A is stopped and the requests are accepted at B

kenfcamp

What happens when server B goes down?

myyis

ASKER

For DB consistency:
All the insert (also update and delete) queries generated at A is written to a table. If they are successfully executed at B the related query is deleted from the table. If can not be successful that means that B is down, the row will not be deleted. When B is recovered it will execute the queries that are at the table (at the queue).

At the user site: When B is down the browser will check for the working server (A) thanks to multi A records. (http://webmasters.stackexchange.com/questions/10927/using-multiple-a-records-for-my-domain-do-web-browsers-ever-try-more-than-one))

SOLUTION

Qlemo

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

gheist

Could it be mirrored database that fails over and two or 10 decoupled webservers approach highly available replication master of the moment ? Then you just need failover script for DB and wise config for webserver

myyis

ASKER

Hi gheist,
If you're suggesting replication features (master-slave) of DB I can't do that because I am on shared environment and I don't have root access to the server.

gheist

I am just telling that you can use primitive DNS load balancing to balance between two apache app servers. There is no need to complicate the picture with kill scripts ensuring one application is running.

They both approach database you have at the same time

Problem solved, no radical customization. One server goes down, all user browsers handle failover until Monday you get back to work. Maybe add 3rd apacce server for christmas holiday, but "works for me" like this.

myyis

ASKER

Hi Gheist
I have checked this link (https://www.digitalocean.com/community/tutorials/how-to-configure-dns-round-robin-load-balancing-for-high-availability)

At the end of the article it mentions that I have to have DB replication. But I cannot have replication feature, that's why I am thinking a more complicated solution.

ASKER CERTIFIED SOLUTION

gheist

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial