Monday, August 5, 2013

Fault Tolerance in TIBCO Businessworks

                               Fault tolerance configuration in tibco bw

fault tolerance is a procedure to provide continuous service of a component even in case of unexpected failures. in fault tolerant systems, backup systems are ready to serve in case of failure of a running system. fault tolerance is also known as graceful degradation. in this document we will clearly see how to go about fault tolerance of bw process. this document explains the steps in deploying the bw process in fault tolerant mode and testing the same.

fault tolerance can be achieved by deploying different instances of the same ear in same/different systems.

fault tolerance parameters

o    ft weight – a parameter available in the tibco administrator during deployment of any engine in fault tolerant mode. this defines the relationship between the fault tolerant engines. it can be set to ‘peer’ or ‘primary/secondary’.

o    ft weight – peer – if all the engines of the fault tolerant group are peers then when the machine containing the currently active process engine fails, another peer process engine resumes processing for the first engine, and continues processing until its machine fails.

o    ft weight – primary / secondary – if the engines are configured as primary (master) and secondary, the secondary engine resumes processing when the master fails. the secondary engine continues processing until the master recovers. once the master recovers, the secondary engine shuts down and the master takes over processing again. 

configuring fault tolerance in tibco administrator

after the ear is ready, it has to be deployed in tibco administrator.

1.    in administrator save the ear that has to be deployed. for detailed explanation on deploying in administrator refer to appendix – section 4.

2.    click on the application name from the right panel in case global variables needs to be changed before deployment. on the next screen click advanced for the display of global variables.


figure 1 – configuration builder

3.    to set the application in fault tolerant mode click on the process archive.


figure 2 – process archive

4.    under the general tab, click on ‘add to additional machines’.


figure 3 – add to additional machines

5.    all the available machines in the domain will be listed down. select the machine on which fault tolerant process needs to run [can be same or different machine]. click ok. below figure shows fault tolerant engine being added to the same machine.


figure 4 – bind to container

6.    repeat steps 3 to 5 incase more fault tolerant process should run on additional machines. here we will use 2 instances of bw process in fault tolerant mode.

7.    select the ft weight and ensure the ‘run fault tolerant’ check box is enabled. the figure shows engines being deployed with ft weight as ‘peer’. click save.


figure 5 – edit service configuration

note: ft weigh can also be set to primary for one service instance and secondary for the other engine. refer section 2.2 for explanations different ft weights.

8.    the configuration now shows 2 instances under the process archive. after all configuration are done click on ‘deploy’ button.


figure 6 – deploy

9.    from the left panel select ‘service instance’.


figure 7 – service instance

10. on clicking the ‘service instance’, initially the right panel shows that both the service instance with its state as ‘starting up’.


figure 8 – service starting up

11. after sometime the state changes from ‘starting up’ to ‘standing by’ for both the instances as shown below.


figure 9 – service standing by

12. after sometime the state of one of the instances changes from ‘starting up’ to ‘running’ as shown below. the other will remain in ‘standing by’. now fault tolerant engines have been set up. incase of failure of the running instance, the ‘standing by’ instance will get activated thus providing no loss of service.


figure 10 – service running

13. ensure that the ear is deployed without errors and is in ‘running’ state. [for detailed explanations on this refer appendix – section 4]

14. for the running instance, the details under tracing tab will show that the instance was first started off in back-up mode and then has been activated as shown in the below figure. this will also be available in its log file.


figure 11 – tracing - running instance
         
15. for the standing by instance, the details under tracing tab will show that the instance is in back-up as shown in the below figure. this will also be available in its log file.


figure 12 – tracing – standing by instance

now the ear is deployed in fault tolerant mode.

2.1   testing fault tolerance

 here we will be testing if the deployed service instances are able to manage in case of failures.

1.    in fault tolerant mode, all the files/input messages will be processed by the running instances.

2.    select the running instance and stop it.


figure 13 – stop instance

3.    the state will be changed from ‘running’ to ‘shutting down’


figure 14 – shutting down

4.    after sometime, the state from ‘shutting down’ will be changed to ‘stopped’. also, the ‘standing by’ instance will get activated. the state of ‘standing by’ engine will be changed to ‘running’.


figure 15 – activation of standing by instance

1 comment: