[activemq-dev] Singleton pattern in KeepAliveDaemon (reliable transport channel) cause "other" reliable connections in same JVM fail for no reason

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[activemq-dev] Singleton pattern in KeepAliveDaemon (reliable transport channel) cause "other" reliable connections in same JVM fail for no reason

Hakan Guleryuz-2
Hi All,

We have a client which needs to connect to multiple servers.
This client is a server side business service, and needs to be connected
to all machines at all times. So it uses the reliable transport channel
with the following attributes:
        private final static String MAXIMUM_RETRIES = "0";
        private final static String INCREMENT_TIMEOUT = "false";
        private final static String FAILURE_SLEEP_TIME = "5000";
        private final static String ESTABLISH_CONNECTION_TIMEOUT = "0";
        private final static String KEEPALIVE_TIMEOUT = "20000";

This means these connections never give up trying to connect to the
server(s).

- A client has 4 connections to 4 different servers with these aggresive
(trying to reconnect) reliable transport channel parameters.
- One of the servers goes down
- The keepalivedaemon gets stuck in this loop trying to reconnect to it:
                        for (Iterator i = monitoredChannels.iterator(); i.hasNext();) {
                                ReliableTransportChannel channel = (ReliableTransportChannel) i.next();
                                if (!zombieChannelSuspects.contains(channel))
                                        examineChannel(channel);
                        }

The reason it gets stuck is that there is only one KeepAliveDaemon for
all 4 connections, and the "examineChannel" method causes an infinite
block since it tries to reconnect to one of the servers indefinitely.
And never sends keepalive packets to other "healthy" connections.

After the faulty connection recovers, all other 3 connections are marked
as timed out, and they are affectively closed by the client itself!
(since they are tought to be missing their KeepAlive packets, but the
client did not send them!)
After failure the "normal" connections recover as well, since there was
nothing wrong about them in the first place.

The only resolution we could find was to break the singleton pattern,
and create a KeepAliveDaeomn per reliable connection.
ie.
changed
FROM:
        public static synchronized KeepAliveDaemon getInstance() {
                if (instance == null)
                        instance = new KeepAliveDaemon();
                return instance;
        }

TO:

        public static synchronized KeepAliveDaemon getInstance() {
            return new KeepAliveDaemon();
        }

If anyone is having similar issues with multiple reliable connections in
the same JVM instance, this could be the cause.

Hakan.