Static network of brokers with not all brokers running (expected)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Static network of brokers with not all brokers running (expected)

pypen
Hi,
I have a network of 4 brokers, statically configured (<networkConnector
uri="static:(tcp://IP1:61617,tcp://IP2:61617,tcp://IP3:61617)"
networkTTL="-1" decreaseNetworkConsumerPriority="true"  />). Each broker has
a list if IPs, excluding its own.
It is expected that not all brokers are running at the same time (they can,
but don't have to).
I have (java) clients that run on the same machines as the brokers and
connect to their localhost broker and I have clients that are running on
separate machines, connecting to one of the 4 (remote) brokers.

1. The activemq log files are being spammed with exceptions that the
connection to the brokers (that are down) fail (see below). This was not the
case with older versions (it was only a warning, I believe). I cannot use
dynamic discovery because multicast is not allowed on the network. I am not
sure if my setup is incorrect or if I just need to change the log4j to not
log these exceptions. Like I said, it's expected that not all brokers are
running at all times. Suppressing the exceptions via log4j seems not really
correct to me.

2. The remote brokers use a list of IPs with
failover:(IPs)?randomize=true&backup=true...
This seems to be working fine, except for if one of the IPs is not
reachable. This does not always happen, but if I for example add as the
first IP some bogus IP, the remote client will not communicate with the
remote broker at all. Lets say I have BOGUS_IP, CORRECT_IP in the list, I
can see it says it connects to the CORRECT_IP, but does not send or receive
any messages. The clients have multiple consumers and producers and on some
producers the session seems to be null and other weird behavior is
happening.
In the admin console of the broker that is running, I do see the connections
from the remote client though. Any ideas what that could be?

Thanks in advance.

<code>
 WARN | Could not start network bridge between: vm://SRV1 and:
tcp://IP3:61617 due to: Connection timed out: connect
 INFO | Establishing network connection from vm://SRV1 to tcp://IP2:61617
 INFO | Connector vm://SRV1 started
 INFO | Establishing network connection from vm://SRV1 to tcp://IP1:61617
 INFO | Establishing network connection from vm://SRV1 to tcp://IP3:61617
 INFO | SRV1 Shutting down NC
 WARN | Could not start network bridge between: vm://SRV1 and:
tcp://IP1:61617 due to: Connection refused: connect
 INFO | SRV1 bridge to Unknown stopped
 INFO | error with pending local brokerInfo on: vm://SRV1#222
org.apache.activemq.transport.TransportDisposedIOException: peer
(vm://SRV1#223) stopped.
        at
org.apache.activemq.transport.vm.VMTransport.stop(VMTransport.java:233)[activemq-broker-5.15.2.jar:5.15.2]
        at
org.apache.activemq.transport.TransportFilter.stop(TransportFilter.java:72)[activemq-client-5.15.2.jar:5.15.2]
        at
org.apache.activemq.transport.TransportFilter.stop(TransportFilter.java:72)[activemq-client-5.15.2.jar:5.15.2]
        at
org.apache.activemq.transport.ResponseCorrelator.stop(ResponseCorrelator.java:132)[activemq-client-5.15.2.jar:5.15.2]
        at
org.apache.activemq.broker.TransportConnection.doStop(TransportConnection.java:1193)[activemq-broker-5.15.2.jar:5.15.2]
        at
org.apache.activemq.broker.TransportConnection$4.run(TransportConnection.java:1159)[activemq-broker-5.15.2.jar:5.15.2]
        at java.lang.Thread.run(Thread.java:748)[:1.8.0_152]
 INFO | SRV1 Shutting down NC
 WARN | Could not start network bridge between: vm://SRV1 and:
tcp://IP2:61617 due to: Connection refused: connect
 INFO | SRV1 bridge to Unknown stopped
 INFO | error with pending local brokerInfo on: vm://SRV1#224
org.apache.activemq.transport.TransportDisposedIOException: peer
(vm://SRV1#225) stopped.
        at
org.apache.activemq.transport.vm.VMTransport.stop(VMTransport.java:233)[activemq-broker-5.15.2.jar:5.15.2]
        at
org.apache.activemq.transport.TransportFilter.stop(TransportFilter.java:72)[activemq-client-5.15.2.jar:5.15.2]
        at
org.apache.activemq.transport.TransportFilter.stop(TransportFilter.java:72)[activemq-client-5.15.2.jar:5.15.2]
        at
org.apache.activemq.transport.ResponseCorrelator.stop(ResponseCorrelator.java:132)[activemq-client-5.15.2.jar:5.15.2]
        at
org.apache.activemq.broker.TransportConnection.doStop(TransportConnection.java:1193)[activemq-broker-5.15.2.jar:5.15.2]
        at
org.apache.activemq.broker.TransportConnection$4.run(TransportConnection.java:1159)[activemq-broker-5.15.2.jar:5.15.2]
        at java.lang.Thread.run(Thread.java:748)[:1.8.0_152]
 INFO | SRV1 Shutting down NC
 INFO | SRV1 bridge to Unknown stopped
 INFO | error with pending local brokerInfo on: vm://SRV1#220
org.apache.activemq.transport.TransportDisposedIOException: peer
(vm://SRV#221) stopped.
        at
org.apache.activemq.transport.vm.VMTransport.stop(VMTransport.java:233)[activemq-broker-5.15.2.jar:5.15.2]
        at
org.apache.activemq.transport.TransportFilter.stop(TransportFilter.java:72)[activemq-client-5.15.2.jar:5.15.2]
        at
org.apache.activemq.transport.TransportFilter.stop(TransportFilter.java:72)[activemq-client-5.15.2.jar:5.15.2]
        at
org.apache.activemq.transport.ResponseCorrelator.stop(ResponseCorrelator.java:132)[activemq-client-5.15.2.jar:5.15.2]
        at
org.apache.activemq.broker.TransportConnection.doStop(TransportConnection.java:1193)[activemq-broker-5.15.2.jar:5.15.2]
        at
org.apache.activemq.broker.TransportConnection$4.run(TransportConnection.java:1159)[activemq-broker-5.15.2.jar:5.15.2]
        at java.lang.Thread.run(Thread.java:748)[:1.8.0_152]
 INFO | Connector vm://SRV1 stopped
</code>



--
Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html
Reply | Threaded
Open this post in threaded view
|

Re: Static network of brokers with not all brokers running (expected)

pypen
Figured out the problem with 2. The client had a network timeout of 30
seconds. Every producer and consumer creates its own connection and session
and that caused the problems. Reducing the timeout solved issue 2.



--
Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html
Reply | Threaded
Open this post in threaded view
|

Re: Static network of brokers with not all brokers running (expected)

Tim Bain
In reply to this post by pypen
For #1, if I'm reading your log right, it's coming out at INFO, and you
said it used to be WARN. So you're saying we lowered the logging level;
where's the problem?

And either way, I'm not seeing why you're trying to avoid tweaking your
Log4J configuration. You have a non-standard environment (one where
unavailability of brokers is normal), so it doesn't hurt my head that the
default Log4J configuration might not be the ideal one for you and that
you'd benefit from adjusting it slightly.

Tim

On Jan 30, 2018 7:57 AM, "pypen" <[hidden email]> wrote:

Hi,
I have a network of 4 brokers, statically configured (<networkConnector
uri="static:(tcp://IP1:61617,tcp://IP2:61617,tcp://IP3:61617)"
networkTTL="-1" decreaseNetworkConsumerPriority="true"  />). Each broker has
a list if IPs, excluding its own.
It is expected that not all brokers are running at the same time (they can,
but don't have to).
I have (java) clients that run on the same machines as the brokers and
connect to their localhost broker and I have clients that are running on
separate machines, connecting to one of the 4 (remote) brokers.

1. The activemq log files are being spammed with exceptions that the
connection to the brokers (that are down) fail (see below). This was not the
case with older versions (it was only a warning, I believe). I cannot use
dynamic discovery because multicast is not allowed on the network. I am not
sure if my setup is incorrect or if I just need to change the log4j to not
log these exceptions. Like I said, it's expected that not all brokers are
running at all times. Suppressing the exceptions via log4j seems not really
correct to me.

2. The remote brokers use a list of IPs with
failover:(IPs)?randomize=true&backup=true...
This seems to be working fine, except for if one of the IPs is not
reachable. This does not always happen, but if I for example add as the
first IP some bogus IP, the remote client will not communicate with the
remote broker at all. Lets say I have BOGUS_IP, CORRECT_IP in the list, I
can see it says it connects to the CORRECT_IP, but does not send or receive
any messages. The clients have multiple consumers and producers and on some
producers the session seems to be null and other weird behavior is
happening.
In the admin console of the broker that is running, I do see the connections
from the remote client though. Any ideas what that could be?

Thanks in advance.

<code>
 WARN | Could not start network bridge between: vm://SRV1 and:
tcp://IP3:61617 due to: Connection timed out: connect
 INFO | Establishing network connection from vm://SRV1 to tcp://IP2:61617
 INFO | Connector vm://SRV1 started
 INFO | Establishing network connection from vm://SRV1 to tcp://IP1:61617
 INFO | Establishing network connection from vm://SRV1 to tcp://IP3:61617
 INFO | SRV1 Shutting down NC
 WARN | Could not start network bridge between: vm://SRV1 and:
tcp://IP1:61617 due to: Connection refused: connect
 INFO | SRV1 bridge to Unknown stopped
 INFO | error with pending local brokerInfo on: vm://SRV1#222
org.apache.activemq.transport.TransportDisposedIOException: peer
(vm://SRV1#223) stopped.
        at
org.apache.activemq.transport.vm.VMTransport.stop(VMTransport.java:233)[
activemq-broker-5.15.2.jar:5.15.2]
        at
org.apache.activemq.transport.TransportFilter.stop(TransportFilter.java:72)[
activemq-client-5.15.2.jar:5.15.2]
        at
org.apache.activemq.transport.TransportFilter.stop(TransportFilter.java:72)[
activemq-client-5.15.2.jar:5.15.2]
        at
org.apache.activemq.transport.ResponseCorrelator.stop(
ResponseCorrelator.java:132)[activemq-client-5.15.2.jar:5.15.2]
        at
org.apache.activemq.broker.TransportConnection.doStop(
TransportConnection.java:1193)[activemq-broker-5.15.2.jar:5.15.2]
        at
org.apache.activemq.broker.TransportConnection$4.run(
TransportConnection.java:1159)[activemq-broker-5.15.2.jar:5.15.2]
        at java.lang.Thread.run(Thread.java:748)[:1.8.0_152]
 INFO | SRV1 Shutting down NC
 WARN | Could not start network bridge between: vm://SRV1 and:
tcp://IP2:61617 due to: Connection refused: connect
 INFO | SRV1 bridge to Unknown stopped
 INFO | error with pending local brokerInfo on: vm://SRV1#224
org.apache.activemq.transport.TransportDisposedIOException: peer
(vm://SRV1#225) stopped.
        at
org.apache.activemq.transport.vm.VMTransport.stop(VMTransport.java:233)[
activemq-broker-5.15.2.jar:5.15.2]
        at
org.apache.activemq.transport.TransportFilter.stop(TransportFilter.java:72)[
activemq-client-5.15.2.jar:5.15.2]
        at
org.apache.activemq.transport.TransportFilter.stop(TransportFilter.java:72)[
activemq-client-5.15.2.jar:5.15.2]
        at
org.apache.activemq.transport.ResponseCorrelator.stop(
ResponseCorrelator.java:132)[activemq-client-5.15.2.jar:5.15.2]
        at
org.apache.activemq.broker.TransportConnection.doStop(
TransportConnection.java:1193)[activemq-broker-5.15.2.jar:5.15.2]
        at
org.apache.activemq.broker.TransportConnection$4.run(
TransportConnection.java:1159)[activemq-broker-5.15.2.jar:5.15.2]
        at java.lang.Thread.run(Thread.java:748)[:1.8.0_152]
 INFO | SRV1 Shutting down NC
 INFO | SRV1 bridge to Unknown stopped
 INFO | error with pending local brokerInfo on: vm://SRV1#220
org.apache.activemq.transport.TransportDisposedIOException: peer
(vm://SRV#221) stopped.
        at
org.apache.activemq.transport.vm.VMTransport.stop(VMTransport.java:233)[
activemq-broker-5.15.2.jar:5.15.2]
        at
org.apache.activemq.transport.TransportFilter.stop(TransportFilter.java:72)[
activemq-client-5.15.2.jar:5.15.2]
        at
org.apache.activemq.transport.TransportFilter.stop(TransportFilter.java:72)[
activemq-client-5.15.2.jar:5.15.2]
        at
org.apache.activemq.transport.ResponseCorrelator.stop(
ResponseCorrelator.java:132)[activemq-client-5.15.2.jar:5.15.2]
        at
org.apache.activemq.broker.TransportConnection.doStop(
TransportConnection.java:1193)[activemq-broker-5.15.2.jar:5.15.2]
        at
org.apache.activemq.broker.TransportConnection$4.run(
TransportConnection.java:1159)[activemq-broker-5.15.2.jar:5.15.2]
        at java.lang.Thread.run(Thread.java:748)[:1.8.0_152]
 INFO | Connector vm://SRV1 stopped
</code>



--
Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html
Reply | Threaded
Open this post in threaded view
|

Re: Static network of brokers with not all brokers running (expected)

pypen
This post was updated on .
Bad day? Relax Tim. Not trying to offend anyone here.

It's not about me not wanting to tweak the log4j. I just didn't want to
suppress the TransportDisposedIOException (or DemandForwardingBridgeSupport)
if it happens for a different reason.
And I was not clear enough about the previous versions: It used to be a
warning without exception.
EDIT:  I assume the default log level might have changed to INFO, because changing it back to WARN behaves as it used to. It still logs every n seconds "WARN | Could not start network bridge between...".

Also, I don't understand why this is a non-standard environment if the
multi-cast discovery setup seems to be fine with brokers joining and leaving
the network . That's why I asked to see if my setup is correct (maybe there
is parameter for the static configuration that makes the brokers
"optional").




--
Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html
Reply | Threaded
Open this post in threaded view
|

Re: Static network of brokers with not all brokers running (expected)

Tim Bain
On Jan 30, 2018 2:20 PM, "pypen" <[hidden email]> wrote:

Bad day? Relax Tim. Not trying to offend anyone here.


I wasn't intending for the response to come across sharply; sorry that it
did.

It's not about me not wanting to tweak the log4j. I just didn't want to
suppress the TransportDisposedIOException (or DemandForwardingBridgeSupport)
if it happens for a different reason.


Unfortunately I don't know of a way to prevent that; the only thing I can
think of that even comes close is to use a custom Log4J stack trace
renderer to strip out the lines if they match any exception you've seen
recently (you'd have to keep a cache of recent exceptions to compare
against), something like
https://www.igorkromin.net/index.php/2017/08/21/filtering-exception-stack-trace-logging-with-log4j-and-a-custom-throwablerenderer/
with some tweaks like the cache.

And I was not clear enough about the previous versions: It used to be a
warning without exception.

Also, I don't understand why this is a non-standard environment if the
multi-cast discovery setup seems to be fine with brokers joining and leaving
the network .


The reason this works differently between those two configurations is that
multicast discovery networks have a broadcast mechanism that allows new
brokers to announce their presence, which means that the existing ones
don't have to poll looking for them. But with static TCP connections, the
only reliable way for the existing brokers to check whether a missing
broker has come up is to poll for it by trying to connect repeatedly, and
that generates these log messages.

As for the non-standardness of this configuration, that's simply a
statement of my perception that very few people seem to do what you're
doing (a mesh network where its normal for some brokers in the network to
be down for extended periods of time), but not to imply that it's
unsupported or wrong. But it does mean that the default logging settings,
which are streamlined for the more common cases, might not be optimal right
out of the box in your environment.

That's why I asked to see if my setup is correct (maybe there
is parameter for the static configuration that makes the brokers
"optional").


There's nothing about the pieces of your configuration that you've posted
that's wrong, and no option I'm aware of to make brokers "optional" when
defined statically. So I believe your only option is adjusting the Log4J
configuration to make the logging more closely meet your needs.

Tim
Reply | Threaded
Open this post in threaded view
|

Re: Static network of brokers with not all brokers running (expected)

pypen
This post was updated on .
Thank Tim. I thought maybe the way I phrased the questions came across
offensive or something.
And thanks for the clarification about the polling vs. broadcasting.

I will have to see if we can maybe enable multicast. I don't see the
"backup" property on the discovery transport though and I am not sure how
remote clients will behave if their current broker will go down. But I can
try it out.
EDIT: Just noticed I can nest discovery in failover...






--
Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html
Reply | Threaded
Open this post in threaded view
|

Re: Static network of brokers with not all brokers running (expected)

Tim Bain
Are you looking to use the discovery transport for the client connections,
or both? I've never used it, but my understanding is that discovery is like
failover (connect to one broker in this list) rather than like static
(connect to all the brokers in this list), so it would be reasonable to use
on the clients but probably not what you want for connecting the brokers to
each other.

Tim