Thousands of "Transport Connection failed" exceptions

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Thousands of "Transport Connection failed" exceptions

Frizz
My clients connect to AMQ with this connection string:
(tcp://amq1:61616,tcp://amq2:61616)?randomize=false&priorityBackup=true

It works - for some time. But sooner or later my AMQ server becomes
unresponsive because the host it runs on runs out of resources (threads).
The AMQ server basically kills it.

activemq.log shows lots of entries like this:
...
2018-01-29 16:50:50,800 | WARN  | Transport Connection to: tcp://
172.13.2.145:45958 failed: java.net.SocketException: Connection reset |
org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ
Transport: tcp:///172.13.2.145:45958@61616
...
2018-01-29 16:50:52,894 | WARN  | Failed to register MBean
org.apache.activemq:type=Broker,brokerName=amq1,connector=clientConnectors,connectorName=default,connectionViewType=clientId,connectionName=ID_e325fbc8d9c2-41743-1517236130397-0_22
...
...

And then I get spammed with thousands of lines like this:

2018-01-29 18:14:40,374 | WARN  | Transport Connection to: tcp://
172.13.2.150:51089 failed: java.io.EOFException |
org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ
Transport: tcp:///172.13.2.150:51089@61616
2018-01-29 18:14:40,455 | WARN  | Transport Connection to: tcp://
172.13.2.150:51091 failed: java.io.EOFException |
org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ
Transport: tcp:///172.13.2.150:51091@61616
2018-01-29 18:14:40,537 | WARN  | Transport Connection to: tcp://
172.13.2.150:51093 failed: java.io.EOFException |
org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ
Transport: tcp:///172.13.2.150:51093@61616
2018-01-29 18:14:40,617 | WARN  | Transport Connection to: tcp://
172.13.2.150:51095 failed: java.io.EOFException |
org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ
Transport: tcp:///172.13.2.150:51095@61616
2018-01-29 18:14:40,698 | WARN  | Transport Connection to: tcp://
172.13.2.150:51097 failed: java.io.EOFException |
org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ
Transport: tcp:///172.13.2.150:51097@61616
2018-01-29 18:14:40,780 | WARN  | Transport Connection to: tcp://
172.13.2.150:51099 failed: java.io.EOFException |
org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ
Transport: tcp:///172.13.2.150:51099@61616
2018-01-29 18:14:40,860 | WARN  | Transport Connection to: tcp://
172.13.2.150:51101 failed:
...

Why is the AMQ server trying to connect to every single port on one of the
client machines?
Reply | Threaded
Open this post in threaded view
|

RE: Thousands of "Transport Connection failed" exceptions

Andrei Shakirin
Hi Frizz,

I have absolutely the same symptoms described in http://activemq.2283324.n4.nabble.com/Excessive-number-of-connections-by-failover-transport-td4735849.html

Also using randomize=false&priorityBackup=true. After some time of work, client spams the server with connections and after short time server becomes out of resources.
Version of client and server is 5.14.5.

@activemq community: any ideas, workarounds?

Regards,
Andrei.


> -----Original Message-----
> From: Frizz [mailto:[hidden email]]
> Sent: Montag, 29. Januar 2018 20:12
> To: [hidden email]
> Subject: Thousands of "Transport Connection failed" exceptions
>
> My clients connect to AMQ with this connection string:
> (tcp://amq1:61616,tcp://amq2:61616)?randomize=false&priorityBackup=true
>
> It works - for some time. But sooner or later my AMQ server becomes
> unresponsive because the host it runs on runs out of resources (threads).
> The AMQ server basically kills it.
>
> activemq.log shows lots of entries like this:
> ...
> 2018-01-29 16:50:50,800 | WARN  | Transport Connection to: tcp://
> 172.13.2.145:45958 failed: java.net.SocketException: Connection reset |
> org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ
> Transport: tcp:///172.13.2.145:45958@61616 ...
> 2018-01-29 16:50:52,894 | WARN  | Failed to register MBean
> org.apache.activemq:type=Broker,brokerName=amq1,connector=clientConnect
> ors,connectorName=default,connectionViewType=clientId,connectionName=ID
> _e325fbc8d9c2-41743-1517236130397-0_22
> ...
> ...
>
> And then I get spammed with thousands of lines like this:
>
> 2018-01-29 18:14:40,374 | WARN  | Transport Connection to: tcp://
> 172.13.2.150:51089 failed: java.io.EOFException |
> org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ
> Transport: tcp:///172.13.2.150:51089@61616
> 2018-01-29 18:14:40,455 | WARN  | Transport Connection to: tcp://
> 172.13.2.150:51091 failed: java.io.EOFException |
> org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ
> Transport: tcp:///172.13.2.150:51091@61616
> 2018-01-29 18:14:40,537 | WARN  | Transport Connection to: tcp://
> 172.13.2.150:51093 failed: java.io.EOFException |
> org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ
> Transport: tcp:///172.13.2.150:51093@61616
> 2018-01-29 18:14:40,617 | WARN  | Transport Connection to: tcp://
> 172.13.2.150:51095 failed: java.io.EOFException |
> org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ
> Transport: tcp:///172.13.2.150:51095@61616
> 2018-01-29 18:14:40,698 | WARN  | Transport Connection to: tcp://
> 172.13.2.150:51097 failed: java.io.EOFException |
> org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ
> Transport: tcp:///172.13.2.150:51097@61616
> 2018-01-29 18:14:40,780 | WARN  | Transport Connection to: tcp://
> 172.13.2.150:51099 failed: java.io.EOFException |
> org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ
> Transport: tcp:///172.13.2.150:51099@61616
> 2018-01-29 18:14:40,860 | WARN  | Transport Connection to: tcp://
> 172.13.2.150:51101 failed:
> ...
>
> Why is the AMQ server trying to connect to every single port on one of the
> client machines?
Reply | Threaded
Open this post in threaded view
|

RE: Thousands of "Transport Connection failed" exceptions

Andrei Shakirin
Hi,

The reason of an issue is identified.
It was large exception string throwing in JMS handler.
AMQ client tried to send it in dlqDeliveryFailureCause property.

Issue can be easily reproduced using the handler code:

public void onMessage(Message message) {
...
StringBuffer bigBuffer = new StringBuffer(Short.MAX_VALUE);
...
throw new RuntimeException(bigBuffer.toString());
}

This client kills the server using failover protocol

I see two AMQ problems here:
1) Exception message have to be controlled and limited before set in dlqDeliveryFailureCause: some exceptions coming from thirdparty and not under client handler control
2) Failover reconnection by EVERY IOException is IMO very dangerous

Details in https://issues.apache.org/jira/browse/AMQ-6894

Regards,
Andrei.

> -----Original Message-----
> From: Andrei Shakirin [mailto:[hidden email]]
> Sent: Montag, 29. Januar 2018 21:51
> To: [hidden email]
> Subject: RE: Thousands of "Transport Connection failed" exceptions
>
> Hi Frizz,
>
> I have absolutely the same symptoms described in
> http://activemq.2283324.n4.nabble.com/Excessive-number-of-connections-by-
> failover-transport-td4735849.html
>
> Also using randomize=false&priorityBackup=true. After some time of work,
> client spams the server with connections and after short time server becomes
> out of resources.
> Version of client and server is 5.14.5.
>
> @activemq community: any ideas, workarounds?
>
> Regards,
> Andrei.
>
>
> > -----Original Message-----
> > From: Frizz [mailto:[hidden email]]
> > Sent: Montag, 29. Januar 2018 20:12
> > To: [hidden email]
> > Subject: Thousands of "Transport Connection failed" exceptions
> >
> > My clients connect to AMQ with this connection string:
> > (tcp://amq1:61616,tcp://amq2:61616)?randomize=false&priorityBackup=tru
> > e
> >
> > It works - for some time. But sooner or later my AMQ server becomes
> > unresponsive because the host it runs on runs out of resources (threads).
> > The AMQ server basically kills it.
> >
> > activemq.log shows lots of entries like this:
> > ...
> > 2018-01-29 16:50:50,800 | WARN  | Transport Connection to: tcp://
> > 172.13.2.145:45958 failed: java.net.SocketException: Connection reset
> > | org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ
> > Transport: tcp:///172.13.2.145:45958@61616 ...
> > 2018-01-29 16:50:52,894 | WARN  | Failed to register MBean
> >
> org.apache.activemq:type=Broker,brokerName=amq1,connector=clientConnec
> > t
> > ors,connectorName=default,connectionViewType=clientId,connectionName=I
> > D
> > _e325fbc8d9c2-41743-1517236130397-0_22
> > ...
> > ...
> >
> > And then I get spammed with thousands of lines like this:
> >
> > 2018-01-29 18:14:40,374 | WARN  | Transport Connection to: tcp://
> > 172.13.2.150:51089 failed: java.io.EOFException |
> > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ
> > Transport: tcp:///172.13.2.150:51089@61616
> > 2018-01-29 18:14:40,455 | WARN  | Transport Connection to: tcp://
> > 172.13.2.150:51091 failed: java.io.EOFException |
> > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ
> > Transport: tcp:///172.13.2.150:51091@61616
> > 2018-01-29 18:14:40,537 | WARN  | Transport Connection to: tcp://
> > 172.13.2.150:51093 failed: java.io.EOFException |
> > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ
> > Transport: tcp:///172.13.2.150:51093@61616
> > 2018-01-29 18:14:40,617 | WARN  | Transport Connection to: tcp://
> > 172.13.2.150:51095 failed: java.io.EOFException |
> > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ
> > Transport: tcp:///172.13.2.150:51095@61616
> > 2018-01-29 18:14:40,698 | WARN  | Transport Connection to: tcp://
> > 172.13.2.150:51097 failed: java.io.EOFException |
> > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ
> > Transport: tcp:///172.13.2.150:51097@61616
> > 2018-01-29 18:14:40,780 | WARN  | Transport Connection to: tcp://
> > 172.13.2.150:51099 failed: java.io.EOFException |
> > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ
> > Transport: tcp:///172.13.2.150:51099@61616
> > 2018-01-29 18:14:40,860 | WARN  | Transport Connection to: tcp://
> > 172.13.2.150:51101 failed:
> > ...
> >
> > Why is the AMQ server trying to connect to every single port on one of
> > the client machines?