Network of Brokers - Duplicate message add attempt rejected

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Network of Brokers - Duplicate message add attempt rejected

oseymen
Hi all,

I tried the below simple network of brokers test with ActiveMQ 5.5, Activemq-5.5.0-fuse-00-43 and Activemq-5.5.0-fuse-00-27 and get the same behavior with all of them.

I deployed ActiveMQ (above versions) to a virtual machine and my local box. I made no changes in local instance. I only add the following network connector to the virtual machine instance which forwards messages in Q1 from virtual machine instance to local instance:

<networkConnectors>
        <networkConnector
                name="nc"
                uri="static:(tcp://<local machine IP>:61616)"
                >
                <staticallyIncludedDestinations>                                       
                        <queue physicalName="Q1" />                               
                </staticallyIncludedDestinations>                       
        </networkConnector>               
</networkConnectors>

That's it...

I start a producer on the virtual machine which sends 10000 transactional messages to Q1 in virtual machine instance. When I disable the network interface and enable it back (simulating network glitch), I get this on the local instance:

 WARN | Duplicate message add attempt rejected. Destination: Q1, Message id: ID:MSGTEST02-56791-634499717928681758-1:0:1:1:1003
 WARN | Duplicate message add attempt rejected. Destination: Q1, Message id: ID:MSGTEST02-56791-634499717928681758-1:0:1:1:1004
 WARN | Duplicate message add attempt rejected. Destination: Q1, Message id: ID:MSGTEST02-56791-634499717928681758-1:0:1:1:1005
 WARN | Duplicate message add attempt rejected. Destination: Q1, Message id: ID:MSGTEST02-56791-634499717928681758-1:0:1:1:1006

Messages are not lost and successfully reach the local machine but queue statistics are all wrong. If I get x "duplicate..." messages, I end up (10000 + x) messages reported on the local machine. When I attach a consumer to Q1 on the local instance, it can consume 10000 messages but x messages still remain reported in admin console and QueueSize in JConsole. Those x messages cannot be consumed. Statistics drop back to 0 when local broker is restarted.

Have you experienced this problem yourselves? Is this a known/unknown bug or am I doing something wrong?

There is an old JIRA which explains this issue (https://issues.apache.org/jira/browse/AMQ-2803) but this is apparently fixed for v5.4. I've reported this in https://issues.apache.org/jira/browse/AMQ-3469 for a more complex case (I didn't know it was this easy to demonstrate!!). Also reported by someone else in https://issues.apache.org/jira/browse/AMQ-3473.

I've tried adding failoverProducersAuditDepth="0" and maxFailoverProducersToTrack="0" settings to kahadb but I still get duplicate suppression and still get incorrect statistics. Are there any other settings that I can try?

I really need your comments here urgently as this is a blocker for us from health and activity monitoring POV. I just can't tell by looking at the QueueSize whether there are really x messages pending (consumers/network experiencing problems), or everything is fine but there are zombie messages reported.

Thanks in advance.
Ozan
Reply | Threaded
Open this post in threaded view
|

Re: Network of Brokers - Duplicate message add attempt rejected

gtully
When you get the WARN message, it is expected that the stats are wrong
because the store should not be getting duplicates. At the point that
the store recognizes the duplicate, the stats and cursors may have
already processed the message.

The producerAudit should eliminate the duplicates at an earlier stage,
so this needs to be enabled, and may need to be large to accommodate
larger transactions.

Can you build a simple junit tests case and attach it to one of the
existing jiras or create your own.

There is a SocketProxy that you can use to simulate a network glitch.

example usage in a test that sounds very like your use case,
org.apache.activemq.usecases.BrokerQueueNetworkWithDisconnectTest

maybe you can use a variant of that test to validate configuration
and/or easily reproduce.

On 26 August 2011 17:01, oseymen <[hidden email]> wrote:

> Hi all,
>
> I tried the below simple network of brokers test with ActiveMQ 5.5,
> Activemq-5.5.0-fuse-00-43 and Activemq-5.5.0-fuse-00-27 and get the same
> behavior with all of them.
>
> I deployed ActiveMQ (above versions) to a virtual machine and my local box.
> I made no changes in local instance. I only add the following network
> connector to the virtual machine instance which forwards messages in Q1 from
> virtual machine instance to local instance:
>
> <networkConnectors>
>        &lt;networkConnector
>                name=&quot;nc&quot;
>                uri=&quot;static:(tcp://&lt;local machine IP&gt;:61616)"
>                >
>                <staticallyIncludedDestinations>
>                        <queue physicalName="Q1" />
>                </staticallyIncludedDestinations>
>        </networkConnector>
> </networkConnectors>
>
> That's it...
>
> I start a producer on the virtual machine which sends 10000 transactional
> messages to Q1 in virtual machine instance. When I disable the network
> interface and enable it back (simulating network glitch), I get this on the
> local instance:
>
>  WARN | Duplicate message add attempt rejected. Destination: Q1, Message id:
> ID:MSGTEST02-56791-634499717928681758-1:0:1:1:1003
>  WARN | Duplicate message add attempt rejected. Destination: Q1, Message id:
> ID:MSGTEST02-56791-634499717928681758-1:0:1:1:1004
>  WARN | Duplicate message add attempt rejected. Destination: Q1, Message id:
> ID:MSGTEST02-56791-634499717928681758-1:0:1:1:1005
>  WARN | Duplicate message add attempt rejected. Destination: Q1, Message id:
> ID:MSGTEST02-56791-634499717928681758-1:0:1:1:1006
>
> Messages are not lost and successfully reach the local machine but queue
> statistics are all wrong. If I get x "duplicate..." messages, I end up
> (10000 + x) messages reported on the local machine. When I attach a consumer
> to Q1 on the local instance, it can consume 10000 messages but x messages
> still remain reported in admin console and QueueSize in JConsole. Those x
> messages cannot be consumed. Statistics drop back to 0 when local broker is
> restarted.
>
> Have you experienced this problem yourselves? Is this a known/unknown bug or
> am I doing something wrong?
>
> There is an old JIRA which explains this issue
> (https://issues.apache.org/jira/browse/AMQ-2803) but this is apparently
> fixed for v5.4. I've reported this in
> https://issues.apache.org/jira/browse/AMQ-3469 for a more complex case (I
> didn't know it was this easy to demonstrate!!). Also reported by someone
> else in https://issues.apache.org/jira/browse/AMQ-3473.
>
> I've tried adding failoverProducersAuditDepth="0" and
> maxFailoverProducersToTrack="0" settings to kahadb but I still get duplicate
> suppression and still get incorrect statistics. Are there any other settings
> that I can try?
>
> I really need your comments here urgently as this is a blocker for us from
> health and activity monitoring POV. I just can't tell by looking at the
> QueueSize whether there are really x messages pending (consumers/network
> experiencing problems), or everything is fine but there are zombie messages
> reported.
>
> Thanks in advance.
> Ozan
>
> --
> View this message in context: http://activemq.2283324.n4.nabble.com/Network-of-Brokers-Duplicate-message-add-attempt-rejected-tp3771301p3771301.html
> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>



--
http://fusesource.com
http://blog.garytully.com
Reply | Threaded
Open this post in threaded view
|

Re: Network of Brokers - Duplicate message add attempt rejected

oseymen
Thanks for your reply Gary.

>> When you get the WARN message, it is expected that the stats are wrong
>> because the store should not be getting duplicates.

OS - Agreed.

>> At the point that the store recognizes the duplicate, the stats and cursors may have
>> already processed the message.

>> The producerAudit should eliminate the duplicates at an earlier stage,
>> so this needs to be enabled, and may need to be large to accommodate
>> larger transactions.

OS - How do I enable producerAudit? If I don't say "maxFailoverProducersToTrack=0" or "enableAudit=false", isn't it enabled by default? So in default AMQ configuration (as it ships), I am guessing producerAudit is enabled.

Can you please elaborate on "need to be large to accommodate larger transactions" part? I think this is the magic that I am missing. Otherwise I just got the default configuration and added networkConnector to it. Even though I understand why duplicate messages are coming to the second broker (unacked messages replayed by the first broker), I just feel that I am missing something that will prevent duplicates eliminated before stats and cursors process them. Otherwise nobody could monitor queues properly, right?

>> Can you build a simple junit tests case and attach it to one of the
>> existing jiras or create your own.

OS - Sure, I will give that I go. I am not a Java guy so might not be able to come back so soon.

Thanks again,
Ozan
Reply | Threaded
Open this post in threaded view
|

Re: Network of Brokers - Duplicate message add attempt rejected

oseymen
In reply to this post by gtully
Hi Gary,

I downloaded the source, compiled and ran the test - indeed it works perfectly fine. In order to understand where the problem is, I started to make changes one at a time:

1. I replaced HUB with a clean deployment of AMQ 5.5 on a remote virtual machine - no modifications to activemq.xml. So SPOKE is running locally in the unit test and HUB is a physical broker. The only reason I did this is to start/stop network myself as well (together with SocketProxy).
2. Made necessary (simple) changes to attached classes in order to bridge SPOKE with new-HUB.

Test always runs successfully when I am using SocketProxy alone. However when I start/stop the network manually as the test is running, I see two different results:

IF ConnectionFactory.AlwaysSyncSend is true, duplicate messages are reported on the HUB.
IF ConnectionFactory.AlwaysSyncSend is false, number of messages that arrive in the HUB are less than number of messages sent.

The windows commands I use to start/stop network are:
netsh interface set interface "Local Area Connection" DISABLED
netsh interface set interface "Local Area Connection" ENABLED

I have no idea why I don't get the same results with SocketProxy but I can reproduce this whenever I run the test and play with network.

I am attaching the test files that I modified here for your perusal. My changes are really simple and only to bridge a physical broker with internal SPOKE. I am happy to provide any other information you require or try any setting you suggest.

BrokerQueueNetworkWithDisconnectTest.java
JmsMultipleBrokersTestSupport.java
MessageIdList.java

Regards,
Ozan
Reply | Threaded
Open this post in threaded view
|

Re: Network of Brokers - Duplicate message add attempt rejected

gtully
With not persistent messages, the default producer sent mode is async.
This non guaranteed delivery mode is honored by the network connector,
it also does an async send when it forwards the message.
The thinking being that if it is ok to loose messages on a network
disconnect between client and broker, it is also ok to loose message
broker to broker.

Using AlwaysSyncSend, the producer waits for a broker ack, so it does
a sync send, which is in turn honored by the network connector, so any
inflight messages will be redelivered. They will be duplicates if it
is just the broker ack that is lost.

With the socket proxy, the connection closure is more controlled than
with netsh so the possibility of writes buffering up on a dead
connection are reduced.


On 5 September 2011 11:17, oseymen <[hidden email]> wrote:

> Hi Gary,
>
> I downloaded the source, compiled and ran the test - indeed it works
> perfectly fine. In order to understand where the problem is, I started to
> make changes one at a time:
>
> 1. I replaced HUB with a clean deployment of AMQ 5.5 on a remote virtual
> machine - no modifications to activemq.xml. So SPOKE is running locally in
> the unit test and HUB is a physical broker. The only reason I did this is to
> start/stop network myself as well (together with SocketProxy).
> 2. Made necessary (simple) changes to attached classes in order to bridge
> SPOKE with new-HUB.
>
> Test always runs successfully when I am using SocketProxy alone. However
> when I start/stop the network manually as the test is running, I see two
> different results:
>
> IF ConnectionFactory.AlwaysSyncSend is true, duplicate messages are reported
> on the HUB.
> IF ConnectionFactory.AlwaysSyncSend is false, number of messages that arrive
> in the HUB are less than number of messages sent.
>
> The windows commands I use to start/stop network are:
> netsh interface set interface "Local Area Connection" DISABLED
> netsh interface set interface "Local Area Connection" ENABLED
>
> I have no idea why I don't get the same results with SocketProxy but I can
> reproduce this whenever I run the test and play with network.
>
> I am attaching the test files that I modified here for your perusal. My
> changes are really simple and only to bridge a physical broker with internal
> SPOKE. I am happy to provide any other information you require or try any
> setting you suggest.
>
> http://activemq.2283324.n4.nabble.com/file/n3790910/BrokerQueueNetworkWithDisconnectTest.java
> BrokerQueueNetworkWithDisconnectTest.java
> http://activemq.2283324.n4.nabble.com/file/n3790910/JmsMultipleBrokersTestSupport.java
> JmsMultipleBrokersTestSupport.java
> http://activemq.2283324.n4.nabble.com/file/n3790910/MessageIdList.java
> MessageIdList.java
>
> Regards,
> Ozan
>
> --
> View this message in context: http://activemq.2283324.n4.nabble.com/Network-of-Brokers-Duplicate-message-add-attempt-rejected-tp3771301p3790910.html
> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>



--
http://fusesource.com
http://blog.garytully.com