ARTEMIS-2894 Message redistribution don't work anymore when paging start

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

ARTEMIS-2894 Message redistribution don't work anymore when paging start

Gaetan.c
Hello,

I opened the following ticket at the beginning of september:  https://issues.apache.org/jira/browse/ARTEMIS-2894

My goal is to replace our ActiveMQ 5 network of Brokers with artemis.

I started to work on it with artemis 2.9.0 and then 2.11.0. and successffully configured a network of broker with the equivalent of the replayWhenNoConsumer (the redistribution with redistributionDelay=0)

The problem is that since 2.12, this don't work anymore when we accumulate many messages on a broker and trigger the paging.
When a consumer is connected on another member of the Network of broker, the redistribution starts but all the messages are not transferred and most disappeared.
They seem to be stucked in a destination used for clustering when the paging occured on it too (internal.sf.my-cluster.335036cc-edc0-11ea-bae1-005056a7672b for example)

I can't find what have caused this since the 2.12 so if someone with more knowledge of the paging and clustering can help me it would be great.

Regards,
Gaetan


Reply | Threaded
Open this post in threaded view
|

Re: ARTEMIS-2894 Message redistribution don't work anymore when paging start

clebertsuconic
It should still work with redistribution, but that will only happen as
it consume message from the broker..

if you have a bunch of messages stuck on the current broker.. it won't
move scanning for redistribution unless you clear space in memory.

I think that's documented. you should probably avoid a slow / stuck
consumer on this case.

On Wed, Oct 14, 2020 at 11:17 AM gaetan caumartin
<[hidden email]> wrote:

>
> Hello,
>
> I opened the following ticket at the beginning of september:  https://issues.apache.org/jira/browse/ARTEMIS-2894
>
> My goal is to replace our ActiveMQ 5 network of Brokers with artemis.
>
> I started to work on it with artemis 2.9.0 and then 2.11.0. and successffully configured a network of broker with the equivalent of the replayWhenNoConsumer (the redistribution with redistributionDelay=0)
>
> The problem is that since 2.12, this don't work anymore when we accumulate many messages on a broker and trigger the paging.
> When a consumer is connected on another member of the Network of broker, the redistribution starts but all the messages are not transferred and most disappeared.
> They seem to be stucked in a destination used for clustering when the paging occured on it too (internal.sf.my-cluster.335036cc-edc0-11ea-bae1-005056a7672b for example)
>
> I can't find what have caused this since the 2.12 so if someone with more knowledge of the paging and clustering can help me it would be great.
>
> Regards,
> Gaetan
>
>


--
Clebert Suconic
Reply | Threaded
Open this post in threaded view
|

RE: ARTEMIS-2894 Message redistribution don't work anymore when paging start

Gaetan.c
It should still work but it isn't (at least, for me).

I properly tested it in 2.11 and 2.12 with the same configuration. With 2.11 it works perfectly with millions of messages being stored before trying to consume them on another member of the cluster but not with 2.12 and newer versions as soon as the paging is triggered.

I'm using two brokers installed on different servers in a cluster (let's call them broker1 and broker2). I produced messages on broker1 until the paging starts, then I stopped the producer and connect a consumer on broker2.
As long as the global-max-size is not reached on broker1 I have no problem with the redistribution on 2.12+, it works correctly.

I can't have a slow or stuck consumer on broker1 as I never connect one to it and I can see the redistribution starting when the consumer is connected to broker2. I consume some messages before the paging is triggered on the "clustering "destination then they seem to be stucked in this destination on broker1.

I tested it with different memory configurations (global-max-size and Xmx) but it has not impact on the described scenario.
I tried with openwire clients (that I used on activemq5) and with the clients included  through the artemis script and it is the same behavior.
I also tried on different servers.
I repeated this test again and again with 2.11 and 2.12 trying different configurations and I can only see a regression with 2.12.

I saw nothing about to the memory usage or the paging in the documentation related to the redistribution.
________________________________
De : Clebert Suconic <[hidden email]>
Envoyé : jeudi 15 octobre 2020 03:06
À : [hidden email] <[hidden email]>
Objet : Re: ARTEMIS-2894 Message redistribution don't work anymore when paging start

It should still work with redistribution, but that will only happen as
it consume message from the broker..

if you have a bunch of messages stuck on the current broker.. it won't
move scanning for redistribution unless you clear space in memory.

I think that's documented. you should probably avoid a slow / stuck
consumer on this case.

On Wed, Oct 14, 2020 at 11:17 AM gaetan caumartin
<[hidden email]> wrote:

>
> Hello,
>
> I opened the following ticket at the beginning of september:  https://issues.apache.org/jira/browse/ARTEMIS-2894
>
> My goal is to replace our ActiveMQ 5 network of Brokers with artemis.
>
> I started to work on it with artemis 2.9.0 and then 2.11.0. and successffully configured a network of broker with the equivalent of the replayWhenNoConsumer (the redistribution with redistributionDelay=0)
>
> The problem is that since 2.12, this don't work anymore when we accumulate many messages on a broker and trigger the paging.
> When a consumer is connected on another member of the Network of broker, the redistribution starts but all the messages are not transferred and most disappeared.
> They seem to be stucked in a destination used for clustering when the paging occured on it too (internal.sf.my-cluster.335036cc-edc0-11ea-bae1-005056a7672b for example)
>
> I can't find what have caused this since the 2.12 so if someone with more knowledge of the paging and clustering can help me it would be great.
>
> Regards,
> Gaetan
>
>


--
Clebert Suconic