Question about rollbackOnlyOnAsyncException (AMQ-3166)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Question about rollbackOnlyOnAsyncException (AMQ-3166)

alprausch77
This post was updated on .
Hello.
I have a question about the AMQ-3166 issue and the introduced
'rollbackOnlyOnAsyncException' flag.
In the comments of the AMQ-3166 task Gary Tully says
    async exceptions on transactional ops - message send and message ack
will result in the transaction being marked rollback-only. Commit will fail
with an exception.



But if I send the message in async mode than it´s likely that the client
side transaction is already committed before(!) an exception occurs on the
AMQ side. In that case the TransportConnection class will no longer find a
transaction for the message (as it´s already committed) and of course it
can´t be rolled-back anymore.
Or do I misunderstand the behaviour?


My current problem is that we switched from async to sync sends because
we´ve seen some messages getting lost without "seeing" it on the client
side. For our use-case it´s not acceptable to have a message loss. So we
switched to the sync send.
But this comes with a high performance impact.
We are seeing a lot of messages to be send in 0-3ms but we see occassional
spikes in the send times (in the ActiveMQMessageProducer) of 30-500ms.

The 'producerFlowControl' is already disabled in the configuration; the
storage is mKahaDB.
And we already tested a lot of configuration combinations with no real
(positive) effect on the performance.

Is there a way to avoid such spikes in the sync send?
(Tests has shown that the spikes are most likely caused due to I/O issues.
Because when using a RAM disk as location for the kahaDB I don´t see any
such spikes.)

Or in async mode: is there a way to have a synchronization on transaction
commit which verifies that the message is processed?
This way the message processing could be triggered in async mode and the
client could do further work and only synchronizes on the transaction
commit.

Thanks.
Joachim

btw: we are running AMQ 5.14.5 (unfortunatelly on Windows) in combination with Wildfly. So we are on a JEE stack with JTA



--
Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html
Reply | Threaded
Open this post in threaded view
|

Re: Question about rollbackOnlyOnAsyncException (AMQ-3166)

Tim Bain
One thing that could cause pauses like that is the JVM's garbage collector.
What GC strategy are you using, and can you see if the times at which the
higher latency occurs correlate with the times at which full GCs are
occurring?

Tim

On Feb 9, 2018 1:53 AM, "alprausch77" <[hidden email]> wrote:

> Hello.
> I have a question about the AMQ-3166 issue and the introduced
> 'rollbackOnlyOnAsyncException' flag.
> In the comments of the AMQ-3166 task Gary Tully says
>      /async exceptions on transactional ops - message send and message ack
> will result in the transaction being marked rollback-only. Commit will fail
> with an exception./
>
>
> But if I send the message in async mode than it´s likely that the client
> side transaction is already committed before(!) an exception occurs on
>
> the AMQ side. In that case the TransportConnection class will no longer
> find
> a transaction for the message (as it´s already committed) and of
>
> course it can´t be rolled-back anymore.
> Or do I misunderstand the behaviour?
>
>
> My current problem is that we switched from async to sync sends because
> we´ve seen some messages getting lost without "seeing" it on the client
>
> side. For our use-case it´s not acceptable to have a message loss. So we
> switched to the sync send.
> But this comes with a high performance impact.
> We are seeing a lot of messages to be send in 0-3ms but we see occassional
> spikes in the send times (in the ActiveMQMessageProducer) of 30-500ms.
>
> The 'producerFlowControl' is already disabled in the configuration; the
> storage is mKahaDB.
> And we already tested a lot of configuration combinations with no real
> (positive) effect on the performance.
>
> Is there a way to avoid such spikes in the sync send?
> (Tests has shown that the spikes are most likely caused due to I/O issues.
> Because when using a RAM disk as location for the kahaDB I don´t see
>
> any such spikes.)
>
> Or in async mode: is there a way to have a synchronization on transaction
> commit which verifies that the message is processed?
> This way the message processing could be triggered in async mode and the
> client could do further work and only synchronizes on the transaction
>
> commit.
>
>
> Thanks.
>
> btw: we are running AMQ 5.14.5 (unfortunatelly on Windows)
>
>
>
> --
> Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-
> f2341805.html
>
Reply | Threaded
Open this post in threaded view
|

Re: Question about rollbackOnlyOnAsyncException (AMQ-3166)

alprausch77
Hello Tim.
Thank you for your suggestion but I already checked the GC (forgot to
mention it).
The times of the slow JMS sends doesn´t correlate with the garbage collector
events.
Btw: we use CMS as collector.

Any other ideas?

Thanks.
Joachim



--
Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html
Reply | Threaded
Open this post in threaded view
|

Re: Question about rollbackOnlyOnAsyncException (AMQ-3166)

gtully
transaction commit is a sync operation... if an async send in a transaction
fails, then the commit will rollback via: rollbackOnlyOnAsyncException

w.r.t spikes in send, have you enabled the preallocation strategy on
kahadb, it may be worth toggling to see the effect in your env. Also,
ackCompaction may be in the way. To get to the bottom of it would need some
thread dumps or profiling... but those are the big events that effect io.

On Mon, 12 Feb 2018 at 05:51 alprausch77 <[hidden email]> wrote:

> Hello Tim.
> Thank you for your suggestion but I already checked the GC (forgot to
> mention it).
> The times of the slow JMS sends doesn´t correlate with the garbage
> collector
> events.
> Btw: we use CMS as collector.
>
> Any other ideas?
>
> Thanks.
> Joachim
>
>
>
> --
> Sent from:
> http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html
>
Reply | Threaded
Open this post in threaded view
|

Re: Question about rollbackOnlyOnAsyncException (AMQ-3166)

gtully
in addition, try the latest Artemis release, there has been a bunch of work
on latency w.r.t gc on exactly that flat send latency (reliable enqueue)
use case.

On Wed, 14 Feb 2018 at 14:29 Gary Tully <[hidden email]> wrote:

> transaction commit is a sync operation... if an async send in a
> transaction fails, then the commit will rollback via:
> rollbackOnlyOnAsyncException
>
> w.r.t spikes in send, have you enabled the preallocation strategy on
> kahadb, it may be worth toggling to see the effect in your env. Also,
> ackCompaction may be in the way. To get to the bottom of it would need some
> thread dumps or profiling... but those are the big events that effect io.
>
> On Mon, 12 Feb 2018 at 05:51 alprausch77 <[hidden email]>
> wrote:
>
>> Hello Tim.
>> Thank you for your suggestion but I already checked the GC (forgot to
>> mention it).
>> The times of the slow JMS sends doesn´t correlate with the garbage
>> collector
>> events.
>> Btw: we use CMS as collector.
>>
>> Any other ideas?
>>
>> Thanks.
>> Joachim
>>
>>
>>
>> --
>> Sent from:
>> http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: Question about rollbackOnlyOnAsyncException (AMQ-3166)

alprausch77
This post was updated on .
In reply to this post by gtully
Hello Gary.
Thanks for your reply.
I already tried the rollbackOnlyOnAsyncException. But this one only works if
the transaction still exists when the error in the async send happens.
But if the TX is already finished, than there is no such action because the
code which looks up the transaction returns null (naturally) and therefore
can´t do a rollback.

That´s why I asked if there is some way to add a synchronization on the TX
commit on the asynch outcome...

Some time ago I also tested out the preallocation strategies but found no
real performance differences between those. I think this has to do with the
Windows file system which we are using; on Unix based systems this seems to
have more of an effect.

The ackCompaction is defaulted to true on AMQ 5.14.x
But we also encounter such latencies on AMQs with 5.13.x versions. So this property shouldn´t have an impact.
Also in my tests I don´t have much data files; typically only one data file is used.

But I will have a look at this and also to Artemis.




--
Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html
Reply | Threaded
Open this post in threaded view
|

Re: Question about rollbackOnlyOnAsyncException (AMQ-3166)

gtully
XA transactions?
normally the transaction is on the session and connection... so any async
call will complete before the commit.. unless maybe nio is mixing this up
some.
do you have some code that will show the rollbackOnlyOnAsyncException case
where the tx is null.

There is a test case that can provide the basis for some experiments:
AMQ3166Test



On Wed, 14 Feb 2018 at 14:38 alprausch77 <[hidden email]> wrote:

> Hello Gary.
> Thanks for your reply.
> I already tried the rollbackOnlyOnAsyncException. But this one only works
> if
> the transaction still exists when the error in the async send happens.
> But if the TX is already finished, than there is no such action because the
> code which looks up the transaction returns null (naturally) and therfore
> can´t do a rollback.
>
> That´s why I asked if there is some way to add a synchronization on the TX
> commit on the asynch outcome...
>
> Some time ago I also tested out the preallocation strategies but found no
> real performance differences between those. I think this has to do with the
> Windows file system which we are using; on Unix based systems this seems to
> have more of an effect.
>
> I haven´t tried the ackCompaction yet. But if I understand it correctly I
> don´t think this will have much influence either because we aren´t doing
> much bulk processing. In each transaction there are only a few messages
> received / send.
>
> But I will have a look at this and also to Artemis.
>
>
>
>
> --
> Sent from:
> http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html
>
Reply | Threaded
Open this post in threaded view
|

Re: Question about rollbackOnlyOnAsyncException (AMQ-3166)

alprausch77
This post was updated on .
Yes, we use XA transactions.

I don´t think that the NIO mixes this up somehow. We can run our system with
a standalone ActiveMQ using tcp or nio but also in an embedded mode inside
Wildfly using the VM protocol.

I just run our tests again with the enableAckCompaction set to false for the
kahaDB - it looked a lot better as we now only have a few message sends
taking ~10ms or ~20ms (which is also a long time...) but no send took longer
than that.

I´ll have further tests in that direction and will keep this post updated.

Thanks


Update: we still encounter message sends >100ms when running the tests over a longer period of time. But much less of such long running sends occur if the enableAckCompaction property is set to false.




--
Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html
Reply | Threaded
Open this post in threaded view
|

Re: Question about rollbackOnlyOnAsyncException (AMQ-3166)

Tim Bain
I'd echo Gary's suggestion that you capture thread dumps or perform
sampling during the slow events. That would make it clear where the time is
being spent.

Tim


On Wed, Feb 14, 2018 at 11:16 PM, alprausch77 <[hidden email]>
wrote:

> Yes, we use XA transactions.
>
> I don´t think that the NIO mixes this up somehow. We can run our system
> with
> a standalone ActiveMQ using tcp or nio but also in an embedded mode inside
> Wildfly using the VM protocol.
>
> I just run our tests again with the enableAckCompaction set to false for
> the
> kahaDB - it looked a lot better as we now only have a few message sends
> taking ~10ms or ~20ms (which is also a long time...) but no send took
> longer
> than that.
>
> I´ll have further tests in that direction and will keep this post updated.
>
> Thanks
>
>
>
> --
> Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-
> f2341805.html
>