ActiveMQ performance during bulk queue GC.

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

ActiveMQ performance during bulk queue GC.

Kevin Burton
I’m still tracking down this bulk queue GC bug I’ve been dealing with.

Right now I’m working on a test to duplicate the problem.

However, in the mean time I noticed this:

http://pastebin.com/PwPJQLNu

… in regionBroker.

Essentially, the code holds the write lock during the entire time of the GC
for *all* the queues.

In my experience, on disk, it takes 100ms to GC a queue (at least under
load)

It seems like a better strategy here would be to have the GC interval be
very low, say 5 seconds, but then have a getMaxPurgedDestinationsPerSweep
value low.. say 1-5 …

The problem is that there is no way to change this value anywhere.  There
aren’t any configuration directives for this.  From the source, it looks
like it’s only used within testing.

Perhaps this should become a config directive?

Kevin

--

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>
<http://spinn3r.com>
Reply | Threaded
Open this post in threaded view
|

Re: ActiveMQ performance during bulk queue GC.

Kevin Burton
Also, I realized there’s another bug here.  A write lock is acquired even
in the situation where no queues actually need to be GCd.  It’s not massive
in the grand scheme of things but it probably ends up with brief moments
where the broker isn’t serving any messages.

On Sat, Feb 21, 2015 at 8:42 PM, Kevin Burton <[hidden email]> wrote:

> I’m still tracking down this bulk queue GC bug I’ve been dealing with.
>
> Right now I’m working on a test to duplicate the problem.
>
> However, in the mean time I noticed this:
>
> http://pastebin.com/PwPJQLNu
>
> … in regionBroker.
>
> Essentially, the code holds the write lock during the entire time of the
> GC for *all* the queues.
>
> In my experience, on disk, it takes 100ms to GC a queue (at least under
> load)
>
> It seems like a better strategy here would be to have the GC interval be
> very low, say 5 seconds, but then have a getMaxPurgedDestinationsPerSweep
> value low.. say 1-5 …
>
> The problem is that there is no way to change this value anywhere.  There
> aren’t any configuration directives for this.  From the source, it looks
> like it’s only used within testing.
>
> Perhaps this should become a config directive?
>
> Kevin
>
> --
>
> Founder/CEO Spinn3r.com
> Location: *San Francisco, CA*
> blog: http://burtonator.wordpress.com
> … or check out my Google+ profile
> <https://plus.google.com/102718274791889610666/posts>
> <http://spinn3r.com>
>
>


--

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>
<http://spinn3r.com>
Reply | Threaded
Open this post in threaded view
|

Re: ActiveMQ performance during bulk queue GC.

Tim Bain
What if we acquired the lock inside the "for (Destination d : map.values())"
loop and immediately GCed that destination, rather than gathering them up
to do at the end as the code you posted currently does?  That allows the
broker to service other requests in the meantime (especially if we
Thread.yield() after releasing the lock at the end of the loop), but still
ensure that a destination isn't allowed to become active between when we
decide it's inactive and when we GC it.  We'd probably need to make a
defensive copy of the map to make sure that calling
getRoot().removeDestination() doesn't cause a concurrent modification
exception, but that's not expensive and I don't see anything else that
requires the lock to be held from one destination to the next.  Am I
overlooking anything here?

All of that doesn't explicitly address the fact that we acquire the lock
before we know there's work we have to do, but by releasing the lock
between iterations the consequences of doing that should be negligible, so
I think if we refactor when we hold the lock then we don't need to worry
about it.

Tim

On Sat, Feb 21, 2015 at 9:54 PM, Kevin Burton <[hidden email]> wrote:

> Also, I realized there’s another bug here.  A write lock is acquired even
> in the situation where no queues actually need to be GCd.  It’s not massive
> in the grand scheme of things but it probably ends up with brief moments
> where the broker isn’t serving any messages.
>
> On Sat, Feb 21, 2015 at 8:42 PM, Kevin Burton <[hidden email]> wrote:
>
> > I’m still tracking down this bulk queue GC bug I’ve been dealing with.
> >
> > Right now I’m working on a test to duplicate the problem.
> >
> > However, in the mean time I noticed this:
> >
> > http://pastebin.com/PwPJQLNu
> >
> > … in regionBroker.
> >
> > Essentially, the code holds the write lock during the entire time of the
> > GC for *all* the queues.
> >
> > In my experience, on disk, it takes 100ms to GC a queue (at least under
> > load)
> >
> > It seems like a better strategy here would be to have the GC interval be
> > very low, say 5 seconds, but then have a getMaxPurgedDestinationsPerSweep
> > value low.. say 1-5 …
> >
> > The problem is that there is no way to change this value anywhere.  There
> > aren’t any configuration directives for this.  From the source, it looks
> > like it’s only used within testing.
> >
> > Perhaps this should become a config directive?
> >
> > Kevin
> >
> > --
> >
> > Founder/CEO Spinn3r.com
> > Location: *San Francisco, CA*
> > blog: http://burtonator.wordpress.com
> > … or check out my Google+ profile
> > <https://plus.google.com/102718274791889610666/posts>
> > <http://spinn3r.com>
> >
> >
>
>
> --
>
> Founder/CEO Spinn3r.com
> Location: *San Francisco, CA*
> blog: http://burtonator.wordpress.com
> … or check out my Google+ profile
> <https://plus.google.com/102718274791889610666/posts>
> <http://spinn3r.com>
>
Reply | Threaded
Open this post in threaded view
|

Re: ActiveMQ performance during bulk queue GC.

Kevin Burton
On Sun, Feb 22, 2015 at 2:30 PM, Tim Bain <[hidden email]> wrote:

> What if we acquired the lock inside the "for (Destination d :
> map.values())"
> loop and immediately GCed that destination, rather than gathering them up
> to do at the end as the code you posted currently does?


I thought of this too, and while it has a benefit, it would suffer from the
problem of only allow one new producer or consumer in between queue GCs.

So say you have 5000 queues that need to be GCd, and this takes 5-15
minutes.

You only have small chunks of time,between GCs, to acquire the read lock
for new producers.

So you would GC one  ,then allow maybe a few producers to come in and do
more work, then you would immediately lock out all producers/consumers
again.

Additionally, the ReentrantReadWriteLock is not currently in fair mode.
This would mean the writer *could* always win which would put us back to
square one.  It’s improbable but still possible.

This is why I was suggesting making the lock granular on the queue name
basis.  this way  you can GC to your hearts content but you only block if
you try to fetch from a queue/topic with the *same* name.

>
> getRoot().removeDestination() doesn't cause a concurrent modification
> exception, but that's not expensive and I don't see anything else that
> requires the lock to be held from one destination to the next.  Am I
> overlooking anything here?
>
>
It’s better but still suffers from the above and I think in practice would
still be non-usable and still have significant latencies in production load.


> All of that doesn't explicitly address the fact that we acquire the lock
> before we know there's work we have to do, but by releasing the lock
> between iterations the consequences of doing that should be negligible, so
> I think if we refactor when we hold the lock then we don't need to worry
> about it.
>

It’s definitely the *easiest* solution though. I’ll give you that.

I just think that the best strategy would be to use a granular lock based
on the queue name and then remove it when not needed.

It would require some tricky concurrency work though.  Probably not too
bad.  Just a concurrent hash map with string (physical queue name) to
read/write lock.

You would want to remove the granular lock when you’re done with it though
because if you don’t you will end up with a memory leak for locks that are
referenced once and then never used again for VERY ephemeral queue names.

… you know.  ANOTHER idea, and this would be easier to implement, is to
just break this out into say 8/16/32 locks.

Then you mod the queue physical name with the number of locks.  It would be
easier to implement and more of a ‘concurrent’ read write lock.

In practice we DO NOT need concurrency of 5000 locks.  Really probably need
2-4x the number of cores (I think).

It’s easier to implement, reduces the probability of bottlenecks, but still
isn’t perfect.  Algorithmically it’s more correct to lock on the topic
name.

Kevin

--

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>
<http://spinn3r.com>