ActiveMQ 5.10.0 queue slowed down, restart helped

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

ActiveMQ 5.10.0 queue slowed down, restart helped

machinery
Hello,
We're using ActiveMQ 5.10.0.
About a month ago we noticed that one of our queues started overflowing. There was no apparent reason: the consumers didn't seem slower than normally, the messages from the producers were also not arriving any faster than normally. But still the messages were getting produced faster than they were getting consumed.
Simply restarting the broker fixed the issue - the big queue of waiting messages got pretty quickly consumed and the problem disappeared.

Since it already happened some time ago, I probably won't be able to provide many more details. However, are you aware of any similar problems or known issues that could have caused such behavior?

Thanks in advance for any help.

Best regards,
Piotr
Reply | Threaded
Open this post in threaded view
|

Re: ActiveMQ 5.10.0 queue slowed down, restart helped

sbarlabanov
Which environment do you have? Are you using an application server with ActiveMQ RAR or are the clients running standalone?

We had recently a problem with a message driven bean (Glassfish+ActiveMQ RAR), which stopped acknowledging messages and they were accumulating in the queue. All the messages were hanging at the consumer because MessageCountAwaitingAcknowledge JMS property of the consumer was >> 0. So for some reason the consumer was not properly closed.
Maybe you have a similar situation.

Best regards,
Sergiy
Reply | Threaded
Open this post in threaded view
|

Re: ActiveMQ 5.10.0 queue slowed down, restart helped

artnaseef
In reply to this post by machinery
There are many causes of this type of problem - in my experience.  The majority are problems in the application.

Without more details, it will be very hard to track down the cause that impacted your solution.

A review of the client code, with a focus on verifying ActiveMQ interactions, is often a good way to find and fix this kind of problem.
Reply | Threaded
Open this post in threaded view
|

Re: ActiveMQ 5.10.0 queue slowed down, restart helped

James Carman
In reply to this post by machinery
Obvious question here, but have there been any changes in the application
code recently?

On Friday, January 30, 2015, machinery <[hidden email]> wrote:

> Hello,
> We're using ActiveMQ 5.10.0.
> About a month ago we noticed that one of our queues started overflowing.
> There was no apparent reason: the consumers didn't seem slower than
> normally, the messages from the producers were also not arriving any faster
> than normally. But still the messages were getting produced faster than
> they
> were getting consumed.
> Simply restarting the broker fixed the issue - the big queue of waiting
> messages got pretty quickly consumed and the problem disappeared.
>
> Since it already happened some time ago, I probably won't be able to
> provide
> many more details. However, are you aware of any similar problems or known
> issues that could have caused such behavior?
>
> Thanks in advance for any help.
>
> Best regards,
> Piotr
>
>
>
> --
> View this message in context:
> http://activemq.2283324.n4.nabble.com/ActiveMQ-5-10-0-queue-slowed-down-restart-helped-tp4690706.html
> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>
mo
Reply | Threaded
Open this post in threaded view
|

Re: ActiveMQ 5.10.0 queue slowed down, restart helped

mo
This post has NOT been accepted by the mailing list yet.
This post was updated on .
In reply to this post by machinery
deleted, posted to the mailing list (didn't work through nabble, was refused by spam filter). actual post can be two posts below.
mo
Reply | Threaded
Open this post in threaded view
|

Re: ActiveMQ 5.10.0 queue slowed down, restart helped

mo
In reply to this post by machinery
Hi,

I work with Piotr on this issue. Let me try to provide some additional
information on our slow-down issue:

Storage is a PostgreSQL Server 9.3.2 on a Debian Wheezy / Kernel 3.2.51-1
System.

We use JDBC and the PGPoolingDataSource
(org.postgresql.ds.PGPoolingDataSource).

This is the persistenceAdapter configuration:
         <persistenceAdapter>
             <jdbcPersistenceAdapter dataDirectory="activemq-data"
dataSource="#postgres-ds" lockKeepAlivePeriod="0"
createTablesOnStartup="false" />
         </persistenceAdapter>

We have 2 destination interceptors setup. And we run the demo code
(jetty-demo) because we have some applications using the http/rest
interface it provides. We don't run camel.

Other than that it's a pretty mondane setup. And we also run two
instances at the same time as a sort of fail-over. Because of the
jdbc-backend, only one of them is active, and we use the failover
protocol on clientside to use the active one. We use haproxy to serve
the webinterface from the active instance. Both activemq-instances run
on the same linux box, with different service ip-adresses. (they use the
same binaries, only configuration and data directory are separated). The
reason we run two instances is that we had big stability issues before,
with the activemq process sort-of-hanging
itself up. We could move away from that setup, because with 5.10 this
hasn't happened.

Like the database server, the linux box that runs the activemq instance
is a Debian Wheezy Linux, but with Kernel 3.2.60-1+deb7u1.

Problem description: Once in a while we see 100% cpu load on the database.
We can isolate that to sql statements of the style:

SELECT ID, PRIORITY FROM ACTIVEMQ_MSGS WHERE
MSGID_PROD='ID:tomcat10-XXX-41356-1422538681150-1:95156:1:1' AND
MSGID_SEQ='1' AND  CONTAINER='queue://XXX_export'

These sql statements take more than 500ms. We've had scenarios where
they took more than 3 seconds to complete. Queuesize for 500ms was ~1200
messages for all queues (concentrated in one queue). With a production
of about 2-3 Messages per seconds and a consumption of about 2 messages
per second. Imho the queuesize and the query-time scales linearly.

We were able to "resolve" the issue by restarting both activemq
instances. After that, the load on the database drops dramatically,
instead of 100% cpu usage we see less than 10% on the database and a
very fast recovery. The ActiveMQ-Processes look fine too.

My first quess was a missing database index, but they look fine.
Besides, restarting the activemq instances resolves the issue .. which
is very very weired for me .. I don't think it's a database lock either,
because we couldn't see any and additionally, we see 100% cpu usage for
the process executing the statement (postgres spawns a process per
statement). That should imho (but I'm no database expect) not happen as
well when there's a lock situation...

We're at a loss. Do you guys have an idea?

And one more thing: Once every two or three hours a lot of (several
thousand) messages are created. But the above described problem is
happening irregularly, every one or two weeks or so.

Best regards,
Mark
Reply | Threaded
Open this post in threaded view
|

Re: ActiveMQ 5.10.0 queue slowed down, restart helped

xabhi
Hi,
I am also kind of facing a similar issue where ActiveMQ seems not be delivering any messages to queue consumers (only for persistent messages though). Consumers are not slow and there aren't any fast producers also.

http://activemq.2283324.n4.nabble.com/Broker-not-delivering-persistent-messages-to-consumer-on-queue-td4691245.html
 and
http://stackoverflow.com/questions/28556049/activemq-not-delivering-dispatching-persistent-messages-on-queues?noredirect=1#comment45424344_28556049

I have posted this on SO also because nobody replied on the previous post. Please checkout the links and see if any of the observations relates to you situation.

Thanks,
Abhi
Reply | Threaded
Open this post in threaded view
|

Re: ActiveMQ 5.10.0 queue slowed down, restart helped

Tim Bain
In reply to this post by mo
Mark,

You say the indices are OK; can you describe them for us, and can you find
out the execution plan for the query?  Also, if you issue the same query
directly against the database when this is happening, is that also slow?
I'm looking for whether the query itself is slow or the query is fast but
the surrounding ActiveMQ code is slow.

Also, have you looked to see if any computing resources (CPU, disk I/O,
network I/O, etc.) are heavily taxed on any of the machines involved (the
broker and the database server; any others?)?  Getting an idea of the
limiting resource might help figure out the problem.

Tim
On Feb 17, 2015 6:08 AM, "Mark Schmitt | Intratop" <[hidden email]>
wrote:

> Hi,
>
> I work with Piotr on this issue. Let me try to provide some additional
> information on our slow-down issue:
>
> Storage is a PostgreSQL Server 9.3.2 on a Debian Wheezy / Kernel 3.2.51-1
> System.
>
> We use JDBC and the PGPoolingDataSource
> (org.postgresql.ds.PGPoolingDataSource).
>
> This is the persistenceAdapter configuration:
>         <persistenceAdapter>
>             <jdbcPersistenceAdapter dataDirectory="activemq-data"
> dataSource="#postgres-ds" lockKeepAlivePeriod="0"
> createTablesOnStartup="false" />
>         </persistenceAdapter>
>
> We have 2 destination interceptors setup. And we run the demo code
> (jetty-demo) because we have some applications using the http/rest
> interface it provides. We don't run camel.
>
> Other than that it's a pretty mondane setup. And we also run two instances
> at the same time as a sort of fail-over. Because of the jdbc-backend, only
> one of them is active, and we use the failover protocol on clientside to
> use the active one. We use haproxy to serve the webinterface from the
> active instance. Both activemq-instances run on the same linux box, with
> different service ip-adresses. (they use the same binaries, only
> configuration and data directory are separated). The reason we run two
> instances is that we had big stability issues before, with the activemq
> process sort-of-hanging
> itself up. We could move away from that setup, because with 5.10 this
> hasn't happened.
>
> Like the database server, the linux box that runs the activemq instance is
> a Debian Wheezy Linux, but with Kernel 3.2.60-1+deb7u1.
>
> Problem description: Once in a while we see 100% cpu load on the database.
> We can isolate that to sql statements of the style:
>
> SELECT ID, PRIORITY FROM ACTIVEMQ_MSGS WHERE MSGID_PROD='ID:tomcat10-XXX-
> 41356-1422538681150-1:95156:1:1' AND MSGID_SEQ='1' AND
> CONTAINER='queue://XXX_export'
>
> These sql statements take more than 500ms. We've had scenarios where they
> took more than 3 seconds to complete. Queuesize for 500ms was ~1200
> messages for all queues (concentrated in one queue). With a production of
> about 2-3 Messages per seconds and a consumption of about 2 messages per
> second. Imho the queuesize and the query-time scales linearly.
>
> We were able to "resolve" the issue by restarting both activemq instances.
> After that, the load on the database drops dramatically, instead of 100%
> cpu usage we see less than 10% on the database and a very fast recovery.
> The ActiveMQ-Processes look fine too.
>
> My first quess was a missing database index, but they look fine. Besides,
> restarting the activemq instances resolves the issue .. which is very very
> weired for me .. I don't think it's a database lock either, because we
> couldn't see any and additionally, we see 100% cpu usage for the process
> executing the statement (postgres spawns a process per statement). That
> should imho (but I'm no database expect) not happen as well when there's a
> lock situation...
>
> We're at a loss. Do you guys have an idea?
>
> And one more thing: Once every two or three hours a lot of (several
> thousand) messages are created. But the above described problem is
> happening irregularly, every one or two weeks or so.
>
> Best regards,
> Mark
>
Reply | Threaded
Open this post in threaded view
|

Re: ActiveMQ 5.10.0 queue slowed down, restart helped

Tim Bain
In reply to this post by xabhi
Abhi,

What part of your problem seems related to Mark's problem to you?  Are you
using JDBC and PostgreSQL?  Are you seeing a slowdown in throughput (but
not a complete stoppage)?  Have you confirmed that the delay is linear with
the number of queued messages?

Other than the fact that you're both using ActiveMQ, I don't see anything
at all that would have made you think that your problem relates to Mark's
and would be relevant to helping Mark find a solution to his.

Tim
On Feb 18, 2015 3:48 AM, "xabhi" <[hidden email]> wrote:

> Hi,
> I am also kind of facing a similar issue where ActiveMQ seems not be
> delivering any messages to queue consumers (only for persistent messages
> though). Consumers are not slow and there aren't any fast producers also.
>
>
> http://activemq.2283324.n4.nabble.com/Broker-not-delivering-persistent-messages-to-consumer-on-queue-td4691245.html
> <
> http://activemq.2283324.n4.nabble.com/Broker-not-delivering-persistent-messages-to-consumer-on-queue-td4691245.html
> >
>  and
>
> http://stackoverflow.com/questions/28556049/activemq-not-delivering-dispatching-persistent-messages-on-queues?noredirect=1#comment45424344_28556049
> <
> http://stackoverflow.com/questions/28556049/activemq-not-delivering-dispatching-persistent-messages-on-queues?noredirect=1#comment45424344_28556049
> >
>
> I have posted this on SO also because nobody replied on the previous post.
> Please checkout the links and see if any of the observations relates to you
> situation.
>
> Thanks,
> Abhi
>
>
>
> --
> View this message in context:
> http://activemq.2283324.n4.nabble.com/ActiveMQ-5-10-0-queue-slowed-down-restart-helped-tp4690706p4691704.html
> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>
mo
Reply | Threaded
Open this post in threaded view
|

Re: ActiveMQ 5.10.0 queue slowed down, restart helped

mo
In reply to this post by Tim Bain
Hi Tim,

thanks for taking an interest.

This is the table's description:

amq=> \d activemq_msgs
          Tabelle „public.activemq_msgs“
    Spalte   |          Typ           | Attribute
------------+------------------------+-----------
  id         | bigint                 | not null
  container  | character varying(250) |
  msgid_prod | character varying(250) |
  msgid_seq  | bigint                 |
  expiration | bigint                 |
  msg        | bytea                  |
  priority   | bigint                 |
  xid        | character varying(250) |
Indexe:
     "activemq_msgs_pkey" PRIMARY KEY, btree (id)
     "activemq_msgs_cidx" btree (container)
     "activemq_msgs_eidx" btree (expiration)
     "activemq_msgs_idx" btree (msgid_prod)
     "activemq_msgs_midx" btree (msgid_prod, msgid_seq)
     "activemq_msgs_pidx" btree (priority)
     "activemq_msgs_xidx" btree (xid)

Running an explain I get...

amq=> explain SELECT ID, PRIORITY FROM ACTIVEMQ_MSGS WHERE
MSGID_PROD='ID:tomcat10-XXX-41356-1422538681150-1:95156:1:1' AND
MSGID_SEQ='1' AND  CONTAINER='queue://XXX_export';
                                                        QUERY PLAN

------------------------------------------------------------------------------------------------------------------------
  Index Scan using activemq_msgs_cidx on activemq_msgs  (cost=0.42..8.45
rows=1 width=16)
    Index Cond: ((container)::text = 'queue://XXX_export'::text)
    Filter: (((msgid_prod)::text =
'ID:tomcat10-XXX-41356-1422538681150-1:95156:1:1'::text) AND (msgid_seq
= 1::bigint))
(3 Zeilen)

I think the Filter here could be problematic. Though I'm not sure why it
is not using activemq_msgs_idx or activemq_msgs_midx.

When I issue the same type of query against the database while having a
slow-down I get similarly slow results as does the activemq process.
However, restarting the activemq and then issueing the same type of
query (of course changing some parameters so no caching occurs) we see
very fast responses.

On the database we always see 100% cpu usage on one core, by one
process. There's no I/O issue as far as I can tell.

One more hint: We have two queues that usually get very big during these
slow-downs, and the responses of the above statements scale roughly
linearly to their size. Just to give you an idea .. queue "R" might have
3000 messages and 3 seconds per above statement queue "B" might have
2000 messages and about 2 seconds per above statement. So it does look
very much like the filter is the issue .. but the thing still throwing
me off is simply that an activemq-restart fixes the issue. After that,
the very same statements run fast.

best regards,
Mark



On 02/23/2015 02:49 PM, Tim Bain [via ActiveMQ] wrote:

> Mark,
>
> You say the indices are OK; can you describe them for us, and can you find
> out the execution plan for the query?  Also, if you issue the same query
> directly against the database when this is happening, is that also slow?
> I'm looking for whether the query itself is slow or the query is fast but
> the surrounding ActiveMQ code is slow.
>
> Also, have you looked to see if any computing resources (CPU, disk I/O,
> network I/O, etc.) are heavily taxed on any of the machines involved (the
> broker and the database server; any others?)?  Getting an idea of the
> limiting resource might help figure out the problem.
>
> Tim
> On Feb 17, 2015 6:08 AM, "Mark Schmitt | Intratop" <[hidden email]
> </user/SendEmail.jtp?type=node&node=4691891&i=0>>
> wrote:
>
>  > Hi,
>  >
>  > I work with Piotr on this issue. Let me try to provide some additional
>  > information on our slow-down issue:
>  >
>  > Storage is a PostgreSQL Server 9.3.2 on a Debian Wheezy / Kernel
> 3.2.51-1
>  > System.
>  >
>  > We use JDBC and the PGPoolingDataSource
>  > (org.postgresql.ds.PGPoolingDataSource).
>  >
>  > This is the persistenceAdapter configuration:
>  >         <persistenceAdapter>
>  >             <jdbcPersistenceAdapter dataDirectory="activemq-data"
>  > dataSource="#postgres-ds" lockKeepAlivePeriod="0"
>  > createTablesOnStartup="false" />
>  >         </persistenceAdapter>
>  >
>  > We have 2 destination interceptors setup. And we run the demo code
>  > (jetty-demo) because we have some applications using the http/rest
>  > interface it provides. We don't run camel.
>  >
>  > Other than that it's a pretty mondane setup. And we also run two
> instances
>  > at the same time as a sort of fail-over. Because of the jdbc-backend,
> only
>  > one of them is active, and we use the failover protocol on clientside to
>  > use the active one. We use haproxy to serve the webinterface from the
>  > active instance. Both activemq-instances run on the same linux box, with
>  > different service ip-adresses. (they use the same binaries, only
>  > configuration and data directory are separated). The reason we run two
>  > instances is that we had big stability issues before, with the activemq
>  > process sort-of-hanging
>  > itself up. We could move away from that setup, because with 5.10 this
>  > hasn't happened.
>  >
>  > Like the database server, the linux box that runs the activemq
> instance is
>  > a Debian Wheezy Linux, but with Kernel 3.2.60-1+deb7u1.
>  >
>  > Problem description: Once in a while we see 100% cpu load on the
> database.
>  > We can isolate that to sql statements of the style:
>  >
>  > SELECT ID, PRIORITY FROM ACTIVEMQ_MSGS WHERE
> MSGID_PROD='ID:tomcat10-XXX-
>  > 41356-1422538681150-1:95156:1:1' AND MSGID_SEQ='1' AND
>  > CONTAINER='queue://XXX_export'
>  >
>  > These sql statements take more than 500ms. We've had scenarios where
> they
>  > took more than 3 seconds to complete. Queuesize for 500ms was ~1200
>  > messages for all queues (concentrated in one queue). With a
> production of
>  > about 2-3 Messages per seconds and a consumption of about 2 messages per
>  > second. Imho the queuesize and the query-time scales linearly.
>  >
>  > We were able to "resolve" the issue by restarting both activemq
> instances.
>  > After that, the load on the database drops dramatically, instead of 100%
>  > cpu usage we see less than 10% on the database and a very fast recovery.
>  > The ActiveMQ-Processes look fine too.
>  >
>  > My first quess was a missing database index, but they look fine.
> Besides,
>  > restarting the activemq instances resolves the issue .. which is very
> very
>  > weired for me .. I don't think it's a database lock either, because we
>  > couldn't see any and additionally, we see 100% cpu usage for the process
>  > executing the statement (postgres spawns a process per statement). That
>  > should imho (but I'm no database expect) not happen as well when
> there's a
>  > lock situation...
>  >
>  > We're at a loss. Do you guys have an idea?
>  >
>  > And one more thing: Once every two or three hours a lot of (several
>  > thousand) messages are created. But the above described problem is
>  > happening irregularly, every one or two weeks or so.
>  >
>  > Best regards,
>  > Mark
>  >
>
>
> ------------------------------------------------------------------------
> If you reply to this email, your message will be added to the discussion
> below:
> http://activemq.2283324.n4.nabble.com/ActiveMQ-5-10-0-queue-slowed-down-restart-helped-tp4690706p4691891.html
>
> To unsubscribe from ActiveMQ 5.10.0 queue slowed down, restart helped,
> click here
> <
> NAML
> <
http://activemq.2283324.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>

--
Mit freundlichen Grüßen

Mark Schmitt
--
intratop UG (haftungsbeschränkt)

Lise-Meitner-Straße 9
89081 Ulm

Telefon: +49-731-146603-70
Durchwahl: +49-731-146603-79
Telefax: +49-731-146603-72

E-Mail: [hidden email]

Vertreten durch: Herr Mark Oliver Schmitt


Registereintrag:
Eintragung im Handelsregister.
Registergericht: Amtsgericht Ulm
Registernummer: HRB 727676
Reply | Threaded
Open this post in threaded view
|

Re: ActiveMQ 5.10.0 queue slowed down, restart helped

xabhi
This post was updated on .
In reply to this post by Tim Bain
Hi Tim,
The reason I think it might be related (though I use KahaDB for persistence) is because this issue affected specifically the persistent messages in my case, the issue happens every one week or two.
Restarting helped in consumption of messages.
In my case the messages consumption was delayed multiple times for some 30 mins or so ad some times for about an hour.
The resource consumption is also normal, nothing wrong with it
I haven't been able to reproduce it.

I am asking for pointers so that i can debug this issue better next time it happens. For this same reason, I asked @mo to take a look at my observations and see if they happens in their setup also.

Sorry if at any point I diverted the discussion into wrong direction.

Thanks,
Abhi
mo
Reply | Threaded
Open this post in threaded view
|

Re: ActiveMQ 5.10.0 queue slowed down, restart helped

mo
Hi Abhi,

I can't see any similarities to my problem. I think we're having very
different issues.

best regards,
Mark

On 02/24/2015 10:45 AM, xabhi [via ActiveMQ] wrote:

> Hi Tim,
> The reason I think it might be related (though I use KahaDB for
> persistence) is because this issue affected specifically the persistent
> messages in my case, the issue happens every one week or two.
> Restarting helped in consumption of messages.
> In my case the messages consumption was delayed multiple times for some
> 30 mins or so ad some times for about an hour.
> The resource consumption is also normal, nothing wrong with it
> I haven't been able to reproduce it.
>
> I am asking for pointers so that i can debug this issue better next time
> it happens. For this same reason, I asked @mo to take a look at my
> observations and see if they happens in their setup also.
>
> Thanks,
> Abhi
>
> ------------------------------------------------------------------------
> If you reply to this email, your message will be added to the discussion
> below:
> http://activemq.2283324.n4.nabble.com/ActiveMQ-5-10-0-queue-slowed-down-restart-helped-tp4690706p4691945.html
>
> To unsubscribe from ActiveMQ 5.10.0 queue slowed down, restart helped,
> click here
> <
> NAML
> <
http://activemq.2283324.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>

--
Mit freundlichen Grüßen

Mark Schmitt
--
intratop UG (haftungsbeschränkt)

Lise-Meitner-Straße 9
89081 Ulm

Telefon: +49-731-146603-70
Durchwahl: +49-731-146603-79
Telefax: +49-731-146603-72

E-Mail: [hidden email]

Vertreten durch: Herr Mark Oliver Schmitt


Registereintrag:
Eintragung im Handelsregister.
Registergericht: Amtsgericht Ulm
Registernummer: HRB 727676
Reply | Threaded
Open this post in threaded view
|

Re: ActiveMQ 5.10.0 queue slowed down, restart helped

Tim Bain
In reply to this post by mo
Mo,

Sorry for the long delay in getting back to you; maybe you've already
figured out your problem, but if not hopefully this will help.

My understanding of how RDBMSes use indices is based on Oracle, so take
this with a grain of salt since it might not apply to PostgreSQL.

As I understand it, most RDBMSes will only pick a single index (the one
they think will result in the most efficient query execution) for a simple
SELECT query like yours; the index will be used to identify rows matching
as many of the criteria in the query as possible, and then those rows will
be loaded to see if they match any criteria that can't be evaluated via the
index and to retrieve any values in the SELECT clause that aren't in the
index.  Using an index that contains only some of the columns in your
SELECT clause is better than doing a full-table scan, but it's not great
because you load lots of rows that you don't need to load.  If all of the
columns in your SELECT and WHERE clauses are in the index, it can skip the
row retrieval entirely (which is the fastest scenario); if all of the
columns in your SELECT clause are in the index, it will only do retrievals
of the rows that actually match (which is still pretty fast and should be
your goal).

In your case, you have indices on CONTAINER and on (MSGID_SEQ, MSGID_PROD),
but no index that covers all three together.  I believe that your
performance should get better if you add that additional index (or modify
one of those existing indices to add the missing fields; you'll have to
evaluate which approach is better for your needs).

Tim

On Mon, Feb 23, 2015 at 7:48 AM, mo <[hidden email]> wrote:

> Hi Tim,
>
> thanks for taking an interest.
>
> This is the table's description:
>
> amq=> \d activemq_msgs
>           Tabelle „public.activemq_msgs“
>     Spalte   |          Typ           | Attribute
> ------------+------------------------+-----------
>   id         | bigint                 | not null
>   container  | character varying(250) |
>   msgid_prod | character varying(250) |
>   msgid_seq  | bigint                 |
>   expiration | bigint                 |
>   msg        | bytea                  |
>   priority   | bigint                 |
>   xid        | character varying(250) |
> Indexe:
>      "activemq_msgs_pkey" PRIMARY KEY, btree (id)
>      "activemq_msgs_cidx" btree (container)
>      "activemq_msgs_eidx" btree (expiration)
>      "activemq_msgs_idx" btree (msgid_prod)
>      "activemq_msgs_midx" btree (msgid_prod, msgid_seq)
>      "activemq_msgs_pidx" btree (priority)
>      "activemq_msgs_xidx" btree (xid)
>
> Running an explain I get...
>
> amq=> explain SELECT ID, PRIORITY FROM ACTIVEMQ_MSGS WHERE
> MSGID_PROD='ID:tomcat10-XXX-41356-1422538681150-1:95156:1:1' AND
> MSGID_SEQ='1' AND  CONTAINER='queue://XXX_export';
>                                                         QUERY PLAN
>
>
> ------------------------------------------------------------------------------------------------------------------------
>   Index Scan using activemq_msgs_cidx on activemq_msgs  (cost=0.42..8.45
> rows=1 width=16)
>     Index Cond: ((container)::text = 'queue://XXX_export'::text)
>     Filter: (((msgid_prod)::text =
> 'ID:tomcat10-XXX-41356-1422538681150-1:95156:1:1'::text) AND (msgid_seq
> = 1::bigint))
> (3 Zeilen)
>
> I think the Filter here could be problematic. Though I'm not sure why it
> is not using activemq_msgs_idx or activemq_msgs_midx.
>
> When I issue the same type of query against the database while having a
> slow-down I get similarly slow results as does the activemq process.
> However, restarting the activemq and then issueing the same type of
> query (of course changing some parameters so no caching occurs) we see
> very fast responses.
>
> On the database we always see 100% cpu usage on one core, by one
> process. There's no I/O issue as far as I can tell.
>
> One more hint: We have two queues that usually get very big during these
> slow-downs, and the responses of the above statements scale roughly
> linearly to their size. Just to give you an idea .. queue "R" might have
> 3000 messages and 3 seconds per above statement queue "B" might have
> 2000 messages and about 2 seconds per above statement. So it does look
> very much like the filter is the issue .. but the thing still throwing
> me off is simply that an activemq-restart fixes the issue. After that,
> the very same statements run fast.
>
> best regards,
> Mark
>
>
>
> On 02/23/2015 02:49 PM, Tim Bain [via ActiveMQ] wrote:
> > Mark,
> >
> > You say the indices are OK; can you describe them for us, and can you
> find
> > out the execution plan for the query?  Also, if you issue the same query
> > directly against the database when this is happening, is that also slow?
> > I'm looking for whether the query itself is slow or the query is fast but
> > the surrounding ActiveMQ code is slow.
> >
> > Also, have you looked to see if any computing resources (CPU, disk I/O,
> > network I/O, etc.) are heavily taxed on any of the machines involved (the
> > broker and the database server; any others?)?  Getting an idea of the
> > limiting resource might help figure out the problem.
> >
> > Tim
> > On Feb 17, 2015 6:08 AM, "Mark Schmitt | Intratop" <[hidden email]
> > </user/SendEmail.jtp?type=node&node=4691891&i=0>>
> > wrote:
> >
> >  > Hi,
> >  >
> >  > I work with Piotr on this issue. Let me try to provide some additional
> >  > information on our slow-down issue:
> >  >
> >  > Storage is a PostgreSQL Server 9.3.2 on a Debian Wheezy / Kernel
> > 3.2.51-1
> >  > System.
> >  >
> >  > We use JDBC and the PGPoolingDataSource
> >  > (org.postgresql.ds.PGPoolingDataSource).
> >  >
> >  > This is the persistenceAdapter configuration:
> >  >         <persistenceAdapter>
> >  >             <jdbcPersistenceAdapter dataDirectory="activemq-data"
> >  > dataSource="#postgres-ds" lockKeepAlivePeriod="0"
> >  > createTablesOnStartup="false" />
> >  >         </persistenceAdapter>
> >  >
> >  > We have 2 destination interceptors setup. And we run the demo code
> >  > (jetty-demo) because we have some applications using the http/rest
> >  > interface it provides. We don't run camel.
> >  >
> >  > Other than that it's a pretty mondane setup. And we also run two
> > instances
> >  > at the same time as a sort of fail-over. Because of the jdbc-backend,
> > only
> >  > one of them is active, and we use the failover protocol on clientside
> to
> >  > use the active one. We use haproxy to serve the webinterface from the
> >  > active instance. Both activemq-instances run on the same linux box,
> with
> >  > different service ip-adresses. (they use the same binaries, only
> >  > configuration and data directory are separated). The reason we run two
> >  > instances is that we had big stability issues before, with the
> activemq
> >  > process sort-of-hanging
> >  > itself up. We could move away from that setup, because with 5.10 this
> >  > hasn't happened.
> >  >
> >  > Like the database server, the linux box that runs the activemq
> > instance is
> >  > a Debian Wheezy Linux, but with Kernel 3.2.60-1+deb7u1.
> >  >
> >  > Problem description: Once in a while we see 100% cpu load on the
> > database.
> >  > We can isolate that to sql statements of the style:
> >  >
> >  > SELECT ID, PRIORITY FROM ACTIVEMQ_MSGS WHERE
> > MSGID_PROD='ID:tomcat10-XXX-
> >  > 41356-1422538681150-1:95156:1:1' AND MSGID_SEQ='1' AND
> >  > CONTAINER='queue://XXX_export'
> >  >
> >  > These sql statements take more than 500ms. We've had scenarios where
> > they
> >  > took more than 3 seconds to complete. Queuesize for 500ms was ~1200
> >  > messages for all queues (concentrated in one queue). With a
> > production of
> >  > about 2-3 Messages per seconds and a consumption of about 2 messages
> per
> >  > second. Imho the queuesize and the query-time scales linearly.
> >  >
> >  > We were able to "resolve" the issue by restarting both activemq
> > instances.
> >  > After that, the load on the database drops dramatically, instead of
> 100%
> >  > cpu usage we see less than 10% on the database and a very fast
> recovery.
> >  > The ActiveMQ-Processes look fine too.
> >  >
> >  > My first quess was a missing database index, but they look fine.
> > Besides,
> >  > restarting the activemq instances resolves the issue .. which is very
> > very
> >  > weired for me .. I don't think it's a database lock either, because we
> >  > couldn't see any and additionally, we see 100% cpu usage for the
> process
> >  > executing the statement (postgres spawns a process per statement).
> That
> >  > should imho (but I'm no database expect) not happen as well when
> > there's a
> >  > lock situation...
> >  >
> >  > We're at a loss. Do you guys have an idea?
> >  >
> >  > And one more thing: Once every two or three hours a lot of (several
> >  > thousand) messages are created. But the above described problem is
> >  > happening irregularly, every one or two weeks or so.
> >  >
> >  > Best regards,
> >  > Mark
> >  >
> >
> >
> > ------------------------------------------------------------------------
> > If you reply to this email, your message will be added to the discussion
> > below:
> >
> http://activemq.2283324.n4.nabble.com/ActiveMQ-5-10-0-queue-slowed-down-restart-helped-tp4690706p4691891.html
> >
> > To unsubscribe from ActiveMQ 5.10.0 queue slowed down, restart helped,
> > click here
> > <
> >.
> > NAML
> > <
>
http://activemq.2283324.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml
> >
> >
>
> --
> Mit freundlichen Grüßen
>
> Mark Schmitt
> --
> intratop UG (haftungsbeschränkt)
>
> Lise-Meitner-Straße 9
> 89081 Ulm
>
> Telefon: +49-731-146603-70
> Durchwahl: +49-731-146603-79
> Telefax: +49-731-146603-72
>
> E-Mail: [hidden email]
>
> Vertreten durch: Herr Mark Oliver Schmitt
>
>
> Registereintrag:
> Eintragung im Handelsregister.
> Registergericht: Amtsgericht Ulm
> Registernummer: HRB 727676
>
>
>
>
> --
> View this message in context:
> http://activemq.2283324.n4.nabble.com/ActiveMQ-5-10-0-queue-slowed-down-restart-helped-tp4690706p4691897.html
> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>