KAHADB clean up old log files

classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

KAHADB clean up old log files

Hill, Steve
Hi fellow ActiveMQ users!

I have searched all over the internet and can not find a solution to a problem we are experiencing where old log files are not removed from KAHADB.  Needless to say we uncovered this problem when we exceeded the size of our persistent store in production :-)

We are using client side acknowledgement for all reads from the Queue and have restarted all clients and also ActiveMQ numerous times, yet the messages do not get removed.  We are also using the network connector to move messages to the queue that has an active consumer.  We were using 5.13.0 but moved to 5.14.0 on the hope it might address this problem.

Follows are some more detailed messages; does anyone have an idea how we can resolve this problem?

KAHADB logging is enabled and we see the following messages in the log file.


 2016-09-20 08:59:10,207 [eckpoint Worker] TRACE MessageDatabase                - not removing data file: 1278 as contained ack(s) refer to referenced file: [1278, 1277]

 2016-09-20 08:59:10,207 [eckpoint Worker] TRACE MessageDatabase                - not removing data file: 1279 as contained ack(s) refer to referenced file: [674]

 2016-09-20 08:59:10,208 [eckpoint Worker] TRACE MessageDatabase                - not removing data file: 1280 as contained ack(s) refer to referenced file: [691]

 2016-09-20 08:59:10,208 [eckpoint Worker] TRACE MessageDatabase                - not removing data file: 1281 as contained ack(s) refer to referenced file: [699]

 2016-09-20 08:59:10,209 [eckpoint Worker] TRACE MessageDatabase                - not removing data file: 1282 as contained ack(s) refer to referenced file: [699]

 2016-09-20 08:59:10,210 [eckpoint Worker] TRACE MessageDatabase                - not removing data file: 1283 as contained ack(s) refer to referenced file: [705]

 2016-09-20 08:59:10,210 [eckpoint Worker] TRACE MessageDatabase                - not removing data file: 1284 as contained ack(s) refer to referenced file: [711]

 2016-09-20 08:59:10,211 [eckpoint Worker] TRACE MessageDatabase                - not removing data file: 1288 as contained ack(s) refer to referenced file: [674]

 2016-09-20 08:59:10,211 [eckpoint Worker] TRACE MessageDatabase                - not removing data file: 1289 as contained ack(s) refer to referenced file: [790]

 2016-09-20 08:59:10,212 [eckpoint Worker] TRACE MessageDatabase                - not removing data file: 1290 as contained ack(s) refer to referenced file: [1290, 1278]

 2016-09-20 08:59:10,213 [eckpoint Worker] TRACE MessageDatabase                - not removing data file: 1291 as contained ack(s) refer to referenced file: [790]

The files at this point are months old.


-rw-r--r--. 1 activemq activemq 33554432 Jul 14 21:23 db-674.log

-rw-r--r--. 1 activemq activemq 33554432 Jul 16 01:57 db-691.log

-rw-r--r--. 1 activemq activemq 33554432 Jul 17 02:10 db-699.log

-rw-r--r--. 1 activemq activemq 33554432 Jul 18 03:55 db-705.log

-rw-r--r--. 1 activemq activemq 33554432 Jul 19 01:32 db-711.log

-rw-r--r--. 1 activemq activemq 33554432 Jul 25 15:15 db-790.log

-rw-r--r--. 1 activemq activemq 33554432 Jul 26 10:32 db-858.log

-rw-r--r--. 1 activemq activemq 33554432 Jul 27 00:19 db-865.log

-rw-r--r--. 1 activemq activemq 33554432 Jul 27 03:03 db-866.log

-rw-r--r--. 1 activemq activemq 33554432 Jul 28 01:45 db-877.log

-rw-r--r--. 1 activemq activemq 33554432 Jul 29 01:18 db-888.log

-rw-r--r--. 1 activemq activemq 33554432 Jul 30 02:10 db-899.log

-rw-r--r--. 1 activemq activemq 33554432 Jul 31 01:11 db-904.log

-rw-r--r--. 1 activemq activemq 33554432 Aug  1 00:19 db-909.log

-rw-r--r--. 1 activemq activemq 33554432 Aug  1 04:42 db-910.log

-rw-r--r--. 1 activemq activemq 33554432 Aug  2 01:07 db-918.log

-rw-r--r--. 1 activemq activemq 33554432 Aug  3 01:38 db-930.log

-rw-r--r--. 1 activemq activemq 33554432 Aug  4 00:19 db-939.log

-rw-r--r--. 1 activemq activemq 33554432 Aug  4 03:16 db-940.log

-rw-r--r--. 1 activemq activemq 33554432 Aug  5 00:23 db-949.log

-rw-r--r--. 1 activemq activemq 33554432 Aug  6 02:42 db-960.log

-rw-r--r--. 1 activemq activemq 33554432 Aug  7 02:18 db-968.log

-rw-r--r--. 1 activemq activemq 33554432 Aug  8 00:44 db-975.log

-rw-r--r--. 1 activemq activemq 33554432 Aug  8 14:00 db-980.log

-rw-r--r--. 1 activemq activemq 33554432 Aug  8 15:41 db-981.log

-rw-r--r--. 1 activemq activemq 33554432 Aug  9 01:43 db-1051.log

-rw-r--r--. 1 activemq activemq 33554432 Aug 11 00:20 db-1066.log

-rw-r--r--. 1 activemq activemq 33554432 Aug 11 03:01 db-1067.log

-rw-r--r--. 1 activemq activemq 33554432 Aug 12 00:19 db-1074.log

-rw-r--r--. 1 activemq activemq 33554432 Aug 12 03:05 db-1075.log

-rw-r--r--. 1 activemq activemq 33554432 Aug 13 00:19 db-1082.log

-rw-r--r--. 1 activemq activemq 33554432 Aug 13 03:04 db-1083.log

-rw-r--r--. 1 activemq activemq 33554432 Aug 14 01:06 db-1093.log

-rw-r--r--. 1 activemq activemq 33554432 Aug 15 01:18 db-1111.log

-rw-r--r--. 1 activemq activemq 33554432 Aug 16 02:38 db-1126.log

-rw-r--r--. 1 activemq activemq 33554432 Aug 17 00:19 db-1137.log

-rw-r--r--. 1 activemq activemq 33554432 Aug 17 02:32 db-1138.log

-rw-r--r--. 1 activemq activemq 33554432 Aug 18 00:21 db-1155.log

-rw-r--r--. 1 activemq activemq 33554432 Aug 19 00:51 db-1167.log

-rw-r--r--. 1 activemq activemq 33554432 Sep 19 13:42 db-1277.log

-rw-r--r--. 1 activemq activemq   597139 Sep 19 14:59 db-1279.log

Thanks!

Steve Hill

ARMADA




The information contained in this e-mail and any attachments is confidential and
intended only for the recipient. If you are not the intended recipient, the
information contained in this message may not be used, copied, or forwarded to
third parties or otherwise distributed for any other purpose. Please notify the
sender if you received this e-mail in error and delete the e-mail and its
attachments promptly.  Nothing in this e-mail may be used or deemed to form the
basis of a contractual or any other legally binding obligation unless separately
confirmed in writing by an authorized representative of ARMADA.
Reply | Threaded
Open this post in threaded view
|

Re: KAHADB clean up old log files

christopher.l.shannon
Do you have old messages laying around?  KahaDB can't clean up old journal
files if there are messages scattered throughout the logs.  Even a single
unacked message means that a journal file (and the future files) can't be
cleaned up. This situation tends to happen either when there are old
messages not being acked quick enough or there are different destinations
with different speeds.  Ie if one queue has a very slow message rate and
another queue has a high message rate.

Assuming there isn't some sort of bug with the GC process then one way to
help lower the usage would be to split up destinations into different
stores using multiKahaDB.  Take a look at Gary's blog post on this:
http://blog.garytully.com/2011/11/activemq-multiple-kahadb-instances.html


On Tue, Sep 20, 2016 at 1:04 PM, Hill, Steve <[hidden email]> wrote:

> Hi fellow ActiveMQ users!
>
> I have searched all over the internet and can not find a solution to a
> problem we are experiencing where old log files are not removed from
> KAHADB.  Needless to say we uncovered this problem when we exceeded the
> size of our persistent store in production :-)
>
> We are using client side acknowledgement for all reads from the Queue and
> have restarted all clients and also ActiveMQ numerous times, yet the
> messages do not get removed.  We are also using the network connector to
> move messages to the queue that has an active consumer.  We were using
> 5.13.0 but moved to 5.14.0 on the hope it might address this problem.
>
> Follows are some more detailed messages; does anyone have an idea how we
> can resolve this problem?
>
> KAHADB logging is enabled and we see the following messages in the log
> file.
>
>
>  2016-09-20 08:59:10,207 [eckpoint Worker] TRACE MessageDatabase
>       - not removing data file: 1278 as contained ack(s) refer to
> referenced file: [1278, 1277]
>
>  2016-09-20 08:59:10,207 [eckpoint Worker] TRACE MessageDatabase
>       - not removing data file: 1279 as contained ack(s) refer to
> referenced file: [674]
>
>  2016-09-20 08:59:10,208 [eckpoint Worker] TRACE MessageDatabase
>       - not removing data file: 1280 as contained ack(s) refer to
> referenced file: [691]
>
>  2016-09-20 08:59:10,208 [eckpoint Worker] TRACE MessageDatabase
>       - not removing data file: 1281 as contained ack(s) refer to
> referenced file: [699]
>
>  2016-09-20 08:59:10,209 [eckpoint Worker] TRACE MessageDatabase
>       - not removing data file: 1282 as contained ack(s) refer to
> referenced file: [699]
>
>  2016-09-20 08:59:10,210 [eckpoint Worker] TRACE MessageDatabase
>       - not removing data file: 1283 as contained ack(s) refer to
> referenced file: [705]
>
>  2016-09-20 08:59:10,210 [eckpoint Worker] TRACE MessageDatabase
>       - not removing data file: 1284 as contained ack(s) refer to
> referenced file: [711]
>
>  2016-09-20 08:59:10,211 [eckpoint Worker] TRACE MessageDatabase
>       - not removing data file: 1288 as contained ack(s) refer to
> referenced file: [674]
>
>  2016-09-20 08:59:10,211 [eckpoint Worker] TRACE MessageDatabase
>       - not removing data file: 1289 as contained ack(s) refer to
> referenced file: [790]
>
>  2016-09-20 08:59:10,212 [eckpoint Worker] TRACE MessageDatabase
>       - not removing data file: 1290 as contained ack(s) refer to
> referenced file: [1290, 1278]
>
>  2016-09-20 08:59:10,213 [eckpoint Worker] TRACE MessageDatabase
>       - not removing data file: 1291 as contained ack(s) refer to
> referenced file: [790]
>
> The files at this point are months old.
>
>
> -rw-r--r--. 1 activemq activemq 33554432 Jul 14 21:23 db-674.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Jul 16 01:57 db-691.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Jul 17 02:10 db-699.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Jul 18 03:55 db-705.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Jul 19 01:32 db-711.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Jul 25 15:15 db-790.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Jul 26 10:32 db-858.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Jul 27 00:19 db-865.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Jul 27 03:03 db-866.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Jul 28 01:45 db-877.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Jul 29 01:18 db-888.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Jul 30 02:10 db-899.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Jul 31 01:11 db-904.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug  1 00:19 db-909.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug  1 04:42 db-910.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug  2 01:07 db-918.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug  3 01:38 db-930.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug  4 00:19 db-939.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug  4 03:16 db-940.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug  5 00:23 db-949.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug  6 02:42 db-960.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug  7 02:18 db-968.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug  8 00:44 db-975.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug  8 14:00 db-980.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug  8 15:41 db-981.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug  9 01:43 db-1051.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug 11 00:20 db-1066.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug 11 03:01 db-1067.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug 12 00:19 db-1074.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug 12 03:05 db-1075.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug 13 00:19 db-1082.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug 13 03:04 db-1083.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug 14 01:06 db-1093.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug 15 01:18 db-1111.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug 16 02:38 db-1126.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug 17 00:19 db-1137.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug 17 02:32 db-1138.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug 18 00:21 db-1155.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug 19 00:51 db-1167.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Sep 19 13:42 db-1277.log
>
> -rw-r--r--. 1 activemq activemq   597139 Sep 19 14:59 db-1279.log
>
> Thanks!
>
> Steve Hill
>
> ARMADA
>
>
>
>
> The information contained in this e-mail and any attachments is
> confidential and
> intended only for the recipient. If you are not the intended recipient, the
> information contained in this message may not be used, copied, or
> forwarded to
> third parties or otherwise distributed for any other purpose. Please
> notify the
> sender if you received this e-mail in error and delete the e-mail and its
> attachments promptly.  Nothing in this e-mail may be used or deemed to
> form the
> basis of a contractual or any other legally binding obligation unless
> separately
> confirmed in writing by an authorized representative of ARMADA.
>
Reply | Threaded
Open this post in threaded view
|

Re: KAHADB clean up old log files

Tim Bain
The most likely place for you to have old messages laying around is in the
DLQ.  The next most likely place is on a topic for an offline (and never
coming back) durable subscriber.

Tim

On Sep 20, 2016 2:02 PM, "Christopher Shannon" <
[hidden email]> wrote:

Do you have old messages laying around?  KahaDB can't clean up old journal
files if there are messages scattered throughout the logs.  Even a single
unacked message means that a journal file (and the future files) can't be
cleaned up. This situation tends to happen either when there are old
messages not being acked quick enough or there are different destinations
with different speeds.  Ie if one queue has a very slow message rate and
another queue has a high message rate.

Assuming there isn't some sort of bug with the GC process then one way to
help lower the usage would be to split up destinations into different
stores using multiKahaDB.  Take a look at Gary's blog post on this:
http://blog.garytully.com/2011/11/activemq-multiple-kahadb-instances.html


On Tue, Sep 20, 2016 at 1:04 PM, Hill, Steve <[hidden email]> wrote:

> Hi fellow ActiveMQ users!
>
> I have searched all over the internet and can not find a solution to a
> problem we are experiencing where old log files are not removed from
> KAHADB.  Needless to say we uncovered this problem when we exceeded the
> size of our persistent store in production :-)
>
> We are using client side acknowledgement for all reads from the Queue and
> have restarted all clients and also ActiveMQ numerous times, yet the
> messages do not get removed.  We are also using the network connector to
> move messages to the queue that has an active consumer.  We were using
> 5.13.0 but moved to 5.14.0 on the hope it might address this problem.
>
> Follows are some more detailed messages; does anyone have an idea how we
> can resolve this problem?
>
> KAHADB logging is enabled and we see the following messages in the log
> file.
>
>
>  2016-09-20 08:59:10,207 [eckpoint Worker] TRACE MessageDatabase
>       - not removing data file: 1278 as contained ack(s) refer to
> referenced file: [1278, 1277]
>
>  2016-09-20 08:59:10,207 [eckpoint Worker] TRACE MessageDatabase
>       - not removing data file: 1279 as contained ack(s) refer to
> referenced file: [674]
>
>  2016-09-20 08:59:10,208 [eckpoint Worker] TRACE MessageDatabase
>       - not removing data file: 1280 as contained ack(s) refer to
> referenced file: [691]
>
>  2016-09-20 08:59:10,208 [eckpoint Worker] TRACE MessageDatabase
>       - not removing data file: 1281 as contained ack(s) refer to
> referenced file: [699]
>
>  2016-09-20 08:59:10,209 [eckpoint Worker] TRACE MessageDatabase
>       - not removing data file: 1282 as contained ack(s) refer to
> referenced file: [699]
>
>  2016-09-20 08:59:10,210 [eckpoint Worker] TRACE MessageDatabase
>       - not removing data file: 1283 as contained ack(s) refer to
> referenced file: [705]
>
>  2016-09-20 08:59:10,210 [eckpoint Worker] TRACE MessageDatabase
>       - not removing data file: 1284 as contained ack(s) refer to
> referenced file: [711]
>
>  2016-09-20 08:59:10,211 [eckpoint Worker] TRACE MessageDatabase
>       - not removing data file: 1288 as contained ack(s) refer to
> referenced file: [674]
>
>  2016-09-20 08:59:10,211 [eckpoint Worker] TRACE MessageDatabase
>       - not removing data file: 1289 as contained ack(s) refer to
> referenced file: [790]
>
>  2016-09-20 08:59:10,212 [eckpoint Worker] TRACE MessageDatabase
>       - not removing data file: 1290 as contained ack(s) refer to
> referenced file: [1290, 1278]
>
>  2016-09-20 08:59:10,213 [eckpoint Worker] TRACE MessageDatabase
>       - not removing data file: 1291 as contained ack(s) refer to
> referenced file: [790]
>
> The files at this point are months old.
>
>
> -rw-r--r--. 1 activemq activemq 33554432 Jul 14 21:23 db-674.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Jul 16 01:57 db-691.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Jul 17 02:10 db-699.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Jul 18 03:55 db-705.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Jul 19 01:32 db-711.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Jul 25 15:15 db-790.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Jul 26 10:32 db-858.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Jul 27 00:19 db-865.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Jul 27 03:03 db-866.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Jul 28 01:45 db-877.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Jul 29 01:18 db-888.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Jul 30 02:10 db-899.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Jul 31 01:11 db-904.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug  1 00:19 db-909.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug  1 04:42 db-910.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug  2 01:07 db-918.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug  3 01:38 db-930.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug  4 00:19 db-939.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug  4 03:16 db-940.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug  5 00:23 db-949.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug  6 02:42 db-960.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug  7 02:18 db-968.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug  8 00:44 db-975.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug  8 14:00 db-980.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug  8 15:41 db-981.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug  9 01:43 db-1051.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug 11 00:20 db-1066.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug 11 03:01 db-1067.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug 12 00:19 db-1074.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug 12 03:05 db-1075.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug 13 00:19 db-1082.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug 13 03:04 db-1083.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug 14 01:06 db-1093.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug 15 01:18 db-1111.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug 16 02:38 db-1126.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug 17 00:19 db-1137.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug 17 02:32 db-1138.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug 18 00:21 db-1155.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug 19 00:51 db-1167.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Sep 19 13:42 db-1277.log
>
> -rw-r--r--. 1 activemq activemq   597139 Sep 19 14:59 db-1279.log
>
> Thanks!
>
> Steve Hill
>
> ARMADA
>
>
>
>
> The information contained in this e-mail and any attachments is
> confidential and
> intended only for the recipient. If you are not the intended recipient,
the

> information contained in this message may not be used, copied, or
> forwarded to
> third parties or otherwise distributed for any other purpose. Please
> notify the
> sender if you received this e-mail in error and delete the e-mail and its
> attachments promptly.  Nothing in this e-mail may be used or deemed to
> form the
> basis of a contractual or any other legally binding obligation unless
> separately
> confirmed in writing by an authorized representative of ARMADA.
>
Reply | Threaded
Open this post in threaded view
|

Re: KAHADB clean up old log files

Steve Hill
In reply to this post by Hill, Steve
Chris/Tim:

Thanks for the ideas.  Unfortunately neither are the case in our situation.  The web client shows 0 messages in any of our queues and we are not using topics.  We did set up the remote debug port and downloaded the source code if there is anything useful to further troubleshoot that could be gleaned from there?

Thanks!
Steve.


The most likely place for you to have old messages laying around is in the
DLQ.  The next most likely place is on a topic for an offline (and never
coming back) durable subscriber.

Tim

On Sep 20, 2016 2:02 PM, "Christopher Shannon" <
[hidden email]> wrote:

Do you have old messages laying around?  KahaDB can't clean up old journal
files if there are messages scattered throughout the logs.  Even a single
unacked message means that a journal file (and the future files) can't be
cleaned up. This situation tends to happen either when there are old
messages not being acked quick enough or there are different destinations
with different speeds.  Ie if one queue has a very slow message rate and
another queue has a high message rate.

Assuming there isn't some sort of bug with the GC process then one way to
help lower the usage would be to split up destinations into different
stores using multiKahaDB.  Take a look at Gary's blog post on this:
http://blog.garytully.com/2011/11/activemq-multiple-kahadb-instances.html


On Tue, Sep 20, 2016 at 1:04 PM, Hill, Steve <[hidden email]> wrote:

> Hi fellow ActiveMQ users!
>
> I have searched all over the internet and can not find a solution to a
> problem we are experiencing where old log files are not removed from
> KAHADB.  Needless to say we uncovered this problem when we exceeded the
> size of our persistent store in production :-)
>
> We are using client side acknowledgement for all reads from the Queue and
> have restarted all clients and also ActiveMQ numerous times, yet the
> messages do not get removed.  We are also using the network connector to
> move messages to the queue that has an active consumer.  We were using
> 5.13.0 but moved to 5.14.0 on the hope it might address this problem.
>
> Follows are some more detailed messages; does anyone have an idea how we
> can resolve this problem?
>
> KAHADB logging is enabled and we see the following messages in the log
> file.
>
>
>  2016-09-20 08:59:10,207 [eckpoint Worker] TRACE MessageDatabase
>       - not removing data file: 1278 as contained ack(s) refer to
> referenced file: [1278, 1277]
>
>  2016-09-20 08:59:10,207 [eckpoint Worker] TRACE MessageDatabase
>       - not removing data file: 1279 as contained ack(s) refer to
> referenced file: [674]
>
>  2016-09-20 08:59:10,208 [eckpoint Worker] TRACE MessageDatabase
>       - not removing data file: 1280 as contained ack(s) refer to
> referenced file: [691]
>
>  2016-09-20 08:59:10,208 [eckpoint Worker] TRACE MessageDatabase
>       - not removing data file: 1281 as contained ack(s) refer to
> referenced file: [699]
>
>  2016-09-20 08:59:10,209 [eckpoint Worker] TRACE MessageDatabase
>       - not removing data file: 1282 as contained ack(s) refer to
> referenced file: [699]
>
>  2016-09-20 08:59:10,210 [eckpoint Worker] TRACE MessageDatabase
>       - not removing data file: 1283 as contained ack(s) refer to
> referenced file: [705]
>
>  2016-09-20 08:59:10,210 [eckpoint Worker] TRACE MessageDatabase
>       - not removing data file: 1284 as contained ack(s) refer to
> referenced file: [711]
>
>  2016-09-20 08:59:10,211 [eckpoint Worker] TRACE MessageDatabase
>       - not removing data file: 1288 as contained ack(s) refer to
> referenced file: [674]
>
>  2016-09-20 08:59:10,211 [eckpoint Worker] TRACE MessageDatabase
>       - not removing data file: 1289 as contained ack(s) refer to
> referenced file: [790]
>
>  2016-09-20 08:59:10,212 [eckpoint Worker] TRACE MessageDatabase
>       - not removing data file: 1290 as contained ack(s) refer to
> referenced file: [1290, 1278]
>
>  2016-09-20 08:59:10,213 [eckpoint Worker] TRACE MessageDatabase
>       - not removing data file: 1291 as contained ack(s) refer to
> referenced file: [790]
>
> The files at this point are months old.
>
>
> -rw-r--r--. 1 activemq activemq 33554432 Jul 14 21:23 db-674.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Jul 16 01:57 db-691.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Jul 17 02:10 db-699.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Jul 18 03:55 db-705.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Jul 19 01:32 db-711.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Jul 25 15:15 db-790.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Jul 26 10:32 db-858.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Jul 27 00:19 db-865.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Jul 27 03:03 db-866.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Jul 28 01:45 db-877.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Jul 29 01:18 db-888.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Jul 30 02:10 db-899.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Jul 31 01:11 db-904.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug  1 00:19 db-909.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug  1 04:42 db-910.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug  2 01:07 db-918.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug  3 01:38 db-930.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug  4 00:19 db-939.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug  4 03:16 db-940.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug  5 00:23 db-949.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug  6 02:42 db-960.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug  7 02:18 db-968.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug  8 00:44 db-975.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug  8 14:00 db-980.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug  8 15:41 db-981.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug  9 01:43 db-1051.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug 11 00:20 db-1066.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug 11 03:01 db-1067.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug 12 00:19 db-1074.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug 12 03:05 db-1075.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug 13 00:19 db-1082.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug 13 03:04 db-1083.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug 14 01:06 db-1093.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug 15 01:18 db-1111.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug 16 02:38 db-1126.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug 17 00:19 db-1137.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug 17 02:32 db-1138.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug 18 00:21 db-1155.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug 19 00:51 db-1167.log
>
> -rw-r--r--. 1 activemq activemq 33554432 Sep 19 13:42 db-1277.log
>
> -rw-r--r--. 1 activemq activemq   597139 Sep 19 14:59 db-1279.log
>
> Thanks!
>
> Steve Hill
>
> ARMADA
>
>
>
>
> The information contained in this e-mail and any attachments is
> confidential and
> intended only for the recipient. If you are not the intended recipient,
the

> information contained in this message may not be used, copied, or
> forwarded to
> third parties or otherwise distributed for any other purpose. Please
> notify the
> sender if you received this e-mail in error and delete the e-mail and its
> attachments promptly.  Nothing in this e-mail may be used or deemed to
> form the
> basis of a contractual or any other legally binding obligation unless
> separately
> confirmed in writing by an authorized representative of ARMADA.
>


Reply | Threaded
Open this post in threaded view
|

Re: KAHADB clean up old log files

Steve Hill
After the comment about the TOPIC I decided to take a quick look at that
in the web interface (just to make sure nothing had gone awry)  In deed
one of our "queues" had been created as a topic, not sure why as we use
the same code to connect to the queues however once removed most of the
data files vanished.

Under the topics there are several activemq.advisory.xyz topics. Could
these be the source of the remaining kahadb log files that still remain?

I am now down to three data files that should not be there. (db-980,
db-1329, and db-1438)

-rw-r--r--. 1 activemq activemq 33554432 Aug  8 14:00 db-980.log
-rw-r--r--. 1 activemq activemq   527645 Sep 19 19:11 db-1329.log
drwxr-xr-x. 5 activemq activemq     4096 Sep 20 14:57 ..
-rw-r--r--. 1 activemq activemq        8 Sep 21 13:17 lock
-rw-r--r--. 1 activemq activemq 33553518 Sep 21 21:52 db-1438.log
drwxr-xr-x. 2 activemq activemq    12288 Sep 22 08:20 .
-rw-r--r--. 1 activemq activemq  3287296 Sep 22 08:34 db.redo
-rw-r--r--. 1 activemq activemq 12357632 Sep 22 08:34 db.data
-rw-r--r--. 1 activemq activemq 33554432 Sep 22 08:34 db-1456.log

 From kahadb logs

  2016-09-22 08:31:07,503 [eckpoint Worker] TRACE
MessageDatabase                - gc candidates: [1329]
  2016-09-22 08:31:07,504 [eckpoint Worker] TRACE
MessageDatabase                - ackMessageFileMap: {1329=[980],
1438=[], 1456=[]}
  2016-09-22 08:31:07,504 [eckpoint Worker] TRACE
MessageDatabase                - not removing data file: 1329 as
contained ack(s) refer to referenced file: [980]
  2016-09-22 08:31:07,505 [eckpoint Worker] TRACE
MessageDatabase                - Not yet time to check for compaction: 7
of 10 cycles
  2016-09-22 08:31:07,505 [eckpoint Worker] DEBUG
MessageDatabase                - Checkpoint done.

Thanks!
Steve.

On 9/22/16 8:14 AM, Steve Hill wrote:

> Chris/Tim:
>
> Thanks for the ideas.  Unfortunately neither are the case in our
> situation.  The web client shows 0 messages in any of our queues and
> we are not using topics.  We did set up the remote debug port and
> downloaded the source code if there is anything useful to further
> troubleshoot that could be gleaned from there?
>
> Thanks!
> Steve.
>
>
> The most likely place for you to have old messages laying around is in
> the
> DLQ.  The next most likely place is on a topic for an offline (and never
> coming back) durable subscriber.
>
> Tim
>
> On Sep 20, 2016 2:02 PM, "Christopher Shannon" <
> [hidden email]> wrote:
>
> Do you have old messages laying around?  KahaDB can't clean up old
> journal
> files if there are messages scattered throughout the logs.  Even a single
> unacked message means that a journal file (and the future files) can't be
> cleaned up. This situation tends to happen either when there are old
> messages not being acked quick enough or there are different destinations
> with different speeds.  Ie if one queue has a very slow message rate and
> another queue has a high message rate.
>
> Assuming there isn't some sort of bug with the GC process then one way to
> help lower the usage would be to split up destinations into different
> stores using multiKahaDB.  Take a look at Gary's blog post on this:
> http://blog.garytully.com/2011/11/activemq-multiple-kahadb-instances.html
>
>
> On Tue, Sep 20, 2016 at 1:04 PM, Hill, Steve <[hidden email]> wrote:
>
>> Hi fellow ActiveMQ users!
>>
>> I have searched all over the internet and can not find a solution to a
>> problem we are experiencing where old log files are not removed from
>> KAHADB.  Needless to say we uncovered this problem when we exceeded the
>> size of our persistent store in production :-)
>>
>> We are using client side acknowledgement for all reads from the Queue
>> and
>> have restarted all clients and also ActiveMQ numerous times, yet the
>> messages do not get removed.  We are also using the network connector to
>> move messages to the queue that has an active consumer.  We were using
>> 5.13.0 but moved to 5.14.0 on the hope it might address this problem.
>>
>> Follows are some more detailed messages; does anyone have an idea how we
>> can resolve this problem?
>>
>> KAHADB logging is enabled and we see the following messages in the log
>> file.
>>
>>
>>  2016-09-20 08:59:10,207 [eckpoint Worker] TRACE MessageDatabase
>>       - not removing data file: 1278 as contained ack(s) refer to
>> referenced file: [1278, 1277]
>>
>>  2016-09-20 08:59:10,207 [eckpoint Worker] TRACE MessageDatabase
>>       - not removing data file: 1279 as contained ack(s) refer to
>> referenced file: [674]
>>
>>  2016-09-20 08:59:10,208 [eckpoint Worker] TRACE MessageDatabase
>>       - not removing data file: 1280 as contained ack(s) refer to
>> referenced file: [691]
>>
>>  2016-09-20 08:59:10,208 [eckpoint Worker] TRACE MessageDatabase
>>       - not removing data file: 1281 as contained ack(s) refer to
>> referenced file: [699]
>>
>>  2016-09-20 08:59:10,209 [eckpoint Worker] TRACE MessageDatabase
>>       - not removing data file: 1282 as contained ack(s) refer to
>> referenced file: [699]
>>
>>  2016-09-20 08:59:10,210 [eckpoint Worker] TRACE MessageDatabase
>>       - not removing data file: 1283 as contained ack(s) refer to
>> referenced file: [705]
>>
>>  2016-09-20 08:59:10,210 [eckpoint Worker] TRACE MessageDatabase
>>       - not removing data file: 1284 as contained ack(s) refer to
>> referenced file: [711]
>>
>>  2016-09-20 08:59:10,211 [eckpoint Worker] TRACE MessageDatabase
>>       - not removing data file: 1288 as contained ack(s) refer to
>> referenced file: [674]
>>
>>  2016-09-20 08:59:10,211 [eckpoint Worker] TRACE MessageDatabase
>>       - not removing data file: 1289 as contained ack(s) refer to
>> referenced file: [790]
>>
>>  2016-09-20 08:59:10,212 [eckpoint Worker] TRACE MessageDatabase
>>       - not removing data file: 1290 as contained ack(s) refer to
>> referenced file: [1290, 1278]
>>
>>  2016-09-20 08:59:10,213 [eckpoint Worker] TRACE MessageDatabase
>>       - not removing data file: 1291 as contained ack(s) refer to
>> referenced file: [790]
>>
>> The files at this point are months old.
>>
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Jul 14 21:23 db-674.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Jul 16 01:57 db-691.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Jul 17 02:10 db-699.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Jul 18 03:55 db-705.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Jul 19 01:32 db-711.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Jul 25 15:15 db-790.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Jul 26 10:32 db-858.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Jul 27 00:19 db-865.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Jul 27 03:03 db-866.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Jul 28 01:45 db-877.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Jul 29 01:18 db-888.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Jul 30 02:10 db-899.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Jul 31 01:11 db-904.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug  1 00:19 db-909.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug  1 04:42 db-910.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug  2 01:07 db-918.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug  3 01:38 db-930.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug  4 00:19 db-939.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug  4 03:16 db-940.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug  5 00:23 db-949.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug  6 02:42 db-960.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug  7 02:18 db-968.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug  8 00:44 db-975.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug  8 14:00 db-980.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug  8 15:41 db-981.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug  9 01:43 db-1051.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug 11 00:20 db-1066.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug 11 03:01 db-1067.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug 12 00:19 db-1074.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug 12 03:05 db-1075.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug 13 00:19 db-1082.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug 13 03:04 db-1083.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug 14 01:06 db-1093.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug 15 01:18 db-1111.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug 16 02:38 db-1126.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug 17 00:19 db-1137.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug 17 02:32 db-1138.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug 18 00:21 db-1155.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug 19 00:51 db-1167.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Sep 19 13:42 db-1277.log
>>
>> -rw-r--r--. 1 activemq activemq   597139 Sep 19 14:59 db-1279.log
>>
>> Thanks!
>>
>> Steve Hill
>>
>> ARMADA
>>
>>
>>
>>
>> The information contained in this e-mail and any attachments is
>> confidential and
>> intended only for the recipient. If you are not the intended recipient,
> the
>> information contained in this message may not be used, copied, or
>> forwarded to
>> third parties or otherwise distributed for any other purpose. Please
>> notify the
>> sender if you received this e-mail in error and delete the e-mail and
>> its
>> attachments promptly.  Nothing in this e-mail may be used or deemed to
>> form the
>> basis of a contractual or any other legally binding obligation unless
>> separately
>> confirmed in writing by an authorized representative of ARMADA.
>>
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: KAHADB clean up old log files

tabish121@gmail.com
On 09/22/2016 08:39 AM, Steve Hill wrote:
> After the comment about the TOPIC I decided to take a quick look at
> that in the web interface (just to make sure nothing had gone awry)  
> In deed one of our "queues" had been created as a topic, not sure why
> as we use the same code to connect to the queues however once removed
> most of the data files vanished.
>
> Under the topics there are several activemq.advisory.xyz topics. Could
> these be the source of the remaining kahadb log files that still remain?

No, Advisory destinations do not allow durable subscriptions so if
there's no consumer the message sent to that Topic essentially goes to
/dev/null

>
> I am now down to three data files that should not be there. (db-980,
> db-1329, and db-1438)
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug  8 14:00 db-980.log
> -rw-r--r--. 1 activemq activemq   527645 Sep 19 19:11 db-1329.log
> drwxr-xr-x. 5 activemq activemq     4096 Sep 20 14:57 ..
> -rw-r--r--. 1 activemq activemq        8 Sep 21 13:17 lock
> -rw-r--r--. 1 activemq activemq 33553518 Sep 21 21:52 db-1438.log
> drwxr-xr-x. 2 activemq activemq    12288 Sep 22 08:20 .
> -rw-r--r--. 1 activemq activemq  3287296 Sep 22 08:34 db.redo
> -rw-r--r--. 1 activemq activemq 12357632 Sep 22 08:34 db.data
> -rw-r--r--. 1 activemq activemq 33554432 Sep 22 08:34 db-1456.log
>
> From kahadb logs
>
>  2016-09-22 08:31:07,503 [eckpoint Worker] TRACE
> MessageDatabase                - gc candidates: [1329]
>  2016-09-22 08:31:07,504 [eckpoint Worker] TRACE
> MessageDatabase                - ackMessageFileMap: {1329=[980],
> 1438=[], 1456=[]}
>  2016-09-22 08:31:07,504 [eckpoint Worker] TRACE
> MessageDatabase                - not removing data file: 1329 as
> contained ack(s) refer to referenced file: [980]
>  2016-09-22 08:31:07,505 [eckpoint Worker] TRACE
> MessageDatabase                - Not yet time to check for compaction:
> 7 of 10 cycles
>  2016-09-22 08:31:07,505 [eckpoint Worker] DEBUG
> MessageDatabase                - Checkpoint done.
>
> Thanks!
> Steve.
>
> On 9/22/16 8:14 AM, Steve Hill wrote:
>> Chris/Tim:
>>
>> Thanks for the ideas.  Unfortunately neither are the case in our
>> situation.  The web client shows 0 messages in any of our queues and
>> we are not using topics.  We did set up the remote debug port and
>> downloaded the source code if there is anything useful to further
>> troubleshoot that could be gleaned from there?
>>
>> Thanks!
>> Steve.
>>
>>
>> The most likely place for you to have old messages laying around is
>> in the
>> DLQ.  The next most likely place is on a topic for an offline (and never
>> coming back) durable subscriber.
>>
>> Tim
>>
>> On Sep 20, 2016 2:02 PM, "Christopher Shannon" <
>> [hidden email]> wrote:
>>
>> Do you have old messages laying around?  KahaDB can't clean up old
>> journal
>> files if there are messages scattered throughout the logs.  Even a
>> single
>> unacked message means that a journal file (and the future files)
>> can't be
>> cleaned up. This situation tends to happen either when there are old
>> messages not being acked quick enough or there are different
>> destinations
>> with different speeds.  Ie if one queue has a very slow message rate and
>> another queue has a high message rate.
>>
>> Assuming there isn't some sort of bug with the GC process then one
>> way to
>> help lower the usage would be to split up destinations into different
>> stores using multiKahaDB.  Take a look at Gary's blog post on this:
>> http://blog.garytully.com/2011/11/activemq-multiple-kahadb-instances.html 
>>
>>
>>
>> On Tue, Sep 20, 2016 at 1:04 PM, Hill, Steve <[hidden email]> wrote:
>>
>>> Hi fellow ActiveMQ users!
>>>
>>> I have searched all over the internet and can not find a solution to a
>>> problem we are experiencing where old log files are not removed from
>>> KAHADB.  Needless to say we uncovered this problem when we exceeded the
>>> size of our persistent store in production :-)
>>>
>>> We are using client side acknowledgement for all reads from the
>>> Queue and
>>> have restarted all clients and also ActiveMQ numerous times, yet the
>>> messages do not get removed.  We are also using the network
>>> connector to
>>> move messages to the queue that has an active consumer.  We were using
>>> 5.13.0 but moved to 5.14.0 on the hope it might address this problem.
>>>
>>> Follows are some more detailed messages; does anyone have an idea
>>> how we
>>> can resolve this problem?
>>>
>>> KAHADB logging is enabled and we see the following messages in the log
>>> file.
>>>
>>>
>>>  2016-09-20 08:59:10,207 [eckpoint Worker] TRACE MessageDatabase
>>>       - not removing data file: 1278 as contained ack(s) refer to
>>> referenced file: [1278, 1277]
>>>
>>>  2016-09-20 08:59:10,207 [eckpoint Worker] TRACE MessageDatabase
>>>       - not removing data file: 1279 as contained ack(s) refer to
>>> referenced file: [674]
>>>
>>>  2016-09-20 08:59:10,208 [eckpoint Worker] TRACE MessageDatabase
>>>       - not removing data file: 1280 as contained ack(s) refer to
>>> referenced file: [691]
>>>
>>>  2016-09-20 08:59:10,208 [eckpoint Worker] TRACE MessageDatabase
>>>       - not removing data file: 1281 as contained ack(s) refer to
>>> referenced file: [699]
>>>
>>>  2016-09-20 08:59:10,209 [eckpoint Worker] TRACE MessageDatabase
>>>       - not removing data file: 1282 as contained ack(s) refer to
>>> referenced file: [699]
>>>
>>>  2016-09-20 08:59:10,210 [eckpoint Worker] TRACE MessageDatabase
>>>       - not removing data file: 1283 as contained ack(s) refer to
>>> referenced file: [705]
>>>
>>>  2016-09-20 08:59:10,210 [eckpoint Worker] TRACE MessageDatabase
>>>       - not removing data file: 1284 as contained ack(s) refer to
>>> referenced file: [711]
>>>
>>>  2016-09-20 08:59:10,211 [eckpoint Worker] TRACE MessageDatabase
>>>       - not removing data file: 1288 as contained ack(s) refer to
>>> referenced file: [674]
>>>
>>>  2016-09-20 08:59:10,211 [eckpoint Worker] TRACE MessageDatabase
>>>       - not removing data file: 1289 as contained ack(s) refer to
>>> referenced file: [790]
>>>
>>>  2016-09-20 08:59:10,212 [eckpoint Worker] TRACE MessageDatabase
>>>       - not removing data file: 1290 as contained ack(s) refer to
>>> referenced file: [1290, 1278]
>>>
>>>  2016-09-20 08:59:10,213 [eckpoint Worker] TRACE MessageDatabase
>>>       - not removing data file: 1291 as contained ack(s) refer to
>>> referenced file: [790]
>>>
>>> The files at this point are months old.
>>>
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 14 21:23 db-674.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 16 01:57 db-691.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 17 02:10 db-699.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 18 03:55 db-705.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 19 01:32 db-711.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 25 15:15 db-790.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 26 10:32 db-858.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 27 00:19 db-865.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 27 03:03 db-866.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 28 01:45 db-877.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 29 01:18 db-888.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 30 02:10 db-899.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 31 01:11 db-904.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  1 00:19 db-909.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  1 04:42 db-910.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  2 01:07 db-918.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  3 01:38 db-930.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  4 00:19 db-939.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  4 03:16 db-940.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  5 00:23 db-949.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  6 02:42 db-960.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  7 02:18 db-968.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  8 00:44 db-975.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  8 14:00 db-980.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  8 15:41 db-981.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  9 01:43 db-1051.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 11 00:20 db-1066.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 11 03:01 db-1067.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 12 00:19 db-1074.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 12 03:05 db-1075.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 13 00:19 db-1082.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 13 03:04 db-1083.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 14 01:06 db-1093.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 15 01:18 db-1111.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 16 02:38 db-1126.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 17 00:19 db-1137.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 17 02:32 db-1138.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 18 00:21 db-1155.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 19 00:51 db-1167.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Sep 19 13:42 db-1277.log
>>>
>>> -rw-r--r--. 1 activemq activemq   597139 Sep 19 14:59 db-1279.log
>>>
>>> Thanks!
>>>
>>> Steve Hill
>>>
>>> ARMADA
>>>
>>>
>>>
>>>
>>> The information contained in this e-mail and any attachments is
>>> confidential and
>>> intended only for the recipient. If you are not the intended recipient,
>> the
>>> information contained in this message may not be used, copied, or
>>> forwarded to
>>> third parties or otherwise distributed for any other purpose. Please
>>> notify the
>>> sender if you received this e-mail in error and delete the e-mail
>>> and its
>>> attachments promptly.  Nothing in this e-mail may be used or deemed to
>>> form the
>>> basis of a contractual or any other legally binding obligation unless
>>> separately
>>> confirmed in writing by an authorized representative of ARMADA.
>>>
>>
>>
>>
>
>


--
Tim Bish
twitter: @tabish121
blog: http://timbish.blogspot.com/

Reply | Threaded
Open this post in threaded view
|

Re: KAHADB clean up old log files

Tim Bain
In reply to this post by Steve Hill
Topics will only hold onto messages if 1) a consumer is currently connected
but hasn't consumed the messages, or 2) a consumer subscribed durably but
is now disconnected.  #1 indicates a slow (or intentionally misbehaving)
consumer, and isn't likely to be a problem over the span of time you're
describing.  It also would result in the messages being discarded when the
broker is restarted, so if you've done a restart and the files weren't
cleaned up, this isn't the problem.  #2 would be unusual for advisory topic
subscriptions, but not impossible (especially since you apparently had this
situation with your mis-created non-advisory topic).  I'd use JConsole
(because it gives more detail than the web console) to go through each
queue looking for a non-zero number of messages, and each topic looking for
offline durable subscribers that have unconsumed messages.

Alternatively, you all could run the KahaDB file analysis tool (
http://activemq.apache.org/why-do-kahadb-log-files-remain-after-cleanup.html)
with a debugger attached to try to see what destination has the message(s)
that are keeping 980 and 1438 alive.  (You say that 1328 shouldn't be
there, but the log shows that it's there because it contains acks for
messages in 980, so it should be there, at least until 980 is deleted.)

Tim

On Sep 22, 2016 6:39 AM, "Steve Hill" <[hidden email]> wrote:

After the comment about the TOPIC I decided to take a quick look at that in
the web interface (just to make sure nothing had gone awry)  In deed one of
our "queues" had been created as a topic, not sure why as we use the same
code to connect to the queues however once removed most of the data files
vanished.

Under the topics there are several activemq.advisory.xyz topics. Could
these be the source of the remaining kahadb log files that still remain?

I am now down to three data files that should not be there. (db-980,
db-1329, and db-1438)


-rw-r--r--. 1 activemq activemq 33554432 Aug  8 14:00 db-980.log
-rw-r--r--. 1 activemq activemq   527645 Sep 19 19:11 db-1329.log
drwxr-xr-x. 5 activemq activemq     4096 Sep 20 14:57 ..
-rw-r--r--. 1 activemq activemq        8 Sep 21 13:17 lock
-rw-r--r--. 1 activemq activemq 33553518 Sep 21 21:52 db-1438.log
drwxr-xr-x. 2 activemq activemq    12288 Sep 22 08:20 .
-rw-r--r--. 1 activemq activemq  3287296 Sep 22 08:34 db.redo
-rw-r--r--. 1 activemq activemq 12357632 Sep 22 08:34 db.data
-rw-r--r--. 1 activemq activemq 33554432 Sep 22 08:34 db-1456.log

From kahadb logs

 2016-09-22 08:31:07,503 [eckpoint Worker] TRACE MessageDatabase
    - gc candidates: [1329]
 2016-09-22 08:31:07,504 [eckpoint Worker] TRACE MessageDatabase
    - ackMessageFileMap: {1329=[980], 1438=[], 1456=[]}
 2016-09-22 08:31:07,504 [eckpoint Worker] TRACE MessageDatabase
    - not removing data file: 1329 as contained ack(s) refer to referenced
file: [980]
 2016-09-22 08:31:07,505 [eckpoint Worker] TRACE MessageDatabase
    - Not yet time to check for compaction: 7 of 10 cycles
 2016-09-22 08:31:07,505 [eckpoint Worker] DEBUG MessageDatabase
    - Checkpoint done.

Thanks!
Steve.


On 9/22/16 8:14 AM, Steve Hill wrote:

> Chris/Tim:
>
> Thanks for the ideas.  Unfortunately neither are the case in our
> situation.  The web client shows 0 messages in any of our queues and we are
> not using topics.  We did set up the remote debug port and downloaded the
> source code if there is anything useful to further troubleshoot that could
> be gleaned from there?
>
> Thanks!
> Steve.
>
>
> The most likely place for you to have old messages laying around is in the
> DLQ.  The next most likely place is on a topic for an offline (and never
> coming back) durable subscriber.
>
> Tim
>
> On Sep 20, 2016 2:02 PM, "Christopher Shannon" <
> [hidden email]> wrote:
>
> Do you have old messages laying around?  KahaDB can't clean up old journal
> files if there are messages scattered throughout the logs.  Even a single
> unacked message means that a journal file (and the future files) can't be
> cleaned up. This situation tends to happen either when there are old
> messages not being acked quick enough or there are different destinations
> with different speeds.  Ie if one queue has a very slow message rate and
> another queue has a high message rate.
>
> Assuming there isn't some sort of bug with the GC process then one way to
> help lower the usage would be to split up destinations into different
> stores using multiKahaDB.  Take a look at Gary's blog post on this:
> http://blog.garytully.com/2011/11/activemq-multiple-kahadb-instances.html
>
>
> On Tue, Sep 20, 2016 at 1:04 PM, Hill, Steve <[hidden email]> wrote:
>
> Hi fellow ActiveMQ users!
>>
>> I have searched all over the internet and can not find a solution to a
>> problem we are experiencing where old log files are not removed from
>> KAHADB.  Needless to say we uncovered this problem when we exceeded the
>> size of our persistent store in production :-)
>>
>> We are using client side acknowledgement for all reads from the Queue and
>> have restarted all clients and also ActiveMQ numerous times, yet the
>> messages do not get removed.  We are also using the network connector to
>> move messages to the queue that has an active consumer.  We were using
>> 5.13.0 but moved to 5.14.0 on the hope it might address this problem.
>>
>> Follows are some more detailed messages; does anyone have an idea how we
>> can resolve this problem?
>>
>> KAHADB logging is enabled and we see the following messages in the log
>> file.
>>
>>
>>  2016-09-20 08:59:10,207 [eckpoint Worker] TRACE MessageDatabase
>>       - not removing data file: 1278 as contained ack(s) refer to
>> referenced file: [1278, 1277]
>>
>>  2016-09-20 08:59:10,207 [eckpoint Worker] TRACE MessageDatabase
>>       - not removing data file: 1279 as contained ack(s) refer to
>> referenced file: [674]
>>
>>  2016-09-20 08:59:10,208 [eckpoint Worker] TRACE MessageDatabase
>>       - not removing data file: 1280 as contained ack(s) refer to
>> referenced file: [691]
>>
>>  2016-09-20 08:59:10,208 [eckpoint Worker] TRACE MessageDatabase
>>       - not removing data file: 1281 as contained ack(s) refer to
>> referenced file: [699]
>>
>>  2016-09-20 08:59:10,209 [eckpoint Worker] TRACE MessageDatabase
>>       - not removing data file: 1282 as contained ack(s) refer to
>> referenced file: [699]
>>
>>  2016-09-20 08:59:10,210 [eckpoint Worker] TRACE MessageDatabase
>>       - not removing data file: 1283 as contained ack(s) refer to
>> referenced file: [705]
>>
>>  2016-09-20 08:59:10,210 [eckpoint Worker] TRACE MessageDatabase
>>       - not removing data file: 1284 as contained ack(s) refer to
>> referenced file: [711]
>>
>>  2016-09-20 08:59:10,211 [eckpoint Worker] TRACE MessageDatabase
>>       - not removing data file: 1288 as contained ack(s) refer to
>> referenced file: [674]
>>
>>  2016-09-20 08:59:10,211 [eckpoint Worker] TRACE MessageDatabase
>>       - not removing data file: 1289 as contained ack(s) refer to
>> referenced file: [790]
>>
>>  2016-09-20 08:59:10,212 [eckpoint Worker] TRACE MessageDatabase
>>       - not removing data file: 1290 as contained ack(s) refer to
>> referenced file: [1290, 1278]
>>
>>  2016-09-20 08:59:10,213 [eckpoint Worker] TRACE MessageDatabase
>>       - not removing data file: 1291 as contained ack(s) refer to
>> referenced file: [790]
>>
>> The files at this point are months old.
>>
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Jul 14 21:23 db-674.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Jul 16 01:57 db-691.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Jul 17 02:10 db-699.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Jul 18 03:55 db-705.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Jul 19 01:32 db-711.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Jul 25 15:15 db-790.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Jul 26 10:32 db-858.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Jul 27 00:19 db-865.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Jul 27 03:03 db-866.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Jul 28 01:45 db-877.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Jul 29 01:18 db-888.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Jul 30 02:10 db-899.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Jul 31 01:11 db-904.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug  1 00:19 db-909.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug  1 04:42 db-910.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug  2 01:07 db-918.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug  3 01:38 db-930.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug  4 00:19 db-939.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug  4 03:16 db-940.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug  5 00:23 db-949.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug  6 02:42 db-960.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug  7 02:18 db-968.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug  8 00:44 db-975.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug  8 14:00 db-980.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug  8 15:41 db-981.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug  9 01:43 db-1051.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug 11 00:20 db-1066.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug 11 03:01 db-1067.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug 12 00:19 db-1074.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug 12 03:05 db-1075.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug 13 00:19 db-1082.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug 13 03:04 db-1083.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug 14 01:06 db-1093.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug 15 01:18 db-1111.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug 16 02:38 db-1126.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug 17 00:19 db-1137.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug 17 02:32 db-1138.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug 18 00:21 db-1155.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug 19 00:51 db-1167.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Sep 19 13:42 db-1277.log
>>
>> -rw-r--r--. 1 activemq activemq   597139 Sep 19 14:59 db-1279.log
>>
>> Thanks!
>>
>> Steve Hill
>>
>> ARMADA
>>
>>
>>
>>
>> The information contained in this e-mail and any attachments is
>> confidential and
>> intended only for the recipient. If you are not the intended recipient,
>>
> the
>
>> information contained in this message may not be used, copied, or
>> forwarded to
>> third parties or otherwise distributed for any other purpose. Please
>> notify the
>> sender if you received this e-mail in error and delete the e-mail and its
>> attachments promptly.  Nothing in this e-mail may be used or deemed to
>> form the
>> basis of a contractual or any other legally binding obligation unless
>> separately
>> confirmed in writing by an authorized representative of ARMADA.
>>
>>
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: KAHADB clean up old log files

Tim Bain
Apparently you can scratch the "but not impossible" bit from my reply.
Thanks Tim, I didn't know it wasn't allowed.

On Sep 22, 2016 7:07 AM, "Tim Bain" <[hidden email]> wrote:

> Topics will only hold onto messages if 1) a consumer is currently
> connected but hasn't consumed the messages, or 2) a consumer subscribed
> durably but is now disconnected.  #1 indicates a slow (or intentionally
> misbehaving) consumer, and isn't likely to be a problem over the span of
> time you're describing.  It also would result in the messages being
> discarded when the broker is restarted, so if you've done a restart and the
> files weren't cleaned up, this isn't the problem.  #2 would be unusual for
> advisory topic subscriptions, but not impossible (especially since you
> apparently had this situation with your mis-created non-advisory topic).
> I'd use JConsole (because it gives more detail than the web console) to go
> through each queue looking for a non-zero number of messages, and each
> topic looking for offline durable subscribers that have unconsumed messages.
>
> Alternatively, you all could run the KahaDB file analysis tool (
> http://activemq.apache.org/why-do-kahadb-log-files-
> remain-after-cleanup.html) with a debugger attached to try to see what
> destination has the message(s) that are keeping 980 and 1438 alive.  (You
> say that 1328 shouldn't be there, but the log shows that it's there because
> it contains acks for messages in 980, so it should be there, at least until
> 980 is deleted.)
>
> Tim
>
> On Sep 22, 2016 6:39 AM, "Steve Hill" <[hidden email]> wrote:
>
> After the comment about the TOPIC I decided to take a quick look at that
> in the web interface (just to make sure nothing had gone awry)  In deed one
> of our "queues" had been created as a topic, not sure why as we use the
> same code to connect to the queues however once removed most of the data
> files vanished.
>
> Under the topics there are several activemq.advisory.xyz topics. Could
> these be the source of the remaining kahadb log files that still remain?
>
> I am now down to three data files that should not be there. (db-980,
> db-1329, and db-1438)
>
>
> -rw-r--r--. 1 activemq activemq 33554432 Aug  8 14:00 db-980.log
> -rw-r--r--. 1 activemq activemq   527645 Sep 19 19:11 db-1329.log
> drwxr-xr-x. 5 activemq activemq     4096 Sep 20 14:57 ..
> -rw-r--r--. 1 activemq activemq        8 Sep 21 13:17 lock
> -rw-r--r--. 1 activemq activemq 33553518 Sep 21 21:52 db-1438.log
> drwxr-xr-x. 2 activemq activemq    12288 Sep 22 08:20 .
> -rw-r--r--. 1 activemq activemq  3287296 Sep 22 08:34 db.redo
> -rw-r--r--. 1 activemq activemq 12357632 Sep 22 08:34 db.data
> -rw-r--r--. 1 activemq activemq 33554432 Sep 22 08:34 db-1456.log
>
> From kahadb logs
>
>  2016-09-22 08:31:07,503 [eckpoint Worker] TRACE MessageDatabase
>       - gc candidates: [1329]
>  2016-09-22 08:31:07,504 [eckpoint Worker] TRACE MessageDatabase
>       - ackMessageFileMap: {1329=[980], 1438=[], 1456=[]}
>  2016-09-22 08:31:07,504 [eckpoint Worker] TRACE MessageDatabase
>       - not removing data file: 1329 as contained ack(s) refer to
> referenced file: [980]
>  2016-09-22 08:31:07,505 [eckpoint Worker] TRACE MessageDatabase
>       - Not yet time to check for compaction: 7 of 10 cycles
>  2016-09-22 08:31:07,505 [eckpoint Worker] DEBUG MessageDatabase
>       - Checkpoint done.
>
> Thanks!
> Steve.
>
>
> On 9/22/16 8:14 AM, Steve Hill wrote:
>
>> Chris/Tim:
>>
>> Thanks for the ideas.  Unfortunately neither are the case in our
>> situation.  The web client shows 0 messages in any of our queues and we are
>> not using topics.  We did set up the remote debug port and downloaded the
>> source code if there is anything useful to further troubleshoot that could
>> be gleaned from there?
>>
>> Thanks!
>> Steve.
>>
>>
>> The most likely place for you to have old messages laying around is in the
>> DLQ.  The next most likely place is on a topic for an offline (and never
>> coming back) durable subscriber.
>>
>> Tim
>>
>> On Sep 20, 2016 2:02 PM, "Christopher Shannon" <
>> [hidden email]> wrote:
>>
>> Do you have old messages laying around?  KahaDB can't clean up old journal
>> files if there are messages scattered throughout the logs.  Even a single
>> unacked message means that a journal file (and the future files) can't be
>> cleaned up. This situation tends to happen either when there are old
>> messages not being acked quick enough or there are different destinations
>> with different speeds.  Ie if one queue has a very slow message rate and
>> another queue has a high message rate.
>>
>> Assuming there isn't some sort of bug with the GC process then one way to
>> help lower the usage would be to split up destinations into different
>> stores using multiKahaDB.  Take a look at Gary's blog post on this:
>> http://blog.garytully.com/2011/11/activemq-multiple-kahadb-instances.html
>>
>>
>> On Tue, Sep 20, 2016 at 1:04 PM, Hill, Steve <[hidden email]> wrote:
>>
>> Hi fellow ActiveMQ users!
>>>
>>> I have searched all over the internet and can not find a solution to a
>>> problem we are experiencing where old log files are not removed from
>>> KAHADB.  Needless to say we uncovered this problem when we exceeded the
>>> size of our persistent store in production :-)
>>>
>>> We are using client side acknowledgement for all reads from the Queue and
>>> have restarted all clients and also ActiveMQ numerous times, yet the
>>> messages do not get removed.  We are also using the network connector to
>>> move messages to the queue that has an active consumer.  We were using
>>> 5.13.0 but moved to 5.14.0 on the hope it might address this problem.
>>>
>>> Follows are some more detailed messages; does anyone have an idea how we
>>> can resolve this problem?
>>>
>>> KAHADB logging is enabled and we see the following messages in the log
>>> file.
>>>
>>>
>>>  2016-09-20 08:59:10,207 [eckpoint Worker] TRACE MessageDatabase
>>>       - not removing data file: 1278 as contained ack(s) refer to
>>> referenced file: [1278, 1277]
>>>
>>>  2016-09-20 08:59:10,207 [eckpoint Worker] TRACE MessageDatabase
>>>       - not removing data file: 1279 as contained ack(s) refer to
>>> referenced file: [674]
>>>
>>>  2016-09-20 08:59:10,208 [eckpoint Worker] TRACE MessageDatabase
>>>       - not removing data file: 1280 as contained ack(s) refer to
>>> referenced file: [691]
>>>
>>>  2016-09-20 08:59:10,208 [eckpoint Worker] TRACE MessageDatabase
>>>       - not removing data file: 1281 as contained ack(s) refer to
>>> referenced file: [699]
>>>
>>>  2016-09-20 08:59:10,209 [eckpoint Worker] TRACE MessageDatabase
>>>       - not removing data file: 1282 as contained ack(s) refer to
>>> referenced file: [699]
>>>
>>>  2016-09-20 08:59:10,210 [eckpoint Worker] TRACE MessageDatabase
>>>       - not removing data file: 1283 as contained ack(s) refer to
>>> referenced file: [705]
>>>
>>>  2016-09-20 08:59:10,210 [eckpoint Worker] TRACE MessageDatabase
>>>       - not removing data file: 1284 as contained ack(s) refer to
>>> referenced file: [711]
>>>
>>>  2016-09-20 08:59:10,211 [eckpoint Worker] TRACE MessageDatabase
>>>       - not removing data file: 1288 as contained ack(s) refer to
>>> referenced file: [674]
>>>
>>>  2016-09-20 08:59:10,211 [eckpoint Worker] TRACE MessageDatabase
>>>       - not removing data file: 1289 as contained ack(s) refer to
>>> referenced file: [790]
>>>
>>>  2016-09-20 08:59:10,212 [eckpoint Worker] TRACE MessageDatabase
>>>       - not removing data file: 1290 as contained ack(s) refer to
>>> referenced file: [1290, 1278]
>>>
>>>  2016-09-20 08:59:10,213 [eckpoint Worker] TRACE MessageDatabase
>>>       - not removing data file: 1291 as contained ack(s) refer to
>>> referenced file: [790]
>>>
>>> The files at this point are months old.
>>>
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 14 21:23 db-674.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 16 01:57 db-691.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 17 02:10 db-699.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 18 03:55 db-705.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 19 01:32 db-711.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 25 15:15 db-790.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 26 10:32 db-858.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 27 00:19 db-865.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 27 03:03 db-866.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 28 01:45 db-877.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 29 01:18 db-888.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 30 02:10 db-899.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 31 01:11 db-904.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  1 00:19 db-909.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  1 04:42 db-910.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  2 01:07 db-918.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  3 01:38 db-930.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  4 00:19 db-939.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  4 03:16 db-940.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  5 00:23 db-949.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  6 02:42 db-960.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  7 02:18 db-968.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  8 00:44 db-975.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  8 14:00 db-980.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  8 15:41 db-981.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  9 01:43 db-1051.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 11 00:20 db-1066.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 11 03:01 db-1067.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 12 00:19 db-1074.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 12 03:05 db-1075.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 13 00:19 db-1082.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 13 03:04 db-1083.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 14 01:06 db-1093.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 15 01:18 db-1111.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 16 02:38 db-1126.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 17 00:19 db-1137.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 17 02:32 db-1138.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 18 00:21 db-1155.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 19 00:51 db-1167.log
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Sep 19 13:42 db-1277.log
>>>
>>> -rw-r--r--. 1 activemq activemq   597139 Sep 19 14:59 db-1279.log
>>>
>>> Thanks!
>>>
>>> Steve Hill
>>>
>>> ARMADA
>>>
>>>
>>>
>>>
>>> The information contained in this e-mail and any attachments is
>>> confidential and
>>> intended only for the recipient. If you are not the intended recipient,
>>>
>> the
>>
>>> information contained in this message may not be used, copied, or
>>> forwarded to
>>> third parties or otherwise distributed for any other purpose. Please
>>> notify the
>>> sender if you received this e-mail in error and delete the e-mail and its
>>> attachments promptly.  Nothing in this e-mail may be used or deemed to
>>> form the
>>> basis of a contractual or any other legally binding obligation unless
>>> separately
>>> confirmed in writing by an authorized representative of ARMADA.
>>>
>>>
>>
>>
>>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: KAHADB clean up old log files

lichtin
In reply to this post by Tim Bain
Changing trace levels and using the debugger is not always an option in production.
I'm wondering why we could not revive the "KahaDBJournalReader" tool
from https://issues.jboss.org/browse/MB-756 and ship it with ActiveMQ.

Tim Bain wrote
Alternatively, you all could run the KahaDB file analysis tool (
http://activemq.apache.org/why-do-kahadb-log-files-remain-after-cleanup.html)
with a debugger attached to try to see what destination has the message(s)
that are keeping 980 and 1438 alive.  (You say that 1328 shouldn't be
there, but the log shows that it's there because it contains acks for
messages in 980, so it should be there, at least until 980 is deleted.)

Tim

On Sep 22, 2016 6:39 AM, "Steve Hill" <[hidden email]> wrote:

After the comment about the TOPIC I decided to take a quick look at that in
the web interface (just to make sure nothing had gone awry)  In deed one of
our "queues" had been created as a topic, not sure why as we use the same
code to connect to the queues however once removed most of the data files
vanished.

Under the topics there are several activemq.advisory.xyz topics. Could
these be the source of the remaining kahadb log files that still remain?

I am now down to three data files that should not be there. (db-980,
db-1329, and db-1438)


-rw-r--r--. 1 activemq activemq 33554432 Aug  8 14:00 db-980.log
-rw-r--r--. 1 activemq activemq   527645 Sep 19 19:11 db-1329.log
drwxr-xr-x. 5 activemq activemq     4096 Sep 20 14:57 ..
-rw-r--r--. 1 activemq activemq        8 Sep 21 13:17 lock
-rw-r--r--. 1 activemq activemq 33553518 Sep 21 21:52 db-1438.log
drwxr-xr-x. 2 activemq activemq    12288 Sep 22 08:20 .
-rw-r--r--. 1 activemq activemq  3287296 Sep 22 08:34 db.redo
-rw-r--r--. 1 activemq activemq 12357632 Sep 22 08:34 db.data
-rw-r--r--. 1 activemq activemq 33554432 Sep 22 08:34 db-1456.log

From kahadb logs

 2016-09-22 08:31:07,503 [eckpoint Worker] TRACE MessageDatabase
    - gc candidates: [1329]
 2016-09-22 08:31:07,504 [eckpoint Worker] TRACE MessageDatabase
    - ackMessageFileMap: {1329=[980], 1438=[], 1456=[]}
 2016-09-22 08:31:07,504 [eckpoint Worker] TRACE MessageDatabase
    - not removing data file: 1329 as contained ack(s) refer to referenced
file: [980]
 2016-09-22 08:31:07,505 [eckpoint Worker] TRACE MessageDatabase
    - Not yet time to check for compaction: 7 of 10 cycles
 2016-09-22 08:31:07,505 [eckpoint Worker] DEBUG MessageDatabase
    - Checkpoint done.

Thanks!
Steve.


On 9/22/16 8:14 AM, Steve Hill wrote:

> Chris/Tim:
>
> Thanks for the ideas.  Unfortunately neither are the case in our
> situation.  The web client shows 0 messages in any of our queues and we are
> not using topics.  We did set up the remote debug port and downloaded the
> source code if there is anything useful to further troubleshoot that could
> be gleaned from there?
>
> Thanks!
> Steve.
>
>
> The most likely place for you to have old messages laying around is in the
> DLQ.  The next most likely place is on a topic for an offline (and never
> coming back) durable subscriber.
>
> Tim
>
> On Sep 20, 2016 2:02 PM, "Christopher Shannon" <
> [hidden email]> wrote:
>
> Do you have old messages laying around?  KahaDB can't clean up old journal
> files if there are messages scattered throughout the logs.  Even a single
> unacked message means that a journal file (and the future files) can't be
> cleaned up. This situation tends to happen either when there are old
> messages not being acked quick enough or there are different destinations
> with different speeds.  Ie if one queue has a very slow message rate and
> another queue has a high message rate.
>
> Assuming there isn't some sort of bug with the GC process then one way to
> help lower the usage would be to split up destinations into different
> stores using multiKahaDB.  Take a look at Gary's blog post on this:
> http://blog.garytully.com/2011/11/activemq-multiple-kahadb-instances.html
>
>
> On Tue, Sep 20, 2016 at 1:04 PM, Hill, Steve <[hidden email]> wrote:
>
> Hi fellow ActiveMQ users!
>>
>> I have searched all over the internet and can not find a solution to a
>> problem we are experiencing where old log files are not removed from
>> KAHADB.  Needless to say we uncovered this problem when we exceeded the
>> size of our persistent store in production :-)
>>
>> We are using client side acknowledgement for all reads from the Queue and
>> have restarted all clients and also ActiveMQ numerous times, yet the
>> messages do not get removed.  We are also using the network connector to
>> move messages to the queue that has an active consumer.  We were using
>> 5.13.0 but moved to 5.14.0 on the hope it might address this problem.
>>
>> Follows are some more detailed messages; does anyone have an idea how we
>> can resolve this problem?
>>
>> KAHADB logging is enabled and we see the following messages in the log
>> file.
>>
>>
>>  2016-09-20 08:59:10,207 [eckpoint Worker] TRACE MessageDatabase
>>       - not removing data file: 1278 as contained ack(s) refer to
>> referenced file: [1278, 1277]
>>
>>  2016-09-20 08:59:10,207 [eckpoint Worker] TRACE MessageDatabase
>>       - not removing data file: 1279 as contained ack(s) refer to
>> referenced file: [674]
>>
>>  2016-09-20 08:59:10,208 [eckpoint Worker] TRACE MessageDatabase
>>       - not removing data file: 1280 as contained ack(s) refer to
>> referenced file: [691]
>>
>>  2016-09-20 08:59:10,208 [eckpoint Worker] TRACE MessageDatabase
>>       - not removing data file: 1281 as contained ack(s) refer to
>> referenced file: [699]
>>
>>  2016-09-20 08:59:10,209 [eckpoint Worker] TRACE MessageDatabase
>>       - not removing data file: 1282 as contained ack(s) refer to
>> referenced file: [699]
>>
>>  2016-09-20 08:59:10,210 [eckpoint Worker] TRACE MessageDatabase
>>       - not removing data file: 1283 as contained ack(s) refer to
>> referenced file: [705]
>>
>>  2016-09-20 08:59:10,210 [eckpoint Worker] TRACE MessageDatabase
>>       - not removing data file: 1284 as contained ack(s) refer to
>> referenced file: [711]
>>
>>  2016-09-20 08:59:10,211 [eckpoint Worker] TRACE MessageDatabase
>>       - not removing data file: 1288 as contained ack(s) refer to
>> referenced file: [674]
>>
>>  2016-09-20 08:59:10,211 [eckpoint Worker] TRACE MessageDatabase
>>       - not removing data file: 1289 as contained ack(s) refer to
>> referenced file: [790]
>>
>>  2016-09-20 08:59:10,212 [eckpoint Worker] TRACE MessageDatabase
>>       - not removing data file: 1290 as contained ack(s) refer to
>> referenced file: [1290, 1278]
>>
>>  2016-09-20 08:59:10,213 [eckpoint Worker] TRACE MessageDatabase
>>       - not removing data file: 1291 as contained ack(s) refer to
>> referenced file: [790]
>>
>> The files at this point are months old.
>>
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Jul 14 21:23 db-674.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Jul 16 01:57 db-691.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Jul 17 02:10 db-699.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Jul 18 03:55 db-705.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Jul 19 01:32 db-711.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Jul 25 15:15 db-790.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Jul 26 10:32 db-858.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Jul 27 00:19 db-865.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Jul 27 03:03 db-866.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Jul 28 01:45 db-877.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Jul 29 01:18 db-888.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Jul 30 02:10 db-899.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Jul 31 01:11 db-904.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug  1 00:19 db-909.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug  1 04:42 db-910.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug  2 01:07 db-918.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug  3 01:38 db-930.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug  4 00:19 db-939.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug  4 03:16 db-940.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug  5 00:23 db-949.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug  6 02:42 db-960.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug  7 02:18 db-968.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug  8 00:44 db-975.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug  8 14:00 db-980.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug  8 15:41 db-981.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug  9 01:43 db-1051.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug 11 00:20 db-1066.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug 11 03:01 db-1067.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug 12 00:19 db-1074.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug 12 03:05 db-1075.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug 13 00:19 db-1082.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug 13 03:04 db-1083.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug 14 01:06 db-1093.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug 15 01:18 db-1111.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug 16 02:38 db-1126.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug 17 00:19 db-1137.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug 17 02:32 db-1138.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug 18 00:21 db-1155.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug 19 00:51 db-1167.log
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Sep 19 13:42 db-1277.log
>>
>> -rw-r--r--. 1 activemq activemq   597139 Sep 19 14:59 db-1279.log
>>
>> Thanks!
>>
>> Steve Hill
>>
>> ARMADA
>>
>>
>>
>>
>> The information contained in this e-mail and any attachments is
>> confidential and
>> intended only for the recipient. If you are not the intended recipient,
>>
> the
>
>> information contained in this message may not be used, copied, or
>> forwarded to
>> third parties or otherwise distributed for any other purpose. Please
>> notify the
>> sender if you received this e-mail in error and delete the e-mail and its
>> attachments promptly.  Nothing in this e-mail may be used or deemed to
>> form the
>> basis of a contractual or any other legally binding obligation unless
>> separately
>> confirmed in writing by an authorized representative of ARMADA.
>>
>>
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: KAHADB clean up old log files

Steve Hill
In reply to this post by Tim Bain
Ok, so I hooked up jconsole and looked at the queues, all of the queues
were showing zero.  Right underneath the queues there was a section that
said "RecoveredXaTransaction".  It had one entry
XID_[1096044365_globalId&amp;3132362e392e312e3133392e746..... There
really is not very much more information available in the attributes.  I
ended up hitting the heuristicRollback operation and shortly afterwards
the old KahaDB files were removed.  Great news!

I checked the queues afterwards and rolling back the XA transaction did
not cause the message to go anywhere.  Does anyone know what this might
have been and how to avoid it from occurring in the future?

Thanks!
Steve.

On 9/22/16 9:09 AM, Tim Bain wrote:

> Apparently you can scratch the "but not impossible" bit from my reply.
> Thanks Tim, I didn't know it wasn't allowed.
>
> On Sep 22, 2016 7:07 AM, "Tim Bain" <[hidden email]> wrote:
>
>> Topics will only hold onto messages if 1) a consumer is currently
>> connected but hasn't consumed the messages, or 2) a consumer subscribed
>> durably but is now disconnected.  #1 indicates a slow (or intentionally
>> misbehaving) consumer, and isn't likely to be a problem over the span of
>> time you're describing.  It also would result in the messages being
>> discarded when the broker is restarted, so if you've done a restart and the
>> files weren't cleaned up, this isn't the problem.  #2 would be unusual for
>> advisory topic subscriptions, but not impossible (especially since you
>> apparently had this situation with your mis-created non-advisory topic).
>> I'd use JConsole (because it gives more detail than the web console) to go
>> through each queue looking for a non-zero number of messages, and each
>> topic looking for offline durable subscribers that have unconsumed messages.
>>
>> Alternatively, you all could run the KahaDB file analysis tool (
>> http://activemq.apache.org/why-do-kahadb-log-files-
>> remain-after-cleanup.html) with a debugger attached to try to see what
>> destination has the message(s) that are keeping 980 and 1438 alive.  (You
>> say that 1328 shouldn't be there, but the log shows that it's there because
>> it contains acks for messages in 980, so it should be there, at least until
>> 980 is deleted.)
>>
>> Tim
>>
>> On Sep 22, 2016 6:39 AM, "Steve Hill" <[hidden email]> wrote:
>>
>> After the comment about the TOPIC I decided to take a quick look at that
>> in the web interface (just to make sure nothing had gone awry)  In deed one
>> of our "queues" had been created as a topic, not sure why as we use the
>> same code to connect to the queues however once removed most of the data
>> files vanished.
>>
>> Under the topics there are several activemq.advisory.xyz topics. Could
>> these be the source of the remaining kahadb log files that still remain?
>>
>> I am now down to three data files that should not be there. (db-980,
>> db-1329, and db-1438)
>>
>>
>> -rw-r--r--. 1 activemq activemq 33554432 Aug  8 14:00 db-980.log
>> -rw-r--r--. 1 activemq activemq   527645 Sep 19 19:11 db-1329.log
>> drwxr-xr-x. 5 activemq activemq     4096 Sep 20 14:57 ..
>> -rw-r--r--. 1 activemq activemq        8 Sep 21 13:17 lock
>> -rw-r--r--. 1 activemq activemq 33553518 Sep 21 21:52 db-1438.log
>> drwxr-xr-x. 2 activemq activemq    12288 Sep 22 08:20 .
>> -rw-r--r--. 1 activemq activemq  3287296 Sep 22 08:34 db.redo
>> -rw-r--r--. 1 activemq activemq 12357632 Sep 22 08:34 db.data
>> -rw-r--r--. 1 activemq activemq 33554432 Sep 22 08:34 db-1456.log
>>
>>  From kahadb logs
>>
>>   2016-09-22 08:31:07,503 [eckpoint Worker] TRACE MessageDatabase
>>        - gc candidates: [1329]
>>   2016-09-22 08:31:07,504 [eckpoint Worker] TRACE MessageDatabase
>>        - ackMessageFileMap: {1329=[980], 1438=[], 1456=[]}
>>   2016-09-22 08:31:07,504 [eckpoint Worker] TRACE MessageDatabase
>>        - not removing data file: 1329 as contained ack(s) refer to
>> referenced file: [980]
>>   2016-09-22 08:31:07,505 [eckpoint Worker] TRACE MessageDatabase
>>        - Not yet time to check for compaction: 7 of 10 cycles
>>   2016-09-22 08:31:07,505 [eckpoint Worker] DEBUG MessageDatabase
>>        - Checkpoint done.
>>
>> Thanks!
>> Steve.
>>
>>
>> On 9/22/16 8:14 AM, Steve Hill wrote:
>>
>>> Chris/Tim:
>>>
>>> Thanks for the ideas.  Unfortunately neither are the case in our
>>> situation.  The web client shows 0 messages in any of our queues and we are
>>> not using topics.  We did set up the remote debug port and downloaded the
>>> source code if there is anything useful to further troubleshoot that could
>>> be gleaned from there?
>>>
>>> Thanks!
>>> Steve.
>>>
>>>
>>> The most likely place for you to have old messages laying around is in the
>>> DLQ.  The next most likely place is on a topic for an offline (and never
>>> coming back) durable subscriber.
>>>
>>> Tim
>>>
>>> On Sep 20, 2016 2:02 PM, "Christopher Shannon" <
>>> [hidden email]> wrote:
>>>
>>> Do you have old messages laying around?  KahaDB can't clean up old journal
>>> files if there are messages scattered throughout the logs.  Even a single
>>> unacked message means that a journal file (and the future files) can't be
>>> cleaned up. This situation tends to happen either when there are old
>>> messages not being acked quick enough or there are different destinations
>>> with different speeds.  Ie if one queue has a very slow message rate and
>>> another queue has a high message rate.
>>>
>>> Assuming there isn't some sort of bug with the GC process then one way to
>>> help lower the usage would be to split up destinations into different
>>> stores using multiKahaDB.  Take a look at Gary's blog post on this:
>>> http://blog.garytully.com/2011/11/activemq-multiple-kahadb-instances.html
>>>
>>>
>>> On Tue, Sep 20, 2016 at 1:04 PM, Hill, Steve <[hidden email]> wrote:
>>>
>>> Hi fellow ActiveMQ users!
>>>> I have searched all over the internet and can not find a solution to a
>>>> problem we are experiencing where old log files are not removed from
>>>> KAHADB.  Needless to say we uncovered this problem when we exceeded the
>>>> size of our persistent store in production :-)
>>>>
>>>> We are using client side acknowledgement for all reads from the Queue and
>>>> have restarted all clients and also ActiveMQ numerous times, yet the
>>>> messages do not get removed.  We are also using the network connector to
>>>> move messages to the queue that has an active consumer.  We were using
>>>> 5.13.0 but moved to 5.14.0 on the hope it might address this problem.
>>>>
>>>> Follows are some more detailed messages; does anyone have an idea how we
>>>> can resolve this problem?
>>>>
>>>> KAHADB logging is enabled and we see the following messages in the log
>>>> file.
>>>>
>>>>
>>>>   2016-09-20 08:59:10,207 [eckpoint Worker] TRACE MessageDatabase
>>>>        - not removing data file: 1278 as contained ack(s) refer to
>>>> referenced file: [1278, 1277]
>>>>
>>>>   2016-09-20 08:59:10,207 [eckpoint Worker] TRACE MessageDatabase
>>>>        - not removing data file: 1279 as contained ack(s) refer to
>>>> referenced file: [674]
>>>>
>>>>   2016-09-20 08:59:10,208 [eckpoint Worker] TRACE MessageDatabase
>>>>        - not removing data file: 1280 as contained ack(s) refer to
>>>> referenced file: [691]
>>>>
>>>>   2016-09-20 08:59:10,208 [eckpoint Worker] TRACE MessageDatabase
>>>>        - not removing data file: 1281 as contained ack(s) refer to
>>>> referenced file: [699]
>>>>
>>>>   2016-09-20 08:59:10,209 [eckpoint Worker] TRACE MessageDatabase
>>>>        - not removing data file: 1282 as contained ack(s) refer to
>>>> referenced file: [699]
>>>>
>>>>   2016-09-20 08:59:10,210 [eckpoint Worker] TRACE MessageDatabase
>>>>        - not removing data file: 1283 as contained ack(s) refer to
>>>> referenced file: [705]
>>>>
>>>>   2016-09-20 08:59:10,210 [eckpoint Worker] TRACE MessageDatabase
>>>>        - not removing data file: 1284 as contained ack(s) refer to
>>>> referenced file: [711]
>>>>
>>>>   2016-09-20 08:59:10,211 [eckpoint Worker] TRACE MessageDatabase
>>>>        - not removing data file: 1288 as contained ack(s) refer to
>>>> referenced file: [674]
>>>>
>>>>   2016-09-20 08:59:10,211 [eckpoint Worker] TRACE MessageDatabase
>>>>        - not removing data file: 1289 as contained ack(s) refer to
>>>> referenced file: [790]
>>>>
>>>>   2016-09-20 08:59:10,212 [eckpoint Worker] TRACE MessageDatabase
>>>>        - not removing data file: 1290 as contained ack(s) refer to
>>>> referenced file: [1290, 1278]
>>>>
>>>>   2016-09-20 08:59:10,213 [eckpoint Worker] TRACE MessageDatabase
>>>>        - not removing data file: 1291 as contained ack(s) refer to
>>>> referenced file: [790]
>>>>
>>>> The files at this point are months old.
>>>>
>>>>
>>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 14 21:23 db-674.log
>>>>
>>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 16 01:57 db-691.log
>>>>
>>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 17 02:10 db-699.log
>>>>
>>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 18 03:55 db-705.log
>>>>
>>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 19 01:32 db-711.log
>>>>
>>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 25 15:15 db-790.log
>>>>
>>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 26 10:32 db-858.log
>>>>
>>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 27 00:19 db-865.log
>>>>
>>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 27 03:03 db-866.log
>>>>
>>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 28 01:45 db-877.log
>>>>
>>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 29 01:18 db-888.log
>>>>
>>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 30 02:10 db-899.log
>>>>
>>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 31 01:11 db-904.log
>>>>
>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  1 00:19 db-909.log
>>>>
>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  1 04:42 db-910.log
>>>>
>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  2 01:07 db-918.log
>>>>
>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  3 01:38 db-930.log
>>>>
>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  4 00:19 db-939.log
>>>>
>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  4 03:16 db-940.log
>>>>
>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  5 00:23 db-949.log
>>>>
>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  6 02:42 db-960.log
>>>>
>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  7 02:18 db-968.log
>>>>
>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  8 00:44 db-975.log
>>>>
>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  8 14:00 db-980.log
>>>>
>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  8 15:41 db-981.log
>>>>
>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  9 01:43 db-1051.log
>>>>
>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 11 00:20 db-1066.log
>>>>
>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 11 03:01 db-1067.log
>>>>
>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 12 00:19 db-1074.log
>>>>
>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 12 03:05 db-1075.log
>>>>
>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 13 00:19 db-1082.log
>>>>
>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 13 03:04 db-1083.log
>>>>
>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 14 01:06 db-1093.log
>>>>
>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 15 01:18 db-1111.log
>>>>
>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 16 02:38 db-1126.log
>>>>
>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 17 00:19 db-1137.log
>>>>
>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 17 02:32 db-1138.log
>>>>
>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 18 00:21 db-1155.log
>>>>
>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 19 00:51 db-1167.log
>>>>
>>>> -rw-r--r--. 1 activemq activemq 33554432 Sep 19 13:42 db-1277.log
>>>>
>>>> -rw-r--r--. 1 activemq activemq   597139 Sep 19 14:59 db-1279.log
>>>>
>>>> Thanks!
>>>>
>>>> Steve Hill
>>>>
>>>> ARMADA
>>>>
>>>>
>>>>
>>>>
>>>> The information contained in this e-mail and any attachments is
>>>> confidential and
>>>> intended only for the recipient. If you are not the intended recipient,
>>>>
>>> the
>>>
>>>> information contained in this message may not be used, copied, or
>>>> forwarded to
>>>> third parties or otherwise distributed for any other purpose. Please
>>>> notify the
>>>> sender if you received this e-mail in error and delete the e-mail and its
>>>> attachments promptly.  Nothing in this e-mail may be used or deemed to
>>>> form the
>>>> basis of a contractual or any other legally binding obligation unless
>>>> separately
>>>> confirmed in writing by an authorized representative of ARMADA.
>>>>
>>>>
>>>
>>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: KAHADB clean up old log files

Tim Bain
In reply to this post by lichtin
I was going to write a JIRA enhancement request for that when I had some
free time, but if you have time to write it before I get to it, that would
be fine.  Please put the link here if you do.

On Sep 22, 2016 9:00 AM, "lichtin" <[hidden email]> wrote:

> Changing trace levels and using the debugger is not always an option in
> production.
> I'm wondering why we could not revive the "KahaDBJournalReader" tool
> from https://issues.jboss.org/browse/MB-756 and ship it with ActiveMQ.
>
>
> Tim Bain wrote
> > Alternatively, you all could run the KahaDB file analysis tool (
> > http://activemq.apache.org/why-do-kahadb-log-files-
> remain-after-cleanup.html)
> > with a debugger attached to try to see what destination has the
> message(s)
> > that are keeping 980 and 1438 alive.  (You say that 1328 shouldn't be
> > there, but the log shows that it's there because it contains acks for
> > messages in 980, so it should be there, at least until 980 is deleted.)
> >
> > Tim
> >
> > On Sep 22, 2016 6:39 AM, "Steve Hill" <
>
> > shill@.co
>
> > > wrote:
> >
> > After the comment about the TOPIC I decided to take a quick look at that
> > in
> > the web interface (just to make sure nothing had gone awry)  In deed one
> > of
> > our "queues" had been created as a topic, not sure why as we use the same
> > code to connect to the queues however once removed most of the data files
> > vanished.
> >
> > Under the topics there are several activemq.advisory.xyz topics. Could
> > these be the source of the remaining kahadb log files that still remain?
> >
> > I am now down to three data files that should not be there. (db-980,
> > db-1329, and db-1438)
> >
> >
> > -rw-r--r--. 1 activemq activemq 33554432 Aug  8 14:00 db-980.log
> > -rw-r--r--. 1 activemq activemq   527645 Sep 19 19:11 db-1329.log
> > drwxr-xr-x. 5 activemq activemq     4096 Sep 20 14:57 ..
> > -rw-r--r--. 1 activemq activemq        8 Sep 21 13:17 lock
> > -rw-r--r--. 1 activemq activemq 33553518 Sep 21 21:52 db-1438.log
> > drwxr-xr-x. 2 activemq activemq    12288 Sep 22 08:20 .
> > -rw-r--r--. 1 activemq activemq  3287296 Sep 22 08:34 db.redo
> > -rw-r--r--. 1 activemq activemq 12357632 Sep 22 08:34 db.data
> > -rw-r--r--. 1 activemq activemq 33554432 Sep 22 08:34 db-1456.log
> >
> > From kahadb logs
> >
> >  2016-09-22 08:31:07,503 [eckpoint Worker] TRACE MessageDatabase
> >     - gc candidates: [1329]
> >  2016-09-22 08:31:07,504 [eckpoint Worker] TRACE MessageDatabase
> >     - ackMessageFileMap: {1329=[980], 1438=[], 1456=[]}
> >  2016-09-22 08:31:07,504 [eckpoint Worker] TRACE MessageDatabase
> >     - not removing data file: 1329 as contained ack(s) refer to
> referenced
> > file: [980]
> >  2016-09-22 08:31:07,505 [eckpoint Worker] TRACE MessageDatabase
> >     - Not yet time to check for compaction: 7 of 10 cycles
> >  2016-09-22 08:31:07,505 [eckpoint Worker] DEBUG MessageDatabase
> >     - Checkpoint done.
> >
> > Thanks!
> > Steve.
> >
> >
> > On 9/22/16 8:14 AM, Steve Hill wrote:
> >
> >> Chris/Tim:
> >>
> >> Thanks for the ideas.  Unfortunately neither are the case in our
> >> situation.  The web client shows 0 messages in any of our queues and we
> >> are
> >> not using topics.  We did set up the remote debug port and downloaded
> the
> >> source code if there is anything useful to further troubleshoot that
> >> could
> >> be gleaned from there?
> >>
> >> Thanks!
> >> Steve.
> >>
> >>
> >> The most likely place for you to have old messages laying around is in
> >> the
> >> DLQ.  The next most likely place is on a topic for an offline (and never
> >> coming back) durable subscriber.
> >>
> >> Tim
> >>
> >> On Sep 20, 2016 2:02 PM, "Christopher Shannon" <
> >>
>
> > christopher.l.shannon@
>
> >> wrote:
> >>
> >> Do you have old messages laying around?  KahaDB can't clean up old
> >> journal
> >> files if there are messages scattered throughout the logs.  Even a
> single
> >> unacked message means that a journal file (and the future files) can't
> be
> >> cleaned up. This situation tends to happen either when there are old
> >> messages not being acked quick enough or there are different
> destinations
> >> with different speeds.  Ie if one queue has a very slow message rate and
> >> another queue has a high message rate.
> >>
> >> Assuming there isn't some sort of bug with the GC process then one way
> to
> >> help lower the usage would be to split up destinations into different
> >> stores using multiKahaDB.  Take a look at Gary's blog post on this:
> >> http://blog.garytully.com/2011/11/activemq-multiple-
> kahadb-instances.html
> >>
> >>
> >> On Tue, Sep 20, 2016 at 1:04 PM, Hill, Steve <
>
> > SHill@
>
> > > wrote:
> >>
> >> Hi fellow ActiveMQ users!
> >>>
> >>> I have searched all over the internet and can not find a solution to a
> >>> problem we are experiencing where old log files are not removed from
> >>> KAHADB.  Needless to say we uncovered this problem when we exceeded the
> >>> size of our persistent store in production :-)
> >>>
> >>> We are using client side acknowledgement for all reads from the Queue
> >>> and
> >>> have restarted all clients and also ActiveMQ numerous times, yet the
> >>> messages do not get removed.  We are also using the network connector
> to
> >>> move messages to the queue that has an active consumer.  We were using
> >>> 5.13.0 but moved to 5.14.0 on the hope it might address this problem.
> >>>
> >>> Follows are some more detailed messages; does anyone have an idea how
> we
> >>> can resolve this problem?
> >>>
> >>> KAHADB logging is enabled and we see the following messages in the log
> >>> file.
> >>>
> >>>
> >>>  2016-09-20 08:59:10,207 [eckpoint Worker] TRACE MessageDatabase
> >>>       - not removing data file: 1278 as contained ack(s) refer to
> >>> referenced file: [1278, 1277]
> >>>
> >>>  2016-09-20 08:59:10,207 [eckpoint Worker] TRACE MessageDatabase
> >>>       - not removing data file: 1279 as contained ack(s) refer to
> >>> referenced file: [674]
> >>>
> >>>  2016-09-20 08:59:10,208 [eckpoint Worker] TRACE MessageDatabase
> >>>       - not removing data file: 1280 as contained ack(s) refer to
> >>> referenced file: [691]
> >>>
> >>>  2016-09-20 08:59:10,208 [eckpoint Worker] TRACE MessageDatabase
> >>>       - not removing data file: 1281 as contained ack(s) refer to
> >>> referenced file: [699]
> >>>
> >>>  2016-09-20 08:59:10,209 [eckpoint Worker] TRACE MessageDatabase
> >>>       - not removing data file: 1282 as contained ack(s) refer to
> >>> referenced file: [699]
> >>>
> >>>  2016-09-20 08:59:10,210 [eckpoint Worker] TRACE MessageDatabase
> >>>       - not removing data file: 1283 as contained ack(s) refer to
> >>> referenced file: [705]
> >>>
> >>>  2016-09-20 08:59:10,210 [eckpoint Worker] TRACE MessageDatabase
> >>>       - not removing data file: 1284 as contained ack(s) refer to
> >>> referenced file: [711]
> >>>
> >>>  2016-09-20 08:59:10,211 [eckpoint Worker] TRACE MessageDatabase
> >>>       - not removing data file: 1288 as contained ack(s) refer to
> >>> referenced file: [674]
> >>>
> >>>  2016-09-20 08:59:10,211 [eckpoint Worker] TRACE MessageDatabase
> >>>       - not removing data file: 1289 as contained ack(s) refer to
> >>> referenced file: [790]
> >>>
> >>>  2016-09-20 08:59:10,212 [eckpoint Worker] TRACE MessageDatabase
> >>>       - not removing data file: 1290 as contained ack(s) refer to
> >>> referenced file: [1290, 1278]
> >>>
> >>>  2016-09-20 08:59:10,213 [eckpoint Worker] TRACE MessageDatabase
> >>>       - not removing data file: 1291 as contained ack(s) refer to
> >>> referenced file: [790]
> >>>
> >>> The files at this point are months old.
> >>>
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Jul 14 21:23 db-674.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Jul 16 01:57 db-691.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Jul 17 02:10 db-699.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Jul 18 03:55 db-705.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Jul 19 01:32 db-711.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Jul 25 15:15 db-790.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Jul 26 10:32 db-858.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Jul 27 00:19 db-865.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Jul 27 03:03 db-866.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Jul 28 01:45 db-877.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Jul 29 01:18 db-888.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Jul 30 02:10 db-899.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Jul 31 01:11 db-904.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug  1 00:19 db-909.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug  1 04:42 db-910.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug  2 01:07 db-918.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug  3 01:38 db-930.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug  4 00:19 db-939.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug  4 03:16 db-940.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug  5 00:23 db-949.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug  6 02:42 db-960.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug  7 02:18 db-968.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug  8 00:44 db-975.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug  8 14:00 db-980.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug  8 15:41 db-981.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug  9 01:43 db-1051.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug 11 00:20 db-1066.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug 11 03:01 db-1067.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug 12 00:19 db-1074.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug 12 03:05 db-1075.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug 13 00:19 db-1082.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug 13 03:04 db-1083.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug 14 01:06 db-1093.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug 15 01:18 db-1111.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug 16 02:38 db-1126.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug 17 00:19 db-1137.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug 17 02:32 db-1138.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug 18 00:21 db-1155.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug 19 00:51 db-1167.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Sep 19 13:42 db-1277.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq   597139 Sep 19 14:59 db-1279.log
> >>>
> >>> Thanks!
> >>>
> >>> Steve Hill
> >>>
> >>> ARMADA
> >>>
> >>>
> >>>
> >>>
> >>> The information contained in this e-mail and any attachments is
> >>> confidential and
> >>> intended only for the recipient. If you are not the intended recipient,
> >>>
> >> the
> >>
> >>> information contained in this message may not be used, copied, or
> >>> forwarded to
> >>> third parties or otherwise distributed for any other purpose. Please
> >>> notify the
> >>> sender if you received this e-mail in error and delete the e-mail and
> >>> its
> >>> attachments promptly.  Nothing in this e-mail may be used or deemed to
> >>> form the
> >>> basis of a contractual or any other legally binding obligation unless
> >>> separately
> >>> confirmed in writing by an authorized representative of ARMADA.
> >>>
> >>>
> >>
> >>
> >>
>
>
>
>
>
> --
> View this message in context: http://activemq.2283324.n4.
> nabble.com/KAHADB-clean-up-old-log-files-tp4716661p4716804.html
> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>
Reply | Threaded
Open this post in threaded view
|

Re: KAHADB clean up old log files

lichtin
https://issues.apache.org/jira/browse/AMQ-6440

Tim Bain wrote
I was going to write a JIRA enhancement request for that when I had some
free time, but if you have time to write it before I get to it, that would
be fine.  Please put the link here if you do.

On Sep 22, 2016 9:00 AM, "lichtin" <[hidden email]> wrote:

> Changing trace levels and using the debugger is not always an option in
> production.
> I'm wondering why we could not revive the "KahaDBJournalReader" tool
> from https://issues.jboss.org/browse/MB-756 and ship it with ActiveMQ.
>
>
> Tim Bain wrote
> > Alternatively, you all could run the KahaDB file analysis tool (
> > http://activemq.apache.org/why-do-kahadb-log-files-
> remain-after-cleanup.html)
> > with a debugger attached to try to see what destination has the
> message(s)
> > that are keeping 980 and 1438 alive.  (You say that 1328 shouldn't be
> > there, but the log shows that it's there because it contains acks for
> > messages in 980, so it should be there, at least until 980 is deleted.)
> >
> > Tim
> >
> > On Sep 22, 2016 6:39 AM, "Steve Hill" <
>
> > shill@.co
>
> > > wrote:
> >
> > After the comment about the TOPIC I decided to take a quick look at that
> > in
> > the web interface (just to make sure nothing had gone awry)  In deed one
> > of
> > our "queues" had been created as a topic, not sure why as we use the same
> > code to connect to the queues however once removed most of the data files
> > vanished.
> >
> > Under the topics there are several activemq.advisory.xyz topics. Could
> > these be the source of the remaining kahadb log files that still remain?
> >
> > I am now down to three data files that should not be there. (db-980,
> > db-1329, and db-1438)
> >
> >
> > -rw-r--r--. 1 activemq activemq 33554432 Aug  8 14:00 db-980.log
> > -rw-r--r--. 1 activemq activemq   527645 Sep 19 19:11 db-1329.log
> > drwxr-xr-x. 5 activemq activemq     4096 Sep 20 14:57 ..
> > -rw-r--r--. 1 activemq activemq        8 Sep 21 13:17 lock
> > -rw-r--r--. 1 activemq activemq 33553518 Sep 21 21:52 db-1438.log
> > drwxr-xr-x. 2 activemq activemq    12288 Sep 22 08:20 .
> > -rw-r--r--. 1 activemq activemq  3287296 Sep 22 08:34 db.redo
> > -rw-r--r--. 1 activemq activemq 12357632 Sep 22 08:34 db.data
> > -rw-r--r--. 1 activemq activemq 33554432 Sep 22 08:34 db-1456.log
> >
> > From kahadb logs
> >
> >  2016-09-22 08:31:07,503 [eckpoint Worker] TRACE MessageDatabase
> >     - gc candidates: [1329]
> >  2016-09-22 08:31:07,504 [eckpoint Worker] TRACE MessageDatabase
> >     - ackMessageFileMap: {1329=[980], 1438=[], 1456=[]}
> >  2016-09-22 08:31:07,504 [eckpoint Worker] TRACE MessageDatabase
> >     - not removing data file: 1329 as contained ack(s) refer to
> referenced
> > file: [980]
> >  2016-09-22 08:31:07,505 [eckpoint Worker] TRACE MessageDatabase
> >     - Not yet time to check for compaction: 7 of 10 cycles
> >  2016-09-22 08:31:07,505 [eckpoint Worker] DEBUG MessageDatabase
> >     - Checkpoint done.
> >
> > Thanks!
> > Steve.
> >
> >
> > On 9/22/16 8:14 AM, Steve Hill wrote:
> >
> >> Chris/Tim:
> >>
> >> Thanks for the ideas.  Unfortunately neither are the case in our
> >> situation.  The web client shows 0 messages in any of our queues and we
> >> are
> >> not using topics.  We did set up the remote debug port and downloaded
> the
> >> source code if there is anything useful to further troubleshoot that
> >> could
> >> be gleaned from there?
> >>
> >> Thanks!
> >> Steve.
> >>
> >>
> >> The most likely place for you to have old messages laying around is in
> >> the
> >> DLQ.  The next most likely place is on a topic for an offline (and never
> >> coming back) durable subscriber.
> >>
> >> Tim
> >>
> >> On Sep 20, 2016 2:02 PM, "Christopher Shannon" <
> >>
>
> > christopher.l.shannon@
>
> >> wrote:
> >>
> >> Do you have old messages laying around?  KahaDB can't clean up old
> >> journal
> >> files if there are messages scattered throughout the logs.  Even a
> single
> >> unacked message means that a journal file (and the future files) can't
> be
> >> cleaned up. This situation tends to happen either when there are old
> >> messages not being acked quick enough or there are different
> destinations
> >> with different speeds.  Ie if one queue has a very slow message rate and
> >> another queue has a high message rate.
> >>
> >> Assuming there isn't some sort of bug with the GC process then one way
> to
> >> help lower the usage would be to split up destinations into different
> >> stores using multiKahaDB.  Take a look at Gary's blog post on this:
> >> http://blog.garytully.com/2011/11/activemq-multiple-
> kahadb-instances.html
> >>
> >>
> >> On Tue, Sep 20, 2016 at 1:04 PM, Hill, Steve <
>
> > SHill@
>
> > > wrote:
> >>
> >> Hi fellow ActiveMQ users!
> >>>
> >>> I have searched all over the internet and can not find a solution to a
> >>> problem we are experiencing where old log files are not removed from
> >>> KAHADB.  Needless to say we uncovered this problem when we exceeded the
> >>> size of our persistent store in production :-)
> >>>
> >>> We are using client side acknowledgement for all reads from the Queue
> >>> and
> >>> have restarted all clients and also ActiveMQ numerous times, yet the
> >>> messages do not get removed.  We are also using the network connector
> to
> >>> move messages to the queue that has an active consumer.  We were using
> >>> 5.13.0 but moved to 5.14.0 on the hope it might address this problem.
> >>>
> >>> Follows are some more detailed messages; does anyone have an idea how
> we
> >>> can resolve this problem?
> >>>
> >>> KAHADB logging is enabled and we see the following messages in the log
> >>> file.
> >>>
> >>>
> >>>  2016-09-20 08:59:10,207 [eckpoint Worker] TRACE MessageDatabase
> >>>       - not removing data file: 1278 as contained ack(s) refer to
> >>> referenced file: [1278, 1277]
> >>>
> >>>  2016-09-20 08:59:10,207 [eckpoint Worker] TRACE MessageDatabase
> >>>       - not removing data file: 1279 as contained ack(s) refer to
> >>> referenced file: [674]
> >>>
> >>>  2016-09-20 08:59:10,208 [eckpoint Worker] TRACE MessageDatabase
> >>>       - not removing data file: 1280 as contained ack(s) refer to
> >>> referenced file: [691]
> >>>
> >>>  2016-09-20 08:59:10,208 [eckpoint Worker] TRACE MessageDatabase
> >>>       - not removing data file: 1281 as contained ack(s) refer to
> >>> referenced file: [699]
> >>>
> >>>  2016-09-20 08:59:10,209 [eckpoint Worker] TRACE MessageDatabase
> >>>       - not removing data file: 1282 as contained ack(s) refer to
> >>> referenced file: [699]
> >>>
> >>>  2016-09-20 08:59:10,210 [eckpoint Worker] TRACE MessageDatabase
> >>>       - not removing data file: 1283 as contained ack(s) refer to
> >>> referenced file: [705]
> >>>
> >>>  2016-09-20 08:59:10,210 [eckpoint Worker] TRACE MessageDatabase
> >>>       - not removing data file: 1284 as contained ack(s) refer to
> >>> referenced file: [711]
> >>>
> >>>  2016-09-20 08:59:10,211 [eckpoint Worker] TRACE MessageDatabase
> >>>       - not removing data file: 1288 as contained ack(s) refer to
> >>> referenced file: [674]
> >>>
> >>>  2016-09-20 08:59:10,211 [eckpoint Worker] TRACE MessageDatabase
> >>>       - not removing data file: 1289 as contained ack(s) refer to
> >>> referenced file: [790]
> >>>
> >>>  2016-09-20 08:59:10,212 [eckpoint Worker] TRACE MessageDatabase
> >>>       - not removing data file: 1290 as contained ack(s) refer to
> >>> referenced file: [1290, 1278]
> >>>
> >>>  2016-09-20 08:59:10,213 [eckpoint Worker] TRACE MessageDatabase
> >>>       - not removing data file: 1291 as contained ack(s) refer to
> >>> referenced file: [790]
> >>>
> >>> The files at this point are months old.
> >>>
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Jul 14 21:23 db-674.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Jul 16 01:57 db-691.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Jul 17 02:10 db-699.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Jul 18 03:55 db-705.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Jul 19 01:32 db-711.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Jul 25 15:15 db-790.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Jul 26 10:32 db-858.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Jul 27 00:19 db-865.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Jul 27 03:03 db-866.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Jul 28 01:45 db-877.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Jul 29 01:18 db-888.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Jul 30 02:10 db-899.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Jul 31 01:11 db-904.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug  1 00:19 db-909.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug  1 04:42 db-910.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug  2 01:07 db-918.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug  3 01:38 db-930.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug  4 00:19 db-939.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug  4 03:16 db-940.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug  5 00:23 db-949.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug  6 02:42 db-960.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug  7 02:18 db-968.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug  8 00:44 db-975.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug  8 14:00 db-980.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug  8 15:41 db-981.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug  9 01:43 db-1051.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug 11 00:20 db-1066.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug 11 03:01 db-1067.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug 12 00:19 db-1074.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug 12 03:05 db-1075.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug 13 00:19 db-1082.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug 13 03:04 db-1083.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug 14 01:06 db-1093.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug 15 01:18 db-1111.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug 16 02:38 db-1126.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug 17 00:19 db-1137.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug 17 02:32 db-1138.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug 18 00:21 db-1155.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug 19 00:51 db-1167.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Sep 19 13:42 db-1277.log
> >>>
> >>> -rw-r--r--. 1 activemq activemq   597139 Sep 19 14:59 db-1279.log
> >>>
> >>> Thanks!
> >>>
> >>> Steve Hill
> >>>
> >>> ARMADA
> >>>
> >>>
> >>>
> >>>
> >>> The information contained in this e-mail and any attachments is
> >>> confidential and
> >>> intended only for the recipient. If you are not the intended recipient,
> >>>
> >> the
> >>
> >>> information contained in this message may not be used, copied, or
> >>> forwarded to
> >>> third parties or otherwise distributed for any other purpose. Please
> >>> notify the
> >>> sender if you received this e-mail in error and delete the e-mail and
> >>> its
> >>> attachments promptly.  Nothing in this e-mail may be used or deemed to
> >>> form the
> >>> basis of a contractual or any other legally binding obligation unless
> >>> separately
> >>> confirmed in writing by an authorized representative of ARMADA.
> >>>
> >>>
> >>
> >>
> >>
>
>
>
>
>
> --
> View this message in context: http://activemq.2283324.n4.
> nabble.com/KAHADB-clean-up-old-log-files-tp4716661p4716804.html
> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>
Reply | Threaded
Open this post in threaded view
|

Re: KAHADB clean up old log files

Johan Carlquist
In reply to this post by Steve Hill
I will hijack this thread since I have almost the same problem. Log
files the filling up disk and TRACE telling me:
"not removing data file: XXX as contained ack(s) refer to
referenced file: [XXX]"
Running ActiveMQ 5.13.4.

I hacked'n'slashed KahaDBJournalReader (would be great if the source
code could be found somewhere else were collaboration and contribution
where easier…) to work with the current ActiveMQ.
And the output ended up like this:
---
2016-09-26 16:21:27,659 INFO main - -- inflight transactions---
2016-09-26 16:21:27,669 INFO main - adds[1251093], acks[1270663] for: null
2016-09-26 16:21:27,669 INFO main -

2016-09-26 16:21:27,669 INFO main - Persistent destination statistics:
2016-09-26 16:21:27,670 INFO main - ------------------------
2016-09-26 16:21:27,672 INFO main - QUEUE: queue1.test: -38136 msgs.
2016-09-26 16:21:27,673 INFO main - QUEUE: queue1.dev: 5883 msgs.
2016-09-26 16:21:27,673 INFO main - QUEUE: ActiveMQ.DLQ: 3 msgs.
2016-09-26 16:21:27,673 INFO main - QUEUE: aliveness-test: -33 msgs.
2016-09-26 16:21:27,674 INFO main - QUEUE: ActiveMQ.DLQ.Queue.queue1.dev: -636 msgs.
2016-09-26 16:21:27,674 INFO main - QUEUE: ActiveMQ.DLQ.Queue.queue1.test: 907 msgs.
---

If comparing with the web console the only thing that seems correct is
the amount of messages in ActiveMQ.DLQ.Queue.queue1.test, everything
else seems wrong.

What can I read from this? Are all the negative values indicating that
my KahaDB is corrupt?

Could enableAckCompaction found in 5.14.0 be a solution to this problem?

--
jocar

> On 22 Sep 2016, at 17:40, Steve Hill <[hidden email]> wrote:
>
> Ok, so I hooked up jconsole and looked at the queues, all of the queues were showing zero.  Right underneath the queues there was a section that said "RecoveredXaTransaction".  It had one entry XID_[1096044365_globalId&amp;3132362e392e312e3133392e746..... There really is not very much more information available in the attributes.  I ended up hitting the heuristicRollback operation and shortly afterwards the old KahaDB files were removed.  Great news!
>
> I checked the queues afterwards and rolling back the XA transaction did not cause the message to go anywhere.  Does anyone know what this might have been and how to avoid it from occurring in the future?
>
> Thanks!
> Steve.
>
> On 9/22/16 9:09 AM, Tim Bain wrote:
>> Apparently you can scratch the "but not impossible" bit from my reply.
>> Thanks Tim, I didn't know it wasn't allowed.
>>
>> On Sep 22, 2016 7:07 AM, "Tim Bain" <[hidden email]> wrote:
>>
>>> Topics will only hold onto messages if 1) a consumer is currently
>>> connected but hasn't consumed the messages, or 2) a consumer subscribed
>>> durably but is now disconnected.  #1 indicates a slow (or intentionally
>>> misbehaving) consumer, and isn't likely to be a problem over the span of
>>> time you're describing.  It also would result in the messages being
>>> discarded when the broker is restarted, so if you've done a restart and the
>>> files weren't cleaned up, this isn't the problem.  #2 would be unusual for
>>> advisory topic subscriptions, but not impossible (especially since you
>>> apparently had this situation with your mis-created non-advisory topic).
>>> I'd use JConsole (because it gives more detail than the web console) to go
>>> through each queue looking for a non-zero number of messages, and each
>>> topic looking for offline durable subscribers that have unconsumed messages.
>>>
>>> Alternatively, you all could run the KahaDB file analysis tool (
>>> http://activemq.apache.org/why-do-kahadb-log-files-
>>> remain-after-cleanup.html) with a debugger attached to try to see what
>>> destination has the message(s) that are keeping 980 and 1438 alive.  (You
>>> say that 1328 shouldn't be there, but the log shows that it's there because
>>> it contains acks for messages in 980, so it should be there, at least until
>>> 980 is deleted.)
>>>
>>> Tim
>>>
>>> On Sep 22, 2016 6:39 AM, "Steve Hill" <[hidden email]> wrote:
>>>
>>> After the comment about the TOPIC I decided to take a quick look at that
>>> in the web interface (just to make sure nothing had gone awry)  In deed one
>>> of our "queues" had been created as a topic, not sure why as we use the
>>> same code to connect to the queues however once removed most of the data
>>> files vanished.
>>>
>>> Under the topics there are several activemq.advisory.xyz topics. Could
>>> these be the source of the remaining kahadb log files that still remain?
>>>
>>> I am now down to three data files that should not be there. (db-980,
>>> db-1329, and db-1438)
>>>
>>>
>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  8 14:00 db-980.log
>>> -rw-r--r--. 1 activemq activemq   527645 Sep 19 19:11 db-1329.log
>>> drwxr-xr-x. 5 activemq activemq     4096 Sep 20 14:57 ..
>>> -rw-r--r--. 1 activemq activemq        8 Sep 21 13:17 lock
>>> -rw-r--r--. 1 activemq activemq 33553518 Sep 21 21:52 db-1438.log
>>> drwxr-xr-x. 2 activemq activemq    12288 Sep 22 08:20 .
>>> -rw-r--r--. 1 activemq activemq  3287296 Sep 22 08:34 db.redo
>>> -rw-r--r--. 1 activemq activemq 12357632 Sep 22 08:34 db.data
>>> -rw-r--r--. 1 activemq activemq 33554432 Sep 22 08:34 db-1456.log
>>>
>>> From kahadb logs
>>>
>>>  2016-09-22 08:31:07,503 [eckpoint Worker] TRACE MessageDatabase
>>>       - gc candidates: [1329]
>>>  2016-09-22 08:31:07,504 [eckpoint Worker] TRACE MessageDatabase
>>>       - ackMessageFileMap: {1329=[980], 1438=[], 1456=[]}
>>>  2016-09-22 08:31:07,504 [eckpoint Worker] TRACE MessageDatabase
>>>       - not removing data file: 1329 as contained ack(s) refer to
>>> referenced file: [980]
>>>  2016-09-22 08:31:07,505 [eckpoint Worker] TRACE MessageDatabase
>>>       - Not yet time to check for compaction: 7 of 10 cycles
>>>  2016-09-22 08:31:07,505 [eckpoint Worker] DEBUG MessageDatabase
>>>       - Checkpoint done.
>>>
>>> Thanks!
>>> Steve.
>>>
>>>
>>> On 9/22/16 8:14 AM, Steve Hill wrote:
>>>
>>>> Chris/Tim:
>>>>
>>>> Thanks for the ideas.  Unfortunately neither are the case in our
>>>> situation.  The web client shows 0 messages in any of our queues and we are
>>>> not using topics.  We did set up the remote debug port and downloaded the
>>>> source code if there is anything useful to further troubleshoot that could
>>>> be gleaned from there?
>>>>
>>>> Thanks!
>>>> Steve.
>>>>
>>>>
>>>> The most likely place for you to have old messages laying around is in the
>>>> DLQ.  The next most likely place is on a topic for an offline (and never
>>>> coming back) durable subscriber.
>>>>
>>>> Tim
>>>>
>>>> On Sep 20, 2016 2:02 PM, "Christopher Shannon" <
>>>> [hidden email]> wrote:
>>>>
>>>> Do you have old messages laying around?  KahaDB can't clean up old journal
>>>> files if there are messages scattered throughout the logs.  Even a single
>>>> unacked message means that a journal file (and the future files) can't be
>>>> cleaned up. This situation tends to happen either when there are old
>>>> messages not being acked quick enough or there are different destinations
>>>> with different speeds.  Ie if one queue has a very slow message rate and
>>>> another queue has a high message rate.
>>>>
>>>> Assuming there isn't some sort of bug with the GC process then one way to
>>>> help lower the usage would be to split up destinations into different
>>>> stores using multiKahaDB.  Take a look at Gary's blog post on this:
>>>> http://blog.garytully.com/2011/11/activemq-multiple-kahadb-instances.html
>>>>
>>>>
>>>> On Tue, Sep 20, 2016 at 1:04 PM, Hill, Steve <[hidden email]> wrote:
>>>>
>>>> Hi fellow ActiveMQ users!
>>>>> I have searched all over the internet and can not find a solution to a
>>>>> problem we are experiencing where old log files are not removed from
>>>>> KAHADB.  Needless to say we uncovered this problem when we exceeded the
>>>>> size of our persistent store in production :-)
>>>>>
>>>>> We are using client side acknowledgement for all reads from the Queue and
>>>>> have restarted all clients and also ActiveMQ numerous times, yet the
>>>>> messages do not get removed.  We are also using the network connector to
>>>>> move messages to the queue that has an active consumer.  We were using
>>>>> 5.13.0 but moved to 5.14.0 on the hope it might address this problem.
>>>>>
>>>>> Follows are some more detailed messages; does anyone have an idea how we
>>>>> can resolve this problem?
>>>>>
>>>>> KAHADB logging is enabled and we see the following messages in the log
>>>>> file.
>>>>>
>>>>>
>>>>>  2016-09-20 08:59:10,207 [eckpoint Worker] TRACE MessageDatabase
>>>>>       - not removing data file: 1278 as contained ack(s) refer to
>>>>> referenced file: [1278, 1277]
>>>>>
>>>>>  2016-09-20 08:59:10,207 [eckpoint Worker] TRACE MessageDatabase
>>>>>       - not removing data file: 1279 as contained ack(s) refer to
>>>>> referenced file: [674]
>>>>>
>>>>>  2016-09-20 08:59:10,208 [eckpoint Worker] TRACE MessageDatabase
>>>>>       - not removing data file: 1280 as contained ack(s) refer to
>>>>> referenced file: [691]
>>>>>
>>>>>  2016-09-20 08:59:10,208 [eckpoint Worker] TRACE MessageDatabase
>>>>>       - not removing data file: 1281 as contained ack(s) refer to
>>>>> referenced file: [699]
>>>>>
>>>>>  2016-09-20 08:59:10,209 [eckpoint Worker] TRACE MessageDatabase
>>>>>       - not removing data file: 1282 as contained ack(s) refer to
>>>>> referenced file: [699]
>>>>>
>>>>>  2016-09-20 08:59:10,210 [eckpoint Worker] TRACE MessageDatabase
>>>>>       - not removing data file: 1283 as contained ack(s) refer to
>>>>> referenced file: [705]
>>>>>
>>>>>  2016-09-20 08:59:10,210 [eckpoint Worker] TRACE MessageDatabase
>>>>>       - not removing data file: 1284 as contained ack(s) refer to
>>>>> referenced file: [711]
>>>>>
>>>>>  2016-09-20 08:59:10,211 [eckpoint Worker] TRACE MessageDatabase
>>>>>       - not removing data file: 1288 as contained ack(s) refer to
>>>>> referenced file: [674]
>>>>>
>>>>>  2016-09-20 08:59:10,211 [eckpoint Worker] TRACE MessageDatabase
>>>>>       - not removing data file: 1289 as contained ack(s) refer to
>>>>> referenced file: [790]
>>>>>
>>>>>  2016-09-20 08:59:10,212 [eckpoint Worker] TRACE MessageDatabase
>>>>>       - not removing data file: 1290 as contained ack(s) refer to
>>>>> referenced file: [1290, 1278]
>>>>>
>>>>>  2016-09-20 08:59:10,213 [eckpoint Worker] TRACE MessageDatabase
>>>>>       - not removing data file: 1291 as contained ack(s) refer to
>>>>> referenced file: [790]
>>>>>
>>>>> The files at this point are months old.
>>>>>
>>>>>
>>>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 14 21:23 db-674.log
>>>>>
>>>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 16 01:57 db-691.log
>>>>>
>>>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 17 02:10 db-699.log
>>>>>
>>>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 18 03:55 db-705.log
>>>>>
>>>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 19 01:32 db-711.log
>>>>>
>>>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 25 15:15 db-790.log
>>>>>
>>>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 26 10:32 db-858.log
>>>>>
>>>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 27 00:19 db-865.log
>>>>>
>>>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 27 03:03 db-866.log
>>>>>
>>>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 28 01:45 db-877.log
>>>>>
>>>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 29 01:18 db-888.log
>>>>>
>>>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 30 02:10 db-899.log
>>>>>
>>>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 31 01:11 db-904.log
>>>>>
>>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  1 00:19 db-909.log
>>>>>
>>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  1 04:42 db-910.log
>>>>>
>>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  2 01:07 db-918.log
>>>>>
>>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  3 01:38 db-930.log
>>>>>
>>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  4 00:19 db-939.log
>>>>>
>>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  4 03:16 db-940.log
>>>>>
>>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  5 00:23 db-949.log
>>>>>
>>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  6 02:42 db-960.log
>>>>>
>>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  7 02:18 db-968.log
>>>>>
>>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  8 00:44 db-975.log
>>>>>
>>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  8 14:00 db-980.log
>>>>>
>>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  8 15:41 db-981.log
>>>>>
>>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  9 01:43 db-1051.log
>>>>>
>>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 11 00:20 db-1066.log
>>>>>
>>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 11 03:01 db-1067.log
>>>>>
>>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 12 00:19 db-1074.log
>>>>>
>>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 12 03:05 db-1075.log
>>>>>
>>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 13 00:19 db-1082.log
>>>>>
>>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 13 03:04 db-1083.log
>>>>>
>>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 14 01:06 db-1093.log
>>>>>
>>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 15 01:18 db-1111.log
>>>>>
>>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 16 02:38 db-1126.log
>>>>>
>>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 17 00:19 db-1137.log
>>>>>
>>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 17 02:32 db-1138.log
>>>>>
>>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 18 00:21 db-1155.log
>>>>>
>>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 19 00:51 db-1167.log
>>>>>
>>>>> -rw-r--r--. 1 activemq activemq 33554432 Sep 19 13:42 db-1277.log
>>>>>
>>>>> -rw-r--r--. 1 activemq activemq   597139 Sep 19 14:59 db-1279.log
>>>>>
>>>>> Thanks!
>>>>>
>>>>> Steve Hill
>>>>>
>>>>> ARMADA
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> The information contained in this e-mail and any attachments is
>>>>> confidential and
>>>>> intended only for the recipient. If you are not the intended recipient,
>>>>>
>>>> the
>>>>
>>>>> information contained in this message may not be used, copied, or
>>>>> forwarded to
>>>>> third parties or otherwise distributed for any other purpose. Please
>>>>> notify the
>>>>> sender if you received this e-mail in error and delete the e-mail and its
>>>>> attachments promptly.  Nothing in this e-mail may be used or deemed to
>>>>> form the
>>>>> basis of a contractual or any other legally binding obligation unless
>>>>> separately
>>>>> confirmed in writing by an authorized representative of ARMADA.
>>>>>
>>>>>
>>>>
>>>>
>>>
>

Reply | Threaded
Open this post in threaded view
|

Re: KAHADB clean up old log files

Tim Bain
Do the stats you see in JConsole line up with KahaDBJournalReader (what's
that?), or with the web console?  And do you see an open XA transaction in
JConsole, like Steve did?

I'd guess that the negative values indicate that whatever code you
hacked'n'slashed has a bug, either in the code you started with or in the
code you changed/added.  But that's pure guesswork.

The interesting part of your email (for me) is the sentence "the only thing
that seems correct is the amount of messages in
ActiveMQ.DLQ.Queue.queue1.test,"
which implies that you know you have messages in the DLQ and aren't doing
anything about it.  By any chance, are one or more of those messages
exactly as old as your oldest data file?  If you purge that queue (and all
other DLQs), do your data files disappear?

enableAckCompaction only compacts acks, not unconsumed messages.  It will
eliminate any data file being kept solely for acks, but by definition
that's not your first data file (that one must have an unconsumed message)
and if you have unconsumed messages in later files it won't compact them
either.  It might be a problem if your goal is to have fewer journal files,
but you're OK with still having some, but if you want to get rid of all the
old ones, you need to make sure you're not holding old unconsumed messages.

Tim

On Mon, Sep 26, 2016 at 8:45 AM, Johan Carlquist <[hidden email]> wrote:

> I will hijack this thread since I have almost the same problem. Log
> files the filling up disk and TRACE telling me:
> "not removing data file: XXX as contained ack(s) refer to
> referenced file: [XXX]"
> Running ActiveMQ 5.13.4.
>
> I hacked'n'slashed KahaDBJournalReader (would be great if the source
> code could be found somewhere else were collaboration and contribution
> where easier…) to work with the current ActiveMQ.
> And the output ended up like this:
> ---
> 2016-09-26 16:21:27,659 INFO main - -- inflight transactions---
> 2016-09-26 16:21:27,669 INFO main - adds[1251093], acks[1270663] for: null
> 2016-09-26 16:21:27,669 INFO main -
>
> 2016-09-26 16:21:27,669 INFO main - Persistent destination statistics:
> 2016-09-26 16:21:27,670 INFO main - ------------------------
> 2016-09-26 16:21:27,672 INFO main - QUEUE: queue1.test: -38136 msgs.
> 2016-09-26 16:21:27,673 INFO main - QUEUE: queue1.dev: 5883 msgs.
> 2016-09-26 16:21:27,673 INFO main - QUEUE: ActiveMQ.DLQ: 3 msgs.
> 2016-09-26 16:21:27,673 INFO main - QUEUE: aliveness-test: -33 msgs.
> 2016-09-26 16:21:27,674 INFO main - QUEUE: ActiveMQ.DLQ.Queue.queue1.dev:
> -636 msgs.
> 2016-09-26 16:21:27,674 INFO main - QUEUE: ActiveMQ.DLQ.Queue.queue1.test:
> 907 msgs.
> ---
>
> If comparing with the web console the only thing that seems correct is
> the amount of messages in ActiveMQ.DLQ.Queue.queue1.test, everything
> else seems wrong.
>
> What can I read from this? Are all the negative values indicating that
> my KahaDB is corrupt?
>
> Could enableAckCompaction found in 5.14.0 be a solution to this problem?
>
> --
> jocar
>
> > On 22 Sep 2016, at 17:40, Steve Hill <[hidden email]> wrote:
> >
> > Ok, so I hooked up jconsole and looked at the queues, all of the queues
> were showing zero.  Right underneath the queues there was a section that
> said "RecoveredXaTransaction".  It had one entry
> XID_[1096044365_globalId&amp;3132362e392e312e3133392e746..... There
> really is not very much more information available in the attributes.  I
> ended up hitting the heuristicRollback operation and shortly afterwards the
> old KahaDB files were removed.  Great news!
> >
> > I checked the queues afterwards and rolling back the XA transaction did
> not cause the message to go anywhere.  Does anyone know what this might
> have been and how to avoid it from occurring in the future?
> >
> > Thanks!
> > Steve.
> >
> > On 9/22/16 9:09 AM, Tim Bain wrote:
> >> Apparently you can scratch the "but not impossible" bit from my reply.
> >> Thanks Tim, I didn't know it wasn't allowed.
> >>
> >> On Sep 22, 2016 7:07 AM, "Tim Bain" <[hidden email]> wrote:
> >>
> >>> Topics will only hold onto messages if 1) a consumer is currently
> >>> connected but hasn't consumed the messages, or 2) a consumer subscribed
> >>> durably but is now disconnected.  #1 indicates a slow (or intentionally
> >>> misbehaving) consumer, and isn't likely to be a problem over the span
> of
> >>> time you're describing.  It also would result in the messages being
> >>> discarded when the broker is restarted, so if you've done a restart
> and the
> >>> files weren't cleaned up, this isn't the problem.  #2 would be unusual
> for
> >>> advisory topic subscriptions, but not impossible (especially since you
> >>> apparently had this situation with your mis-created non-advisory
> topic).
> >>> I'd use JConsole (because it gives more detail than the web console)
> to go
> >>> through each queue looking for a non-zero number of messages, and each
> >>> topic looking for offline durable subscribers that have unconsumed
> messages.
> >>>
> >>> Alternatively, you all could run the KahaDB file analysis tool (
> >>> http://activemq.apache.org/why-do-kahadb-log-files-
> >>> remain-after-cleanup.html) with a debugger attached to try to see what
> >>> destination has the message(s) that are keeping 980 and 1438 alive.
> (You
> >>> say that 1328 shouldn't be there, but the log shows that it's there
> because
> >>> it contains acks for messages in 980, so it should be there, at least
> until
> >>> 980 is deleted.)
> >>>
> >>> Tim
> >>>
> >>> On Sep 22, 2016 6:39 AM, "Steve Hill" <[hidden email]> wrote:
> >>>
> >>> After the comment about the TOPIC I decided to take a quick look at
> that
> >>> in the web interface (just to make sure nothing had gone awry)  In
> deed one
> >>> of our "queues" had been created as a topic, not sure why as we use the
> >>> same code to connect to the queues however once removed most of the
> data
> >>> files vanished.
> >>>
> >>> Under the topics there are several activemq.advisory.xyz topics. Could
> >>> these be the source of the remaining kahadb log files that still
> remain?
> >>>
> >>> I am now down to three data files that should not be there. (db-980,
> >>> db-1329, and db-1438)
> >>>
> >>>
> >>> -rw-r--r--. 1 activemq activemq 33554432 Aug  8 14:00 db-980.log
> >>> -rw-r--r--. 1 activemq activemq   527645 Sep 19 19:11 db-1329.log
> >>> drwxr-xr-x. 5 activemq activemq     4096 Sep 20 14:57 ..
> >>> -rw-r--r--. 1 activemq activemq        8 Sep 21 13:17 lock
> >>> -rw-r--r--. 1 activemq activemq 33553518 Sep 21 21:52 db-1438.log
> >>> drwxr-xr-x. 2 activemq activemq    12288 Sep 22 08:20 .
> >>> -rw-r--r--. 1 activemq activemq  3287296 Sep 22 08:34 db.redo
> >>> -rw-r--r--. 1 activemq activemq 12357632 Sep 22 08:34 db.data
> >>> -rw-r--r--. 1 activemq activemq 33554432 Sep 22 08:34 db-1456.log
> >>>
> >>> From kahadb logs
> >>>
> >>>  2016-09-22 08:31:07,503 [eckpoint Worker] TRACE MessageDatabase
> >>>       - gc candidates: [1329]
> >>>  2016-09-22 08:31:07,504 [eckpoint Worker] TRACE MessageDatabase
> >>>       - ackMessageFileMap: {1329=[980], 1438=[], 1456=[]}
> >>>  2016-09-22 08:31:07,504 [eckpoint Worker] TRACE MessageDatabase
> >>>       - not removing data file: 1329 as contained ack(s) refer to
> >>> referenced file: [980]
> >>>  2016-09-22 08:31:07,505 [eckpoint Worker] TRACE MessageDatabase
> >>>       - Not yet time to check for compaction: 7 of 10 cycles
> >>>  2016-09-22 08:31:07,505 [eckpoint Worker] DEBUG MessageDatabase
> >>>       - Checkpoint done.
> >>>
> >>> Thanks!
> >>> Steve.
> >>>
> >>>
> >>> On 9/22/16 8:14 AM, Steve Hill wrote:
> >>>
> >>>> Chris/Tim:
> >>>>
> >>>> Thanks for the ideas.  Unfortunately neither are the case in our
> >>>> situation.  The web client shows 0 messages in any of our queues and
> we are
> >>>> not using topics.  We did set up the remote debug port and downloaded
> the
> >>>> source code if there is anything useful to further troubleshoot that
> could
> >>>> be gleaned from there?
> >>>>
> >>>> Thanks!
> >>>> Steve.
> >>>>
> >>>>
> >>>> The most likely place for you to have old messages laying around is
> in the
> >>>> DLQ.  The next most likely place is on a topic for an offline (and
> never
> >>>> coming back) durable subscriber.
> >>>>
> >>>> Tim
> >>>>
> >>>> On Sep 20, 2016 2:02 PM, "Christopher Shannon" <
> >>>> [hidden email]> wrote:
> >>>>
> >>>> Do you have old messages laying around?  KahaDB can't clean up old
> journal
> >>>> files if there are messages scattered throughout the logs.  Even a
> single
> >>>> unacked message means that a journal file (and the future files)
> can't be
> >>>> cleaned up. This situation tends to happen either when there are old
> >>>> messages not being acked quick enough or there are different
> destinations
> >>>> with different speeds.  Ie if one queue has a very slow message rate
> and
> >>>> another queue has a high message rate.
> >>>>
> >>>> Assuming there isn't some sort of bug with the GC process then one
> way to
> >>>> help lower the usage would be to split up destinations into different
> >>>> stores using multiKahaDB.  Take a look at Gary's blog post on this:
> >>>> http://blog.garytully.com/2011/11/activemq-multiple-
> kahadb-instances.html
> >>>>
> >>>>
> >>>> On Tue, Sep 20, 2016 at 1:04 PM, Hill, Steve <[hidden email]>
> wrote:
> >>>>
> >>>> Hi fellow ActiveMQ users!
> >>>>> I have searched all over the internet and can not find a solution to
> a
> >>>>> problem we are experiencing where old log files are not removed from
> >>>>> KAHADB.  Needless to say we uncovered this problem when we exceeded
> the
> >>>>> size of our persistent store in production :-)
> >>>>>
> >>>>> We are using client side acknowledgement for all reads from the
> Queue and
> >>>>> have restarted all clients and also ActiveMQ numerous times, yet the
> >>>>> messages do not get removed.  We are also using the network
> connector to
> >>>>> move messages to the queue that has an active consumer.  We were
> using
> >>>>> 5.13.0 but moved to 5.14.0 on the hope it might address this problem.
> >>>>>
> >>>>> Follows are some more detailed messages; does anyone have an idea
> how we
> >>>>> can resolve this problem?
> >>>>>
> >>>>> KAHADB logging is enabled and we see the following messages in the
> log
> >>>>> file.
> >>>>>
> >>>>>
> >>>>>  2016-09-20 08:59:10,207 [eckpoint Worker] TRACE MessageDatabase
> >>>>>       - not removing data file: 1278 as contained ack(s) refer to
> >>>>> referenced file: [1278, 1277]
> >>>>>
> >>>>>  2016-09-20 08:59:10,207 [eckpoint Worker] TRACE MessageDatabase
> >>>>>       - not removing data file: 1279 as contained ack(s) refer to
> >>>>> referenced file: [674]
> >>>>>
> >>>>>  2016-09-20 08:59:10,208 [eckpoint Worker] TRACE MessageDatabase
> >>>>>       - not removing data file: 1280 as contained ack(s) refer to
> >>>>> referenced file: [691]
> >>>>>
> >>>>>  2016-09-20 08:59:10,208 [eckpoint Worker] TRACE MessageDatabase
> >>>>>       - not removing data file: 1281 as contained ack(s) refer to
> >>>>> referenced file: [699]
> >>>>>
> >>>>>  2016-09-20 08:59:10,209 [eckpoint Worker] TRACE MessageDatabase
> >>>>>       - not removing data file: 1282 as contained ack(s) refer to
> >>>>> referenced file: [699]
> >>>>>
> >>>>>  2016-09-20 08:59:10,210 [eckpoint Worker] TRACE MessageDatabase
> >>>>>       - not removing data file: 1283 as contained ack(s) refer to
> >>>>> referenced file: [705]
> >>>>>
> >>>>>  2016-09-20 08:59:10,210 [eckpoint Worker] TRACE MessageDatabase
> >>>>>       - not removing data file: 1284 as contained ack(s) refer to
> >>>>> referenced file: [711]
> >>>>>
> >>>>>  2016-09-20 08:59:10,211 [eckpoint Worker] TRACE MessageDatabase
> >>>>>       - not removing data file: 1288 as contained ack(s) refer to
> >>>>> referenced file: [674]
> >>>>>
> >>>>>  2016-09-20 08:59:10,211 [eckpoint Worker] TRACE MessageDatabase
> >>>>>       - not removing data file: 1289 as contained ack(s) refer to
> >>>>> referenced file: [790]
> >>>>>
> >>>>>  2016-09-20 08:59:10,212 [eckpoint Worker] TRACE MessageDatabase
> >>>>>       - not removing data file: 1290 as contained ack(s) refer to
> >>>>> referenced file: [1290, 1278]
> >>>>>
> >>>>>  2016-09-20 08:59:10,213 [eckpoint Worker] TRACE MessageDatabase
> >>>>>       - not removing data file: 1291 as contained ack(s) refer to
> >>>>> referenced file: [790]
> >>>>>
> >>>>> The files at this point are months old.
> >>>>>
> >>>>>
> >>>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 14 21:23 db-674.log
> >>>>>
> >>>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 16 01:57 db-691.log
> >>>>>
> >>>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 17 02:10 db-699.log
> >>>>>
> >>>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 18 03:55 db-705.log
> >>>>>
> >>>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 19 01:32 db-711.log
> >>>>>
> >>>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 25 15:15 db-790.log
> >>>>>
> >>>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 26 10:32 db-858.log
> >>>>>
> >>>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 27 00:19 db-865.log
> >>>>>
> >>>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 27 03:03 db-866.log
> >>>>>
> >>>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 28 01:45 db-877.log
> >>>>>
> >>>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 29 01:18 db-888.log
> >>>>>
> >>>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 30 02:10 db-899.log
> >>>>>
> >>>>> -rw-r--r--. 1 activemq activemq 33554432 Jul 31 01:11 db-904.log
> >>>>>
> >>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  1 00:19 db-909.log
> >>>>>
> >>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  1 04:42 db-910.log
> >>>>>
> >>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  2 01:07 db-918.log
> >>>>>
> >>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  3 01:38 db-930.log
> >>>>>
> >>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  4 00:19 db-939.log
> >>>>>
> >>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  4 03:16 db-940.log
> >>>>>
> >>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  5 00:23 db-949.log
> >>>>>
> >>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  6 02:42 db-960.log
> >>>>>
> >>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  7 02:18 db-968.log
> >>>>>
> >>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  8 00:44 db-975.log
> >>>>>
> >>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  8 14:00 db-980.log
> >>>>>
> >>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  8 15:41 db-981.log
> >>>>>
> >>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug  9 01:43 db-1051.log
> >>>>>
> >>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 11 00:20 db-1066.log
> >>>>>
> >>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 11 03:01 db-1067.log
> >>>>>
> >>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 12 00:19 db-1074.log
> >>>>>
> >>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 12 03:05 db-1075.log
> >>>>>
> >>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 13 00:19 db-1082.log
> >>>>>
> >>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 13 03:04 db-1083.log
> >>>>>
> >>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 14 01:06 db-1093.log
> >>>>>
> >>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 15 01:18 db-1111.log
> >>>>>
> >>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 16 02:38 db-1126.log
> >>>>>
> >>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 17 00:19 db-1137.log
> >>>>>
> >>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 17 02:32 db-1138.log
> >>>>>
> >>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 18 00:21 db-1155.log
> >>>>>
> >>>>> -rw-r--r--. 1 activemq activemq 33554432 Aug 19 00:51 db-1167.log
> >>>>>
> >>>>> -rw-r--r--. 1 activemq activemq 33554432 Sep 19 13:42 db-1277.log
> >>>>>
> >>>>> -rw-r--r--. 1 activemq activemq   597139 Sep 19 14:59 db-1279.log
> >>>>>
> >>>>> Thanks!
> >>>>>
> >>>>> Steve Hill
> >>>>>
> >>>>> ARMADA
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> The information contained in this e-mail and any attachments is
> >>>>> confidential and
> >>>>> intended only for the recipient. If you are not the intended
> recipient,
> >>>>>
> >>>> the
> >>>>
> >>>>> information contained in this message may not be used, copied, or
> >>>>> forwarded to
> >>>>> third parties or otherwise distributed for any other purpose. Please
> >>>>> notify the
> >>>>> sender if you received this e-mail in error and delete the e-mail
> and its
> >>>>> attachments promptly.  Nothing in this e-mail may be used or deemed
> to
> >>>>> form the
> >>>>> basis of a contractual or any other legally binding obligation unless
> >>>>> separately
> >>>>> confirmed in writing by an authorized representative of ARMADA.
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>
> >
>
>
Reply | Threaded
Open this post in threaded view
|

Re: KAHADB clean up old log files

Johan Carlquist
Thanks for your reply.

> On 27 Sep 2016, at 14:59, Tim Bain <[hidden email]> wrote:
>
> Do the stats you see in JConsole line up with KahaDBJournalReader (what's
> that?), or with the web console?  And do you see an open XA transaction in
> JConsole, like Steve did?

No traces of XA transactions.

> I'd guess that the negative values indicate that whatever code you
> hacked'n'slashed has a bug, either in the code you started with or in the
> code you changed/added.  But that's pure guesswork.

Highly possible.
The java isn't strong in this one…

> The interesting part of your email (for me) is the sentence "the only thing
> that seems correct is the amount of messages in
> ActiveMQ.DLQ.Queue.queue1.test,"
> which implies that you know you have messages in the DLQ and aren't doing
> anything about it.  By any chance, are one or more of those messages
> exactly as old as your oldest data file?  If you purge that queue (and all
> other DLQs), do your data files disappear?

Yes. The DLQs haven't been reciving a lot of love since this is a
staging environment.
I thought I tested to purge the DLQs, but apperantly not. Did it now and
all the log files disappeard and freed a lot of disk. Maybe I didn't
wait until the gc before checking the disk usage last time.

This might be a dumb question, but it feels like the messages in DLQ
took a lot more storage then "regular" messages. Is this since they are
spread over many log files? Is it possible to "defrag" the log files
without purging the queues?


> enableAckCompaction only compacts acks, not unconsumed messages.  It will
> eliminate any data file being kept solely for acks, but by definition
> that's not your first data file (that one must have an unconsumed message)
> and if you have unconsumed messages in later files it won't compact them
> either.  It might be a problem if your goal is to have fewer journal files,
> but you're OK with still having some, but if you want to get rid of all the
> old ones, you need to make sure you're not holding old unconsumed messages.

Ok! My goal is to not fill disk. So it seems like handling the DLQ is a
better way to do it.

--
jocar

Reply | Threaded
Open this post in threaded view
|

Re: KAHADB clean up old log files

Tim Bain
KahaDB's data files are not compacted, so a tiny message can keep alive a
32-MB (or whatever size you're using) log file.  It's not that the messages
are larger, it's just that lots of clutter gets kept along with the actual
content you're keeping.  And if you're not using 5.14.0, then chains of
files can be kept alive where only the first file contains live messages
and the later files contain just acks for messages in the older files
(which can't be discarded, or else the messages they ack will come back to
life).

Either way, you should either implement a strategy for addressing DLQed
messages in a timely manner or if that's not possible then configure your
broker to set a JMSExpirationTime on each message or use the discarding
dead letter strategy (as described in
http://activemq.apache.org/message-redelivery-and-dlq-handling.html) or
something similar to automatically dispose of DLQ messages if you're sure
you're not going to use them anyway.

Tim

On Tue, Sep 27, 2016 at 8:27 AM, Johan Carlquist <[hidden email]> wrote:

> Thanks for your reply.
>
> > On 27 Sep 2016, at 14:59, Tim Bain <[hidden email]> wrote:
> >
> > Do the stats you see in JConsole line up with KahaDBJournalReader (what's
> > that?), or with the web console?  And do you see an open XA transaction
> in
> > JConsole, like Steve did?
>
> No traces of XA transactions.
>
> > I'd guess that the negative values indicate that whatever code you
> > hacked'n'slashed has a bug, either in the code you started with or in the
> > code you changed/added.  But that's pure guesswork.
>
> Highly possible.
> The java isn't strong in this one…
>
> > The interesting part of your email (for me) is the sentence "the only
> thing
> > that seems correct is the amount of messages in
> > ActiveMQ.DLQ.Queue.queue1.test,"
> > which implies that you know you have messages in the DLQ and aren't doing
> > anything about it.  By any chance, are one or more of those messages
> > exactly as old as your oldest data file?  If you purge that queue (and
> all
> > other DLQs), do your data files disappear?
>
> Yes. The DLQs haven't been reciving a lot of love since this is a
> staging environment.
> I thought I tested to purge the DLQs, but apperantly not. Did it now and
> all the log files disappeard and freed a lot of disk. Maybe I didn't
> wait until the gc before checking the disk usage last time.
>
> This might be a dumb question, but it feels like the messages in DLQ
> took a lot more storage then "regular" messages. Is this since they are
> spread over many log files? Is it possible to "defrag" the log files
> without purging the queues?
>
>
> > enableAckCompaction only compacts acks, not unconsumed messages.  It will
> > eliminate any data file being kept solely for acks, but by definition
> > that's not your first data file (that one must have an unconsumed
> message)
> > and if you have unconsumed messages in later files it won't compact them
> > either.  It might be a problem if your goal is to have fewer journal
> files,
> > but you're OK with still having some, but if you want to get rid of all
> the
> > old ones, you need to make sure you're not holding old unconsumed
> messages.
>
> Ok! My goal is to not fill disk. So it seems like handling the DLQ is a
> better way to do it.
>
> --
> jocar
>
>
Reply | Threaded
Open this post in threaded view
|

Re: KAHADB clean up old log files

Shobhana
Sorry about adding to somebody else's thread.... I am adding here since I see same issue in our production server.

We use 5.14.0 version and I have already enabled DLQ to discard all expired messages and also set expiry of 1 day to all messages (MQTT) and offline durable consumers using the following configurations :

a) I enabled following plug-in to set expiry of 1 day for every message (we use only MQTT messages) :
<plugins>
  <discardingDLQBrokerPlugin dropAll="true" dropTemporaryTopics="true" dropTemporaryQueues="true" />
        <timeStampingBrokerPlugin ttlCeiling="86400000" zeroExpirationOverride="86400000"/>
</plugins>

b) I enabled DLQ to drop all expired messages for all topics and queues :
    <deadLetterStrategy>
          <sharedDeadLetterStrategy processExpired="false" />
    </deadLetterStrategy>

<discardingDLQBrokerPlugin dropAll="true" dropTemporaryTopics="true" dropTemporaryQueues="true" />

c) I enabled offline durable subscribers to timeout after 1 day :
offlineDurableSubscriberTimeout="86400000" offlineDurableSubscriberTaskSchedule="3600000"

After these changes, I could see a few log files got deleted, but majority of the log files remain in the Kahadb folder which consume lot of disk space.

I observed this in our production server where I cannot enable trace or JMX. Is there any other way to identify what is causing this issue?

TIA,
Shobhana