Failover very slow with kahadb while restart of master is fast

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Failover very slow with kahadb while restart of master is fast

cschneider
At a customer we are experiencing a very strange behaviour of activemq.

We have two brokers on separate machines A and B. Storage is on a third
machine and imported using nfsv4.
Kahadb contains about 18 GB of journal files.

Scenario 1:
- Stop B
- Stop A (master now)
- Start A

The start of A takes about 50 seconds.

Scnario 2:
- Stop A
- Stop B (master now)
- Start A

The start of A takes about 5 minutes.

In the slow case the log at debug level shows a big gap between adding
the last queue (12:04) and running the ActiveMQ Journal Checkpoint
Worker again after the queue creation (12:09).
I have no idea what activemq is doing in the mean time but it seems to
take several minutes.

Can anyone explain this behaviour or explain what activemq does in the
mean time? I would also be happy about pointer to the code.
What I do not understand is why it makes a difference if the master is
started again or if the slave is becoming master. As they both share the
same kahadb over nfs I would expect the same startup
behaviour.

Christian

-----

Slow start
2016-02-15 12:03:01,943 | INFO  | Refreshing
org.apache.activemq.xbean.XBeanBrokerFactory$1@17b14695: startup date
[Mon Feb 15 12:03:01 CET 2016]; root of context hierarchy |
org.apache.activemq.xbean.XBeanBrokerFactory$1 | main
...
2016-02-15 12:03:03,018 | INFO  |
PListStore:[/path/activemq/data/A/tmp_storage] started |
org.apache.activemq.store.kahadb.plist.PListStoreImpl | main
2016-02-15 12:03:03,033 | INFO  | ignoring zero length, partially
initialised journal data file: db-1.log number = 1 , length = 0 |
org.apache.activemq.store.kahadb.disk.journal.Journal | main
2016-02-15 12:03:03,177 | INFO  | JobSchedulerStore:path/scheduler
started |
org.apache.activemq.store.kahadb.scheduler.JobSchedulerStoreImpl | main
2016-02-15 12:03:03,180 | INFO  | JobScheduler using directory:
path/scheduler | org.apache.activemq.broker.BrokerService | main
2016-02-15 12:03:03,335 | INFO  | Using Persistence Adapter:
KahaDBPersistenceAdapter[/path] |
org.apache.activemq.broker.BrokerService | main
2016-02-15 12:03:03,350 | INFO  | JMX consoles can connect to
service:jmx:rmi:///jndi/rmi://localhost:1617/jmxrmi |
org.apache.activemq.broker.jmx.ManagementContext | JMX connector
2016-02-15 12:03:05,198 | INFO  | Corrupt journal records found in
'path/db-1122.log' between offsets: 17893102..17893732 |
org.apache.activemq.store.kahadb.disk.journal.Journal | main
2016-02-15 12:03:47,925 | INFO  | KahaDB is version 5 |
org.apache.activemq.store.kahadb.MessageDatabase | main
....
2016-02-15 12:03:49,233 | INFO  | Recovering from the journal ... |
org.apache.activemq.store.kahadb.MessageDatabase | main
2016-02-15 12:03:49,237 | INFO  | Recovery replayed 20 operations from
the journal in 0.094 seconds. |
org.apache.activemq.store.kahadb.MessageDatabase | main
2016-02-15 12:04:26,863 | DEBUG | Checkpoint started. |
org.apache.activemq.store.kahadb.MessageDatabase | ActiveMQ Journal
Checkpoint Worker
2016-02-15 12:04:26,932 | DEBUG | Checkpoint done. |
org.apache.activemq.store.kahadb.MessageDatabase | ActiveMQ Journal
Checkpoint Worker
2016-02-15 12:04:26,940 | INFO  | Apache ActiveMQ 5.10.2 (A,
ID:A-59167-1455534183070-1:1) is starting |
org.apache.activemq.broker.BrokerService | main
...
2016-02-15 12:04:30,289 | DEBUG | A adding destination: queue://q1.DLQ |
org.apache.activemq.broker.region.AbstractRegion | main
2016-02-15 12:04:57,026 | DEBUG | queue://q1.DLQ expiring messages .. |
org.apache.activemq.broker.region.Queue | ActiveMQ Broker[A] Scheduler
2016-02-15 12:04:57,029 | DEBUG | q1.DLQ toPageIn: 5, Inflight: 0,
pagedInMessages.size 0, pagedInPendingDispatch.size 0, enqueueCount: 0,
dequeueCount: 0, memUsage:0 | org.apache.activemq.broker.region.Queue |
ActiveMQ Broker[A] Scheduler
2016-02-15 12:09:20,440 | DEBUG | Checkpoint started. |
org.apache.activemq.store.kahadb.MessageDatabase | ActiveMQ Journal
Checkpoint Worker
2016-02-15 12:09:20,476 | DEBUG | Checkpoint done. |
org.apache.activemq.store.kahadb.MessageDatabase | ActiveMQ Journal
Checkpoint Worker
2016-02-15 12:09:20,488 | DEBUG | queue://q1.DLQ expiring messages done.
| org.apache.activemq.broker.region.Queue | ActiveMQ Broker[A] Scheduler


--
Christian Schneider
http://www.liquid-reality.de

Open Source Architect
http://www.talend.com

Reply | Threaded
Open this post in threaded view
|

Re: Failover very slow with kahadb while restart of master is fast

Jean-Baptiste Onofré
Hi Christian,

what's the option on the NFS ?

Most of the time, I use I use sync and I increase the rsize and wsize.

Anyway, the difference is huge. The problem could be related to the
journal checksum check. Can you try checkForCorruptJournalFiles="false"
on the kahadb configuration ?

Maybe increasing the kahadb jobs could help too:

             <kahaDB directory="${karaf.data}/activemq/activemq/kahadb"
                 indexWriteBatchSize="1000"
                 indexCacheSize="2000"
                 journalMaxFileLength="32mb"
                 checkForCorruptJournalFiles="false"
                 maxAsyncJobs="5000"
                 concurrentStoreAndDispatchQueues="true"
                 concurrentStoreAndDispatchTopics="true"
                 enableJournalDiskSyncs="true"
                 enableIndexWriteAsync="true"/>


Regards
JB

On 02/15/2016 01:32 PM, Christian Schneider wrote:

> At a customer we are experiencing a very strange behaviour of activemq.
>
> We have two brokers on separate machines A and B. Storage is on a third
> machine and imported using nfsv4.
> Kahadb contains about 18 GB of journal files.
>
> Scenario 1:
> - Stop B
> - Stop A (master now)
> - Start A
>
> The start of A takes about 50 seconds.
>
> Scnario 2:
> - Stop A
> - Stop B (master now)
> - Start A
>
> The start of A takes about 5 minutes.
>
> In the slow case the log at debug level shows a big gap between adding
> the last queue (12:04) and running the ActiveMQ Journal Checkpoint
> Worker again after the queue creation (12:09).
> I have no idea what activemq is doing in the mean time but it seems to
> take several minutes.
>
> Can anyone explain this behaviour or explain what activemq does in the
> mean time? I would also be happy about pointer to the code.
> What I do not understand is why it makes a difference if the master is
> started again or if the slave is becoming master. As they both share the
> same kahadb over nfs I would expect the same startup
> behaviour.
>
> Christian
>
> -----
>
> Slow start
> 2016-02-15 12:03:01,943 | INFO  | Refreshing
> org.apache.activemq.xbean.XBeanBrokerFactory$1@17b14695: startup date
> [Mon Feb 15 12:03:01 CET 2016]; root of context hierarchy |
> org.apache.activemq.xbean.XBeanBrokerFactory$1 | main
> ...
> 2016-02-15 12:03:03,018 | INFO  |
> PListStore:[/path/activemq/data/A/tmp_storage] started |
> org.apache.activemq.store.kahadb.plist.PListStoreImpl | main
> 2016-02-15 12:03:03,033 | INFO  | ignoring zero length, partially
> initialised journal data file: db-1.log number = 1 , length = 0 |
> org.apache.activemq.store.kahadb.disk.journal.Journal | main
> 2016-02-15 12:03:03,177 | INFO  | JobSchedulerStore:path/scheduler
> started |
> org.apache.activemq.store.kahadb.scheduler.JobSchedulerStoreImpl | main
> 2016-02-15 12:03:03,180 | INFO  | JobScheduler using directory:
> path/scheduler | org.apache.activemq.broker.BrokerService | main
> 2016-02-15 12:03:03,335 | INFO  | Using Persistence Adapter:
> KahaDBPersistenceAdapter[/path] |
> org.apache.activemq.broker.BrokerService | main
> 2016-02-15 12:03:03,350 | INFO  | JMX consoles can connect to
> service:jmx:rmi:///jndi/rmi://localhost:1617/jmxrmi |
> org.apache.activemq.broker.jmx.ManagementContext | JMX connector
> 2016-02-15 12:03:05,198 | INFO  | Corrupt journal records found in
> 'path/db-1122.log' between offsets: 17893102..17893732 |
> org.apache.activemq.store.kahadb.disk.journal.Journal | main
> 2016-02-15 12:03:47,925 | INFO  | KahaDB is version 5 |
> org.apache.activemq.store.kahadb.MessageDatabase | main
> ....
> 2016-02-15 12:03:49,233 | INFO  | Recovering from the journal ... |
> org.apache.activemq.store.kahadb.MessageDatabase | main
> 2016-02-15 12:03:49,237 | INFO  | Recovery replayed 20 operations from
> the journal in 0.094 seconds. |
> org.apache.activemq.store.kahadb.MessageDatabase | main
> 2016-02-15 12:04:26,863 | DEBUG | Checkpoint started. |
> org.apache.activemq.store.kahadb.MessageDatabase | ActiveMQ Journal
> Checkpoint Worker
> 2016-02-15 12:04:26,932 | DEBUG | Checkpoint done. |
> org.apache.activemq.store.kahadb.MessageDatabase | ActiveMQ Journal
> Checkpoint Worker
> 2016-02-15 12:04:26,940 | INFO  | Apache ActiveMQ 5.10.2 (A,
> ID:A-59167-1455534183070-1:1) is starting |
> org.apache.activemq.broker.BrokerService | main
> ...
> 2016-02-15 12:04:30,289 | DEBUG | A adding destination: queue://q1.DLQ |
> org.apache.activemq.broker.region.AbstractRegion | main
> 2016-02-15 12:04:57,026 | DEBUG | queue://q1.DLQ expiring messages .. |
> org.apache.activemq.broker.region.Queue | ActiveMQ Broker[A] Scheduler
> 2016-02-15 12:04:57,029 | DEBUG | q1.DLQ toPageIn: 5, Inflight: 0,
> pagedInMessages.size 0, pagedInPendingDispatch.size 0, enqueueCount: 0,
> dequeueCount: 0, memUsage:0 | org.apache.activemq.broker.region.Queue |
> ActiveMQ Broker[A] Scheduler
> 2016-02-15 12:09:20,440 | DEBUG | Checkpoint started. |
> org.apache.activemq.store.kahadb.MessageDatabase | ActiveMQ Journal
> Checkpoint Worker
> 2016-02-15 12:09:20,476 | DEBUG | Checkpoint done. |
> org.apache.activemq.store.kahadb.MessageDatabase | ActiveMQ Journal
> Checkpoint Worker
> 2016-02-15 12:09:20,488 | DEBUG | queue://q1.DLQ expiring messages done.
> | org.apache.activemq.broker.region.Queue | ActiveMQ Broker[A] Scheduler
>
>

--
Jean-Baptiste Onofré
[hidden email]
http://blog.nanthrax.net
Talend - http://www.talend.com
Reply | Threaded
Open this post in threaded view
|

Re: Failover very slow with kahadb while restart of master is fast

Tim Bain
You could use a profiler such as JVisualVM to tell you what work is being
done.  You could get some of that information from thread dumps via jstack,
but I think a profiler would give much more useful and detailed information
for less effort.

Tim
On Feb 15, 2016 5:37 AM, "Jean-Baptiste Onofré" <[hidden email]> wrote:

> Hi Christian,
>
> what's the option on the NFS ?
>
> Most of the time, I use I use sync and I increase the rsize and wsize.
>
> Anyway, the difference is huge. The problem could be related to the
> journal checksum check. Can you try checkForCorruptJournalFiles="false" on
> the kahadb configuration ?
>
> Maybe increasing the kahadb jobs could help too:
>
>             <kahaDB directory="${karaf.data}/activemq/activemq/kahadb"
>                 indexWriteBatchSize="1000"
>                 indexCacheSize="2000"
>                 journalMaxFileLength="32mb"
>                 checkForCorruptJournalFiles="false"
>                 maxAsyncJobs="5000"
>                 concurrentStoreAndDispatchQueues="true"
>                 concurrentStoreAndDispatchTopics="true"
>                 enableJournalDiskSyncs="true"
>                 enableIndexWriteAsync="true"/>
>
>
> Regards
> JB
>
> On 02/15/2016 01:32 PM, Christian Schneider wrote:
>
>> At a customer we are experiencing a very strange behaviour of activemq.
>>
>> We have two brokers on separate machines A and B. Storage is on a third
>> machine and imported using nfsv4.
>> Kahadb contains about 18 GB of journal files.
>>
>> Scenario 1:
>> - Stop B
>> - Stop A (master now)
>> - Start A
>>
>> The start of A takes about 50 seconds.
>>
>> Scnario 2:
>> - Stop A
>> - Stop B (master now)
>> - Start A
>>
>> The start of A takes about 5 minutes.
>>
>> In the slow case the log at debug level shows a big gap between adding
>> the last queue (12:04) and running the ActiveMQ Journal Checkpoint
>> Worker again after the queue creation (12:09).
>> I have no idea what activemq is doing in the mean time but it seems to
>> take several minutes.
>>
>> Can anyone explain this behaviour or explain what activemq does in the
>> mean time? I would also be happy about pointer to the code.
>> What I do not understand is why it makes a difference if the master is
>> started again or if the slave is becoming master. As they both share the
>> same kahadb over nfs I would expect the same startup
>> behaviour.
>>
>> Christian
>>
>> -----
>>
>> Slow start
>> 2016-02-15 12:03:01,943 | INFO  | Refreshing
>> org.apache.activemq.xbean.XBeanBrokerFactory$1@17b14695: startup date
>> [Mon Feb 15 12:03:01 CET 2016]; root of context hierarchy |
>> org.apache.activemq.xbean.XBeanBrokerFactory$1 | main
>> ...
>> 2016-02-15 12:03:03,018 | INFO  |
>> PListStore:[/path/activemq/data/A/tmp_storage] started |
>> org.apache.activemq.store.kahadb.plist.PListStoreImpl | main
>> 2016-02-15 12:03:03,033 | INFO  | ignoring zero length, partially
>> initialised journal data file: db-1.log number = 1 , length = 0 |
>> org.apache.activemq.store.kahadb.disk.journal.Journal | main
>> 2016-02-15 12:03:03,177 | INFO  | JobSchedulerStore:path/scheduler
>> started |
>> org.apache.activemq.store.kahadb.scheduler.JobSchedulerStoreImpl | main
>> 2016-02-15 12:03:03,180 | INFO  | JobScheduler using directory:
>> path/scheduler | org.apache.activemq.broker.BrokerService | main
>> 2016-02-15 12:03:03,335 | INFO  | Using Persistence Adapter:
>> KahaDBPersistenceAdapter[/path] |
>> org.apache.activemq.broker.BrokerService | main
>> 2016-02-15 12:03:03,350 | INFO  | JMX consoles can connect to
>> service:jmx:rmi:///jndi/rmi://localhost:1617/jmxrmi |
>> org.apache.activemq.broker.jmx.ManagementContext | JMX connector
>> 2016-02-15 12:03:05,198 | INFO  | Corrupt journal records found in
>> 'path/db-1122.log' between offsets: 17893102..17893732 |
>> org.apache.activemq.store.kahadb.disk.journal.Journal | main
>> 2016-02-15 12:03:47,925 | INFO  | KahaDB is version 5 |
>> org.apache.activemq.store.kahadb.MessageDatabase | main
>> ....
>> 2016-02-15 12:03:49,233 | INFO  | Recovering from the journal ... |
>> org.apache.activemq.store.kahadb.MessageDatabase | main
>> 2016-02-15 12:03:49,237 | INFO  | Recovery replayed 20 operations from
>> the journal in 0.094 seconds. |
>> org.apache.activemq.store.kahadb.MessageDatabase | main
>> 2016-02-15 12:04:26,863 | DEBUG | Checkpoint started. |
>> org.apache.activemq.store.kahadb.MessageDatabase | ActiveMQ Journal
>> Checkpoint Worker
>> 2016-02-15 12:04:26,932 | DEBUG | Checkpoint done. |
>> org.apache.activemq.store.kahadb.MessageDatabase | ActiveMQ Journal
>> Checkpoint Worker
>> 2016-02-15 12:04:26,940 | INFO  | Apache ActiveMQ 5.10.2 (A,
>> ID:A-59167-1455534183070-1:1) is starting |
>> org.apache.activemq.broker.BrokerService | main
>> ...
>> 2016-02-15 12:04:30,289 | DEBUG | A adding destination: queue://q1.DLQ |
>> org.apache.activemq.broker.region.AbstractRegion | main
>> 2016-02-15 12:04:57,026 | DEBUG | queue://q1.DLQ expiring messages .. |
>> org.apache.activemq.broker.region.Queue | ActiveMQ Broker[A] Scheduler
>> 2016-02-15 12:04:57,029 | DEBUG | q1.DLQ toPageIn: 5, Inflight: 0,
>> pagedInMessages.size 0, pagedInPendingDispatch.size 0, enqueueCount: 0,
>> dequeueCount: 0, memUsage:0 | org.apache.activemq.broker.region.Queue |
>> ActiveMQ Broker[A] Scheduler
>> 2016-02-15 12:09:20,440 | DEBUG | Checkpoint started. |
>> org.apache.activemq.store.kahadb.MessageDatabase | ActiveMQ Journal
>> Checkpoint Worker
>> 2016-02-15 12:09:20,476 | DEBUG | Checkpoint done. |
>> org.apache.activemq.store.kahadb.MessageDatabase | ActiveMQ Journal
>> Checkpoint Worker
>> 2016-02-15 12:09:20,488 | DEBUG | queue://q1.DLQ expiring messages done.
>> | org.apache.activemq.broker.region.Queue | ActiveMQ Broker[A] Scheduler
>>
>>
>>
> --
> Jean-Baptiste Onofré
> [hidden email]
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>
Reply | Threaded
Open this post in threaded view
|

Re: Failover very slow with kahadb while restart of master is fast

artnaseef
In reply to this post by cschneider
Does the shutdown of the broker complete cleanly in both cases?

A slow shutdown will lead to the shutdown script using kill -9, which eliminates any possibility of the process continuing to ensure a clean shutdown.  In that case, the contents of the file may be left in a dirty state, which may lead to recovery when the broker starts up.

Also, in the "Stop master then stop second master" scenario, is the second master (B) completely stopped before restarting the first master (A)?

One way to help track this down - take periodic stack dumps (jstack is good for this) during the startup (perhaps one every 15 seconds), and look at the stack trace of the main broker startup thread.  That should help determine the cause.
Reply | Threaded
Open this post in threaded view
|

Re: Failover very slow with kahadb while restart of master is fast

cschneider
In reply to this post by Tim Bain
I now took thread dumps of activemq while it was taking so long to
start  up.

This is where it seemed to spend most of the time:
http://apaste.info/H1c

So it looks like it is reading the kahadb journals. Does anyone have an
idea why it seems to matter if the master restarts (fast) or the dlave
takes over (slow).
After all the files are in the same remote location in both cases.

Christian

# at
org.apache.activemq.util.RecoverableRandomAccessFile.readFully(RecoverableRandomAccessFile.java:75)
# at
org.apache.activemq.store.kahadb.disk.journal.DataFileAccessor.readFully(DataFileAccessor.java:98)
# at
org.apache.activemq.store.kahadb.disk.journal.Journal.checkBatchRecord(Journal.java:302)
# at
org.apache.activemq.store.kahadb.disk.journal.Journal.recoveryCheck(Journal.java:210)
# at
org.apache.activemq.store.kahadb.disk.journal.Journal.start(Journal.java:170)
# -locked <0x000000070086c700>(a
org.apache.activemq.store.kahadb.disk.journal.Journal)
# at
org.apache.activemq.store.kahadb.MessageDatabase.open(MessageDatabase.java:378)



On 15.02.2016 15:15, Tim Bain wrote:

> You could use a profiler such as JVisualVM to tell you what work is being
> done.  You could get some of that information from thread dumps via jstack,
> but I think a profiler would give much more useful and detailed information
> for less effort.
>
> Tim
> On Feb 15, 2016 5:37 AM, "Jean-Baptiste Onofré" <[hidden email]> wrote:
>
>> Hi Christian,
>>
>> what's the option on the NFS ?
>>
>> Most of the time, I use I use sync and I increase the rsize and wsize.
>>
>> Anyway, the difference is huge. The problem could be related to the
>> journal checksum check. Can you try checkForCorruptJournalFiles="false" on
>> the kahadb configuration ?
>>
>> Maybe increasing the kahadb jobs could help too:
>>
>>              <kahaDB directory="${karaf.data}/activemq/activemq/kahadb"
>>                  indexWriteBatchSize="1000"
>>                  indexCacheSize="2000"
>>                  journalMaxFileLength="32mb"
>>                  checkForCorruptJournalFiles="false"
>>                  maxAsyncJobs="5000"
>>                  concurrentStoreAndDispatchQueues="true"
>>                  concurrentStoreAndDispatchTopics="true"
>>                  enableJournalDiskSyncs="true"
>>                  enableIndexWriteAsync="true"/>
>>
>>
>> Regards
>> JB
>>
>> On 02/15/2016 01:32 PM, Christian Schneider wrote:
>>
>>> At a customer we are experiencing a very strange behaviour of activemq.
>>>
>>> We have two brokers on separate machines A and B. Storage is on a third
>>> machine and imported using nfsv4.
>>> Kahadb contains about 18 GB of journal files.
>>>
>>> Scenario 1:
>>> - Stop B
>>> - Stop A (master now)
>>> - Start A
>>>
>>> The start of A takes about 50 seconds.
>>>
>>> Scnario 2:
>>> - Stop A
>>> - Stop B (master now)
>>> - Start A
>>>
>>> The start of A takes about 5 minutes.
>>>
>>> In the slow case the log at debug level shows a big gap between adding
>>> the last queue (12:04) and running the ActiveMQ Journal Checkpoint
>>> Worker again after the queue creation (12:09).
>>> I have no idea what activemq is doing in the mean time but it seems to
>>> take several minutes.
>>>
>>> Can anyone explain this behaviour or explain what activemq does in the
>>> mean time? I would also be happy about pointer to the code.
>>> What I do not understand is why it makes a difference if the master is
>>> started again or if the slave is becoming master. As they both share the
>>> same kahadb over nfs I would expect the same startup
>>> behaviour.
>>>
>>> Christian
>>>
>>> -----
>>>
>>> Slow start
>>> 2016-02-15 12:03:01,943 | INFO  | Refreshing
>>> org.apache.activemq.xbean.XBeanBrokerFactory$1@17b14695: startup date
>>> [Mon Feb 15 12:03:01 CET 2016]; root of context hierarchy |
>>> org.apache.activemq.xbean.XBeanBrokerFactory$1 | main
>>> ...
>>> 2016-02-15 12:03:03,018 | INFO  |
>>> PListStore:[/path/activemq/data/A/tmp_storage] started |
>>> org.apache.activemq.store.kahadb.plist.PListStoreImpl | main
>>> 2016-02-15 12:03:03,033 | INFO  | ignoring zero length, partially
>>> initialised journal data file: db-1.log number = 1 , length = 0 |
>>> org.apache.activemq.store.kahadb.disk.journal.Journal | main
>>> 2016-02-15 12:03:03,177 | INFO  | JobSchedulerStore:path/scheduler
>>> started |
>>> org.apache.activemq.store.kahadb.scheduler.JobSchedulerStoreImpl | main
>>> 2016-02-15 12:03:03,180 | INFO  | JobScheduler using directory:
>>> path/scheduler | org.apache.activemq.broker.BrokerService | main
>>> 2016-02-15 12:03:03,335 | INFO  | Using Persistence Adapter:
>>> KahaDBPersistenceAdapter[/path] |
>>> org.apache.activemq.broker.BrokerService | main
>>> 2016-02-15 12:03:03,350 | INFO  | JMX consoles can connect to
>>> service:jmx:rmi:///jndi/rmi://localhost:1617/jmxrmi |
>>> org.apache.activemq.broker.jmx.ManagementContext | JMX connector
>>> 2016-02-15 12:03:05,198 | INFO  | Corrupt journal records found in
>>> 'path/db-1122.log' between offsets: 17893102..17893732 |
>>> org.apache.activemq.store.kahadb.disk.journal.Journal | main
>>> 2016-02-15 12:03:47,925 | INFO  | KahaDB is version 5 |
>>> org.apache.activemq.store.kahadb.MessageDatabase | main
>>> ....
>>> 2016-02-15 12:03:49,233 | INFO  | Recovering from the journal ... |
>>> org.apache.activemq.store.kahadb.MessageDatabase | main
>>> 2016-02-15 12:03:49,237 | INFO  | Recovery replayed 20 operations from
>>> the journal in 0.094 seconds. |
>>> org.apache.activemq.store.kahadb.MessageDatabase | main
>>> 2016-02-15 12:04:26,863 | DEBUG | Checkpoint started. |
>>> org.apache.activemq.store.kahadb.MessageDatabase | ActiveMQ Journal
>>> Checkpoint Worker
>>> 2016-02-15 12:04:26,932 | DEBUG | Checkpoint done. |
>>> org.apache.activemq.store.kahadb.MessageDatabase | ActiveMQ Journal
>>> Checkpoint Worker
>>> 2016-02-15 12:04:26,940 | INFO  | Apache ActiveMQ 5.10.2 (A,
>>> ID:A-59167-1455534183070-1:1) is starting |
>>> org.apache.activemq.broker.BrokerService | main
>>> ...
>>> 2016-02-15 12:04:30,289 | DEBUG | A adding destination: queue://q1.DLQ |
>>> org.apache.activemq.broker.region.AbstractRegion | main
>>> 2016-02-15 12:04:57,026 | DEBUG | queue://q1.DLQ expiring messages .. |
>>> org.apache.activemq.broker.region.Queue | ActiveMQ Broker[A] Scheduler
>>> 2016-02-15 12:04:57,029 | DEBUG | q1.DLQ toPageIn: 5, Inflight: 0,
>>> pagedInMessages.size 0, pagedInPendingDispatch.size 0, enqueueCount: 0,
>>> dequeueCount: 0, memUsage:0 | org.apache.activemq.broker.region.Queue |
>>> ActiveMQ Broker[A] Scheduler
>>> 2016-02-15 12:09:20,440 | DEBUG | Checkpoint started. |
>>> org.apache.activemq.store.kahadb.MessageDatabase | ActiveMQ Journal
>>> Checkpoint Worker
>>> 2016-02-15 12:09:20,476 | DEBUG | Checkpoint done. |
>>> org.apache.activemq.store.kahadb.MessageDatabase | ActiveMQ Journal
>>> Checkpoint Worker
>>> 2016-02-15 12:09:20,488 | DEBUG | queue://q1.DLQ expiring messages done.
>>> | org.apache.activemq.broker.region.Queue | ActiveMQ Broker[A] Scheduler
>>>
>>>
>>>
>> --
>> Jean-Baptiste Onofré
>> [hidden email]
>> http://blog.nanthrax.net
>> Talend - http://www.talend.com
>>


--
Christian Schneider
http://www.liquid-reality.de

Open Source Architect
http://www.talend.com

Reply | Threaded
Open this post in threaded view
|

Re: Failover very slow with kahadb while restart of master is fast

cschneider
In reply to this post by cschneider
We have found the main cause for this issue. Current Master and new
Master seem to do the exact same things on startup.
They read all the journals.

It seems that the OS is caching the files in memory in some cases.
The master seems to have all the jorunal files in the cache. So when it
restarts it is fast.
The new master seems to not have the journals in cache so they have to
be transfered over the network (~10MB/s) in our case. So this
causes the difference in startup time.

One big question remains:
Why are the whole journals read? Shouldn't a master that is shut down in
a controlled way leave a clean state, so on startup only the meta data
index is read?

Christian

On 15.02.2016 13:32, Christian Schneider wrote:

> At a customer we are experiencing a very strange behaviour of activemq.
>
> We have two brokers on separate machines A and B. Storage is on a
> third machine and imported using nfsv4.
> Kahadb contains about 18 GB of journal files.
>
> Scenario 1:
> - Stop B
> - Stop A (master now)
> - Start A
>
> The start of A takes about 50 seconds.
>
> Scnario 2:
> - Stop A
> - Stop B (master now)
> - Start A
>
> The start of A takes about 5 minutes.
>
> In the slow case the log at debug level shows a big gap between adding
> the last queue (12:04) and running the ActiveMQ Journal Checkpoint
> Worker again after the queue creation (12:09).
> I have no idea what activemq is doing in the mean time but it seems to
> take several minutes.
>
> Can anyone explain this behaviour or explain what activemq does in the
> mean time? I would also be happy about pointer to the code.
> What I do not understand is why it makes a difference if the master is
> started again or if the slave is becoming master. As they both share
> the same kahadb over nfs I would expect the same startup
> behaviour.
>
> Christian
>
> -----
>
> Slow start
> 2016-02-15 12:03:01,943 | INFO  | Refreshing
> org.apache.activemq.xbean.XBeanBrokerFactory$1@17b14695: startup date
> [Mon Feb 15 12:03:01 CET 2016]; root of context hierarchy |
> org.apache.activemq.xbean.XBeanBrokerFactory$1 | main
> ...
> 2016-02-15 12:03:03,018 | INFO  |
> PListStore:[/path/activemq/data/A/tmp_storage] started |
> org.apache.activemq.store.kahadb.plist.PListStoreImpl | main
> 2016-02-15 12:03:03,033 | INFO  | ignoring zero length, partially
> initialised journal data file: db-1.log number = 1 , length = 0 |
> org.apache.activemq.store.kahadb.disk.journal.Journal | main
> 2016-02-15 12:03:03,177 | INFO  | JobSchedulerStore:path/scheduler
> started |
> org.apache.activemq.store.kahadb.scheduler.JobSchedulerStoreImpl | main
> 2016-02-15 12:03:03,180 | INFO  | JobScheduler using directory:
> path/scheduler | org.apache.activemq.broker.BrokerService | main
> 2016-02-15 12:03:03,335 | INFO  | Using Persistence Adapter:
> KahaDBPersistenceAdapter[/path] |
> org.apache.activemq.broker.BrokerService | main
> 2016-02-15 12:03:03,350 | INFO  | JMX consoles can connect to
> service:jmx:rmi:///jndi/rmi://localhost:1617/jmxrmi |
> org.apache.activemq.broker.jmx.ManagementContext | JMX connector
> 2016-02-15 12:03:05,198 | INFO  | Corrupt journal records found in
> 'path/db-1122.log' between offsets: 17893102..17893732 |
> org.apache.activemq.store.kahadb.disk.journal.Journal | main
> 2016-02-15 12:03:47,925 | INFO  | KahaDB is version 5 |
> org.apache.activemq.store.kahadb.MessageDatabase | main
> ....
> 2016-02-15 12:03:49,233 | INFO  | Recovering from the journal ... |
> org.apache.activemq.store.kahadb.MessageDatabase | main
> 2016-02-15 12:03:49,237 | INFO  | Recovery replayed 20 operations from
> the journal in 0.094 seconds. |
> org.apache.activemq.store.kahadb.MessageDatabase | main
> 2016-02-15 12:04:26,863 | DEBUG | Checkpoint started. |
> org.apache.activemq.store.kahadb.MessageDatabase | ActiveMQ Journal
> Checkpoint Worker
> 2016-02-15 12:04:26,932 | DEBUG | Checkpoint done. |
> org.apache.activemq.store.kahadb.MessageDatabase | ActiveMQ Journal
> Checkpoint Worker
> 2016-02-15 12:04:26,940 | INFO  | Apache ActiveMQ 5.10.2 (A,
> ID:A-59167-1455534183070-1:1) is starting |
> org.apache.activemq.broker.BrokerService | main
> ...
> 2016-02-15 12:04:30,289 | DEBUG | A adding destination: queue://q1.DLQ
> | org.apache.activemq.broker.region.AbstractRegion | main
> 2016-02-15 12:04:57,026 | DEBUG | queue://q1.DLQ expiring messages ..
> | org.apache.activemq.broker.region.Queue | ActiveMQ Broker[A] Scheduler
> 2016-02-15 12:04:57,029 | DEBUG | q1.DLQ toPageIn: 5, Inflight: 0,
> pagedInMessages.size 0, pagedInPendingDispatch.size 0, enqueueCount:
> 0, dequeueCount: 0, memUsage:0 |
> org.apache.activemq.broker.region.Queue | ActiveMQ Broker[A] Scheduler
> 2016-02-15 12:09:20,440 | DEBUG | Checkpoint started. |
> org.apache.activemq.store.kahadb.MessageDatabase | ActiveMQ Journal
> Checkpoint Worker
> 2016-02-15 12:09:20,476 | DEBUG | Checkpoint done. |
> org.apache.activemq.store.kahadb.MessageDatabase | ActiveMQ Journal
> Checkpoint Worker
> 2016-02-15 12:09:20,488 | DEBUG | queue://q1.DLQ expiring messages
> done. | org.apache.activemq.broker.region.Queue | ActiveMQ Broker[A]
> Scheduler
>
>


--
Christian Schneider
http://www.liquid-reality.de

Open Source Architect
http://www.talend.com

Reply | Threaded
Open this post in threaded view
|

Re: Failover very slow with kahadb while restart of master is fast

Tim Bain
By reading the code (
http://www.grepcode.com/file/repo1.maven.org/maven2/org.apache.activemq/activemq-kahadb-store/5.11.1/org/apache/activemq/store/kahadb/disk/journal/Journal.java#Journal.recoveryCheck%28org.apache.activemq.store.kahadb.disk.journal.DataFile%29),
I found that we're reading the data files to confirm that they're not
corrupted.  That's the beauty of an open-source project: all of the source
code is available, so you can read it yourself and see what's being done.

You should confirm that your OS will properly invalidate its cache entry
for a given file when that file changes; having KahaDB read stale cache
entries would be a very bad thing.

Tim
We have found the main cause for this issue. Current Master and new Master
seem to do the exact same things on startup.
They read all the journals.

It seems that the OS is caching the files in memory in some cases.
The master seems to have all the jorunal files in the cache. So when it
restarts it is fast.
The new master seems to not have the journals in cache so they have to be
transfered over the network (~10MB/s) in our case. So this
causes the difference in startup time.

One big question remains:
Why are the whole journals read? Shouldn't a master that is shut down in a
controlled way leave a clean state, so on startup only the meta data index
is read?

Christian

On 15.02.2016 13:32, Christian Schneider wrote:

> At a customer we are experiencing a very strange behaviour of activemq.
>
> We have two brokers on separate machines A and B. Storage is on a third
> machine and imported using nfsv4.
> Kahadb contains about 18 GB of journal files.
>
> Scenario 1:
> - Stop B
> - Stop A (master now)
> - Start A
>
> The start of A takes about 50 seconds.
>
> Scnario 2:
> - Stop A
> - Stop B (master now)
> - Start A
>
> The start of A takes about 5 minutes.
>
> In the slow case the log at debug level shows a big gap between adding the
> last queue (12:04) and running the ActiveMQ Journal Checkpoint Worker again
> after the queue creation (12:09).
> I have no idea what activemq is doing in the mean time but it seems to
> take several minutes.
>
> Can anyone explain this behaviour or explain what activemq does in the
> mean time? I would also be happy about pointer to the code.
> What I do not understand is why it makes a difference if the master is
> started again or if the slave is becoming master. As they both share the
> same kahadb over nfs I would expect the same startup
> behaviour.
>
> Christian
>
> -----
>
> Slow start
> 2016-02-15 12:03:01,943 | INFO  | Refreshing
> org.apache.activemq.xbean.XBeanBrokerFactory$1@17b14695: startup date
> [Mon Feb 15 12:03:01 CET 2016]; root of context hierarchy |
> org.apache.activemq.xbean.XBeanBrokerFactory$1 | main
> ...
> 2016-02-15 12:03:03,018 | INFO  |
> PListStore:[/path/activemq/data/A/tmp_storage] started |
> org.apache.activemq.store.kahadb.plist.PListStoreImpl | main
> 2016-02-15 12:03:03,033 | INFO  | ignoring zero length, partially
> initialised journal data file: db-1.log number = 1 , length = 0 |
> org.apache.activemq.store.kahadb.disk.journal.Journal | main
> 2016-02-15 12:03:03,177 | INFO  | JobSchedulerStore:path/scheduler started
> | org.apache.activemq.store.kahadb.scheduler.JobSchedulerStoreImpl | main
> 2016-02-15 12:03:03,180 | INFO  | JobScheduler using directory:
> path/scheduler | org.apache.activemq.broker.BrokerService | main
> 2016-02-15 12:03:03,335 | INFO  | Using Persistence Adapter:
> KahaDBPersistenceAdapter[/path] | org.apache.activemq.broker.BrokerService
> | main
> 2016-02-15 12:03:03,350 | INFO  | JMX consoles can connect to
> service:jmx:rmi:///jndi/rmi://localhost:1617/jmxrmi |
> org.apache.activemq.broker.jmx.ManagementContext | JMX connector
> 2016-02-15 12:03:05,198 | INFO  | Corrupt journal records found in
> 'path/db-1122.log' between offsets: 17893102..17893732 |
> org.apache.activemq.store.kahadb.disk.journal.Journal | main
> 2016-02-15 12:03:47,925 | INFO  | KahaDB is version 5 |
> org.apache.activemq.store.kahadb.MessageDatabase | main
> ....
> 2016-02-15 12:03:49,233 | INFO  | Recovering from the journal ... |
> org.apache.activemq.store.kahadb.MessageDatabase | main
> 2016-02-15 12:03:49,237 | INFO  | Recovery replayed 20 operations from the
> journal in 0.094 seconds. |
> org.apache.activemq.store.kahadb.MessageDatabase | main
> 2016-02-15 12:04:26,863 | DEBUG | Checkpoint started. |
> org.apache.activemq.store.kahadb.MessageDatabase | ActiveMQ Journal
> Checkpoint Worker
> 2016-02-15 12:04:26,932 | DEBUG | Checkpoint done. |
> org.apache.activemq.store.kahadb.MessageDatabase | ActiveMQ Journal
> Checkpoint Worker
> 2016-02-15 12:04:26,940 | INFO  | Apache ActiveMQ 5.10.2 (A,
> ID:A-59167-1455534183070-1:1) is starting |
> org.apache.activemq.broker.BrokerService | main
> ...
> 2016-02-15 12:04:30,289 | DEBUG | A adding destination: queue://q1.DLQ |
> org.apache.activemq.broker.region.AbstractRegion | main
> 2016-02-15 12:04:57,026 | DEBUG | queue://q1.DLQ expiring messages .. |
> org.apache.activemq.broker.region.Queue | ActiveMQ Broker[A] Scheduler
> 2016-02-15 12:04:57,029 | DEBUG | q1.DLQ toPageIn: 5, Inflight: 0,
> pagedInMessages.size 0, pagedInPendingDispatch.size 0, enqueueCount: 0,
> dequeueCount: 0, memUsage:0 | org.apache.activemq.broker.region.Queue |
> ActiveMQ Broker[A] Scheduler
> 2016-02-15 12:09:20,440 | DEBUG | Checkpoint started. |
> org.apache.activemq.store.kahadb.MessageDatabase | ActiveMQ Journal
> Checkpoint Worker
> 2016-02-15 12:09:20,476 | DEBUG | Checkpoint done. |
> org.apache.activemq.store.kahadb.MessageDatabase | ActiveMQ Journal
> Checkpoint Worker
> 2016-02-15 12:09:20,488 | DEBUG | queue://q1.DLQ expiring messages done. |
> org.apache.activemq.broker.region.Queue | ActiveMQ Broker[A] Scheduler
>
>
>

--
Christian Schneider
http://www.liquid-reality.de

Open Source Architect
http://www.talend.com
Reply | Threaded
Open this post in threaded view
|

Re: Failover very slow with kahadb while restart of master is fast

Tim Bain
Also, your question makes the assumption that files your broker stumbles
upon when it becomes master were written by a previous master that shutdown
cleanly, which is a rather optimistic assumption to make.
On Apr 29, 2016 7:01 AM, "Tim Bain" <[hidden email]> wrote:

> By reading the code (
> http://www.grepcode.com/file/repo1.maven.org/maven2/org.apache.activemq/activemq-kahadb-store/5.11.1/org/apache/activemq/store/kahadb/disk/journal/Journal.java#Journal.recoveryCheck%28org.apache.activemq.store.kahadb.disk.journal.DataFile%29),
> I found that we're reading the data files to confirm that they're not
> corrupted.  That's the beauty of an open-source project: all of the source
> code is available, so you can read it yourself and see what's being done.
>
> You should confirm that your OS will properly invalidate its cache entry
> for a given file when that file changes; having KahaDB read stale cache
> entries would be a very bad thing.
>
> Tim
> We have found the main cause for this issue. Current Master and new Master
> seem to do the exact same things on startup.
> They read all the journals.
>
> It seems that the OS is caching the files in memory in some cases.
> The master seems to have all the jorunal files in the cache. So when it
> restarts it is fast.
> The new master seems to not have the journals in cache so they have to be
> transfered over the network (~10MB/s) in our case. So this
> causes the difference in startup time.
>
> One big question remains:
> Why are the whole journals read? Shouldn't a master that is shut down in a
> controlled way leave a clean state, so on startup only the meta data index
> is read?
>
> Christian
>
> On 15.02.2016 13:32, Christian Schneider wrote:
>
>> At a customer we are experiencing a very strange behaviour of activemq.
>>
>> We have two brokers on separate machines A and B. Storage is on a third
>> machine and imported using nfsv4.
>> Kahadb contains about 18 GB of journal files.
>>
>> Scenario 1:
>> - Stop B
>> - Stop A (master now)
>> - Start A
>>
>> The start of A takes about 50 seconds.
>>
>> Scnario 2:
>> - Stop A
>> - Stop B (master now)
>> - Start A
>>
>> The start of A takes about 5 minutes.
>>
>> In the slow case the log at debug level shows a big gap between adding
>> the last queue (12:04) and running the ActiveMQ Journal Checkpoint Worker
>> again after the queue creation (12:09).
>> I have no idea what activemq is doing in the mean time but it seems to
>> take several minutes.
>>
>> Can anyone explain this behaviour or explain what activemq does in the
>> mean time? I would also be happy about pointer to the code.
>> What I do not understand is why it makes a difference if the master is
>> started again or if the slave is becoming master. As they both share the
>> same kahadb over nfs I would expect the same startup
>> behaviour.
>>
>> Christian
>>
>> -----
>>
>> Slow start
>> 2016-02-15 12:03:01,943 | INFO  | Refreshing
>> org.apache.activemq.xbean.XBeanBrokerFactory$1@17b14695: startup date
>> [Mon Feb 15 12:03:01 CET 2016]; root of context hierarchy |
>> org.apache.activemq.xbean.XBeanBrokerFactory$1 | main
>> ...
>> 2016-02-15 12:03:03,018 | INFO  |
>> PListStore:[/path/activemq/data/A/tmp_storage] started |
>> org.apache.activemq.store.kahadb.plist.PListStoreImpl | main
>> 2016-02-15 12:03:03,033 | INFO  | ignoring zero length, partially
>> initialised journal data file: db-1.log number = 1 , length = 0 |
>> org.apache.activemq.store.kahadb.disk.journal.Journal | main
>> 2016-02-15 12:03:03,177 | INFO  | JobSchedulerStore:path/scheduler
>> started | org.apache.activemq.store.kahadb.scheduler.JobSchedulerStoreImpl
>> | main
>> 2016-02-15 12:03:03,180 | INFO  | JobScheduler using directory:
>> path/scheduler | org.apache.activemq.broker.BrokerService | main
>> 2016-02-15 12:03:03,335 | INFO  | Using Persistence Adapter:
>> KahaDBPersistenceAdapter[/path] | org.apache.activemq.broker.BrokerService
>> | main
>> 2016-02-15 12:03:03,350 | INFO  | JMX consoles can connect to
>> service:jmx:rmi:///jndi/rmi://localhost:1617/jmxrmi |
>> org.apache.activemq.broker.jmx.ManagementContext | JMX connector
>> 2016-02-15 12:03:05,198 | INFO  | Corrupt journal records found in
>> 'path/db-1122.log' between offsets: 17893102..17893732 |
>> org.apache.activemq.store.kahadb.disk.journal.Journal | main
>> 2016-02-15 12:03:47,925 | INFO  | KahaDB is version 5 |
>> org.apache.activemq.store.kahadb.MessageDatabase | main
>> ....
>> 2016-02-15 12:03:49,233 | INFO  | Recovering from the journal ... |
>> org.apache.activemq.store.kahadb.MessageDatabase | main
>> 2016-02-15 12:03:49,237 | INFO  | Recovery replayed 20 operations from
>> the journal in 0.094 seconds. |
>> org.apache.activemq.store.kahadb.MessageDatabase | main
>> 2016-02-15 12:04:26,863 | DEBUG | Checkpoint started. |
>> org.apache.activemq.store.kahadb.MessageDatabase | ActiveMQ Journal
>> Checkpoint Worker
>> 2016-02-15 12:04:26,932 | DEBUG | Checkpoint done. |
>> org.apache.activemq.store.kahadb.MessageDatabase | ActiveMQ Journal
>> Checkpoint Worker
>> 2016-02-15 12:04:26,940 | INFO  | Apache ActiveMQ 5.10.2 (A,
>> ID:A-59167-1455534183070-1:1) is starting |
>> org.apache.activemq.broker.BrokerService | main
>> ...
>> 2016-02-15 12:04:30,289 | DEBUG | A adding destination: queue://q1.DLQ |
>> org.apache.activemq.broker.region.AbstractRegion | main
>> 2016-02-15 12:04:57,026 | DEBUG | queue://q1.DLQ expiring messages .. |
>> org.apache.activemq.broker.region.Queue | ActiveMQ Broker[A] Scheduler
>> 2016-02-15 12:04:57,029 | DEBUG | q1.DLQ toPageIn: 5, Inflight: 0,
>> pagedInMessages.size 0, pagedInPendingDispatch.size 0, enqueueCount: 0,
>> dequeueCount: 0, memUsage:0 | org.apache.activemq.broker.region.Queue |
>> ActiveMQ Broker[A] Scheduler
>> 2016-02-15 12:09:20,440 | DEBUG | Checkpoint started. |
>> org.apache.activemq.store.kahadb.MessageDatabase | ActiveMQ Journal
>> Checkpoint Worker
>> 2016-02-15 12:09:20,476 | DEBUG | Checkpoint done. |
>> org.apache.activemq.store.kahadb.MessageDatabase | ActiveMQ Journal
>> Checkpoint Worker
>> 2016-02-15 12:09:20,488 | DEBUG | queue://q1.DLQ expiring messages done.
>> | org.apache.activemq.broker.region.Queue | ActiveMQ Broker[A] Scheduler
>>
>>
>>
>
> --
> Christian Schneider
> http://www.liquid-reality.de
>
> Open Source Architect
> http://www.talend.com
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Failover very slow with kahadb while restart of master is fast

cschneider
I do not assume that the restart will be clean when the master dies.

In our case the brokers are shutdown orderly during a regular
maintenance where some of the customer software is updated on the system.
So in the case were we experienced the slow start it was a clean shutdown.

Christian

On 29.04.2016 15:03, Tim Bain wrote:
> Also, your question makes the assumption that files your broker stumbles
> upon when it becomes master were written by a previous master that shutdown
> cleanly, which is a rather optimistic assumption to make.
> On Apr 29, 2016 7:01 AM, "Tim Bain" <[hidden email]> wrote:
>

--
Christian Schneider
http://www.liquid-reality.de

Open Source Architect
http://www.talend.com

Reply | Threaded
Open this post in threaded view
|

Re: Failover very slow with kahadb while restart of master is fast

Tim Bain
Sure, YOU know it was a clean shutdown, but how does your slave broker know
which type it was?

Methods to signal that fact could be added, but none currently exists in
the code.
On Apr 29, 2016 7:14 AM, "Christian Schneider" <[hidden email]>
wrote:

> I do not assume that the restart will be clean when the master dies.
>
> In our case the brokers are shutdown orderly during a regular maintenance
> where some of the customer software is updated on the system.
> So in the case were we experienced the slow start it was a clean shutdown.
>
> Christian
>
> On 29.04.2016 15:03, Tim Bain wrote:
>
>> Also, your question makes the assumption that files your broker stumbles
>> upon when it becomes master were written by a previous master that
>> shutdown
>> cleanly, which is a rather optimistic assumption to make.
>> On Apr 29, 2016 7:01 AM, "Tim Bain" <[hidden email]> wrote:
>>
>>
> --
> Christian Schneider
> http://www.liquid-reality.de
>
> Open Source Architect
> http://www.talend.com
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Failover very slow with kahadb while restart of master is fast

akhil
In reply to this post by cschneider
Hi Chris ,

How did you solve this caching issue while your brokers standing on master
slave ..?

Did you put out any settings in activemq xml to startup the broker in quick
time or making it quick failover ?

Thanks,
Akhil.



--
Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html