replicatedLevelDB errors after failover

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

replicatedLevelDB errors after failover

kal123
I am using the latest changes from trunk and i see following error after failover (still getting corrupted?)

2013-11-15 06:44:53,606 | INFO  | jolokia-agent: Using access restrictor classpath:/jolokia-access.xml | /hawtio | main
2013-11-15 06:44:53,743 | INFO  | ActiveMQ WebConsole available at http://localhost:8161/ | org.apache.activemq.web.WebConsoleStarter | main
2013-11-15 06:44:53,754 | INFO  | Initializing Spring FrameworkServlet 'dispatcher' | /admin | main
2013-11-15 06:44:56,969 | INFO  | Stopping BrokerService[largeamq] due to exception, java.io.IOException | org.apache.activemq.util.DefaultIOExceptionHandler | LevelDB IOException handler.
java.io.IOException
        at org.apache.activemq.util.IOExceptionSupport.create(IOExceptionSupport.java:39)
        at org.apache.activemq.leveldb.LevelDBClient.might_fail(LevelDBClient.scala:554)
        at org.apache.activemq.leveldb.LevelDBClient.might_fail_using_index(LevelDBClient.scala:1021)
        at org.apache.activemq.leveldb.LevelDBClient.collectionCursor(LevelDBClient.scala:1320)
        at org.apache.activemq.leveldb.LevelDBClient.queueCursor(LevelDBClient.scala:1244)
        at org.apache.activemq.leveldb.DBManager.cursorMessages(DBManager.scala:708)
        at org.apache.activemq.leveldb.LevelDBStore$LevelDBMessageStore.recoverNextMessages(LevelDBStore.scala:756)
        at org.apache.activemq.broker.region.cursors.QueueStorePrefetch.doFillBatch(QueueStorePrefetch.java:106)
        at org.apache.activemq.broker.region.cursors.AbstractStoreCursor.fillBatch(AbstractStoreCursor.java:258)
        at org.apache.activemq.broker.region.cursors.AbstractStoreCursor.hasNext(AbstractStoreCursor.java:145)
        at org.apache.activemq.broker.region.cursors.StoreQueueCursor.hasNext(StoreQueueCursor.java:131)
        at org.apache.activemq.broker.region.Queue.doPageInForDispatch(Queue.java:1876)
        at org.apache.activemq.broker.region.Queue.pageInMessages(Queue.java:2086)
        at org.apache.activemq.broker.region.Queue.iterate(Queue.java:1581)
        at org.apache.activemq.broker.region.Queue.wakeup(Queue.java:1803)
        at org.apache.activemq.broker.region.Queue.addSubscription(Queue.java:464)
        at org.apache.activemq.broker.region.AbstractRegion.addConsumer(AbstractRegion.java:314)
        at org.apache.activemq.broker.region.RegionBroker.addConsumer(RegionBroker.java:400)
        at org.apache.activemq.broker.jmx.ManagedRegionBroker.addConsumer(ManagedRegionBroker.java:230)
        at org.apache.activemq.broker.BrokerFilter.addConsumer(BrokerFilter.java:97)
        at org.apache.activemq.advisory.AdvisoryBroker.addConsumer(AdvisoryBroker.java:102)
        at org.apache.activemq.broker.BrokerFilter.addConsumer(BrokerFilter.java:97)
        at org.apache.activemq.broker.BrokerFilter.addConsumer(BrokerFilter.java:97)
        at org.apache.activemq.broker.MutableBrokerFilter.addConsumer(MutableBrokerFilter.java:102)
        at org.apache.activemq.broker.TransportConnection.processAddConsumer(TransportConnection.java:587)
        at org.apache.activemq.command.ConsumerInfo.visit(ConsumerInfo.java:349)
        at org.apache.activemq.broker.TransportConnection.service(TransportConnection.java:292)
Reply | Threaded
Open this post in threaded view
|

Re: replicatedLevelDB errors after failover

kal123
This post was updated on .
the issue seems to happen when master broker that is being stopped (to cause failover) generated the following exception.
Store getting corrupted due to the way the stop is handled?

2013-11-15 09:26:52,881 | WARN  | Transport Connection to: tcp://10.44.173.146:47163 failed: java.io.IOException: Connection reset by peer | org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ NIO Worker 3
2013-11-15 09:27:04,959 | INFO  | Apache ActiveMQ 5.10-SNAPSHOT (largeamq, ID:dl360x2832-38143-1384525531423-0:1) is shutting down | org.apache.activemq.broker.BrokerService | ActiveMQ ShutdownHook
2013-11-15 09:27:04,982 | INFO  | Stopping BrokerService[largeamq] due to exception, java.io.IOException: Store replication stopped | org.apache.activemq.util.DefaultIOExceptionHandler | LevelDB IOException handler.
java.io.IOException: Store replication stopped
        at org.apache.activemq.util.IOExceptionSupport.create(IOExceptionSupport.java:39)
        at org.apache.activemq.leveldb.LevelDBClient.might_fail(LevelDBClient.scala:554)
        at org.apache.activemq.leveldb.LevelDBClient.might_fail_using_index(LevelDBClient.scala:1021)
        at org.apache.activemq.leveldb.LevelDBClient.store(LevelDBClient.scala:1352)
        at org.apache.activemq.leveldb.DBManager$$anonfun$drainFlushes$1.apply$mcV$sp(DBManager.scala:600)
        at org.fusesource.hawtdispatch.package$$anon$4.run(hawtdispatch.scala:357)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:722)
Caused by: java.lang.IllegalStateException: Store replication stopped
        at org.apache.activemq.leveldb.replicated.MasterLevelDBStore.wal_sync_to(MasterLevelDBStore.scala:420)
        at org.apache.activemq.leveldb.replicated.MasterLevelDBClient$$anon$2$$anon$1.force(MasterLevelDBClient.scala:147)
        at org.apache.activemq.leveldb.LevelDBClient$$anonfun$store$1$$anonfun$apply$mcV$sp$14.apply(LevelDBClient.scala:1358)
        at org.apache.activemq.leveldb.LevelDBClient$$anonfun$store$1$$anonfun$apply$mcV$sp$14.apply(LevelDBClient.scala:1353)
        at org.apache.activemq.leveldb.RecordLog$$anonfun$appender$1.apply(RecordLog.scala:479)
        at org.apache.activemq.leveldb.util.TimeMetric.apply(TimeMetric.scala:43)
        at org.apache.activemq.leveldb.RecordLog.appender(RecordLog.scala:478)
        at org.apache.activemq.leveldb.LevelDBClient$$anonfun$store$1.apply$mcV$sp(LevelDBClient.scala:1353)
        at org.apache.activemq.leveldb.LevelDBClient$$anonfun$store$1.apply(LevelDBClient.scala:1352)
        at org.apache.activemq.leveldb.LevelDBClient$$anonfun$store$1.apply(LevelDBClient.scala:1352)
        at org.apache.activemq.leveldb.LevelDBClient.usingIndex(LevelDBClient.scala:1015)
        at org.apache.activemq.leveldb.LevelDBClient$$anonfun$might_fail_using_index$1.apply(LevelDBClient.scala:1021)
        at org.apache.activemq.leveldb.LevelDBClient.might_fail(LevelDBClient.scala:551)
        ... 7 more
2013-11-15 09:27:05,011 | INFO  | Stopped LevelDB[/solidstate/ldb] | org.apache.activemq.leveldb.LevelDBStore | LevelDB IOException handler.
2013-11-15 09:27:06,879 | INFO  | Connector openwire stopped | org.apache.activemq.broker.TransportConnector | ActiveMQ ShutdownHook
2013-11-15 09:27:07,768 | INFO  | Connector amqp stopped | org.apache.activemq.broker.TransportConnector | ActiveMQ ShutdownHook
Reply | Threaded
Open this post in threaded view
|

Re: replicatedLevelDB errors after failover

chirino
In reply to this post by kal123
Hi that exception should have also printed out a cause.  Could you
include that too?  If you built it yourself, which git commit did you
build?  If you downloaded a snapshot whats the file name of the
download?

On Fri, Nov 15, 2013 at 8:46 AM, kal123 <[hidden email]> wrote:

> I am using the latest changes from trunk and i see following error after
> failover (still getting corrupted?)
>
> 2013-11-15 06:44:53,606 | INFO  | jolokia-agent: Using access restrictor
> classpath:/jolokia-access.xml | /hawtio | main
> 2013-11-15 06:44:53,743 | INFO  | ActiveMQ WebConsole available at
> http://localhost:8161/ | org.apache.activemq.web.WebConsoleStarter | main
> 2013-11-15 06:44:53,754 | INFO  | Initializing Spring FrameworkServlet
> 'dispatcher' | /admin | main
> 2013-11-15 06:44:56,969 | INFO  | Stopping BrokerService[largeamq] due to
> exception, java.io.IOException |
> org.apache.activemq.util.DefaultIOExceptionHandler | LevelDB IOException
> handler.
> java.io.IOException
>         at
> org.apache.activemq.util.IOExceptionSupport.create(IOExceptionSupport.java:39)
>         at
> org.apache.activemq.leveldb.LevelDBClient.might_fail(LevelDBClient.scala:554)
>         at
> org.apache.activemq.leveldb.LevelDBClient.might_fail_using_index(LevelDBClient.scala:1021)
>         at
> org.apache.activemq.leveldb.LevelDBClient.collectionCursor(LevelDBClient.scala:1320)
>         at
> org.apache.activemq.leveldb.LevelDBClient.queueCursor(LevelDBClient.scala:1244)
>         at
> org.apache.activemq.leveldb.DBManager.cursorMessages(DBManager.scala:708)
>         at
> org.apache.activemq.leveldb.LevelDBStore$LevelDBMessageStore.recoverNextMessages(LevelDBStore.scala:756)
>         at
> org.apache.activemq.broker.region.cursors.QueueStorePrefetch.doFillBatch(QueueStorePrefetch.java:106)
>         at
> org.apache.activemq.broker.region.cursors.AbstractStoreCursor.fillBatch(AbstractStoreCursor.java:258)
>         at
> org.apache.activemq.broker.region.cursors.AbstractStoreCursor.hasNext(AbstractStoreCursor.java:145)
>         at
> org.apache.activemq.broker.region.cursors.StoreQueueCursor.hasNext(StoreQueueCursor.java:131)
>         at
> org.apache.activemq.broker.region.Queue.doPageInForDispatch(Queue.java:1876)
>         at
> org.apache.activemq.broker.region.Queue.pageInMessages(Queue.java:2086)
>         at org.apache.activemq.broker.region.Queue.iterate(Queue.java:1581)
>         at org.apache.activemq.broker.region.Queue.wakeup(Queue.java:1803)
>         at
> org.apache.activemq.broker.region.Queue.addSubscription(Queue.java:464)
>         at
> org.apache.activemq.broker.region.AbstractRegion.addConsumer(AbstractRegion.java:314)
>         at
> org.apache.activemq.broker.region.RegionBroker.addConsumer(RegionBroker.java:400)
>         at
> org.apache.activemq.broker.jmx.ManagedRegionBroker.addConsumer(ManagedRegionBroker.java:230)
>         at
> org.apache.activemq.broker.BrokerFilter.addConsumer(BrokerFilter.java:97)
>         at
> org.apache.activemq.advisory.AdvisoryBroker.addConsumer(AdvisoryBroker.java:102)
>         at
> org.apache.activemq.broker.BrokerFilter.addConsumer(BrokerFilter.java:97)
>         at
> org.apache.activemq.broker.BrokerFilter.addConsumer(BrokerFilter.java:97)
>         at
> org.apache.activemq.broker.MutableBrokerFilter.addConsumer(MutableBrokerFilter.java:102)
>         at
> org.apache.activemq.broker.TransportConnection.processAddConsumer(TransportConnection.java:587)
>         at
> org.apache.activemq.command.ConsumerInfo.visit(ConsumerInfo.java:349)
>         at
> org.apache.activemq.broker.TransportConnection.service(TransportConnection.java:292)
>
>
>
>
> --
> View this message in context: http://activemq.2283324.n4.nabble.com/replicatedLevelDB-errors-after-failover-tp4674550.html
> Sent from the ActiveMQ - User mailing list archive at Nabble.com.



--
Hiram Chirino

Engineering | Red Hat, Inc.

[hidden email] | fusesource.com | redhat.com

skype: hiramchirino | twitter: @hiramchirino

blog: Hiram Chirino's Bit Mojo
Reply | Threaded
Open this post in threaded view
|

Re: replicatedLevelDB errors after failover

kal123
This post was updated on .
I believe there were no other errors prior to that exception, i will try to reproduce and update you on it.  

I downloaded the source yesterday form git:
parent: e57aeb3) | patch
See AMQ-4886.  Updated tearDown so it can't hang, reduced timeouts, updated to JUnit4



Reply | Threaded
Open this post in threaded view
|

Re: replicatedLevelDB errors after failover

kal123

Today after the exception occurred, I updated the activemq start script to redirect output to file instead of null and at the start i see the following error.  Does this help?

INFO | Attaching... Downloaded 295.12/295.12 kb and 4/4 files
 INFO | Attached
java.io.IOException: invalid record position 314575339 (file: /solidstate/ldb/000000000c80071d.log, offset: 104858318)
        at org.apache.activemq.leveldb.RecordLog$LogReader.read(RecordLog.scala:316)
        at org.apache.activemq.leveldb.RecordLog$$anonfun$read$1.apply(RecordLog.scala:560)
        at org.apache.activemq.leveldb.RecordLog$$anonfun$read$1.apply(RecordLog.scala:560)
        at org.apache.activemq.leveldb.RecordLog$$anonfun$get_reader$1.apply(RecordLog.scala:552)
        at org.apache.activemq.leveldb.RecordLog$$anonfun$get_reader$1.apply(RecordLog.scala:534)
        at scala.Option.map(Option.scala:133)
        at org.apache.activemq.leveldb.RecordLog.get_reader(RecordLog.scala:534)
        at org.apache.activemq.leveldb.RecordLog.read(RecordLog.scala:560)
        at org.apache.activemq.leveldb.LevelDBClient$$anonfun$replay_from$1.apply$mcV$sp(LevelDBClient.scala:734)
        at org.apache.activemq.leveldb.LevelDBClient$$anonfun$replay_from$1.apply(LevelDBClient.scala:701)
        at org.apache.activemq.leveldb.LevelDBClient$$anonfun$replay_from$1.apply(LevelDBClient.scala:701)
        at org.apache.activemq.leveldb.LevelDBClient.might_fail(LevelDBClient.scala:551)
        at org.apache.activemq.leveldb.LevelDBClient.replay_from(LevelDBClient.scala:700)
        at org.apache.activemq.leveldb.replicated.SlaveLevelDBStore$$anonfun$send_wal_ack$1.apply$mcV$sp(SlaveLevelDBStore.scala:175)
        at org.fusesource.hawtdispatch.package$$anon$4.run(hawtdispatch.scala:357)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java
Reply | Threaded
Open this post in threaded view
|

Re: replicatedLevelDB errors after failover

chirino
Yep that helps a little.  So that seems like an error on the slave.
Where there any errors on the master?

On Mon, Nov 18, 2013 at 5:37 PM, kal123 <[hidden email]> wrote:

>
> Today after the exception occurred, I updated the activemq start script to
> redirect output to file instead of null and at the start i see the following
> error.  Does this help?
>
> INFO | Attaching... Downloaded 295.12/295.12 kb and 4/4 files
>  INFO | Attached
> java.io.IOException: invalid record position 314575339 (file:
> /solidstate/ldb/000000000c80071d.log, offset: 104858318)
>         at
> org.apache.activemq.leveldb.RecordLog$LogReader.read(RecordLog.scala:316)
>         at
> org.apache.activemq.leveldb.RecordLog$$anonfun$read$1.apply(RecordLog.scala:560)
>         at
> org.apache.activemq.leveldb.RecordLog$$anonfun$read$1.apply(RecordLog.scala:560)
>         at
> org.apache.activemq.leveldb.RecordLog$$anonfun$get_reader$1.apply(RecordLog.scala:552)
>         at
> org.apache.activemq.leveldb.RecordLog$$anonfun$get_reader$1.apply(RecordLog.scala:534)
>         at scala.Option.map(Option.scala:133)
>         at
> org.apache.activemq.leveldb.RecordLog.get_reader(RecordLog.scala:534)
>         at org.apache.activemq.leveldb.RecordLog.read(RecordLog.scala:560)
>         at
> org.apache.activemq.leveldb.LevelDBClient$$anonfun$replay_from$1.apply$mcV$sp(LevelDBClient.scala:734)
>         at
> org.apache.activemq.leveldb.LevelDBClient$$anonfun$replay_from$1.apply(LevelDBClient.scala:701)
>         at
> org.apache.activemq.leveldb.LevelDBClient$$anonfun$replay_from$1.apply(LevelDBClient.scala:701)
>         at
> org.apache.activemq.leveldb.LevelDBClient.might_fail(LevelDBClient.scala:551)
>         at
> org.apache.activemq.leveldb.LevelDBClient.replay_from(LevelDBClient.scala:700)
>         at
> org.apache.activemq.leveldb.replicated.SlaveLevelDBStore$$anonfun$send_wal_ack$1.apply$mcV$sp(SlaveLevelDBStore.scala:175)
>         at
> org.fusesource.hawtdispatch.package$$anon$4.run(hawtdispatch.scala:357)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java
>
>
>
> --
> View this message in context: http://activemq.2283324.n4.nabble.com/replicatedLevelDB-errors-after-failover-tp4674550p4674643.html
> Sent from the ActiveMQ - User mailing list archive at Nabble.com.



--
Hiram Chirino

Engineering | Red Hat, Inc.

[hidden email] | fusesource.com | redhat.com

skype: hiramchirino | twitter: @hiramchirino

blog: Hiram Chirino's Bit Mojo
Reply | Threaded
Open this post in threaded view
|

Re: replicatedLevelDB errors after failover

kal123
Ran failover test again and after about 25 or so fail overs I got the below errors on the master. I did not see any other errors.  I will try to run with debug turned on next week to see it helps with debugging.

Note: even after this error i was able to do few more failover before things stopped working properly.
After few more failovers, the system was failing over automatically continuously within the cluster and producer and consumer stopped working.



ACTIVEMQ_DATA: /activemq_5_10/apache-activemq-5.10-SNAPSHOT/data
Loading message broker from: xbean:activemq.xml
java.io.IOException: invalid record position 1809753339 (file: /solidstate/ldb/000000006a404a30.log, offset: 27155147)
        at org.apache.activemq.leveldb.RecordLog$LogReader.read(RecordLog.scala:316)
        at org.apache.activemq.leveldb.RecordLog$$anonfun$read$1.apply(RecordLog.scala:560)
        at org.apache.activemq.leveldb.RecordLog$$anonfun$read$1.apply(RecordLog.scala:560)
        at org.apache.activemq.leveldb.RecordLog$$anonfun$get_reader$1.apply(RecordLog.scala:552)
        at org.apache.activemq.leveldb.RecordLog$$anonfun$get_reader$1.apply(RecordLog.scala:534)
        at scala.Option.map(Option.scala:133)
        at org.apache.activemq.leveldb.RecordLog.get_reader(RecordLog.scala:534)
        at org.apache.activemq.leveldb.RecordLog.read(RecordLog.scala:560)
        at org.apache.activemq.leveldb.LevelDBClient$$anonfun$replay_from$1.apply$mcV$sp(LevelDBClient.scala:734)
        at org.apache.activemq.leveldb.LevelDBClient$$anonfun$replay_from$1.apply(LevelDBClient.scala:701)
        at org.apache.activemq.leveldb.LevelDBClient$$anonfun$replay_from$1.apply(LevelDBClient.scala:701)
        at org.apache.activemq.leveldb.LevelDBClient.might_fail(LevelDBClient.scala:551)
        at org.apache.activemq.leveldb.LevelDBClient.replay_from(LevelDBClient.scala:700)
        at org.apache.activemq.leveldb.replicated.SlaveLevelDBStore$$anonfun$send_wal_ack$1.apply$mcV$sp(SlaveLevelDBStore.scala:175)
        at org.fusesource.hawtdispatch.package$$anon$4.run(hawtdispatch.scala:357)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)


        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:722)
Restarting broker
Loading message broker from: xbean:activemq.xml


also see the followng during failover:
2013-11-22 11:55:32,192 | INFO  | Stopping BrokerService[largeamq] due to except                                       ion, java.io.IOException | org.apache.activemq.util.DefaultIOExceptionHandler |                                        LevelDB IOException handler.
java.io.IOException
        at org.apache.activemq.util.IOExceptionSupport.create(IOExceptionSupport                                       .java:39)
        at org.apache.activemq.leveldb.LevelDBClient.might_fail(LevelDBClient.sc                                       ala:554)
        at org.apache.activemq.leveldb.LevelDBClient.might_fail_using_index(Leve                                       lDBClient.scala:1021)
        at org.apache.activemq.leveldb.LevelDBClient.collectionCursor(LevelDBCli                                       ent.scala:1320)
        at org.apache.activemq.leveldb.LevelDBClient.queueCursor(LevelDBClient.s                                       cala:1244)
        at org.apache.activemq.leveldb.DBManager.cursorMessages(DBManager.scala:                                       708)
        at org.apache.activemq.leveldb.LevelDBStore$LevelDBMessageStore.recoverN                                       extMessages(LevelDBStore.scala:756)
        at org.apache.activemq.broker.region.cursors.QueueStorePrefetch.doFillBa                                       tch(QueueStorePrefetch.java:106)
        at org.apache.activemq.broker.region.cursors.AbstractStoreCursor.fillBat                                       ch(AbstractStoreCursor.java:258)
        at org.apache.activemq.broker.region.cursors.AbstractStoreCursor.hasNext                                       (AbstractStoreCursor.java:145)
        at org.apache.activemq.broker.region.cursors.StoreQueueCursor.hasNext(St                                       oreQueueCursor.java:131)
        at org.apache.activemq.broker.region.Queue.doPageInForDispatch(Queue.jav                                       a:1876)
        at org.apache.activemq.broker.region.Queue.pageInMessages(Queue.java:208                                       6)
        at org.apache.activemq.broker.region.Queue.iterate(Queue.java:1581)
        at org.apache.activemq.broker.region.Queue.wakeup(Queue.java:1803)
        at org.apache.activemq.broker.region.PrefetchSubscription.acknowledge(Pr                                       efetchSubscription.java:409)
        at org.apache.activemq.broker.region.AbstractRegion.acknowledge(Abstract                                       Region.java:412)
        at org.apache.activemq.broker.region.RegionBroker.acknowledge(RegionBrok                                       er.java:457)
        at org.apache.activemq.broker.BrokerFilter.acknowledge(BrokerFilter.java                                       :82)
        at org.apache.activemq.broker.BrokerFilter.acknowledge(BrokerFilter.java                                       :82)
        at org.apache.activemq.broker.TransactionBroker.acknowledge(TransactionB                                       roker.java:277)
        at org.apache.activemq.broker.MutableBrokerFilter.acknowledge(MutableBro                                       kerFilter.java:92)
        at org.apache.activemq.broker.TransportConnection.processMessageAck(Tran                                       sportConnection.java:476)
        at org.apache.activemq.command.MessageAck.visit(MessageAck.java:236)
        at org.apache.activemq.broker.TransportConnection.service(TransportConne                                       ction.java:292)
        at org.apache.activemq.broker.TransportConnection$1.onCommand(TransportC                                       onnection.java:149)
        at org.apache.activemq.transport.MutexTransport.onCommand(MutexTransport                                       .java:50)
        at org.apache.activemq.transport.WireFormatNegotiator.onCommand(WireForm                                       atNegotiator.java:113)
        at org.apache.activemq.transport.AbstractInactivityMonitor.onCommand(Abs                                       tractInactivityMonitor.java:270)
        at org.apache.activemq.transport.TransportSupport.doConsume(TransportSup                                       port.java:83)
        at org.apache.activemq.transport.nio.NIOTransport.serviceRead(NIOTranspo                                       rt.java:138)
        at org.apache.activemq.transport.nio.NIOTransport$1.onSelect(NIOTranspor                                       t.java:69)
        at org.apache.activemq.transport.nio.SelectorSelection.onSelect(Selector                                       Selection.java:94)
        at org.apache.activemq.transport.nio.SelectorWorker$1.run(SelectorWorker                                       .java:119)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.                                       java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor                                       .java:615)
        at java.lang.Thread.run(Thread.java:724)
Caused by: java.lang.NullPointerException
        at org.apache.activemq.leveldb.LevelDBClient$$anonfun$queueCursor$1.appl                                       y(LevelDBClient.scala:1248)
        at org.apache.activemq.leveldb.LevelDBClient$$anonfun$queueCursor$1.appl                                       y(LevelDBClient.scala:1244)
        at org.apache.activemq.leveldb.LevelDBClient$$anonfun$collectionCursor$1                                       $$anonfun$apply$mcV$sp$12.apply(LevelDBClient.scala:1322)
        at org.apache.activemq.leveldb.LevelDBClient$$anonfun$collectionCursor$1                                       $$anonfun$apply$mcV$sp$12.apply(LevelDBClient.scala:1321)
        at org.apache.activemq.leveldb.LevelDBClient$RichDB.check$4(LevelDBClien                                       t.scala:326)
        at org.apache.activemq.leveldb.LevelDBClient$RichDB.cursorRange(LevelDBC                                       lient.scala:328)
        at org.apache.activemq.leveldb.LevelDBClient$$anonfun$collectionCursor$1                                       .apply$mcV$sp(LevelDBClient.scala:1321)
        at org.apache.activemq.leveldb.LevelDBClient$$anonfun$collectionCursor$1                                       .apply(LevelDBClient.scala:1321)
        at org.apache.activemq.leveldb.LevelDBClient$$anonfun$collectionCursor$1                                       .apply(LevelDBClient.scala:1321)
        at org.apache.activemq.leveldb.LevelDBClient.usingIndex(LevelDBClient.sc                                       ala:1015)
        at org.apache.activemq.leveldb.LevelDBClient$$anonfun$might_fail_using_i                                       ndex$1.apply(LevelDBClient.scala:1021)
        at org.apache.activemq.leveldb.LevelDBClient.might_fail(LevelDBClient.sc                                       ala:551)


Reply | Threaded
Open this post in threaded view
|

Re: replicatedLevelDB errors after failover

kal123
Tried testing with dec. 5 snapshot and getting following errors after about 10 failovers:

2013-12-06 12:06:39,673 | WARN  | Could not load message seq: 45760 from DataLocator(630f8b1, 2262) | org.apache.activemq.leveldb.LevelDBClient | ActiveMQ NIO Worker 2
2013-12-06 12:06:39,673 | WARN  | No reader available for position: 63101c1, log_infos: {104859611=LogInfo(/solidstate/ldb/00000000064007db.log,104859611,104859172), 209718783=LogInfo(/solidstate/ldb/000000000c800dff.log,209718783,104859220), 314578003=LogInfo(/solidstate/ldb/0000000012c01453.log,314578003,104858557), 419436560=LogInfo(/solidstate/ldb/0000000019001810.log,419436560,104858088), 524294648=LogInfo(/solidstate/ldb/000000001f4019f8.log,524294648,0)} | org.apache.activemq.leveldb.RecordLog | ActiveMQ NIO Worker 2
2013-12-06 12:06:39,673 | WARN  | Could not load message seq: 45761 from DataLocator(63101c1, 2262) | org.apache.activemq.leveldb.LevelDBClient | ActiveMQ NIO Worker 2
2013-12-06 12:06:39,673 | WARN  | No reader available for position: 6310ad1, log_infos: {104859611=LogInfo(/solidstate/ldb/00000000064007db.log,104859611,104859172), 209718783=LogInfo(/solidstate/ldb/000000000c800dff.log,209718783,104859220), 314578003=LogInfo(/solidstate/ldb/0000000012c01453.log,314578003,104858557), 419436560=LogInfo(/solidstate/ldb/0000000019001810.log,419436560,104858088), 524294648=LogInfo(/solidstate/ldb/000000001f4019f8.log,524294648,0)} | org.apache.activemq.leveldb.RecordLog | ActiveMQ NIO Worker 2
2013-12-06 12:06:39,673 | WARN  | Could not load message seq: 45762 from DataLocator(6310ad1, 2262) | org.apache.activemq.leveldb.LevelDBClient | ActiveMQ NIO Worker 2
Reply | Threaded
Open this post in threaded view
|

Re: replicatedLevelDB errors after failover

chirino
I suspect those 'Could not load message seq' warnings can be ignored.
Can you verify that there has been no message loss?

On Fri, Dec 6, 2013 at 12:19 PM, kal123 <[hidden email]> wrote:

> Tried testing with dec. 5 snapshot and getting following errors after about
> 10 failovers:
>
> 2013-12-06 12:06:39,673 | WARN  | Could not load message seq: 45760 from
> DataLocator(630f8b1, 2262) | org.apache.activemq.leveldb.LevelDBClient |
> ActiveMQ NIO Worker 2
> 2013-12-06 12:06:39,673 | WARN  | No reader available for position: 63101c1,
> log_infos:
> {104859611=LogInfo(/solidstate/ldb/00000000064007db.log,104859611,104859172),
> 209718783=LogInfo(/solidstate/ldb/000000000c800dff.log,209718783,104859220),
> 314578003=LogInfo(/solidstate/ldb/0000000012c01453.log,314578003,104858557),
> 419436560=LogInfo(/solidstate/ldb/0000000019001810.log,419436560,104858088),
> 524294648=LogInfo(/solidstate/ldb/000000001f4019f8.log,524294648,0)} |
> org.apache.activemq.leveldb.RecordLog | ActiveMQ NIO Worker 2
> 2013-12-06 12:06:39,673 | WARN  | Could not load message seq: 45761 from
> DataLocator(63101c1, 2262) | org.apache.activemq.leveldb.LevelDBClient |
> ActiveMQ NIO Worker 2
> 2013-12-06 12:06:39,673 | WARN  | No reader available for position: 6310ad1,
> log_infos:
> {104859611=LogInfo(/solidstate/ldb/00000000064007db.log,104859611,104859172),
> 209718783=LogInfo(/solidstate/ldb/000000000c800dff.log,209718783,104859220),
> 314578003=LogInfo(/solidstate/ldb/0000000012c01453.log,314578003,104858557),
> 419436560=LogInfo(/solidstate/ldb/0000000019001810.log,419436560,104858088),
> 524294648=LogInfo(/solidstate/ldb/000000001f4019f8.log,524294648,0)} |
> org.apache.activemq.leveldb.RecordLog | ActiveMQ NIO Worker 2
> 2013-12-06 12:06:39,673 | WARN  | Could not load message seq: 45762 from
> DataLocator(6310ad1, 2262) | org.apache.activemq.leveldb.LevelDBClient |
> ActiveMQ NIO Worker 2
>
>
>
>
> --
> View this message in context: http://activemq.2283324.n4.nabble.com/replicatedLevelDB-errors-after-failover-tp4674550p4675255.html
> Sent from the ActiveMQ - User mailing list archive at Nabble.com.



--
Hiram Chirino

Engineering | Red Hat, Inc.

[hidden email] | fusesource.com | redhat.com

skype: hiramchirino | twitter: @hiramchirino

blog: Hiram Chirino's Bit Mojo
Reply | Threaded
Open this post in threaded view
|

Re: replicatedLevelDB errors after failover

kal123
The leveldb seems to be corrupted after 10 failovers, broker was not able to load records from leveldb, similar to the issue described in early post but now i see these additional error msg
Reply | Threaded
Open this post in threaded view
|

Re: replicatedLevelDB errors after failover

mtod
Was this ever resolved?

I'm testing a leveldb cluster and was performing some fail over and I now have the same problem.

 Activemq log