Reliability issues with Apollo?

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Reliability issues with Apollo?

rjrizzuto
I am testing with Apollo 1.7.1 running with Java 1.8.0_45 on Windows 7.  I am using Apache.NMS with a .Net producer and consumer app.  The producer publishes to topic FOO.BAR, and the consumer is subscribing to durable subscription rizzuto:test.  Here is the config from the virtual host in the apollo.xml:

  <virtual_host id="RaysBroker" purge_on_startup="true">
    <host_name>RaysBroker</host_name>
    <host_name>localhost</host_name>
    <host_name>127.0.0.1</host_name>

    <access_rule allow="users" action="connect create destroy send receive consume"/>
   
    <leveldb_store directory="${apollo.base}/data"/> 
   
  <topic id="FOO.BAR" auto_delete_after="0">
  </topic>

  <dsub id="*:test" topic="FOO.BAR" quota="2gb" full_policy="drop head" />

  </virtual_host>

When I run the producer and consumer, I have no issues, and have run that for about an hour sending ~20K msg/sec on a PC with a 1st gen i7.

If I kill and restart the consumer, the consumer starts back up from where it left off, however after a while the consumer stops receiving messages, and I start getting various warning on the broker console.  I have seen all of the following at different times:

WARN  | Could not snapshot the index: java.io.FileNotFoundException: c:\Users\rizzuto\Downloads\RaysBroker\data\dirty.index\000097.log (The system cannot find the file specified)
WARN  | Could not snapshot the index: java.io.IOException: link failed
WARN  | DB operation failed. (entering recovery mode): java.io.EOFException: File 'c:\Users\rizzuto\Downloads\RaysBroker\data\0000000000000000.log' offset: 1865684455
WARN  | java.lang.Error: java.io.EOFException: File 'c:\Users\rizzuto\Downloads\RaysBroker\data\0000000000000000.log' offset: 1871425261
WARN  | java.lang.AssertionError: Node is not linked
WARN  | Queue 'rizzuto:test' detected store dropped message at seq: 734760

My durable subscription is set with a 2gb quota, and to delete from the head if the quota is reached.  

Any suggestions on why this is happening or how to debug it?
Reply | Threaded
Open this post in threaded view
|

Re: Reliability issues with Apollo?

rjrizzuto
Also I get:

WARN  | java.lang.NullPointerException

I think all the issue seems to happen when the producer is producing faster than the consumer can consume.  Initially the consumer keeps up, but when I kill and restart the consumer, it slows down, likely since it is consuming from the persistent files, and that is slower.

For my application, I am fine with losing older messages, which is why I have the "drop head" policy.  Having the broker become unreliable is a big issue, however.