[Artemis] Shared Store on NFSv4 share: Locking issue?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

[Artemis] Shared Store on NFSv4 share: Locking issue?

Benjamin Buehlmann
We have a HA setup with shared storage on a NFSv4 share. During the startup
of the second server I got the following Exception and the startup process
will be stopped:

17:32:12,900 INFO  [org.apache.activemq.artemis.integration.bootstrap]
AMQ101000: Starting ActiveMQ Artemis Server
Exception in thread "main" java.io.IOException: Input/output error
        at sun.nio.ch.FileDispatcherImpl.pread0(Native Method)
        at sun.nio.ch.FileDispatcherImpl.pread(FileDispatcherImpl.java:52)
        at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:220)
        at sun.nio.ch.IOUtil.read(IOUtil.java:192)
        at sun.nio.ch.FileChannelImpl.readInternal(FileChannelImpl.java:741)
        at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:727)
        at
org.apache.activemq.artemis.core.server.NodeManager.createNodeId(NodeManager.java:209)
        at
org.apache.activemq.artemis.core.server.NodeManager.setUpServerLockFile(NodeManager.java:195)
        at


I can reproduce the behaviour when I try to cat the file from the console:
[foo@broker bin]$ cat /data/artemis_shared_store/journal/server.lock
cat: /data/artemis_shared_store/journal/server.lock: Input/output error

Used environment: RHEL 7.3, Artemis 2.4.0
Btw: I tried ASYNCIO and NIO and got the same results.
Btw2: The same NFSv4 share works perfectly fine for a HA setup with two
embedded Artemis instances in JBoss EAP 7.0 (Artemis 1.1.0.SP16-redhat-1).

In my understanding NFSv4 is not recommended due to performance issues but
should work without errors?



--
Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html
Reply | Threaded
Open this post in threaded view
|

Re: [Artemis] Shared Store on NFSv4 share: Locking issue?

Justin Bertram
It should work, but like you said will be slow.  Artemis 2.4.0 had a change
in the way it handles the lock file for the NFSv4 use-case [1], but the
code throwing the exception you're seeing wasn't touched.  In fact, that
bit of code hasn't had any semantic changes in years (just some style
changes).  The fact that you can reproduce the error with 'cat' indicates
to me that there's an environmental issue unrelated to Artemis.



Justin

[1] https://issues.apache.org/jira/browse/ARTEMIS-1417

On Thu, Nov 9, 2017 at 11:12 AM, Benjamin Buehlmann <
[hidden email]> wrote:

> We have a HA setup with shared storage on a NFSv4 share. During the startup
> of the second server I got the following Exception and the startup process
> will be stopped:
>
> 17:32:12,900 INFO  [org.apache.activemq.artemis.integration.bootstrap]
> AMQ101000: Starting ActiveMQ Artemis Server
> Exception in thread "main" java.io.IOException: Input/output error
>         at sun.nio.ch.FileDispatcherImpl.pread0(Native Method)
>         at sun.nio.ch.FileDispatcherImpl.pread(FileDispatcherImpl.java:52)
>         at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:220)
>         at sun.nio.ch.IOUtil.read(IOUtil.java:192)
>         at sun.nio.ch.FileChannelImpl.readInternal(FileChannelImpl.
> java:741)
>         at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:727)
>         at
> org.apache.activemq.artemis.core.server.NodeManager.
> createNodeId(NodeManager.java:209)
>         at
> org.apache.activemq.artemis.core.server.NodeManager.setUpServerLockFile(
> NodeManager.java:195)
>         at
>
>
> I can reproduce the behaviour when I try to cat the file from the console:
> [foo@broker bin]$ cat /data/artemis_shared_store/journal/server.lock
> cat: /data/artemis_shared_store/journal/server.lock: Input/output error
>
> Used environment: RHEL 7.3, Artemis 2.4.0
> Btw: I tried ASYNCIO and NIO and got the same results.
> Btw2: The same NFSv4 share works perfectly fine for a HA setup with two
> embedded Artemis instances in JBoss EAP 7.0 (Artemis 1.1.0.SP16-redhat-1).
>
> In my understanding NFSv4 is not recommended due to performance issues but
> should work without errors?
>
>
>
> --
> Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-
> f2341805.html
>
Reply | Threaded
Open this post in threaded view
|

Re: [Artemis] Shared Store on NFSv4 share: Locking issue?

Benjamin Buehlmann
You are right. I installed Artemis 2.4.0 on the same Host like EAP 7.0 is
installed and it worked as expected. Means the problem is not on the
broker-side. The only difference is the Linux Version (RHEL 7.3 vs. 7.4).
Anyway, I will post a replay when we have found the problem.



--
Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html
Reply | Threaded
Open this post in threaded view
|

Re: [Artemis] Shared Store on NFSv4 share: Locking issue?

Benjamin Buehlmann
Might the problem has a correlation to this issue
https://bugzilla.redhat.com/show_bug.cgi?id=1486132? I'm just wondering why
it works on RHEL 7.3 but not on 7.4?



--
Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html
Reply | Threaded
Open this post in threaded view
|

Re: [Artemis] Shared Store on NFSv4 share: Locking issue?

Justin Bertram
The Bugzilla you cited is related to the problem I mentioned previously
that Artemis had regarding the locking.  I really have no idea why your
particular problem only manifests on 7.4 vs 7.3.  That would probably need
to be addressed by an expert in the NFS community.


Justin

On Mon, Nov 13, 2017 at 6:16 AM, Benjamin Buehlmann <
[hidden email]> wrote:

> Might the problem has a correlation to this issue
> https://bugzilla.redhat.com/show_bug.cgi?id=1486132? I'm just wondering
> why
> it works on RHEL 7.3 but not on 7.4?
>
>
>
> --
> Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-
> f2341805.html
>
Reply | Threaded
Open this post in threaded view
|

Re: [Artemis] Shared Store on NFSv4 share: Locking issue?

Benjamin Buehlmann
Could a faulty cluster configuration lead to this problem?



--
Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html
Reply | Threaded
Open this post in threaded view
|

Re: [Artemis] Shared Store on NFSv4 share: Locking issue?

Justin Bertram
In terms of shared-storage there's really not much to configure.  You
simply point both nodes at the same journal.  I don't see how this issue
could be caused by a faulty configuration, especially given that you see
the problem outside of Artemis as well when simply using 'cat'.


Justin

On Tue, Nov 14, 2017 at 1:58 AM, Benjamin Buehlmann <
[hidden email]> wrote:

> Could a faulty cluster configuration lead to this problem?
>
>
>
> --
> Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-
> f2341805.html
>
Reply | Threaded
Open this post in threaded view
|

Re: [Artemis] Shared Store on NFSv4 share: Locking issue?

Benjamin Buehlmann
We could not find a solution within two days and switched finally to
replication as HA strategy.



--
Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html