[GitHub] activemq-artemis pull request #1638: ARTEMIS-1499 ArtemisExecutor and Proces...

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
28 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[GitHub] activemq-artemis pull request #1638: ARTEMIS-1499 ArtemisExecutor and Proces...

asfgit
GitHub user franz1981 opened a pull request:

    https://github.com/apache/activemq-artemis/pull/1638

    ARTEMIS-1499 ArtemisExecutor and ProcessorBase needs support of recursive blocking executions

    Submitting a CountDownLatch::countDown task and CountDownLatch::await right after in the event loop will block ArtemisExecutor.
    This fix provide:
     - a way to validate blocking actions (ie failing flushes on event loop)
     - thread local sequential execution for task submissions while on event loop (ie execute).

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/franz1981/activemq-artemis recursive_ordered_executor

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/activemq-artemis/pull/1638.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1638
   
----
commit ee7c5db9c1f7a72fd29b791dfce4c7fba3b17543
Author: Francesco Nigro <[hidden email]>
Date:   2017-11-03T13:40:25Z

    ARTEMIS-1499 ArtemisExecutor and ProcessorBase needs support of recursive blocking executions
   
    Submitting a CountDownLatch::countDown task and CountDownLatch::await right after in the event loop will block ArtemisExecutor.
    This fix provide:
     - a way to validate blocking actions (ie failing flushes on event loop)
     - thread local sequential execution for task submissions while on event loop (ie execute).

----


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] activemq-artemis issue #1638: ARTEMIS-1499 ArtemisExecutor and ProcessorBase...

asfgit
Github user franz1981 commented on the issue:

    https://github.com/apache/activemq-artemis/pull/1638
 
    @gemmellr @clebertsuconic I'm not very used to "even loop" patterns hence I call for help :)
   
    I've found that our `ArtemisExecutor` deadlock with the (fresh new) test `shouldSubmitBlockingTasksFromEventLoopMaintainingOrder` TL;DR if a `CountDownLatch::await` is blocking the event loop while waiting a submitted `CoundDownLatch::countDown`.
    Enforcing any code rule to avoid these patterns to be used it not simple, hence I've provided a fix to it.
    Just as a note, the same pattern (verified) blocks the Netty's `EventLoop` too.
    This fix come at the cost of a potential bigger calls stack, but providing a good boost for "in event loop" executions too, but I'm not 100% to have messed up with others things.
    Wdyt?



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] activemq-artemis issue #1638: ARTEMIS-1499 ArtemisExecutor and ProcessorBase...

asfgit
In reply to this post by asfgit
Github user franz1981 commented on the issue:

    https://github.com/apache/activemq-artemis/pull/1638
 
    @clebertsuconic Re `ArtemisExecutor::flush` I've provided the most defensive/conservative way to manage it, but I'm sure that it can be solved (like `execute`) in a more permissive and simpler way if called inside the `event loop`.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] activemq-artemis pull request #1638: ARTEMIS-1499 ArtemisExecutor and Proces...

asfgit
In reply to this post by asfgit
Github user clebertsuconic commented on a diff in the pull request:

    https://github.com/apache/activemq-artemis/pull/1638#discussion_r148796005
 
    --- Diff: artemis-commons/src/main/java/org/apache/activemq/artemis/utils/actors/ProcessorBase.java ---
    @@ -109,12 +125,49 @@ public final boolean isFlushed() {
           return stateUpdater.get(this) == STATE_NOT_RUNNING;
        }
     
    -   protected void task(T command) {
    --- End diff --
   
    What about just fix the offending flush than add this complex piece?


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] activemq-artemis pull request #1638: ARTEMIS-1499 ArtemisExecutor and Proces...

asfgit
In reply to this post by asfgit
Github user clebertsuconic commented on a diff in the pull request:

    https://github.com/apache/activemq-artemis/pull/1638#discussion_r148796392
 
    --- Diff: artemis-commons/src/main/java/org/apache/activemq/artemis/utils/actors/ProcessorBase.java ---
    @@ -109,12 +125,49 @@ public final boolean isFlushed() {
           return stateUpdater.get(this) == STATE_NOT_RUNNING;
        }
     
    -   protected void task(T command) {
    -      tasks.add(command);
    --- End diff --
   
    -1.... lets just fix the offending part?


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] activemq-artemis pull request #1638: ARTEMIS-1499 ArtemisExecutor and Proces...

asfgit
In reply to this post by asfgit
Github user franz1981 commented on a diff in the pull request:

    https://github.com/apache/activemq-artemis/pull/1638#discussion_r148800347
 
    --- Diff: artemis-commons/src/main/java/org/apache/activemq/artemis/utils/actors/ProcessorBase.java ---
    @@ -109,12 +125,49 @@ public final boolean isFlushed() {
           return stateUpdater.get(this) == STATE_NOT_RUNNING;
        }
     
    -   protected void task(T command) {
    -      tasks.add(command);
    --- End diff --
   
    It is not only the flush that has issues but the execute too: take a look at the tests I've added :)


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] activemq-artemis issue #1638: ARTEMIS-1499 ArtemisExecutor and ProcessorBase...

asfgit
In reply to this post by asfgit
Github user clebertsuconic commented on the issue:

    https://github.com/apache/activemq-artemis/pull/1638
 
    throwing an executor call, and waiting for a .wait is clearly a bug...
   
    Adding this complex code on such a hot path has more risks in itself than eventually fixing issues.
   
    I wouldn't merge this PR


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] activemq-artemis issue #1638: ARTEMIS-1499 ArtemisExecutor and ProcessorBase...

asfgit
In reply to this post by asfgit
Github user clebertsuconic commented on the issue:

    https://github.com/apache/activemq-artemis/pull/1638
 
    @franz1981  shouldSubmitBlockingTasksFromEventLoopMaintainingOrder is a buggy code... it's an anti-pattern we shouldn't have in our code.
   
    Sometimes we must do such thing, but then we must make sure we are using proper executions.. or proper callbacks (even better)


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] activemq-artemis issue #1638: ARTEMIS-1499 ArtemisExecutor and ProcessorBase...

asfgit
In reply to this post by asfgit
Github user franz1981 commented on the issue:

    https://github.com/apache/activemq-artemis/pull/1638
 
    i understand it and I agree, but please reconsider it with tests and a longer review: as I've said we can't prohibit that kind of behaviours with ease and if it works (as I've tested/expected) it could make everything more stable/solid.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] activemq-artemis pull request #1638: ARTEMIS-1499 ArtemisExecutor and Proces...

asfgit
In reply to this post by asfgit
Github user clebertsuconic commented on a diff in the pull request:

    https://github.com/apache/activemq-artemis/pull/1638#discussion_r148801546
 
    --- Diff: artemis-commons/src/main/java/org/apache/activemq/artemis/utils/actors/ProcessorBase.java ---
    @@ -109,12 +125,49 @@ public final boolean isFlushed() {
           return stateUpdater.get(this) == STATE_NOT_RUNNING;
        }
     
    -   protected void task(T command) {
    -      tasks.add(command);
    --- End diff --
   
    you've added tests that are simulating bugs at the code.. I would rather fix bugs than protect the executor from them.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] activemq-artemis issue #1638: ARTEMIS-1499 ArtemisExecutor and ProcessorBase...

asfgit
In reply to this post by asfgit
Github user clebertsuconic commented on the issue:

    https://github.com/apache/activemq-artemis/pull/1638
 
    @franz1981 I disagree.. it would hide bugs.. which could end up in being more dangerous.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] activemq-artemis issue #1638: ARTEMIS-1499 ArtemisExecutor and ProcessorBase...

asfgit
In reply to this post by asfgit
Github user andytaylor commented on the issue:

    https://github.com/apache/activemq-artemis/pull/1638
 
    Clebert is correct, lets fix rather than re engineer


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] activemq-artemis issue #1638: ARTEMIS-1499 ArtemisExecutor and ProcessorBase...

asfgit
In reply to this post by asfgit
Github user franz1981 commented on the issue:

    https://github.com/apache/activemq-artemis/pull/1638
 
    @andytaylor @clebertsuconic Got it :+1:
    I will refactor the current PR handling only the fix on `flush` while leaving the improvement for recursive executions to be proposed in a different PR :)


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] activemq-artemis issue #1638: ARTEMIS-1499 ArtemisExecutor and ProcessorBase...

asfgit
In reply to this post by asfgit
Github user franz1981 commented on the issue:

    https://github.com/apache/activemq-artemis/pull/1638
 
    @clebertsuconic Just a couple questions (so I can add some doc/tests):
   
    1. it is allowed/legal to submit new tasks` from the event loop thread?
    2. it is allowed call `flush` from the event loop thread? What is the expected behaviour while doing it?  


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] activemq-artemis issue #1638: ARTEMIS-1499 ArtemisExecutor and ProcessorBase...

asfgit
In reply to this post by asfgit
Github user clebertsuconic commented on the issue:

    https://github.com/apache/activemq-artemis/pull/1638
 
    @franz1981 what fix on flush? where do you see it being used.
   
   
    1. it is legal to submit tasks...
    2. you shouldn't call flush within a thread runnable. That's broken behaviour.
   
    Instead of fixing what's not broken... just remove any calls to flush within executors.. I don't think there's any now.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] activemq-artemis issue #1638: ARTEMIS-1499 ArtemisExecutor and ProcessorBase...

asfgit
In reply to this post by asfgit
Github user clebertsuconic commented on the issue:

    https://github.com/apache/activemq-artemis/pull/1638
 
    @franz1981 changing flush is still re-engineering..
   
    if there's any code calling flush within the OrderedExecutor, that's a classical starvation problem.. it's a bug that needs to be removed.
   
    Instead of changing the semantic of flush, lets just avoid the bug... this PR should be closed IMO.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] activemq-artemis issue #1638: ARTEMIS-1499 ArtemisExecutor and ProcessorBase...

asfgit
In reply to this post by asfgit
Github user franz1981 commented on the issue:

    https://github.com/apache/activemq-artemis/pull/1638
 
    @clebertsuconic
    > Instead of changing the semantic of flush, lets just avoid the bug... this PR should be closed IMO.
   
    I'm not sure about it TBH: help to understand why we can't check if `flush` is being called for within the event loop?
    It is a wrong use of it as you said and we can check/validate it with ease (or at least we could log it as a warning/error) and without any perf impact.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] activemq-artemis issue #1638: ARTEMIS-1499 ArtemisExecutor and ProcessorBase...

asfgit
In reply to this post by asfgit
Github user clebertsuconic commented on the issue:

    https://github.com/apache/activemq-artemis/pull/1638
 
    @franz1981 ProcessorBase has no knowledge of Netty. just make the fix simple.. flush is only called in 3 places.. fix those places instead of rewriting the whole thing.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] activemq-artemis issue #1638: ARTEMIS-1499 ArtemisExecutor and ProcessorBase...

asfgit
In reply to this post by asfgit
Github user franz1981 commented on the issue:

    https://github.com/apache/activemq-artemis/pull/1638
 
    @clebertsuconic I've mentioned Netty just as an example of a event loop framework that doesn't handle well recursive submits into the event loop/blocking call issued from without the event loop.
    The fix (if i'll remove the other changes) is simple: i'm just checking if the call will happen inside the event loop: let me push it and you can see it by yourself :+1:


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] activemq-artemis issue #1638: ARTEMIS-1499 ArtemisExecutor and ProcessorBase...

asfgit
In reply to this post by asfgit
Github user clebertsuconic commented on the issue:

    https://github.com/apache/activemq-artemis/pull/1638
 
    I don't think we should have any thread locals on these executors... they are quite hot on the path.. I would leave it alone.. if there's anything broken.. just change the caller.


---
12