Clustering ActiveMQ in an Amazon ECS/Docker/Weave environment

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Clustering ActiveMQ in an Amazon ECS/Docker/Weave environment

Jeroen van Ooststroom
Hi all,

I’m stuck with getting an ActiveMQ cluster going within a Docker
environment using Weave. The environment runs on an Amazon AWS ECS
Cluster of 2 EC2 Instances running in the same Region, but in different
Availability Zones. The clustering of ActiveMQ must rely on multicast
due to the scaling nature of ECS.

I used Tomcat’s SimpleTcpCluster to confirm that multicast is working at
the Weave level. (The only changes I did for this was to add unique
jvmRoutes to each Tomcat instance and uncomment the SimpleTcpCluster
configuration.) However, trying to cluster ActiveMQ has been
unsuccessful so far. I took
[active-mq-home]/examples/conf/activemq-dynamic-network-broker2.xml as a
basis and changed the brokerName for each ActiveMQ instance to be
something unique. However, after trying both tcp://0.0.0.0:61618 and
tcp://10.40.0.1:61618 for the openwire transportConnector the two
instances still won’t form a cluster.

I suspect that ActiveMQ tries to use the network interface that does not
support multicasting. The container ActiveMQ is running on has two
network interfaces to its disposal: eth0 (Amazon AWS ECS) and ethwe
(Weave). The latter does support multicasting. The IP address mentioned
earlier (10.40.0.1) is the IP address assigned to the ethwe network
interface. How can I get clustering for ActiveMQ to work based on
multicasting in this environment?

Thanks,
Jeroen...



PS: I'm using ActiveMQ 5.13.4.

Reply | Threaded
Open this post in threaded view
|

Re: Clustering ActiveMQ in an Amazon ECS/Docker/Weave environment

Jeroen van Ooststroom
Hi all,

I tried with the latest ActiveMQ 5.15.1 as well, but I unfortunately get the
same result: it still won't do the automatic clustering, while locally using
VMs this still works.

Thanks,
Jeroen...



--
Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html
Reply | Threaded
Open this post in threaded view
|

Re: Clustering ActiveMQ in an Amazon ECS/Docker/Weave environment

Tim Bain
Leave the transportConnector's uri property alone as the original value of
tcp://0.0.0.0:61618; that's the IP for brokers to use when talking to each
other, but doesn't play when they're finding out about one another.

The way this is supposed to work (follow along at
https://github.com/apache/activemq/blob/master/activemq-client/src/main/java/org/apache/activemq/transport/discovery/multicast/MulticastDiscoveryAgent.java)
is that in order to bind to the ethwe NIC, you would set the discoveryUri
property to "multicast://default" (which would join 239.255.2.3:6155, see
the constants at the top of that class) and you would set the
mcNetworkInterface field to "ethwe" to tell the socket to bind to that
particular NIC instead of the default one chosen by
MulticastDiscoveryAgent.findNetworkInterface(). Unfortunately, there's no
attribute in the XSD
<http://activemq.apache.org/schema/core/activemq-core-5.15.1.xsd> that
corresponds to the mcNetworkInterface property.

You have some options (which are not necessarily mutually exclusive):

   1. Submit an enhancement request in JIRA asking that the
   mcNetworkInterface field be mapped to an XSD attribute so you can set it
   via the config file.
   2. Modify the ActiveMQ code to add an XSD attribute yourself, then
   compile the code and use that patched version of ActiveMQ. (If you do this,
   hopefully you'll contribute the change back so that we maintain the fix
   instead of you.)
   3. Modify the ActiveMQ code to hard-code the mcNetworkInterface field
   (or some other local variable, however you choose to do it) to "ethwe", or
   to the value of a system property or environment variable of your choosing
   (to which you can then pass in "ethwe"), then compile the code and use that
   patched version of ActiveMQ. This is slightly easier to implement, but then
   you're stuck maintaining your patched version until someone else implements
   a permanent fix for the mainline ActiveMQ codebase.
   4. Use some form of runtime bytecode manipulation to modify the value of
   the mcNetworkInterface field without modifying the ActiveMQ code. Maybe it
   would be possible to do this via AOP, or a mocking framework, or...?
   5. In Eclipse, set a conditional breakpoint on
   MulticastDiscoveryAgent:279 whose condition is
   mcNetworkInterface="ethwe"; return false;  and then attach your Eclipse
   debugger to each running broker process. This one is probably a non-starter
   in operations, but could very quickly and easily let you confirm that the
   fix in question will actually work before you implement option 2/3/4.

Tim

On Wed, Oct 18, 2017 at 9:30 AM, Jeroen van Ooststroom <
[hidden email]> wrote:

> Hi all,
>
> I tried with the latest ActiveMQ 5.15.1 as well, but I unfortunately get
> the
> same result: it still won't do the automatic clustering, while locally
> using
> VMs this still works.
>
> Thanks,
> Jeroen...
>
>
>
> --
> Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-
> f2341805.html
>
Reply | Threaded
Open this post in threaded view
|

Re: Clustering ActiveMQ in an Amazon ECS/Docker/Weave environment

Jeroen van Ooststroom
Hi Tim,

Thank you for your extensive reply. If I can get it to work by
hardcoding the mcNetworkInterface value first, I'm more than willing to
spend some additional time into getting a nicer permanent solution and
contributing this back to the ActiveMQ project.

However, I still don't have much luck...

So I edited the MulticastDiscoveryAgent.java file as follows:

    ...
         private String mcNetworkInterface = *"ethwe"*;
    ...
         public void setNetworkInterface(String mcNetworkInterface) {
    *//this.mcNetworkInterface = mcNetworkInterface;*
         }
    ...
                 if (mcNetworkInterface != null) {
    *LOG.info("[Jeroen] NetworkInterface: '" +
    NetworkInterface.getByName(mcNetworkInterface) + "'.
    (mcNetworkInterface: '" + mcNetworkInterface + "')");*
    mcast.setNetworkInterface(NetworkInterface.getByName(mcNetworkInterface));
                 }
    ...

But checking the logs, I don't see the automatic clustering happening
just yet:

    2017-10-20 14:44:45,212 | INFO  | Refreshing
    org.apache.activemq.xbean.XBeanBrokerFactory$1@cb5822: startup date
    [Fri Oct 20 14:44:45 UTC 2017]; root of context hierarchy |
    org.apache.activemq.xbean.XBeanBrokerFactory$1 | main
    2017-10-20 14:44:59,685 | INFO  | Using Persistence Adapter:
    KahaDBPersistenceAdapter[/activemq/data/dynamic-broker2/kahadb] |
    org.apache.activemq.broker.BrokerService | main
    2017-10-20 14:45:00,776 | INFO  |
    PListStore:[/activemq/data/node-10.40.0.3/tmp_storage] started |
    org.apache.activemq.store.kahadb.plist.PListStoreImpl | main
    2017-10-20 14:45:01,153 | INFO  | JMX consoles can connect to
    service:jmx:rmi:///jndi/rmi://localhost:1100/jmxrmi |
    org.apache.activemq.broker.jmx.ManagementContext | JMX connector
    2017-10-20 14:45:01,537 | INFO  | Apache ActiveMQ 5.16.0-SNAPSHOT
    (node-10.40.0.3, ID:msgs.weave.local-43127-1508510701118-0:1) is
    starting | org.apache.activemq.broker.BrokerService | main
    2017-10-20 14:45:01,605 | INFO  | Listening for connections at:
    tcp://msgs.weave.local:61618 |
    org.apache.activemq.transport.TransportServerThreadSupport | main
    *2017-10-20 14:45:01,630 | INFO  | [Jeroen] NetworkInterface:
    'name:ethwe (ethwe)'.  (mcNetworkInterface: 'ethwe') |
    org.apache.activemq.transport.discovery.multicast.MulticastDiscoveryAgent
    | main*
    2017-10-20 14:45:01,631 | INFO  | Connector openwire started |
    org.apache.activemq.broker.TransportConnector | main
    *2017-10-20 14:45:01,633 | INFO  | [Jeroen] NetworkInterface:
    'name:ethwe (ethwe)'.  (mcNetworkInterface: 'ethwe') |
    org.apache.activemq.transport.discovery.multicast.MulticastDiscoveryAgent
    | main*
    2017-10-20 14:45:01,633 | INFO  | Network Connector
    DiscoveryNetworkConnector:NC:BrokerService[node-10.40.0.3] started |
    org.apache.activemq.network.NetworkConnector | main
    2017-10-20 14:45:01,634 | INFO  | Apache ActiveMQ 5.16.0-SNAPSHOT
    (node-10.40.0.3, ID:msgs.weave.local-43127-1508510701118-0:1)
    started | org.apache.activemq.broker.BrokerService | main
    2017-10-20 14:45:01,634 | INFO  | For help or more information
    please see: http://activemq.apache.org |
    org.apache.activemq.broker.BrokerService | main

Just to be more complete I'll include a couple of other configuration
details we have in our environment:

  * activemq.xml snippets:

             <transportConnectors>
                 <transportConnector name="openwire"
    uri="tcp://0.0.0.0:61618" discoveryUri="multicast://default" />
                 <transportConnector name="amqp"
    uri="amqp://0.0.0.0:5672?maximumConnections=1000&amp;wireFormat.maxFrameSize=104857600"/>
                 <transportConnector name="stomp"
    uri="stomp://0.0.0.0:61613?maximumConnections=1000&amp;wireFormat.maxFrameSize=104857600"/>
                 <transportConnector name="mqtt"
    uri="mqtt://0.0.0.0:1883?maximumConnections=1000&amp;wireFormat.maxFrameSize=104857600&amp;transport.subscriptionStrategy=mqtt-virtual-topic-subscriptions"/>
                 <transportConnector name="ws"
    uri="ws://0.0.0.0:61614?maximumConnections=1000&amp;wireFormat.maxFrameSize=104857600"/>
             </transportConnectors>

  * Dockerfile exposed ports:

    # MQTT
    EXPOSE 1883

    # AMQP
    EXPOSE 5672

    # STOMP
    EXPOSE 61613

    # WS
    EXPOSE 61614

    # OpenWire
    EXPOSE 61618

  * Amazon AWS ECS' Task Definition snippet:

         {
           "volumesFrom": [],
           "memory": 1024,
           "extraHosts": null,
           "linuxParameters": null,
           "dnsServers": null,
           "disableNetworking": false,
           "dnsSearchDomains": null,
           "portMappings": [
             {
               "hostPort": 0,
               "containerPort": 1883,
               "protocol": "tcp"
             },
             {
               "hostPort": 0,
               "containerPort": 5672,
               "protocol": "tcp"
             },
             {
               "hostPort": 0,
               "containerPort": 61613,
               "protocol": "tcp"
             },
             {
               "hostPort": 0,
               "containerPort": 61614,
               "protocol": "tcp"
             },
             {
               "hostPort": 0,
               "containerPort": 61618,
               "protocol": "tcp"
             }
           ],
           "hostname": "msgs.weave.local",
           "essential": true,
           "entryPoint": null,
           "mountPoints": [],
           "name": "msgs",
           "ulimits": null,
           "dockerSecurityOptions": null,
           "environment": [],
           "links": null,
           "workingDirectory": null,
           "readonlyRootFilesystem": null,
           "image": "voyent/msgs-service-activemq:latest",
           "command": null,
           "user": null,
           "dockerLabels": null,
           "logConfiguration": null,
           "cpu": 0,
           "privileged": null,
           "memoryReservation": 256
         }

On a final note, each relevant Instance within the ECS Cluster uses the
same Security Group have all ports for both TCP and UDP open for
incoming traffic from other Instances with the same Security Group and
are allowed to use any outgoing traffic.

Is there something else I can try in the ActiveMQ code to track what is
going wrong?

Thanks,
Jeroen...

On 19/10/2017 07:45, Tim Bain wrote:

> Leave the transportConnector's uri property alone as the original value of
> tcp://0.0.0.0:61618; that's the IP for brokers to use when talking to each
> other, but doesn't play when they're finding out about one another.
>
> The way this is supposed to work (follow along at
> https://github.com/apache/activemq/blob/master/activemq-client/src/main/java/org/apache/activemq/transport/discovery/multicast/MulticastDiscoveryAgent.java)
> is that in order to bind to the ethwe NIC, you would set the discoveryUri
> property to "multicast://default" (which would join 239.255.2.3:6155, see
> the constants at the top of that class) and you would set the
> mcNetworkInterface field to "ethwe" to tell the socket to bind to that
> particular NIC instead of the default one chosen by
> MulticastDiscoveryAgent.findNetworkInterface(). Unfortunately, there's no
> attribute in the XSD
> <http://activemq.apache.org/schema/core/activemq-core-5.15.1.xsd> that
> corresponds to the mcNetworkInterface property.
>
> You have some options (which are not necessarily mutually exclusive):
>
>     1. Submit an enhancement request in JIRA asking that the
>     mcNetworkInterface field be mapped to an XSD attribute so you can set it
>     via the config file.
>     2. Modify the ActiveMQ code to add an XSD attribute yourself, then
>     compile the code and use that patched version of ActiveMQ. (If you do this,
>     hopefully you'll contribute the change back so that we maintain the fix
>     instead of you.)
>     3. Modify the ActiveMQ code to hard-code the mcNetworkInterface field
>     (or some other local variable, however you choose to do it) to "ethwe", or
>     to the value of a system property or environment variable of your choosing
>     (to which you can then pass in "ethwe"), then compile the code and use that
>     patched version of ActiveMQ. This is slightly easier to implement, but then
>     you're stuck maintaining your patched version until someone else implements
>     a permanent fix for the mainline ActiveMQ codebase.
>     4. Use some form of runtime bytecode manipulation to modify the value of
>     the mcNetworkInterface field without modifying the ActiveMQ code. Maybe it
>     would be possible to do this via AOP, or a mocking framework, or...?
>     5. In Eclipse, set a conditional breakpoint on
>     MulticastDiscoveryAgent:279 whose condition is
>     mcNetworkInterface="ethwe"; return false;  and then attach your Eclipse
>     debugger to each running broker process. This one is probably a non-starter
>     in operations, but could very quickly and easily let you confirm that the
>     fix in question will actually work before you implement option 2/3/4.
>
> Tim
>
> On Wed, Oct 18, 2017 at 9:30 AM, Jeroen van Ooststroom <
> [hidden email]> wrote:
>
>> Hi all,
>>
>> I tried with the latest ActiveMQ 5.15.1 as well, but I unfortunately get
>> the
>> same result: it still won't do the automatic clustering, while locally
>> using
>> VMs this still works.
>>
>> Thanks,
>> Jeroen...
>>
>>
>>
>> --
>> Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-
>> f2341805.html
>>