HA features & limitations

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

HA features & limitations

superuser
Hi i was trying to set up a HA broker config via Master/Slave according to the following:

http://www.activemq.org/site/clustering.html

The fact that there is no automatic fail-back makes using this option *very* difficult to support in a production environment.  Specifically, once the cluster has failed over to the secondary broker (which as documented seems to work fine), the cluster cannot be recovered without the downtime required to stop the SECONDARY, copy the datafiles, start the PRIMARY. This makes routine maintenance of the PRIMARY impossible, because as soon as this happens you are not HA without the required downtime & interruption of service required to fail back to the primary.

I am not intimataly familiar with the internals of ActiveMQ, but it seems to me that the easiest way to fix this would be some way of allowing the primary to turn into the secondary (and visa-versa) when it comes back online. This alone should be enough to allow the cluster to continue servicing requests.  If a the new primary (old secondary) then needs to be taken down (maintenance, failure) it then swaps back to the secondary role.

Thoughts? Are there any current JIRA issues i should be tracking WRT this functionality, or should i create a new request?

TIA for any help.
Reply | Threaded
Open this post in threaded view
|

Re: HA features

James Strachan-2
On 8/7/06, superuser <[hidden email]> wrote:
>
> Hi i was trying to set up a HA broker config via Master/Slave according to
> the following:
>
> http://www.activemq.org/site/clustering.html
>
> The fact that there is no automatic fail-back makes using this option *very*
> difficult to support in a production environment.

There is automatic failover support if you have a shared database or
shared file system...

http://incubator.apache.org/activemq/shared-file-system-master-slave.html
http://incubator.apache.org/activemq/jdbc-master-slave.html


> Specifically, once the
> cluster has failed over to the secondary broker (which as documented seems
> to work fine), the cluster cannot be recovered without the downtime required
> to stop the SECONDARY, copy the datafiles, start the PRIMARY. This makes
> routine maintenance of the PRIMARY impossible, because as soon as this
> happens you are not HA without the required downtime & interruption of
> service required to fail back to the primary.

You can have multiple master-slave pairs. So you can take down one
master/slave pair, clients will fail over to another masters/slave
pair


> I am not intimataly familiar with the internals of ActiveMQ, but it seems to
> me that the easiest way to fix this would be some way of allowing the
> primary to turn into the secondary (and visa-versa) when it comes back
> online. This alone should be enough to allow the cluster to continue
> servicing requests.  If a the new primary (old secondary) then needs to be
> taken down (maintenance, failure) it then swaps back to the secondary role.
>
> Thoughts?

Sounds great - we just need some volunteers to design it, develop it
and test it :)


> Are there any current JIRA issues i should be tracking WRT this
> functionality, or should i create a new request?

I just did a quick query to check and no its not been raised in JIRA
yet. I've created one for you here...

http://issues.apache.org/activemq/browse/AMQ-869

--

James
-------
http://radio.weblogs.com/0112098/
Reply | Threaded
Open this post in threaded view
|

Re: HA features

superuser
We would heavily prefer a shared-nothing architecture, and so for this reason are not considering a "shared backend" scenario.

I had not considered a network of Master/Slave brokers, would this look, in a two machine configuration, something like:

MACHINE1:
   MQ-A-PRIMARY
   MQ-B-BACKUP

MACHINE2:
   MQ-A-BACKUP
   MQ-B-PRIMARY

Where backups would be configured as normal and all servers would be configured as a network of brokers. Clients would have a connection string like "failover://(tcp://MQ-A-PRIMARY:PORT,tcp://MQ-B-PRIMARY:PORT,tcp://MQ-A-BACKUP:PORT,tcp://MQ-B-BACKUP:PORT)"

Would this be a minimal HA cluster?

Incidentally, the "topology" section of the site was not necessarily crystal clear as to what a typical configuration would look like. I would think something like a list of use-cases, with a diagram of the topology and some lniks to .conf files for each would be VERY beneficial. IE, if you are looking for HA, do X; if you are looking for highly scalable read-infrastructure, do Y; etc. Just a suggestion, ignore at will ;-)

Thanks for the help regardless.
Reply | Threaded
Open this post in threaded view
|

Re: HA features

James Strachan-2
On 8/8/06, superuser <[hidden email]> wrote:

>
> We would heavily prefer a shared-nothing architecture, and so for this reason
> are not considering a "shared backend" scenario.
>
> I had not considered a network of Master/Slave brokers, would this look, in
> a two machine configuration, something like:
>
> MACHINE1:
>    MQ-A-PRIMARY
>    MQ-B-BACKUP
>
> MACHINE2:
>    MQ-A-BACKUP
>    MQ-B-PRIMARY
>
> Where backups would be configured as normal and all servers would be
> configured as a network of brokers. Clients would have a connection string
> like
> "failover://(tcp://MQ-A-PRIMARY:PORT,tcp://MQ-B-PRIMARY:PORT,tcp://MQ-A-BACKUP:PORT,tcp://MQ-B-BACKUP:PORT)"
>
> Would this be a minimal HA cluster?

Yes - 4 brokers in 2 master-slave pairs with the broker M/S pairs
store-and-forwarding to each other.


> Incidentally, the "topology" section of the site was not necessarily crystal
> clear as to what a typical configuration would look like. I would think
> something like a list of use-cases, with a diagram of the topology and some
> lniks to .conf files for each would be VERY beneficial.

Agreed. We welcome contributions :)
http://incubator.apache.org/activemq/contributing.html

The website is a wiki so anyone can contribute documentation and diagrams..
http://incubator.apache.org/activemq/how-does-the-website-work.html

--

James
-------
http://radio.weblogs.com/0112098/