Unexplained slow sending

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Unexplained slow sending

James Green-3
Hi all,

I've been busy shifting an existing workload into AWS recently, and a load
test shows a serious performance drop when sending to ActiveMQ which I
could use some advice on.

Quick architecture summary: We send requests via a webserver that are
forwarded as messages to a queue. A backend receives these messages and
forwards them onward to another queue. Spring Boot with Camel powers the
show within Docker containers. Messages are persistent.

Story so far:

Tests show this first queue builds rapidly with pending messages yet
monitoring of our existing production environment shows no such backlog.

Our existing production environment has everything in a single DC so it's
super low latency. Our AWS environment uses Fargate with AmazonMQ. I
understand send latency will be higher and AmazonMQ will store the messages
across three AZs.

So I launched a small EC2 instance to run some comparison tests:

Receiving via a Camel route is super quick. This is not a problem.
Sending via a minimal Camel route is super slow. 14 messages per second. We
appear to be doing at least 20-30 per second in production but it's enough
of a difference.
Sending via PHP with stomp-php setting both persistence on and receipt
headers on is substantially faster than sending via Camel. 55 messages per
second.
Tests have been with 10K small payloads.

At this point I'm thinking that both Camel and PHP should be sending with
the same properties - synchronously and with persistence. The messages on
the queue are flagged persistent when viewed by the web console.

Can anyone provide further suggestions to try?

Thanks,

James
Reply | Threaded
Open this post in threaded view
|

Re: Unexplained slow sending

Tim Bain
Can you create a minimal producer via the OpenWire protocol in Java or
another language of your choice, to determine if your Camel producer is
slow because it's OpenWire or because it's Camel? I suspect you'll find
that OpenWire is the culprit, not Camel, but let's confirm that.

All of these numbers sound tiny compared to what the ActiveMQ product is
capable of (though I don't have any insight into how Amazon has configured
the brokers, nor into any code customizations they might have made). If you
run multiple minimal producers in parallel, does throughput increase
linearly?

Also, you say you're testing with small payloads; are they small enough
that you might be running into the Nagle algorithm on your TCP sockets? If
you use larger (e.g. 1KB) payloads, what does that do to your throughput on
a single producer?

Tim

On Thu, Aug 22, 2019, 2:54 AM James Green <[hidden email]> wrote:

> Hi all,
>
> I've been busy shifting an existing workload into AWS recently, and a load
> test shows a serious performance drop when sending to ActiveMQ which I
> could use some advice on.
>
> Quick architecture summary: We send requests via a webserver that are
> forwarded as messages to a queue. A backend receives these messages and
> forwards them onward to another queue. Spring Boot with Camel powers the
> show within Docker containers. Messages are persistent.
>
> Story so far:
>
> Tests show this first queue builds rapidly with pending messages yet
> monitoring of our existing production environment shows no such backlog.
>
> Our existing production environment has everything in a single DC so it's
> super low latency. Our AWS environment uses Fargate with AmazonMQ. I
> understand send latency will be higher and AmazonMQ will store the messages
> across three AZs.
>
> So I launched a small EC2 instance to run some comparison tests:
>
> Receiving via a Camel route is super quick. This is not a problem.
> Sending via a minimal Camel route is super slow. 14 messages per second. We
> appear to be doing at least 20-30 per second in production but it's enough
> of a difference.
> Sending via PHP with stomp-php setting both persistence on and receipt
> headers on is substantially faster than sending via Camel. 55 messages per
> second.
> Tests have been with 10K small payloads.
>
> At this point I'm thinking that both Camel and PHP should be sending with
> the same properties - synchronously and with persistence. The messages on
> the queue are flagged persistent when viewed by the web console.
>
> Can anyone provide further suggestions to try?
>
> Thanks,
>
> James
>
Reply | Threaded
Open this post in threaded view
|

Re: Unexplained slow sending

James Green-3
Hi,

Following-up as I've run more tests.

My minimal producer suffered the same bug as our main application: we had
spring boot activemq thread pooling turned on as a property, but the
library (referenced in the main docs) was not included. Looking back at my
rather sparse notes at the time I activated this my sends of 10K messages
went from taking 11m47s to between 2m56s - 3m31s which is a marked
improvement.

To my chagrin, this has made little different to our real-world
application, and so I have modified my minimal producer to be capable of
sending to the main application via it's queues.

Allow me to elaborate at this point as it's important to understand what
I'm looking at...

The messages follow a small path through a series of queues as they are
processed. Queue A -> B -> C.

If my minimal producer sends to Queue C (skipping A and B) I'm able to
produce at 49/s which is "quick enough".
If my minimal producer sends to Queue B (skipping A) I'm able to produce at
28/s - 38/s which is variable but most of the tests reached 38/s.
If my minimal producer sends to Queue A I'm able to produce at 28/s - 42/s
- again variable.

Now Queues A and B are consumed by separate Camel routes inside the same
application. Queue C is entirely separate.

Looking at throughput graphs of the consumption of Queue C, when first
going through (A,B) for 10K messages, then going through (B), I can see
(A,B) is twice as slow.

I'm left wondering if there's contention somehow within the application
consuming from (A,B) that is only showing up during load testing on AWS, I
was not expecting it would be 2x slower unless the producer thread is
shared - you might imagine a thread pool was solve that!

At this point I have ensured that there are 4 instances of each application
and they can happily deal with about 50 messages per second across the
queues with persistence on. I am uncertain whether I should be expecting
more.

If anyone has insights on why the two routes within the same application
appear contended and indeed on whether overall throughput should be a lot
higher I'd love to hear it.

James


On Thu, 22 Aug 2019 at 14:02, Tim Bain <[hidden email]> wrote:

> Can you create a minimal producer via the OpenWire protocol in Java or
> another language of your choice, to determine if your Camel producer is
> slow because it's OpenWire or because it's Camel? I suspect you'll find
> that OpenWire is the culprit, not Camel, but let's confirm that.
>
> All of these numbers sound tiny compared to what the ActiveMQ product is
> capable of (though I don't have any insight into how Amazon has configured
> the brokers, nor into any code customizations they might have made). If you
> run multiple minimal producers in parallel, does throughput increase
> linearly?
>
> Also, you say you're testing with small payloads; are they small enough
> that you might be running into the Nagle algorithm on your TCP sockets? If
> you use larger (e.g. 1KB) payloads, what does that do to your throughput on
> a single producer?
>
> Tim
>
> On Thu, Aug 22, 2019, 2:54 AM James Green <[hidden email]>
> wrote:
>
> > Hi all,
> >
> > I've been busy shifting an existing workload into AWS recently, and a
> load
> > test shows a serious performance drop when sending to ActiveMQ which I
> > could use some advice on.
> >
> > Quick architecture summary: We send requests via a webserver that are
> > forwarded as messages to a queue. A backend receives these messages and
> > forwards them onward to another queue. Spring Boot with Camel powers the
> > show within Docker containers. Messages are persistent.
> >
> > Story so far:
> >
> > Tests show this first queue builds rapidly with pending messages yet
> > monitoring of our existing production environment shows no such backlog.
> >
> > Our existing production environment has everything in a single DC so it's
> > super low latency. Our AWS environment uses Fargate with AmazonMQ. I
> > understand send latency will be higher and AmazonMQ will store the
> messages
> > across three AZs.
> >
> > So I launched a small EC2 instance to run some comparison tests:
> >
> > Receiving via a Camel route is super quick. This is not a problem.
> > Sending via a minimal Camel route is super slow. 14 messages per second.
> We
> > appear to be doing at least 20-30 per second in production but it's
> enough
> > of a difference.
> > Sending via PHP with stomp-php setting both persistence on and receipt
> > headers on is substantially faster than sending via Camel. 55 messages
> per
> > second.
> > Tests have been with 10K small payloads.
> >
> > At this point I'm thinking that both Camel and PHP should be sending with
> > the same properties - synchronously and with persistence. The messages on
> > the queue are flagged persistent when viewed by the web console.
> >
> > Can anyone provide further suggestions to try?
> >
> > Thanks,
> >
> > James
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Unexplained slow sending

alan protasio
Hi,

I think you can try disable concurrentStoreAndDispatchQueues and rerun the
tests.

Alan Diego


On Wed, Aug 28, 2019 at 9:42 AM James Green <[hidden email]>
wrote:

> Hi,
>
> Following-up as I've run more tests.
>
> My minimal producer suffered the same bug as our main application: we had
> spring boot activemq thread pooling turned on as a property, but the
> library (referenced in the main docs) was not included. Looking back at my
> rather sparse notes at the time I activated this my sends of 10K messages
> went from taking 11m47s to between 2m56s - 3m31s which is a marked
> improvement.
>
> To my chagrin, this has made little different to our real-world
> application, and so I have modified my minimal producer to be capable of
> sending to the main application via it's queues.
>
> Allow me to elaborate at this point as it's important to understand what
> I'm looking at...
>
> The messages follow a small path through a series of queues as they are
> processed. Queue A -> B -> C.
>
> If my minimal producer sends to Queue C (skipping A and B) I'm able to
> produce at 49/s which is "quick enough".
> If my minimal producer sends to Queue B (skipping A) I'm able to produce at
> 28/s - 38/s which is variable but most of the tests reached 38/s.
> If my minimal producer sends to Queue A I'm able to produce at 28/s - 42/s
> - again variable.
>
> Now Queues A and B are consumed by separate Camel routes inside the same
> application. Queue C is entirely separate.
>
> Looking at throughput graphs of the consumption of Queue C, when first
> going through (A,B) for 10K messages, then going through (B), I can see
> (A,B) is twice as slow.
>
> I'm left wondering if there's contention somehow within the application
> consuming from (A,B) that is only showing up during load testing on AWS, I
> was not expecting it would be 2x slower unless the producer thread is
> shared - you might imagine a thread pool was solve that!
>
> At this point I have ensured that there are 4 instances of each application
> and they can happily deal with about 50 messages per second across the
> queues with persistence on. I am uncertain whether I should be expecting
> more.
>
> If anyone has insights on why the two routes within the same application
> appear contended and indeed on whether overall throughput should be a lot
> higher I'd love to hear it.
>
> James
>
>
> On Thu, 22 Aug 2019 at 14:02, Tim Bain <[hidden email]> wrote:
>
> > Can you create a minimal producer via the OpenWire protocol in Java or
> > another language of your choice, to determine if your Camel producer is
> > slow because it's OpenWire or because it's Camel? I suspect you'll find
> > that OpenWire is the culprit, not Camel, but let's confirm that.
> >
> > All of these numbers sound tiny compared to what the ActiveMQ product is
> > capable of (though I don't have any insight into how Amazon has
> configured
> > the brokers, nor into any code customizations they might have made). If
> you
> > run multiple minimal producers in parallel, does throughput increase
> > linearly?
> >
> > Also, you say you're testing with small payloads; are they small enough
> > that you might be running into the Nagle algorithm on your TCP sockets?
> If
> > you use larger (e.g. 1KB) payloads, what does that do to your throughput
> on
> > a single producer?
> >
> > Tim
> >
> > On Thu, Aug 22, 2019, 2:54 AM James Green <[hidden email]>
> > wrote:
> >
> > > Hi all,
> > >
> > > I've been busy shifting an existing workload into AWS recently, and a
> > load
> > > test shows a serious performance drop when sending to ActiveMQ which I
> > > could use some advice on.
> > >
> > > Quick architecture summary: We send requests via a webserver that are
> > > forwarded as messages to a queue. A backend receives these messages and
> > > forwards them onward to another queue. Spring Boot with Camel powers
> the
> > > show within Docker containers. Messages are persistent.
> > >
> > > Story so far:
> > >
> > > Tests show this first queue builds rapidly with pending messages yet
> > > monitoring of our existing production environment shows no such
> backlog.
> > >
> > > Our existing production environment has everything in a single DC so
> it's
> > > super low latency. Our AWS environment uses Fargate with AmazonMQ. I
> > > understand send latency will be higher and AmazonMQ will store the
> > messages
> > > across three AZs.
> > >
> > > So I launched a small EC2 instance to run some comparison tests:
> > >
> > > Receiving via a Camel route is super quick. This is not a problem.
> > > Sending via a minimal Camel route is super slow. 14 messages per
> second.
> > We
> > > appear to be doing at least 20-30 per second in production but it's
> > enough
> > > of a difference.
> > > Sending via PHP with stomp-php setting both persistence on and receipt
> > > headers on is substantially faster than sending via Camel. 55 messages
> > per
> > > second.
> > > Tests have been with 10K small payloads.
> > >
> > > At this point I'm thinking that both Camel and PHP should be sending
> with
> > > the same properties - synchronously and with persistence. The messages
> on
> > > the queue are flagged persistent when viewed by the web console.
> > >
> > > Can anyone provide further suggestions to try?
> > >
> > > Thanks,
> > >
> > > James
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Unexplained slow sending

Tim Bain
Might the choke point be the NIC on the EC2 instance? If you run the
consumers for A and B on different EC2s, how does that throughput compare
to what you're seeing?

Also, I'd recommend you use JVisualVM or similar to capture a CPU sampling
(not profiling!) snapshot of your producer program to see where it's
spending its time. If there's a significant amount of time spent anywhere
except making the network call to send the bytes of the payload, then dig
into that.

Tim

On Wed, Aug 28, 2019, 12:03 PM alan protasio <[hidden email]> wrote:

> Hi,
>
> I think you can try disable concurrentStoreAndDispatchQueues and rerun the
> tests.
>
> Alan Diego
>
>
> On Wed, Aug 28, 2019 at 9:42 AM James Green <[hidden email]>
> wrote:
>
> > Hi,
> >
> > Following-up as I've run more tests.
> >
> > My minimal producer suffered the same bug as our main application: we had
> > spring boot activemq thread pooling turned on as a property, but the
> > library (referenced in the main docs) was not included. Looking back at
> my
> > rather sparse notes at the time I activated this my sends of 10K messages
> > went from taking 11m47s to between 2m56s - 3m31s which is a marked
> > improvement.
> >
> > To my chagrin, this has made little different to our real-world
> > application, and so I have modified my minimal producer to be capable of
> > sending to the main application via it's queues.
> >
> > Allow me to elaborate at this point as it's important to understand what
> > I'm looking at...
> >
> > The messages follow a small path through a series of queues as they are
> > processed. Queue A -> B -> C.
> >
> > If my minimal producer sends to Queue C (skipping A and B) I'm able to
> > produce at 49/s which is "quick enough".
> > If my minimal producer sends to Queue B (skipping A) I'm able to produce
> at
> > 28/s - 38/s which is variable but most of the tests reached 38/s.
> > If my minimal producer sends to Queue A I'm able to produce at 28/s -
> 42/s
> > - again variable.
> >
> > Now Queues A and B are consumed by separate Camel routes inside the same
> > application. Queue C is entirely separate.
> >
> > Looking at throughput graphs of the consumption of Queue C, when first
> > going through (A,B) for 10K messages, then going through (B), I can see
> > (A,B) is twice as slow.
> >
> > I'm left wondering if there's contention somehow within the application
> > consuming from (A,B) that is only showing up during load testing on AWS,
> I
> > was not expecting it would be 2x slower unless the producer thread is
> > shared - you might imagine a thread pool was solve that!
> >
> > At this point I have ensured that there are 4 instances of each
> application
> > and they can happily deal with about 50 messages per second across the
> > queues with persistence on. I am uncertain whether I should be expecting
> > more.
> >
> > If anyone has insights on why the two routes within the same application
> > appear contended and indeed on whether overall throughput should be a lot
> > higher I'd love to hear it.
> >
> > James
> >
> >
> > On Thu, 22 Aug 2019 at 14:02, Tim Bain <[hidden email]> wrote:
> >
> > > Can you create a minimal producer via the OpenWire protocol in Java or
> > > another language of your choice, to determine if your Camel producer is
> > > slow because it's OpenWire or because it's Camel? I suspect you'll find
> > > that OpenWire is the culprit, not Camel, but let's confirm that.
> > >
> > > All of these numbers sound tiny compared to what the ActiveMQ product
> is
> > > capable of (though I don't have any insight into how Amazon has
> > configured
> > > the brokers, nor into any code customizations they might have made). If
> > you
> > > run multiple minimal producers in parallel, does throughput increase
> > > linearly?
> > >
> > > Also, you say you're testing with small payloads; are they small enough
> > > that you might be running into the Nagle algorithm on your TCP sockets?
> > If
> > > you use larger (e.g. 1KB) payloads, what does that do to your
> throughput
> > on
> > > a single producer?
> > >
> > > Tim
> > >
> > > On Thu, Aug 22, 2019, 2:54 AM James Green <[hidden email]>
> > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > I've been busy shifting an existing workload into AWS recently, and a
> > > load
> > > > test shows a serious performance drop when sending to ActiveMQ which
> I
> > > > could use some advice on.
> > > >
> > > > Quick architecture summary: We send requests via a webserver that are
> > > > forwarded as messages to a queue. A backend receives these messages
> and
> > > > forwards them onward to another queue. Spring Boot with Camel powers
> > the
> > > > show within Docker containers. Messages are persistent.
> > > >
> > > > Story so far:
> > > >
> > > > Tests show this first queue builds rapidly with pending messages yet
> > > > monitoring of our existing production environment shows no such
> > backlog.
> > > >
> > > > Our existing production environment has everything in a single DC so
> > it's
> > > > super low latency. Our AWS environment uses Fargate with AmazonMQ. I
> > > > understand send latency will be higher and AmazonMQ will store the
> > > messages
> > > > across three AZs.
> > > >
> > > > So I launched a small EC2 instance to run some comparison tests:
> > > >
> > > > Receiving via a Camel route is super quick. This is not a problem.
> > > > Sending via a minimal Camel route is super slow. 14 messages per
> > second.
> > > We
> > > > appear to be doing at least 20-30 per second in production but it's
> > > enough
> > > > of a difference.
> > > > Sending via PHP with stomp-php setting both persistence on and
> receipt
> > > > headers on is substantially faster than sending via Camel. 55
> messages
> > > per
> > > > second.
> > > > Tests have been with 10K small payloads.
> > > >
> > > > At this point I'm thinking that both Camel and PHP should be sending
> > with
> > > > the same properties - synchronously and with persistence. The
> messages
> > on
> > > > the queue are flagged persistent when viewed by the web console.
> > > >
> > > > Can anyone provide further suggestions to try?
> > > >
> > > > Thanks,
> > > >
> > > > James
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Unexplained slow sending

James Green-3
Tim,

The NIC issue is a potential issue. We're trying to deploy via Fargate
where possible to shift the operational burden away from our developer
staff, so noisy neighbours is entirely possible, as is cross-AZ latency.

Touch wood, I seem to be in a place where throughput is at least as good as
existing production. I've yet to "liven" all the potential traffic patterns
to simulate the more complex loads but it may be good enough for now.


On Thu, 29 Aug 2019 at 13:40, Tim Bain <[hidden email]> wrote:

> Might the choke point be the NIC on the EC2 instance? If you run the
> consumers for A and B on different EC2s, how does that throughput compare
> to what you're seeing?
>
> Also, I'd recommend you use JVisualVM or similar to capture a CPU sampling
> (not profiling!) snapshot of your producer program to see where it's
> spending its time. If there's a significant amount of time spent anywhere
> except making the network call to send the bytes of the payload, then dig
> into that.
>
> Tim
>
> On Wed, Aug 28, 2019, 12:03 PM alan protasio <[hidden email]> wrote:
>
> > Hi,
> >
> > I think you can try disable concurrentStoreAndDispatchQueues and rerun
> the
> > tests.
> >
> > Alan Diego
> >
> >
> > On Wed, Aug 28, 2019 at 9:42 AM James Green <[hidden email]>
> > wrote:
> >
> > > Hi,
> > >
> > > Following-up as I've run more tests.
> > >
> > > My minimal producer suffered the same bug as our main application: we
> had
> > > spring boot activemq thread pooling turned on as a property, but the
> > > library (referenced in the main docs) was not included. Looking back at
> > my
> > > rather sparse notes at the time I activated this my sends of 10K
> messages
> > > went from taking 11m47s to between 2m56s - 3m31s which is a marked
> > > improvement.
> > >
> > > To my chagrin, this has made little different to our real-world
> > > application, and so I have modified my minimal producer to be capable
> of
> > > sending to the main application via it's queues.
> > >
> > > Allow me to elaborate at this point as it's important to understand
> what
> > > I'm looking at...
> > >
> > > The messages follow a small path through a series of queues as they are
> > > processed. Queue A -> B -> C.
> > >
> > > If my minimal producer sends to Queue C (skipping A and B) I'm able to
> > > produce at 49/s which is "quick enough".
> > > If my minimal producer sends to Queue B (skipping A) I'm able to
> produce
> > at
> > > 28/s - 38/s which is variable but most of the tests reached 38/s.
> > > If my minimal producer sends to Queue A I'm able to produce at 28/s -
> > 42/s
> > > - again variable.
> > >
> > > Now Queues A and B are consumed by separate Camel routes inside the
> same
> > > application. Queue C is entirely separate.
> > >
> > > Looking at throughput graphs of the consumption of Queue C, when first
> > > going through (A,B) for 10K messages, then going through (B), I can see
> > > (A,B) is twice as slow.
> > >
> > > I'm left wondering if there's contention somehow within the application
> > > consuming from (A,B) that is only showing up during load testing on
> AWS,
> > I
> > > was not expecting it would be 2x slower unless the producer thread is
> > > shared - you might imagine a thread pool was solve that!
> > >
> > > At this point I have ensured that there are 4 instances of each
> > application
> > > and they can happily deal with about 50 messages per second across the
> > > queues with persistence on. I am uncertain whether I should be
> expecting
> > > more.
> > >
> > > If anyone has insights on why the two routes within the same
> application
> > > appear contended and indeed on whether overall throughput should be a
> lot
> > > higher I'd love to hear it.
> > >
> > > James
> > >
> > >
> > > On Thu, 22 Aug 2019 at 14:02, Tim Bain <[hidden email]> wrote:
> > >
> > > > Can you create a minimal producer via the OpenWire protocol in Java
> or
> > > > another language of your choice, to determine if your Camel producer
> is
> > > > slow because it's OpenWire or because it's Camel? I suspect you'll
> find
> > > > that OpenWire is the culprit, not Camel, but let's confirm that.
> > > >
> > > > All of these numbers sound tiny compared to what the ActiveMQ product
> > is
> > > > capable of (though I don't have any insight into how Amazon has
> > > configured
> > > > the brokers, nor into any code customizations they might have made).
> If
> > > you
> > > > run multiple minimal producers in parallel, does throughput increase
> > > > linearly?
> > > >
> > > > Also, you say you're testing with small payloads; are they small
> enough
> > > > that you might be running into the Nagle algorithm on your TCP
> sockets?
> > > If
> > > > you use larger (e.g. 1KB) payloads, what does that do to your
> > throughput
> > > on
> > > > a single producer?
> > > >
> > > > Tim
> > > >
> > > > On Thu, Aug 22, 2019, 2:54 AM James Green <[hidden email]>
> > > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > I've been busy shifting an existing workload into AWS recently,
> and a
> > > > load
> > > > > test shows a serious performance drop when sending to ActiveMQ
> which
> > I
> > > > > could use some advice on.
> > > > >
> > > > > Quick architecture summary: We send requests via a webserver that
> are
> > > > > forwarded as messages to a queue. A backend receives these messages
> > and
> > > > > forwards them onward to another queue. Spring Boot with Camel
> powers
> > > the
> > > > > show within Docker containers. Messages are persistent.
> > > > >
> > > > > Story so far:
> > > > >
> > > > > Tests show this first queue builds rapidly with pending messages
> yet
> > > > > monitoring of our existing production environment shows no such
> > > backlog.
> > > > >
> > > > > Our existing production environment has everything in a single DC
> so
> > > it's
> > > > > super low latency. Our AWS environment uses Fargate with AmazonMQ.
> I
> > > > > understand send latency will be higher and AmazonMQ will store the
> > > > messages
> > > > > across three AZs.
> > > > >
> > > > > So I launched a small EC2 instance to run some comparison tests:
> > > > >
> > > > > Receiving via a Camel route is super quick. This is not a problem.
> > > > > Sending via a minimal Camel route is super slow. 14 messages per
> > > second.
> > > > We
> > > > > appear to be doing at least 20-30 per second in production but it's
> > > > enough
> > > > > of a difference.
> > > > > Sending via PHP with stomp-php setting both persistence on and
> > receipt
> > > > > headers on is substantially faster than sending via Camel. 55
> > messages
> > > > per
> > > > > second.
> > > > > Tests have been with 10K small payloads.
> > > > >
> > > > > At this point I'm thinking that both Camel and PHP should be
> sending
> > > with
> > > > > the same properties - synchronously and with persistence. The
> > messages
> > > on
> > > > > the queue are flagged persistent when viewed by the web console.
> > > > >
> > > > > Can anyone provide further suggestions to try?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > James
> > > > >
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Unexplained slow sending

Tim Bain
You're right that the network might be congested (noisy neighbors), but I
was asking if you were saturating your host's NIC with your own traffic.
Both could be a problem, but the latter is possible to investigate without
support from Amazon's technical support team.

Tim

On Thu, Aug 29, 2019, 11:13 AM James Green <[hidden email]> wrote:

> Tim,
>
> The NIC issue is a potential issue. We're trying to deploy via Fargate
> where possible to shift the operational burden away from our developer
> staff, so noisy neighbours is entirely possible, as is cross-AZ latency.
>
> Touch wood, I seem to be in a place where throughput is at least as good as
> existing production. I've yet to "liven" all the potential traffic patterns
> to simulate the more complex loads but it may be good enough for now.
>
>
> On Thu, 29 Aug 2019 at 13:40, Tim Bain <[hidden email]> wrote:
>
> > Might the choke point be the NIC on the EC2 instance? If you run the
> > consumers for A and B on different EC2s, how does that throughput compare
> > to what you're seeing?
> >
> > Also, I'd recommend you use JVisualVM or similar to capture a CPU
> sampling
> > (not profiling!) snapshot of your producer program to see where it's
> > spending its time. If there's a significant amount of time spent anywhere
> > except making the network call to send the bytes of the payload, then dig
> > into that.
> >
> > Tim
> >
> > On Wed, Aug 28, 2019, 12:03 PM alan protasio <[hidden email]> wrote:
> >
> > > Hi,
> > >
> > > I think you can try disable concurrentStoreAndDispatchQueues and rerun
> > the
> > > tests.
> > >
> > > Alan Diego
> > >
> > >
> > > On Wed, Aug 28, 2019 at 9:42 AM James Green <[hidden email]>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > Following-up as I've run more tests.
> > > >
> > > > My minimal producer suffered the same bug as our main application: we
> > had
> > > > spring boot activemq thread pooling turned on as a property, but the
> > > > library (referenced in the main docs) was not included. Looking back
> at
> > > my
> > > > rather sparse notes at the time I activated this my sends of 10K
> > messages
> > > > went from taking 11m47s to between 2m56s - 3m31s which is a marked
> > > > improvement.
> > > >
> > > > To my chagrin, this has made little different to our real-world
> > > > application, and so I have modified my minimal producer to be capable
> > of
> > > > sending to the main application via it's queues.
> > > >
> > > > Allow me to elaborate at this point as it's important to understand
> > what
> > > > I'm looking at...
> > > >
> > > > The messages follow a small path through a series of queues as they
> are
> > > > processed. Queue A -> B -> C.
> > > >
> > > > If my minimal producer sends to Queue C (skipping A and B) I'm able
> to
> > > > produce at 49/s which is "quick enough".
> > > > If my minimal producer sends to Queue B (skipping A) I'm able to
> > produce
> > > at
> > > > 28/s - 38/s which is variable but most of the tests reached 38/s.
> > > > If my minimal producer sends to Queue A I'm able to produce at 28/s -
> > > 42/s
> > > > - again variable.
> > > >
> > > > Now Queues A and B are consumed by separate Camel routes inside the
> > same
> > > > application. Queue C is entirely separate.
> > > >
> > > > Looking at throughput graphs of the consumption of Queue C, when
> first
> > > > going through (A,B) for 10K messages, then going through (B), I can
> see
> > > > (A,B) is twice as slow.
> > > >
> > > > I'm left wondering if there's contention somehow within the
> application
> > > > consuming from (A,B) that is only showing up during load testing on
> > AWS,
> > > I
> > > > was not expecting it would be 2x slower unless the producer thread is
> > > > shared - you might imagine a thread pool was solve that!
> > > >
> > > > At this point I have ensured that there are 4 instances of each
> > > application
> > > > and they can happily deal with about 50 messages per second across
> the
> > > > queues with persistence on. I am uncertain whether I should be
> > expecting
> > > > more.
> > > >
> > > > If anyone has insights on why the two routes within the same
> > application
> > > > appear contended and indeed on whether overall throughput should be a
> > lot
> > > > higher I'd love to hear it.
> > > >
> > > > James
> > > >
> > > >
> > > > On Thu, 22 Aug 2019 at 14:02, Tim Bain <[hidden email]>
> wrote:
> > > >
> > > > > Can you create a minimal producer via the OpenWire protocol in Java
> > or
> > > > > another language of your choice, to determine if your Camel
> producer
> > is
> > > > > slow because it's OpenWire or because it's Camel? I suspect you'll
> > find
> > > > > that OpenWire is the culprit, not Camel, but let's confirm that.
> > > > >
> > > > > All of these numbers sound tiny compared to what the ActiveMQ
> product
> > > is
> > > > > capable of (though I don't have any insight into how Amazon has
> > > > configured
> > > > > the brokers, nor into any code customizations they might have
> made).
> > If
> > > > you
> > > > > run multiple minimal producers in parallel, does throughput
> increase
> > > > > linearly?
> > > > >
> > > > > Also, you say you're testing with small payloads; are they small
> > enough
> > > > > that you might be running into the Nagle algorithm on your TCP
> > sockets?
> > > > If
> > > > > you use larger (e.g. 1KB) payloads, what does that do to your
> > > throughput
> > > > on
> > > > > a single producer?
> > > > >
> > > > > Tim
> > > > >
> > > > > On Thu, Aug 22, 2019, 2:54 AM James Green <
> [hidden email]>
> > > > > wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > I've been busy shifting an existing workload into AWS recently,
> > and a
> > > > > load
> > > > > > test shows a serious performance drop when sending to ActiveMQ
> > which
> > > I
> > > > > > could use some advice on.
> > > > > >
> > > > > > Quick architecture summary: We send requests via a webserver that
> > are
> > > > > > forwarded as messages to a queue. A backend receives these
> messages
> > > and
> > > > > > forwards them onward to another queue. Spring Boot with Camel
> > powers
> > > > the
> > > > > > show within Docker containers. Messages are persistent.
> > > > > >
> > > > > > Story so far:
> > > > > >
> > > > > > Tests show this first queue builds rapidly with pending messages
> > yet
> > > > > > monitoring of our existing production environment shows no such
> > > > backlog.
> > > > > >
> > > > > > Our existing production environment has everything in a single DC
> > so
> > > > it's
> > > > > > super low latency. Our AWS environment uses Fargate with
> AmazonMQ.
> > I
> > > > > > understand send latency will be higher and AmazonMQ will store
> the
> > > > > messages
> > > > > > across three AZs.
> > > > > >
> > > > > > So I launched a small EC2 instance to run some comparison tests:
> > > > > >
> > > > > > Receiving via a Camel route is super quick. This is not a
> problem.
> > > > > > Sending via a minimal Camel route is super slow. 14 messages per
> > > > second.
> > > > > We
> > > > > > appear to be doing at least 20-30 per second in production but
> it's
> > > > > enough
> > > > > > of a difference.
> > > > > > Sending via PHP with stomp-php setting both persistence on and
> > > receipt
> > > > > > headers on is substantially faster than sending via Camel. 55
> > > messages
> > > > > per
> > > > > > second.
> > > > > > Tests have been with 10K small payloads.
> > > > > >
> > > > > > At this point I'm thinking that both Camel and PHP should be
> > sending
> > > > with
> > > > > > the same properties - synchronously and with persistence. The
> > > messages
> > > > on
> > > > > > the queue are flagged persistent when viewed by the web console.
> > > > > >
> > > > > > Can anyone provide further suggestions to try?
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > James
> > > > > >
> > > > >
> > > >
> > >
> >
>