How to Tune M3 Java Broker Performance
Problem
Statement
During destructive testing of the Qpid M3 Java Broker, we tested
some tuning techniques and deployment changes to improve the Qpid
M3 Java Broker's capacity to maintain high levels of throughput,
particularly in the case of a slower consumer than produceer
(i.e. a growing backlog).
The focus of this page is to detail the results of tuning &
deployment changes trialled.
The successful tuning changes are applicable for any deployment
expecting to see bursts of high volume throughput (1000s of
persistent messages in large batches). Any user wishing to use
these options must test them thoroughly in their own
environment with representative volumes.
Successful
Tuning Options
The key scenario being taregetted by these changes is a broker
under heavy load (processing a large batch of persistent
messages)can be seen to perform slowly when filling up with an
influx of high volume transient messages which are queued behind
the persistent backlog. However, the changes suggested will be
equally applicable to general heavy load scenarios.
The easiest way to address this is to separate streams of
messages. Thus allowing the separate streams of messages to be
processed, and preventing a backlog behind a particular slow
consumer.
These strategies have been successfully tested to mitigate this
problem:
Strategy
Result
Seperate connections to one broker for separate streams of
messages.
Messages processed successfully, no problems experienced
Seperate brokers for transient and persistent messages.
Messages processed successfully, no problems experienced
Separate Connections
Using separate connections effectively means that the two streams
of data are not being processed via the same buffer, and thus the
broker gets & processes the transient messages while
processing the persistent messages. Thus any build up of
unprocessed data is minimal and transitory.
Separate Brokers
Using separate brokers may mean more work in terms of client
connection details being changed, and from an operational
perspective. However, it is certainly the most clear cut way of
isolating the two streams of messages and the heaps impacted.
Additional
tuning
It is worth testing if changing the size of the Qpid read/write
thread pool improves performance (eg. by setting
JAVA_OPTS="-Damqj.read_write_pool_size=32" before running
qpid-server). By default this is equal to the number of CPU
cores, but a higher number may show better performance with some
work loads.
It is also important to note that you should give the Qpid broker
plenty of memory - for any serious application at least a -Xmx of
3Gb. If you are deploying on a 64 bit platform, a larger heap is
definitely worth testing with. We will be testing tuning options
around a larger heap shortly.
Next
Steps
These two options have been testing using a Qpid test case, and
demonstrated that for a test case with a profile of persistent
heavy load following by constant transient high load traffic they
provide significant improvment.
However, the deploying project must complete their own
testing, using the same destructive test cases, representative
message paradigms & volumes, in order to verify the proposed
mitigation options.
The using programme should then choose the option most applicable
for their deployment and perform BAU testing before any
implementation into a production or pilot environment.