Luxun is a high-throughput, distributed, pub-sub messaging system tailored for big data collecting and analytics. Luxun is inspired by Apache Kafka, both have a similar architecture. In order to compare the performance of Luxun and Kafka, I did a round of performance testing, below are the testing results:
1 2 3 4 5 6 7 8 9 10 11 |
|
Analysis & Conclusion
The overall performance of Luxun is much better than Kafka:
- In async producing mode, no matter whether flush is enabled or not, the throughput of Luxun is at least twice the throughput of Kafka.
- In sync producing mode, if flush is disabled, Luxun performs much better than Kafka.
- In sync producing mode, if flush is enabled, Luxun performs worse than Kafka.
- In all consuming tests, Luxun performs much better than Kafka.
The only lose case of Luxun is in sync producing mode when flush is enabled, Luxun uses Memory Mapped File internally, we are still not sure the cause of poor performance to explicitly flush memory mapped buffer in Java, this will be a future optimization of Luxun. However, following unique feature of memory mapped file makes explicit flush unnecessary(or not recommended) on Luxun system:
- OS will ensure the message persistence even the process crashes and there is no explicit flush before the crash.
- Message appended by producer thread will be immediately visible to consumer threads, even producer thread hasn’t flushed the message explicitly.
Also, the inner paging and swapping mechanism of Luxun will automatically flush a cached page when it is replaced out, making explicit flush unnecessary on Luxun system.
Regrading inner implementation, Luxun and Kafka have two main differences:
- Luxun queue is based on Memory Mapped File while Kafka queue is based on filesystem and OS page cache.
- Luxun leveraged Thrift RPC as communication layer while Kafka built its custom NIO communication layer.
We believe the performance difference between Luxun and Kafak is mainly caused by memory mapped file, while Thrift RPC or custom NIO communication layer does not make much difference here.
Detailed performance test results can be found here