I've been looking at traditional message queues like RabbitMQ and ActiveMQ - which started as transactional, producer-pushed, point-to-point queues ... and have since branched out into pub-sub, streaming, and topics with multiple consumers.
Apache Kafka seemed to come from the other direction starting with streaming, non-point-to-point topics, and then adding point-to-point, and guaranteed one-delivery-only options.
At the start of our project, a traditional, very simple ActiveMQ/RabbitMQ message queue (where one producer sends one message to one consumer) will work fine. It is yet undecided where the use cases will expand to in the future.
To me, it seems that Apache Kafka can do everything that ActiveMQ/RabbitMQ can do. The biggest difference being that the *MQs are producer pushed (producer controlled) whereas Kafka is consumer controlled.
So, based on the above ...
What can ActiveMQ/RabbitMQ do that Kafka CAN'T do?
What benefits would I get from RabbitMQ/ActiveMQ that I CAN'T get from Kafka?
Kafka 0.9 was Producer, Consumer... Kafka 0.10 started with streams.... so not sure if you did check the kafka site..
there are more MQ's... the reason there are many s several software builders had their own version. with all working on the same characteristic.
And with considerable installed base they all have their place.
I think you have to look at what you know possibly do performance test to
AMQ, RabbitMQ also do MMQT didnt see that one with Kafka so there may be differences in support fo endpoints.
You have to decide what you need... and make a shortlist of topics, then products that support them (performance test?) and then choose.
I am not sure what you mean by Consumer Controlled/ Producer controlled. Without producers consumers get nothing. Same the otehr way around if there are no consumers sooner or later the queues will overflow.
Thanks for the response.
Consumer Controlled/ Producer controlled.Basically, I mean that Kafka uses a "consumer pull" method, where the consumer is responsible for determining if all messages have successfully been received and Kafka does not track that - whereas with traditional MQs (like Rabbit and Active) the producer (I.e. the rabbitMQ and activeMQ infrastructure) pushes the message to consumers, and tracks whether messages are delivered.
AMQ, RabbitMQ, IBM's MQ, DEC/MQ etc. all have two message classes persistent & non-persistent messages.
Non-persistent can be lost..., persistent message once accepted by the local broker will be eventually be delivered if the intermediate & final brokers & apps are running. Guaranteed delivery. If needed message are stored on disk. by a broker. Producers just Shove the messages to their local broker until the buffers get filled.
How should consumers know whether a message has been sent or not...
The difference is this: Kafka will delete message after some time if not picked up. So it cannot do Guaranteed delivery
Durability — Kafka does not track which messages were read by each consumer. Kafka keeps all messages for a finite amount of time, and it is consumers' responsibility to track their location per topic, i.e. offsets.
I had noticed this too. I couldn't decide if this was a negative or a positive. The Kafka documentation discusses it as a benefit, where consumers are in control of the queue and (since messages can be held for an infinite-or-less amount of time) they will never "miss" a message should they go down. This is part of Kafka's mindset of "dumb pipes and smart consumers."
Others see this as an issue. Rabbit and Active are both "smart pipes with dumb consumers" - meaning the pipes track the messages, so the consumers just consume. BUT, if a consumer fails too many times, or the MQ service fails, lots of messages can be lost.
Do you have any opinion on this idea? I really don't know which was to fall :)