Using this architectural solution, the microservice, that writes on Mongo, queues the correct changes on Kafka. MongoStream2Kafka has several advantages:
It’s a simple and easy to use component since it can be efficient in limited budget and time situations.
It’s the perfect solution for you if you are using CRUD since MongoStream2Kafka allows to execute operations at each data modification.
Be Careful: in the following situations, MongoStream2Kafka is not the optimal component to be used
When MongoChangeStream is not used in a cluster configuration: MongoChangeStream use the oplog internal collection and it is used by Mongo only for updating the replicas. If your Mongo cluster is not configured as clustered, you cannot use MongoChangeStream and MongoChange2Kafka.
When you want to scale at MongoDB side: MongoChangeStream is not partitioned/sharded and, so, it does not scale at MongoDB side.
To solve this issue, two solutions are possible:
- You can deploy multiple MongoStream2Kafka instances by setting a query to the MongoChangeStream. In this way you are partitioning the event, but your application does not scale up if a spike of traffic occurs. Moreover ,the partition key has to be a good one: a spike cannot follow your key since one partition is used more than other.
- You can shard your cluster. But be aware from sharding your cluster, you cannot revert this choice!
In the situations in which MongoStream2Kafka is not the most suitable component to be used, you can ask to our specialists for architectural advices about other solutions.
In any case, MongoStream2Kafka can be exposed to limitations, that sometimes can be solved by some workarounds:
Delete event sends only “_id”: since the full document is lost and you cannot access to its properties, the Delete event cannot send other information than “_id”.
Kafka events are not sent to topic in the right order: the resumetoken is used as kafka message key and it is an unique identifier used by Mongo for tracking the stream cursor.
To solve this issue, you can choose to create topic with only one partition to allow messages to arrive in the correct order, but you may occur in performance problem at consumer-side.
Changes are not atomic: MongoChangeStream sends you the last document version. So if multiple changes occur on the same document, Mongo may choose to send you only a change that contains only last document version since. In this case you loose the granularity of “each change”.
Lost event on spike traffic: the oplog collection is a capped one and, so, it is limited in time and space. On spike traffic, this collection may contains too many changes, so mongo may decides to trash some old changes. If MongoChange2Kafka de-queues slowly, some changes will be lost.
To solve this issue, you can increase the oplog window in order to avoid that Mongo trashes any old change.
- Kafka authentication errors: if the kafka producer is not able to successfully authenticate, it will keep retrying. The service will not go down, but it will not work until the authentication is successful.
To solve this issue, you can set an alarm on the logs of the service. The expected error log in case of authentication failure is:
MongoChangeStream is built to be resumable: each change stream response includes a resume token, that represents a specific operation on the Oplog collection. This token is persisted by Mongo2Kafka and whenever the connection between the service and the database is lost or paused, the service will send the last resume token it received so that the change stream can pick up right where the application left off.
However, it may happen that the Oplog grows out a saved resume token causing and error when the service tries to pick up the watching operations.
Mongo2Kafka implements a recovery strategy for this event. When an error related to an invalid resume token occurs, the
service will delete the token and automatically restart watching the relevant collection. By default, the service handles
2 errors of this kind, after which the process is stopped. You can configure the maximum number of tolerated errors
MAX_ALLOWED_TOKEN_ERRORS environment variable.
Please, note that even with the recovery strategy, in the event of an invalid resume token some data may be lost.