If you haven’t read the post describing the feature’s highlights, you
should, but today I’d like to focus on how the <batch:commit>block
interacts with Anypoint™ Connectors and more specifically, how you can leverage your own connectors to take advantage of the feature.
<batch:commit> overview
In a nutshell, you can use a Batch Commit block to collect a subsetof records for bulk upsert to an external source or service. For
example, rather than upserting each individual contact (i.e. record) to Google Contacts,
you can configure a Batch Commit to collect, lets say 100 records, and
then upsert all of them to Google Contacts in one chunk. Within a batch
step – the only place you can apply it – you can use a Batch Commit to
wrap an outbound message processor. See the example below:


Mixing in Connectors
This is all great but what do connectors have to do with this? Well,the only reason why the example above makes any sense at all is because
the Google Connector
is capable of doing bulk operations. If the connector only supported
updating records one at a time, then there would be no reason for
<batch:commit> to exists.
But wait! The batch module was only released two months ago, yet connectors like Google Contacts, Salesforce, Netsuite, etc, have had bulk operations for years! True that. But what we didn’t have until Batch came along was a construct allowing us to do record level error handling.
Suppose that you’re upserting 200 records in Salesforce. In the past,
if 100 of them failed and the other 100 were successful, it was up to
you to parse the connector response, pull the failed from the successful
apart and take appropiate action. If you wanted to do the same with
Google Contacts, you again found yourself needing to do everything
again, with the extra complexity that you couldn’t reuse your code
because Google and Salesforce APIs use completely different representations to notify the operation’s result.
Our goal with the batch module is clear: make this stuff simple. We no longer want you struggling to figure out each API’s
representation for a bulk result and handling each failed record
independently – from now on, you can rely on <batch:commit> to do
that for you automatically.
It’s not magic
“A kind of magic” is one of my favorite songs from Queen, speciallythe live performance at Wembley Stadium in ’86. Although the magic
described in that song doesn’t apply to batch and connector’s
mechanisms, there’s one phrase in that song which accurately describes
the problem here: “There can only be one”.
If we want the Batch module to understand all types of a bulk
operations results, we need to start by defining a canonical way of
representing it. We did so in a class called BulkOperationResult which
defines the following contact:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 |
|
- BulkOperationResult represents the operation as a whole, playing the role of the master
- BulkItem represents the result for each individual record, playing the role of the detail
- Both classes are immutable
- There’s an ordering relationship between the master and the detail.
The first item in the BulkItem list has to correspond to the first
record in the original bulk. The second has to correspond to the second
one, and so forth.
In case you’re curious, this is how BulkItem’s contact looks like:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 |
|
So, that’s it? We just modify all connectors to return a
BulkOperationResult object on all bulk operations and we’re done? Not
quite. That would be the recommended practice for new connectors moving
forward, but for existing connectors we would be breaking backwards
compatibility with any existing Mule application written before the release of the Batch module, which are manually handling the output of bulk operations.
BulkOperationResult object on all bulk operations and we’re done? Not
quite. That would be the recommended practice for new connectors moving
forward, but for existing connectors we would be breaking backwards
compatibility with any existing Mule application written before the release of the Batch module, which are manually handling the output of bulk operations.
What we did in these cases is have those connectors register a
Transformer. Since it’s each connector’s responsibility to understand
each API’s domain, it also makes sense to ask each connector to
translate it’s own bulk operation representation to a
BulkOperationResult object.
Transformer. Since it’s each connector’s responsibility to understand
each API’s domain, it also makes sense to ask each connector to
translate it’s own bulk operation representation to a
BulkOperationResult object.
Let’s see an example. This is the signature for an operation in the Google Contacts connector which performs a bulk operation:
1 |
|
Let’s forget about the implementation of the method right now. The
take away from the above snippet is that the operation will return a
List of BatchResult objects. Let’s see how to register a transformer
that goes from that to a BulkOperationResult:
take away from the above snippet is that the operation will return a
List of BatchResult objects. Let’s see how to register a transformer
that goes from that to a BulkOperationResult:
1 2 3 4 5 |
|
And for the big finale, the code of the transformer itself:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
|
Important things to notice about the above transformer:
- It extends AbstractDiscoverableTransformer. This is so that the batch module can dynamically find it in runtime.
- It defines the source and target data types on its constructor
- The doTransform() method does “the magic”
- Notice how BulkOperationResult and BulkItem classes provide
convenient Builder objects to decouple their inner representations from
your connector’s code
And that’s pretty much it! The last consideration to take is: what
happens if I use a bulk operation in a <batch:commit> using a
connector that doesn’t support reporting a BulkOperationResult? Well, in
that case you have two options:
happens if I use a bulk operation in a <batch:commit> using a
connector that doesn’t support reporting a BulkOperationResult? Well, in
that case you have two options:
- Write the transformer and register it yourself at an application level
- Just let it be and in case of exception, batch will fail all records alike
Wrapping it up
In this article we discussed why it’s important for connectors to
support bulk operations whenever possible (some APIs just can’t do it,
that’s not your fault). For new connectors, we advice to always return
instances of the canonical BulkOperationResult class. If you want to add
batch support to an existing connector without breaking backwards
compatibility, we covered how to register discoverable transformers to
do the trick.
support bulk operations whenever possible (some APIs just can’t do it,
that’s not your fault). For new connectors, we advice to always return
instances of the canonical BulkOperationResult class. If you want to add
batch support to an existing connector without breaking backwards
compatibility, we covered how to register discoverable transformers to
do the trick.
As always, I hope you enjoyed the reading, and since we spoke about
magic so much, this time I’ll say farewell with some music. Enjoy!
magic so much, this time I’ll say farewell with some music. Enjoy!