Mariano Gonzalez on Wednesday, February 26, 2014

Mule How-to: Build a Batch Enabled Cloud Connector

When we announced the December 2013 release, an exciting new feature also saw daylight: The Batch Module.
If you haven’t read the post describing the feature’s highlights, you
should, but today I’d like to focus on how the <batch:commit>block
interacts with Anypoint™ Connectors and more specifically, how you can leverage your own connectors to take advantage of the feature.

<batch:commit> overview

In a nutshell, you can use a Batch Commit block to collect a subset
of records for bulk upsert to an external source or service. For
example, rather than upserting each individual contact (i.e. record) to Google Contacts,
you can configure a Batch Commit to collect, lets say 100 records, and
then upsert all of them to Google Contacts in one chunk. Within a batch
step – the only place you can apply it – you can use a Batch Commit to
wrap an outbound message processor. See the example below:

Mixing in Connectors

This is all great but what do connectors have to do with this? Well,
the only reason why the example above makes any sense at all is because
the Google Connector
is capable of doing bulk operations. If the connector only supported
updating records one at a time, then there would be no reason for
<batch:commit> to exists.

But wait! The batch module was only released two months ago, yet connectors like Google Contacts, Salesforce, Netsuite, etc, have had bulk operations for years! True that. But what we didn’t have until Batch came along was a construct allowing us to do record level error handling.

Suppose that you’re upserting 200 records in Salesforce. In the past,
if 100 of them failed and the other 100 were successful, it was up to
you to parse the connector response, pull the failed from the successful
apart and take appropiate action. If you wanted to do the same with
Google Contacts, you again found yourself needing to do everything
again, with the extra complexity that you couldn’t reuse your code
because Google and Salesforce APIs use completely different representations to notify the operation’s result.

Our goal with the batch module is clear: make this stuff simple. We no longer want you struggling to figure out each API’s
representation for a bulk result and handling each failed record
independently – from now on, you can rely on <batch:commit> to do
that for you automatically.

It’s not magic

“A kind of magic” is one of my favorite songs from Queen, specially
the live performance at Wembley Stadium in ’86. Although the magic
described in that song doesn’t apply to batch and connector’s
mechanisms, there’s one phrase in that song which accurately describes
the problem here: “There can only be one”.

If we want the Batch module to understand all types of a bulk
operations results, we need to start by defining a canonical way of
representing it. We did so in a class called BulkOperationResult which
defines the following contact:

              
          
          

            /**
 * This class is used to provide item level information about a bulk operation. This
 * master entity represents the bulk operation as a whole, while the detail entity
 * {@link BulkItem} represents the operation status for each individual data piece.
 * The {@link #items} list defines a contract in which the ordering of those items
 * needs to match the ordering of the original objects. For example, if the bulk
 * operation consisted of 10 person objects in which number X corresponded to the
 * person 'John Doe', then the Xth item in the {@link #items} list must reference to
 * the result of procesing the same 'John Doe'
 */
public final class BulkOperationResult<T> implements Serializable
{
    /**
     * The operation id
     */
    public Serializable getId();
    
    /**
     * Whether or not the operation was successful. Should be <code>true</code> if
     * and only if all the child {@link BulkItem} entities were also successful
     */
    public boolean isSuccessful();
    
    /**
     * An ordered list of {@link BulkItem}, one per each item in the original
     * operation, no matter if the record was successful or not
     */
    public List<BulkItem<T>> getItems();
    
    /**
     * A custom property stored under the given key
     * 
     * @param key the key of the custom property
     * @return a {@link Serializable} value
     */
    public Serializable getCustomProperty(String key);
}

        
            view raw

            gistfile1.java

            hosted with ❤ by GitHub

          

Basically, the above class is a Master-Detail relationship in which:

BulkOperationResult represents the operation as a whole, playing the role of the master
BulkItem represents the result for each individual record, playing the role of the detail
Both classes are immutable
There’s an ordering relationship between the master and the detail.
The first item in the BulkItem list has to correspond to the first
record in the original bulk. The second has to correspond to the second
one, and so forth.

In case you’re curious, this is how BulkItem’s contact looks like:

              
          
          

            /**
 * This class represents an individual data piece in the context of a bulk operation
 */
public final class BulkItem<T> implements Serializable
{
    /**
     * The item id
     */
    public Serializable getId();
 
    /**
     * Wether or not it was successful. Notice that this should be <code>false</code>
     * if {@link #exception} is not <code>null</code>, however there might not be an
     * exception but the item could still not be successful for other reasons.
     */
    public boolean isSuccessful();
 
    /**
     * Message to add context on this item. Could be an error description, a warning
     * or simply some info related to the operation
     */
    public String getMessage();
 
    /**
     * An optional status code
     */
    public String getStatusCode();
    
    /**
     * An exception if the item was failed
     */
    public Exception getException();
 
    /**
     * The actual data this entity represents
     */
    public T getPayload();
 
    /**
     * A custom property stored under the given key
     * 
     * @param key the key of the custom property
     * @return a {@link Serializable} value
     */
    public Serializable getCustomProperty(String key);
}

        
            view raw

            gistfile1.java

            hosted with ❤ by GitHub

          

So, that’s it? We just modify all connectors to return a
BulkOperationResult object on all bulk operations and we’re done? Not
quite. That would be the recommended practice for new connectors moving
forward, but for existing connectors we would be breaking backwards
compatibility with any existing Mule application written before the release of the Batch module, which are manually handling the output of bulk operations.

What we did in these cases is have those connectors register a
Transformer. Since it’s each connector’s responsibility to understand
each API’s domain, it also makes sense to ask each connector to
translate it’s own bulk operation representation to a
BulkOperationResult object.

Let’s see an example. This is the signature for an operation in the Google Contacts connector which performs a bulk operation:

              
            public List<BatchResult> batchContacts(String batchId, List<NestedProcessor> operations) throws Exception;

            view raw

            gistfile1.java

            hosted with ❤ by GitHub

Let’s forget about the implementation of the method right now. The
take away from the above snippet is that the operation will return a
List of BatchResult objects. Let’s see how to register a transformer
that goes from that to a BulkOperationResult:

              
            @Start
public void init() {
 this.muleContext.getRegistry().registerTransformer(new BatchResultToBulkOperationTransformer());
}
 
            view raw

            gistfile1.java

            hosted with ❤ by GitHub

And for the big finale, the code of the transformer itself:

              
          
          

            public class BatchResultToBulkOperationTransformer extends AbstractDiscoverableTransformer {
 
 public BatchResultToBulkOperationTransformer() {
  this.registerSourceType(DataTypeFactory.create(List.class, BatchResult.class, null));
  this.setReturnDataType(DataTypeFactory.create(BulkOperationResult.class));
 }
 
 @Override
 protected Object doTransform(Object src, String enc) throws TransformerException {
  List<BatchResult> results = (List<BatchResult>) src;
  
  BulkOperationResultBuilder<BaseEntry<?>> builder = BulkOperationResult.<BaseEntry<?>>builder();
  
  if (results != null) {
   for (BatchResult result : results) {
    BatchStatus status = result.getStatus();
    int code = status.getCode();
    
    builder.addItem(BulkItem.<BaseEntry<?>>builder()
      .setRecordId(result.getId())
      .setPayload(result.getEntry())
      .setMessage(status.getContent())
      .setStatusCode(String.format("%d - %s", code, status.getReason()))
      .setSuccessful(code == 200 || code == 201 || code == 204)
     );
   }
  }
  
  return builder.build();
 }
 
}

        
            view raw

            gistfile1.java

            hosted with ❤ by GitHub

          

Important things to notice about the above transformer:

It extends AbstractDiscoverableTransformer. This is so that the batch module can dynamically find it in runtime.
It defines the source and target data types on its constructor
The doTransform() method does “the magic”
Notice how BulkOperationResult and BulkItem classes provide
convenient Builder objects to decouple their inner representations from
your connector’s code

And that’s pretty much it! The last consideration to take is: what
happens if I use a bulk operation in a <batch:commit> using a
connector that doesn’t support reporting a BulkOperationResult? Well, in
that case you have two options:

Write the transformer and register it yourself at an application level
Just let it be and in case of exception, batch will fail all records alike

Wrapping it up

In this article we discussed why it’s important for connectors to
support bulk operations whenever possible (some APIs just can’t do it,
that’s not your fault). For new connectors, we advice to always return
instances of the canonical BulkOperationResult class. If you want to add
batch support to an existing connector without breaking backwards
compatibility, we covered how to register discoverable transformers to
do the trick.

As always, I hope you enjoyed the reading, and since we spoke about
magic so much, this time I’ll say farewell with some music. Enjoy!

Integration One

Wikipedia

Thursday, 12 June 2014

Mule How-to: Build a Batch Enabled Cloud Connector

Mule How-to: Build a Batch Enabled Cloud Connector

<batch:commit> overview

Mixing in Connectors

It’s not magic

Wrapping it up