Salesforce Apex – How to Parse YAML Object

I get a YAML string from REST service response in Apex code and I look for some similar way how to parse it and convert to collection of Salesforce objects.

For instance, I get such YAML as string:

---
item:
  - 
    title: A Grief Observed
    author: C. S. Lewis
    author_by_last: Lewis, C. S.
    isbn: "0060652381"
    publisher: ZOND
    on_hand: "5"
    in_pub_date: 2001-01-01
  - 
    title: "Grief Sanctified: From Sorrow to Eternal Hope: Including Richard Baxter's Timeles"
    author: J. I. Packer
    author_by_last: Packer, J. I.
    isbn: "1581344406"
    publisher: CROSS
    on_hand: "5"
    in_pub_date: 2002-09-01

And I'd like to convert it to list of Book objects. When I worked with JSON I used JSON class which provided deserialization functionality, and I'd like to know, does Salesforce provide something similar for YAML processing?

Best Answer

No, at time of writing (API v41.0), Salesforce does not have any built-in functionality to create or parse YAML. That means you'd need to build your own (or adapt an existing one from another language, like Java, into Apex).

If you aren't looking for a general-purpose parser (something that can handle any valid YAML) and you don't expect the format of the data you're receiving to change, writing a domain-specific parser shouldn't be too much work.

When I find myself working with XML, I like to break the incoming schema into individual apex classes and build parsing methods into each one. Doing so keeps the parser manageable. A similar approach can be taken with YAML.

// Yes, the need for this BookCollection object is debatable (it's mainly just storing
//   a List).
// Encapsulating the parsing makes it worth being made into a class (in my mind).
public class BookCollection{
    // The class variables for each level mimic the data stored on each level
    //   of the schema of your incoming data.
    // This will become more apparent later.
    public List<Book> item;

    public BookCollection(String input){
        item = new List<Book>();

        // At this level, all we're concerned about is finding the individual books.
        // Once we find a book, we pass it down to the next level of parsing (and
        //   add the result to our list)
        // YAML uses whitespace to denote structure, so we need to take that into
        //   account when splitting.
        // The regex here looks for two spaces, a hyphen, one space, a newline.
        // Everything after that (up to the next '  - \n' or EOF) is book data.

        // String.split() will return 'item:' as the first part.
        // That isn't part of the data for a book, so we'll want to remove that.
        List<String> bookStringsList = input.split('  - \n');
        bookStringsList.remove(0);

        for(String bookString :bookStringsList){
            Book currentBook = new Book(bookString);
            item.add(currentBook);
        }
    }
}

public class Book{
    // Now it should be more apparent that we're mimicking the structure of the
    //   incoming data.
    String title;
    String author;
    String author_by_last;
    String isbn;
    String publisher;
    Integer on_hand;
    Date in_pub_date;

    public Book(String input){
        // On this level of parsing, we have actual data to work with.
        // Our job here is to find all of the key:value pairs, and cast them to
        //   their appropriate types.
        for(String keyAndValue :input.split('    \n')){
            String key, value;
            List<String> kvSplit = keyAndValue.split(':');
            key = kvSplit[0];
            // Double quotes are likely to mess things up, so remove them.
            value = kvSplit[1].replace('"', '');

            // There's probably a more elegant way to handle this than a big 'ol
            //   if/else if chain...but this'll work
            if(key == 'title'){
                this.title = value;
            } else if(key == 'author'){
                this.author = value;
            } else if(key == 'author_by_last'){
                this.author_by_last = value;
            } else if(key == 'isbn'){
                this.isbn = value;
            } else if(key == 'publisher'){
                this.publisher = value;
            } else if(key == 'on_hand'){
                // String -> Integer is pretty easy, we can use Integer.valueOf()
                this.on_hand = Integer.valueOf(value);
            } else if(key == 'in_pub_date'){
                // Dates are a bit tricky.
                // Salesforce wants them in YYYY-MM-DDThh:mm:ssZ format, or the format
                //   used in your locale (for parse() or valueOf()).
                // Given our data, it's easiest to simply generate a new date instance
                List<String> dateParts = value.split('-');

                this.in_pub_date = Date.newInstance(dateParts[0], dateParts[1], dateParts[2]);
            }
        }
    }
}

Using those classes is pretty simple. It does, however, require some additional setup.

// Your YAML, from some source
String myYAML = blackBox.getData();

// Break up your YAML's documents before attempting to parse each one.
List<String> documents = myYAML.split('---\n');

// The first result string will likely always be empty, so we can remove that.
documents.remove(0);

// This is the simple case where we know we're only dealing with a single document.
// If you had multiple documents in a single YAML string, you would (hopefully)
//   be able to tell which type of document you were working with (and you'd
//   need additional logic to determine which parser class to send the data to).
BookCollection myBooks = new BookCollection(documents[0]);