[SalesForce] How to query for the first matching related records

Given a list of master record Ids, I would like to get the first related child record for each master record. Assume that I can't control the initial inputs to simply include the child records in the first place. Is there a way to do this in a single SOQL query?

(This is, in essence, the bulkified version of this question: get First contact showed in a related list)

Here are the options I already know about:

1) Sub-query on the parent object:

List<Id> masterIds = fromSomewhereElse();

// Re-query for the masters, including the first matching child
List<Master__c> masters = [
    SELECT Id, 
        (SELECT Id, Master__c, LastModifiedDate 
         FROM Children__r
         ORDER BY LastModifiedDate DESC
         LIMIT 1)
    FROM Master__c
    WHERE Id IN :masterIds];

// Iterate through and extract the children
// Yes - this is contrived; doing it for the sake of the example
List<Child__c> children = new List<Child__c>();
for (Master__c master : masters) {
    if (!master.Children__r.isEmpty()) {
       children.add(master.Children__r.get(0));
    }
}

Pro: Easy to get just one child record for each parent record.

Con: You have to re-query for the master records, which counts as one extra SOQL query plus twice as many rows returned.

2) Query for all child objects, and then choose first to keep

List<Id> masterIds = fromSomewhereElse();

// Query for all of the masters' child records
List<Child__c> allChildren = [
    SELECT Id, Master__c, LastModifiedDate
    FROM Child__c
    WHERE Master__c IN :masterIds
    ORDER BY Master__c DESC, LastModifiedDate DESC];

// Iterate through and save the first child for each master
List<Child__c> children = new List<Child__c>();
Id currentMasterId;
for (Child__c child : allChildren) {
    if (child.Master__c != currentMasterId) {
       currentMasterId = child.Master__c;
       children.add(child);
    }
}

Pro: Counts as a single SOQL query
Con: There are potentially LOTS of extra rows queried, and the time to check the limit in Apex.

What I'd Like

Either a syntax for doing this without the extra query or extra rows, or insights into how to choose which of these approaches makes sense in a given context.

Best Answer

You may be able to utilize GROUP BY to do this. Something like:

SELECT
    Master__c, MAX(LastModifiedDate)
FROM
    Child__c
WHERE
    Master__c IN :masterIds
ORDER BY
    Master__c DESC, LastModifiedDate DESC
GROUP BY
    Master__c

This will only return a single record per Master__c object. The only issue you may have is that you need to utilize aggregate functions which may cause you some issues depending on what data you want.

To give an example, I ran this query in my test org. My test org has 3 accounts with BillingState set to CA:

SELECT
    BillingState
FROM 
    Account
WHERE
    BillingState = 'CA'
GROUP BY
    BillingState

When this query is run, it only returns a single record since it grouped the results together into a single record. I did not need to use an aggregate function on BillingState because it was part of the GROUP BY. If it was not part of the GROUP BY, I would need to use an aggregate function on it. However, I wouldn't be able to pull out the Id of the account, because technically it is an aggregate of all 3 accounts. There is no aggregate function to get only a single Id from it.

If you need the Ids for your records, your first approach is best in my opinion.