[SalesForce] Efficiency when querying data with dynamic SOQL vs. formula fields

I am coding a search functionality where I have several parameters to search for (Account Name, Location Name, several kinds of dates, etc.

My question is, when filtering the data in the SOQL query string doing something like the following:

if(accountName != null) queryString += ' AND Location_Account_Name__c LIKE \'%' + accountName+ '%\' ';

Is it better (for robustness and performance) to filter by Location_Account_Name__c being this a formula that points to Site__r.Account__r.Name. or better to actually reference Site__r.Account__r.Name in the query string? no difference?

I would appreciate any insight on this, as the complexity of the query is growing quite fast…

Thanks!

Best Answer

Based on the documentation, the reference to the actual field Site__r.Account__r.Name looks the way to go but it is not clear-cut. (And with small numbers of records you probably won't see any difference.)

This is because (from Working with Very Large SOQL Queries):

primary keys (Id, Name and Owner fields), foreign keys (lookup or master-detail relationship fields), audit dates (such as LastModifiedDate), and custom fields marked as External ID or Unique

and (from Force.com SOQL Best Practices: Nulls and Formula Fields):

Force.com cannot index [formula] fields that: Reference other entities (i.e., fields accessible through lookup fields)

But there is this caveat (from Developing Selective Force.com Queries through the Query Resource Feedback Parameter Pilot):

Leading '%' wildcards are inefficient operators that also make filter conditions unselective. As a best practice, use SOQL or remove the leading '%'; although the Force.com query optimizer does consider using an index even if you do use this leading wildcard, it will choose to use that index based on the number of characters in the search string and the number of matches that result. For these reasons, it usually chooses to do a table scan. When you're querying millions of records, the performance that a selective index gives you can be significantly better.

An index allows a row to be found without the database having to search through every table row (a table scan) and can make a large difference to performance.

So it appears that by not using formula fields you open up the possibility that any pre-existing index is used and you can also request Salesforce support to add custom indexes and they will be used. But the wildcards you are using may cause those indexes to be ignored.

If you have an org with a representative amount of data in it, probably the best thing to do if this optimisation is important to you, is to create some unit tests and compare the results. As well as measuring the execution time you can get some basic information about the resulting query via the Developer Console Query Plan Tool (How To & FAQ).

Related Topic