My question originates from the following scenario:
I have a daily retrieval of data from Salesforce to a third party system which uses regular SOQL to retrieve the desired data. This nested query is pretty big and includes ~50 fields spanning over ~10 objects related to my Custom Object (lets call it CustomObject) which is my main Object (this has a Master-detail-relationship to the Case object, where one Case can have many MyCustomObject objects) for the SOQL query. The total count of CustomObject objects in my Salesforce org is atm ~500 000 records. Each day a small subset of these ~1000-2000 records gets tagged with a True value in a custom field "MyCustomCheckbox", which indicates that the third party system should pick these up in the daily data retrieval, and once processed the MyCustomCheckbox will be set to False.
What I am experiencing now is very slow response times and often that the query times out and will not be ran directly, but needs mutiple attempts before it goes through. As you can imagine this is causing issues and I want to change it to have a filter as selective as possible.
My current filter is as follows:
MyCustomCheckbox__c = TRUE AND Type__c IN ('Option1,Option2, Option3') AND MyCustomObject__r.Case__r.Country__c = 'US' AND MyCustomObject__c.Case__r.Valid_Account = TRUE
This gives me, using the Query Plan Tool, a Cost of ~0.67 with a Table Scan on the MyCustomObject Object.
Since this is a daily transfer and due to other business logic I know that I can now add a filter that says that we should only transfer the MyCustomObject objects where the Case has been changed in the last four days. This opens up for the possibility to extend the filter with the following:
AND MyCustomObject__r.Case__r.LastModifiedDate = LAST_N_DAYS:4
Now the query has a cost of ~0.25 while using the index on the SystemModstamp field and preforming an "Other" Leading operation type on the Case object. This appears to solve my problem at the moment but keeping scalability in mind I am wondering about my options using custom indexes.
I have been reading a lot about custom indexes and which fields are better to index than other etc. but I am a bit unclear about Checkboxes, I have not found any good discussions or reasoning around indexing Checkbox fields when you have a very big variety of the values in that field. In my scenario I would have ~2000 records daily out of then ~500 000 records which would have the desired TRUE value in that Checkbox that is used in the filter.
Is that a reason for making that field a custom index, would that in theory work better than if the distribution was more even (50/50 between TRUE/FALSE)?
Does anyone have any other experienes etc. around similar scenarios that would lead me towards some improved solution for this? Creating a formula text field instead and having that set to a text value, index that field, and use that in the filter isntead(?).
Creating custom indexes have sense when data have 1/3 uniqueness in the table.
If the distribution like in your case is more or less equal you will not help you. Probably when you contact with someone from Salesforce they told you the same.