[SalesForce] Why is the report taking forever to run

When working with a very large dataset of Accounts, e.g., 10's of millions of records, I wanted to create a report to return a subsets of records, i.e., records filtered by country.

I created a report A: on Accounts as follows (Country is an indexed field):
Filter Criteria: Country = XX
Created date = All Time
Show: All Accounts No Division filter.
Report columns: Account ID, OwnerID
Result: took 10 secs to run and returned 200K records.

Report B: Filter Criteria: Country = YY
Created date = All Time
Show: All Accounts No Division filter.
Report columns: Account ID, OwnerID
Result: Report timed out (although it is known that there are 5 million records for this country in the table)

Report C: Filter Criteria: Country = ZZ
Created date = All Time
Show: All Accounts No Division filter.
Report columns: Account ID, OwnerID
Result: Report timed out (although it is known that there are 2 million records for this country in the table)

Proposition: Reporting run times are logarythmically proportional to the number of records being returned BUT NOT proportional to the number of records being searched.

Is this a Best Practice?: Create reports that create the smallest set of records to be returned as possible (sounds obvious but requires a little bit of skill, knowing the indexed field to use that filters in (not out) the least records).

Additional Question: Is there a practical limit to the number of record being collated for returning in the report?

Best Answer

You can use things like Custom Indexes, Skinny Tables, etc to reduce the time it takes to run a report. There are some limits to the amount of data that can be returned so please look in the below guides for the exact numbers.

When dealing with Large Data Volumes (LDV) on the Salesforce Platform you should always ensure that your reports use "Selective Critiera". Otherwise they can time out.

Please see this cheatsheet for how to ensure you criteria is selective which has detailed numbers on what is selective:

https://help.salesforce.com/help/pdfs/en/salesforce_query_search_optimization_developer_cheatsheet.pdf

You should also look at the Force.com best practices for LDV data sets which is here: http://www.salesforce.com/docs/en/cce/ldv_deployments/salesforce_large_data_volumes_bp.pdf

Hope that helps.