[SalesForce] Concurrent requests limit exceeded – how to find the problematic code

I am a developer for a Salesforce org that serves as a community portal for thousands of users. We have been consulting with a large team of developers for years and everything we have is custom coded. Our pages are built using AngularJS to remotely call Apex code. We are on an Unlimited Edition org and we are nearing 90% of our total allowed code. We have a lot of callouts to external services and many times these callouts can run long.

We are now running into issues where users are seeing the "Unable to Process Request – Concurrent requests limit exceeded" error. The issue is there is so much code written by so many different developers it is hard to identify what/where the problematic code is. Sometimes users see this error when they are trying to perform a callout. Sometimes it's when they are inserting a record and triggers are firing. The long-running code issue can be anywhere in our org. Another issue is error handling has been done very poorly, so if any process does run long, we most likely are not looking for it or logging it.

Other than enhancing the logging on our callouts, I do not know where to even begin to find other problematic code. Every developer before me has written code with no regards to limits, so inefficient code can be anywhere. Is there any way or best practice to being able to analyze such a complex org and find where the problems may be? All I can think of is enhanced logging EVERYWHERE, but that may not be the best approach and would be very time consuming, especially considering how many developers we have constantly writing and pushing new code.

I have tried to replicate the issue in a sandbox however I cannot seem to get it working. I wrote Javascript to remotely call an Apex method that waits for 6 seconds. I called this remotely 11 times at once, causing 11 synchronous transactions running concurrently each longer than 5 seconds. However instead of getting the concurrent limit error, each Apex transaction just errors out with a timeout.

Any help would be greatly appreciated. I am not familiar with the majority of the code/functionality of our org, so I do not know how to really solve this issue considering I don't even know what could be problematic aside from long running callouts. Thanks!

Best Answer

Your best bet is to enable Governor Limits Warnings. This will send you emails every time a transaction exceeds 50% of the governor limits, which should get you pretty close to nailing down where your problems are.

You can also engage support, which can check the logs to determine which of your requests are running the longest. This information should generally be enough to narrow down the possibilities.

A third option is to get your unit tests up to date, and run all your tests to see which tests are taking the longest. Even better, simulate callout times with spin loops of average API time, and assert that the governor limits are not above a certain threshold.

Also, if you're trying to get the concurrent limits to show up, try setting up more than just 11 calls at once. There's a bit of wiggle room. Make it more like 100 calls simultaneously logged in as multiple users. I know it'd be time consuming to set up, but I feel you could get some decent logs this way.

Lastly, you could always start optimizing your code. It sounds like you have a lot of technical debt, and now is the time to pay it down. Traditionally, I've found that most code can usually be optimized by at least 50% by an experienced developer. If you don't have any resources like this, you could outsource it to a consultant. This is probably the most expensive option, but might be your only choice if the other options don't pan out.

Related Topic