If you run the request that's hitting the error limit with the developer console open, a log of the code that runs is generated. At the end of the huge log is a list of the limits Salesforce has and how close you are to each. Just above that list toward the end of the log will also be the exception thrown when you hit the limit.
Once you have the Developer Console open, Click on Debug -> Switch Perspective -> Log Only (Predefined). That will remove some of the clutter that you don't need right now (Feel free to explore, it can be quite useful).
Your best bet is to enable Governor Limits Warnings. This will send you emails every time a transaction exceeds 50% of the governor limits, which should get you pretty close to nailing down where your problems are.
You can also engage support, which can check the logs to determine which of your requests are running the longest. This information should generally be enough to narrow down the possibilities.
A third option is to get your unit tests up to date, and run all your tests to see which tests are taking the longest. Even better, simulate callout times with spin loops of average API time, and assert that the governor limits are not above a certain threshold.
Also, if you're trying to get the concurrent limits to show up, try setting up more than just 11 calls at once. There's a bit of wiggle room. Make it more like 100 calls simultaneously logged in as multiple users. I know it'd be time consuming to set up, but I feel you could get some decent logs this way.
Lastly, you could always start optimizing your code. It sounds like you have a lot of technical debt, and now is the time to pay it down. Traditionally, I've found that most code can usually be optimized by at least 50% by an experienced developer. If you don't have any resources like this, you could outsource it to a consultant. This is probably the most expensive option, but might be your only choice if the other options don't pan out.
Best Answer
I've done some more research to this and created a repository containing code to reproduce:
https://github.com/koenfaro90/SFDC-ConcurrentPerOrgLongTxn-Reproduction
It seems that these days the actual start of a HTTP request (ie. the reception of headers by SF) is when a request starts counting against the concurrent request limit - potentially the reception of a body and thus Apex-execution kick-off can be many seconds later. This all causes connectivity to be a factor in the amount of concurrent processes your subscriber org is running.