c# software limits / server processing limits

websss
websss used Ask the Experts™
on
Hi

We are facing some issues with performance, we cannot seem to process data fast enough.

Firstly what I would need to know is what we are trying is technically possible

I'm trying to process hundreds of thousands of records
each record takes approx 500ms to process due to things like hitting external API's

I'm only able to process 85-100 records per second

The server is in a datacenter, its am 8 cpu skylake cpu, however when the app runs it uses approx 140 threads, and cpu sits around 6%

The app is multithreaded and i've tried threads, threadpool, Async/await, parallel foreach/invoke etc, most of the 500ms is waiting for API's to respond.

My question is related to processing throughput with the given parameters
i.e. if it takes 500ms to process 1 record, (and its running consistently on 140 threads), does 100 records per second so like a near limit?

I know this question is tough to answer but I need to know is i should be focusing on code performance tuning or look at other solutions
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Commented:
That sounds roughly right.

You could also be potentially hitting the max concurrent network calls, as described here:
https://docs.microsoft.com/en-us/dotnet/framework/wcf/feature-details/using-servicethrottlingbehavior-to-control-wcf-service-performance

You could try setting maxConcurrentCalls, maxConcurrentSessions, and maxConcurrentInstances to something higher, like 256, and see if that opens up anything for you, but it's going to depend on # of cores / CPUs.

There's also potentially a bottleneck on the API side - if the server can only process so many concurrent requests at a time, you might not be able to go past its own limit.

I'd also suggest seeing if there's any way to combine/batch calls. Most enterprise-level APIs have some form of batching / bulk capability so you can request multiple records in one API call. If the overhead of the call is 300ms-400ms and the actual call only takes 100ms-200ms, then maybe you could ask for 100 records at a time and get 100 records back in 10-20 seconds that you could process with one thread.
Top Expert 2016

Commented:
and the api is waiting for what? disk i/o ?
websssCEO

Author

Commented:
Api is external service which returns a string of data, I call over http
OWASP: Forgery and Phishing

Learn the techniques to avoid forgery and phishing attacks and the types of attacks an application or network may face.

Commented:
I think the bigger point that David and I were both trying to make is that if the API is a bottleneck - is it one that you have control over or can improve on usage?
websssCEO

Author

Commented:
Thanks, I did think the same, and is why I mentioned it, wasn't sure if there was some magic work area
It appears that the threads are all in use while these requests are waiting for response and not available to the rest of the system, so it's reaching a limit (approx 140 threads at any one point)

The http api is open source, I have it on my own server but its being access via public IP in same data centre (different hosts)
I intended to bring it onto a Lan, and access it this was to see if it will speed up things significantly

There seems to be a significant ttfb (time to first Byte) when checking the same request via chrome Dev tool bar
We have seen it at 300ms, but not much faster than this
The http api sends a lat lon into the api, and returns a road name.

Commented:
So it's a reverse geocoding API? There might not be too much you can do unless you can load the database onto a ramdisk or something. Usually with geocoding (regardless of direction), there's a lot of data and it just takes a while to look up, so I/O can be the biggest bottleneck. It also sounds like it's a 3rd party / open source API, which might constrain the ability for you to make significant modifications to it unless you want to really dive into it.
websssCEO

Author

Commented:
yes its reverse
I'll take a look at the Source / see if i can connect directly to the DB
websssCEO

Author

Commented:
Thanks, this had led me to investigate much more, i'll close this question here now as i've got a separate question with a lot more detail including an environment to test, and source code.
https://www.experts-exchange.com/questions/29145595/Increase-concurrent-requests-to-API-from-one-machine.html

It may result in the same answer as here, but its a very important area so I need to be 100% sure how to proceed.
I would appreciate if you could participate on the other question,

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial