asked on

REST-like queries run very slow if fetching only 3 records at a time

On internal intranet the C/C++ client sends out REST-like queries via TCP/IP and receives an XML response from the Java server. If the REST query loops fetching 3 records at a time, the total time to accumulate 90000 records is about 10x longer than if we fetch 100 records at a time. We will be performing a number of timing tests to isolate the cause. In anticipation that the problem may be the slow TCP/IP start due to initial small windowing, what settings are there to tell TCP/IP to start off with the largest (or larger) window size possible?

We are on 64-bit RHEL servers, and I assume that since the client/server are run on an intranet self-contained within the company, that we do not have to be concerned about congestion.

Thanks,
Paul

noci

This might be very well cause by doing a query (yielding said 90000) records then sending records 1-3.
Then again running the query yielding 90000 records returning record 6-9 etc.
thus running 30K queries against your database.

While returning 100 / batch will 'only'' do 900 Queries. that will make up for a 33 times difference. (it might be slightly better due to caching inside the DB).

The better approach might be running the query yielding 90K records, then keep those in a cache (memcache f.e.) and then pull from memcache until either the data is missing there or until x seconds have past..... and if memcache has no data THEN query the DB again.

David Favor

As noci suggested, the larger your payload (higher number of records), the less setup time is required to setup + teardown data to be returned.

Also, depending on your network latency, your network connections will always take some time to setup + teardown.

Summary: Fetching larger numbers of records in fewer API calls will always be faster, then fetching smaller numbers of records using larger numbers of API calls.

Tip: When writing APIs, running a good cache (noci suggested this too) on the API side will greatly reduce resource load on your API side.

Also, running an embedded cache for all callers/clients will massively increase performance, in the case where you're touching duplicate data before the local caller/client side cache expires data.

Note: The biggest performance killer of any code is always i/o, so anytime you increase network or disk (SQL) i/o your code will slow down.

Your target is to optimize your code for least touching of any network or disk i/o.

phoffric

ASKER

To clarify...
The first thing the client sends is a query . That query returns instantly with a query ID. The second thing the client sends is a fetch with the query id. This fetch results in the server Java side caching all 90,000 records . The first fetch therefore can take five to 10 minutes to return the number of records requested. This is expected . All subsequent fetches with the query id are returned much faster .

Our isolation timing test will determine where the bottleneck is .

From the OP is only one question..

"In anticipation that the problem may be the slow TCP/IP start due to initial small windowing, what settings are there to tell TCP/IP to start off with the largest (or larger) window size possible?"

noci

That question was not that clear though.
Here is a relevant portion on initital tcp window size & scaling:

https://access.redhat.com/solutions/29455

Window size is the number of inflight segments of packets.
So it will not change the packet size (mostly around 1500) due to ubiquitous IEEE 802.3 (aka ethernet.)
And also has no relation ship with TCP packet size (up to 64K)

David Favor

You said, "In anticipation that the problem may be the slow TCP/IP start due to initial small windowing, what settings are there to tell TCP/IP to start off with the largest (or larger) window size possible?"

Better to test + know, than anticipate + guess.

You can test + know for sure exactly where the problem exists, by using curl to simulate API calls.

How you do this, relates to your specific API.

You can run curl reporting status variables (curl has many, like connection setup/teardown times, DNS lookup, etc) while tracking logs on your API side.

David Favor

Aside: Several thorny issues related to Linux are best considered.

1) Kernel version makes a huge difference in throughput, including connection setup/teardown/caching.

2) Start with a Kernel of 4.15 or above. Older Kernels... you'll be fighting problems which have been fixed for years.

3) Another horrible problem is the steaming pile of... mud... that is systemd...

The systemd subsystem uses the horribly broken systemd-resolved DNS resolver. For around 3x years I wrestled with odd WordPress site problems along with odd microservices (APIs) built on Linux.

Finally tracked the problem down to systemd-resolved.

If any machines in your mix are using systemd-resolved, run dnsmasq instead + destroy all traces of systemd-resolved.

If you're unsure how to nuke forever the nonsense that is systemd-resolved, open another question for a walk through of how to setup dnsmasq + destroy systemd-resolved.

4) Best also take a look at your retransmit data on both server + client side of your system, as things like odd MTU settings + other tunings (incorrectly suggested by many articles about TCP performance enhancement) can all cause problems.

If you've "tuned TCP", best untune TCP... meaning remove any oddball settings suggested by TCP tuning guides.

Best TCP tuning is to install Kernel 4.15+ with default tunings.

David Favor

You mentioned, "This fetch results in the server Java side caching all 90,000 records."

If what you're saying is that every fetch triggers the server side to read 90K records you have 2x choices.

1) Rewrite your code to work with SQL SELECT statements with better constraints (like starting + ending id to fetch).

2) Always have clients make a call for all 90K records.

phoffric

ASKER

>> The better approach might be running the query yielding 90K records
Agreed. At the meeting, I raised the point that if they think there is a memory problem, then just by memory since it is cheap - a lot cheaper than all the labor is figuring out where the problem is.
Answer I got is that we have to do tests and prove that we need to buy memory.

>> And also has no relationship with TCP packet size (up to 64K)
Here is my recollection from decades ago. If the Window size starts off at, say 1K, then the TCP packet size will be not be 64K. Over time, if acks come back in timely manner, then window size will keep increasing up to the maximum of 64K. At this point, the transmission rate is a maximum. So, I thought the Window size had a relationship with TCP packet size.

>> Better to test + know, than anticipate + guess.
Yes. That is better. Since another group does the testing, I just wanted to be prepared with the knowledge gained from this question in case it became relevant.

>> Aside: Several thorny issues related to Linux...
I have no control of Linux versions or what commands/utilities I have access to. Hopefully, our infrastructure team knows the issues you mention.

>> Rewrite your code to work with SQL SELECT
There are no SQL commands involved with these XML queries. I probably should have been clearer about that in my question.

I did go through that redhat link, but I didn't see the code to set the window size. Maybe it is not even an option when setting up the connection.

noci

The Redhat article explains what parameters contribute to the windows size & scaling settings.
On modern systems the window never is 1K... it is primarily determined by the receivebuffers...
as no receivebuffers does prevent receiving data in a meaningfull way.
cat /proc/sys/net/ipv4/tcp_rmem Will show the minimal, default, maximum sizes for this.

Also the Article refers to another for more info: https://access.redhat.com/solutions/30866
Or the RFC: https://www.ietf.org/rfc/rfc3390.txt

and the windows size does not have a relation with packet size... (not directly). The window is the amount of data between application socket A and application Socket B which has been sent by either socket but not received (yet) by the other.
For initial size the default tcp_rmem value is used. It is the receiving end that sets the effective starting size.

Here is another article on tuning... beaware in a network you are never alone, anything you do to make things "better" can work out negatively on the other side. You can "send" data more efficiently causing the other end to become swamped (or DOS-attacked).
https://wwwx.cs.unc.edu/~sparkst/howto/network_tuning.php

phoffric

ASKER

>> The Redhat article explains what parameters contribute to the windows size & scaling settings.
True, but I didn't see how to change the windows size.

$ cat /proc/sys/net/ipv4/tcp_rmem  
4096	87380	6291456

Open in new window

>> it is primarily determined by the receivebuffers... as no receivebuffers does prevent receiving data in a meaningfull way.
Right. If there is a tiny receive buffer, the receiver is well-advised to not make its window size a lot bigger.

>> the windows size does not have a relation with packet size...
Could you point out the section in the provided links for me to look at to better understand this?

The last link on tuning is concerned with send/receive buffer sizes. I didn't find "window" in that article.

Is it possible to set the initial window size in C/C++ and Java? If not, is there any other way to do that?

ASKER CERTIFIED SOLUTION

noci

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

phoffric

ASKER

Thanks for the discussion. I appreciate your input.