Microsoft RDS connectivity issue

We have a brand new Microsoft RDS environment spun up on Server 2016.  We have two gateway servers, two connection brokers and three app servers.  When a user hits the url (https://myrds.mydomain.com/rdweb) about half of the time, they get just a screen with Work Resources and NO login.  The other half of the time, they get a login.  

Now, the gateway servers are behind a VIP and the connection servers are behind a VIP.  I'm thinking it's something with the connection server VIP being messed up.  

I need some help troubleshooting this and knowing where to start.

Any ideas?

Thanks

Cliff
crp0499CEOAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Cliff GaliherCommented:
One thing you haven't listed is the web access role. That is the role that generates the  "work resources    page. And it sounds like that is what is failing.
crp0499CEOAuthor Commented:
That Role is on the gateway servers, I think.  Again, half of the time we get a login and half of the time we don't.
crp0499CEOAuthor Commented:
Confirmed, that role shows on the first gateway server.Screen-Shot-2017-12-08-at-8.28.41-AM.png
Active Protection takes the fight to cryptojacking

While there were several headline-grabbing ransomware attacks during in 2017, another big threat started appearing at the same time that didn’t get the same coverage – illicit cryptomining.

Cliff GaliherCommented:
If a web page is showing up every time, but a login page is only showing up half the time, that is a fairly good sign that one of the two servers is not properly running the role. The load balancer is sending requests there and it spits back a page. Just not the full page.  So spoke one may haveangled the settings or misconfigured the role.
crp0499CEOAuthor Commented:
ok, that makes sense.  BUT, it use to work until we moved the gateway servers to a new DMZ and had to open firewall ports.  Could this be a port issue to just one server or are you still thinking the role is misconfigured?  How would I determine which server might have a problem?  
i guess I could take one server out of the VIP and test that way.  

Also, is the web page on the gateway server or connection broker?  It looks to me like it's on the gateway server.
Cliff GaliherCommented:
"is the web page on the gateway server or connection broker"

The way  you phrased this question is a little odd, so I'm going to answer it this way.  Neither the gateway nor the broker roles *ever* serve up web pages. That is always done by the Web Access role. So whichever servers have that role are the same servers creating the login webpage.  Full stop. Yes, that role can exist with other roles, much like many organizations put DHCP on their domain controllers. But is shouldn't be assumed or required and the two are unrelated.  Find your RDWA role servers, and those are your web page servers.

If the issue was with the firewall then I would usually guess you'd see a complete failure either all of the time or at least half the time.  And by complete failure, I mean "this page cannot be found" browser error when it doesn't get any response.

What you describe is not that. You always get a response.  Half the time it includes a login box. Half the time it just includes the "work resources" template with nothing filled in. That tells me that the load balancer is properly redirecting traffic to each server half the time, and each server *is* responding. That one server responds differently places the blame firmly on the server.  There are always edge cases, such as a proxy caching responses, but even then a proxy will usually flush a cache for a server if the server's keep-alive time expires.  Which it would in a non-response scenario.  So is it *possible* that the firewall is the culprit?  Anything is possible. Is it likely?  It is very unlikely.

How would you troubleshoot?  You hit it. Take each server off the load balancer one at a time. Find the bad server.  Then fix it.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
crp0499CEOAuthor Commented:
You are really good at this expert thing.  I won't trouble you any longer.  Once we have removed each server from the LB and determined which server it is that's not working, I'll report.  I appreciate all of your time and especially the way you responded.  Thank you.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Remote Access

From novice to tech pro — start learning today.