HP 765 MSM Wireless Controller DHCP Problems

Current Setup:

We currently have one core router/switch (HP 5412zl) with a MSM module inside of it.  It is equipped with 10GB/1GB links to different subnets (all have 5412zl).  The VLAN & IP & routing scheme is as follows for the core router and i will list one other site to give an idea of our setup

Core IP/VLAN:

1      | MGMT                  | Manual     10.100.0.1      255.255.248.0   No No
5      | Servers               | Manual     172.16.0.1      255.255.0.0     No No
100  | User VLAN           | Manual     10.100.100.1    255.255.252.0   No No
120  | Services A           | Manual     10.100.120.1    255.255.255.0   No No
140  | Services B           | Manual     10.100.140.1    255.255.255.0   No No
160  | Services C           | Manual     10.100.160.1    255.255.255.0   No No
200  | WLAN Managed   | Manual     10.100.200.1    255.255.255.0   No No
1010|WAN to CE            | Manual     192.168.101.1   255.255.255.252 No No
1020|WAN to CH           | Manual     192.168.102.1   255.255.255.252 No No
1030|WAN to THS          | Manual     192.168.103.1   255.255.255.252 No No
1040|WAN to TMS          | Manual     192.168.104.1   255.255.255.252 No No
1050|WAN to SQ            | Manual     192.168.105.1   255.255.255.252 No No
1060|WAN to GW           | Manual     192.168.106.1   255.255.255.252 No No

Core IP Route:

0.0.0.0/0                172.16.1.7      5    static               1          1
10.100.0.0/21        MGMT                   1    connected            1          0
10.100.10.40/29    VCenter               6    connected            1          0
10.100.100.0/22    User VLAN          100  connected            1          0
10.100.200.0/24    WLAN Managed  200  connected            1          0
10.101.0.0/16        192.168.101.2   1010 static               1          1
10.102.0.0/16        192.168.102.2   1020 static               1          1
10.103.0.0/16        192.168.103.2   1030 static               1          1
10.104.0.0/16        192.168.104.2   1040 static               1          1
10.105.0.0/16        192.168.105.2   1050 static               1          1
10.106.0.0/16        192.168.106.2   1060 static               1          1
172.16.0.0/16        Servers                 5    connected            1          0
192.168.101.0/30  WAN to CE         1010 connected            1          0
192.168.102.0/30  WAN to CH         1020 connected            1          0
192.168.103.0/30  WAN to THS       1030 connected            1          0
192.168.104.0/30  WAN to TMS       1040 connected            1          0
192.168.105.0/30  WAN to SQ        1050 connected            1          0
192.168.106.0/30  WAN to GW       1060 connected            1          0
192.168.200.0/24   172.16.0.10        5    static               1          1
192.168.201.0/24   172.16.0.10        5    static               1          1
192.168.202.0/24   172.16.0.10        5    static               1          1

172.16.1.7 is the Firewall and 172.16.0.10 (internet port) is the MSM controller inside the 5412zl. 192.168.200.0, 192.168.201.0, & 192.168.202.0 are the IPs for the MSM in the tunneled interfaces.

Here is the IP route/ VLAN for one of the sites (Greenwood)

Greenwood IP/VLAN:

1      | MGMT                  | Manual     10.106.0.1      255.255.248.0   No No
100  | User VLAN           | Manual     10.106.100.1    255.255.252.0   No No
120  | Services A           | Manual     10.106.120.1    255.255.255.0   No No
140  | Services B           | Manual     10.106.140.1    255.255.255.0   No No
160  | Services C           | Manual     10.106.160.1    255.255.255.0   No No
200  | WLAN Managed   | Manual     10.106.200.1    255.255.255.0   No No
1060| WAN to Core       | Manual     192.168.106.2   255.255.255.252 No No

Greenwood IP Route:

0.0.0.0/0              192.168.106.1      1060 static               1          1
10.106.0.0/21      MGMT                     1    connected            1          0
10.106.100.0/22   User VLAN            100  connected            1          0
10.106.120.0/24   Services A            120  connected            1          0
10.106.140.0/24   Services B            140  connected            1          0
10.106.160.0/24   Services C            160  connected            1          0
10.106.200.0/24   WLAN Managed    200  connected            1          0
192.168.106.0/30 WAN to Core        1060 connected            1          0

All the routes work and ping correctly on wire.  On the wireless there are 4 VSCs one on the network and 3 tunneled.  All APs are untagged in the MGMT vlan and tagged in the WLAN Managed VLAN.

Problem:

90% of the clients connect fine and authenticate, but 10% authenticate fine according to the log, but receive no IP (169.x.x.x).  This is on the tunneled VSCs as well as the non-tunnneled.  All the DHCP servers are connected on the server vlan on the core switch, and there are no DHCP problems on wired ports.  The problems seems completely random... machines right next to each other will work and not work.  Machine might work on one subnet... move to the next site will not get an ip even if there are many right next to it that do work.

The problem does seem to happen more often on the subnets outside of the core router.  All APs find the controller by dns.  They are configured/synchronized and show to be working.

I have used wireshark to try to determine the problem on the controller port and the client port.  The client authenticates ok, sends dhcp requests and gets no answer, times out.

Any Ideas at all... i am going crazy here!!!  

Let me know if you need any more information.

Thanks!
LVL 1
tahlequahitguysAsked:
Who is Participating?
 
RikeRCommented:
I was thinking the DHCP would be on the internet port of the 765, but I think I'm wrong. I just reread your question.

Is it true that client on different VLAN have these problems? If so, this issue could by RF related. DHCP is a broadcast which is usually sent at a lower rate and it is a known issue for clients at the edge of a wifi network will be able to connect, but unable to receive an IP address. This should be seen by a low SNR value ( < 20)
0
 
tahlequahitguysAuthor Commented:
Also... forgot to mention that a restart of the MSM will sometimes get an ip for the clients, but it does not last
0
 
RikeRCommented:
Which firmware are you on right now? The current release should be 5.4.1.

I know there is an issue with the 765 and DHCP relaying when using teaming. Procurve has not come up with a solution for it.

You can also open a ticket at:
http://www.wlanparts.com/product/PA24-19/19dBi-24GHz-Panel-Antenna.html

You can find some reference guide at:
http://h10144.www1.hp.com/solutions/enterprise/mobility/mobility-resources.htm

Just to check your design ;)
0
WEBINAR: 10 Easy Ways to Lose a Password

Join us on June 27th at 8 am PDT to learn about the methods that hackers use to lift real, working credentials from even the most security-savvy employees. We'll cover the importance of multi-factor authentication and how these solutions can better protect your business!

 
tahlequahitguysAuthor Commented:
I just upgraded to 5.4.1, but still the same problem, and we are not using teaming
0
 
RikeRCommented:
If the DHCP on a tagged VLAN? If so, see if you can try on an untagged.
0
 
tahlequahitguysAuthor Commented:
I have a DHCP Server scope for both the tagged VLAN (200) and the untagged VLAN (1)  THE APs are on both so that should cover it, right?  Is that what you are meaning?  Or are you thinking to just untag it to 200 and forget 1?
0
 
tahlequahitguysAuthor Commented:
That would make the most sense because of it being a problem across VSCs with different DHCP servers.  I am going to do some experimenting today and watch the SNR and get back to you.
0
 
RikeRCommented:
To get an idea of the signal strength you can use the heatmapper from Ekahau (for free).
It does not show you the noise clients experience, so the SNR is still good to read from the controller.
0
 
tahlequahitguysAuthor Commented:
Well it wound up being the setting "distance between access points."  The default is large and our APs are fairly close together.  Therefore the clients were holding onto APs that were farthest away and gettting very low snr.  Changed it to medium and the problem seems to have gone away.

Thanks
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.