Link to home
Start Free TrialLog in
Avatar of Pkafkas
Pkafkas

asked on

Why is DHCP not beig relayed to this new VLan, on a HP ProCurve Switch?

Hello:

I have a network routing or a DHCP problem.  

We have 2 offices at my work and these 2 offices are linked together with a 10mbs Fiber connection(12 miles a part).  We have a DHCP setup to provide each office with its own IP Address scheme:

Office#1: 172.20.1.0 (Scope: .100 - .175)
Office#2: 172.20.2.0 (Scope: .100 - .140)
We only have 1 DHCP Server which is 172.20.1.3 and that is located at Office#1.

The DHCP Services has been working fine for years; but, yesterday (Friday) DHCP was not being passed to the Office#2 location.  I do not see any warning or errors on the DHCP server.   There is only 1 Ethernet Switch at Office#2 and that switch has the only VLAN that is configured with a 172.20.2.X IP address.  All of the Office#1 ethernet switches have 172.20.1.X IP addresses assigned; hence, DHCP requests (for 172.20.2.X) can only come from Office#2.

Office#2's HP ProCurve Switch IP address is 172.20.2.252
IP Routing is enabled.  
172.20.2.254 is our Firewall which handles the routing rules at office#2.

The Configuration for Office#2 HP ProCurve's ethernet switch is shown below:

Startup configuration:

; J9148A Configuration Editor; Created on release #W.15.14.0012
; Ver #06:04.18.63.ff.35.05:b6
hostname "XX-XXX-48G+POE-2-254"
module 1 type j9148a
trunk 38-39 trk2 lacp
timesync sntp
sntp unicast
sntp server priority 1 172.20.1.3
no telnet-server
time daylight-time-rule continental-us-and-canada
time timezone 6
ip default-gateway 172.20.2.253
ip dns domain-name "XXX.com"
ip dns server-address priority 1 172.20.1.3
ip dns server-address priority 2 172.20.1.15
ip route 0.0.0.0 0.0.0.0 172.20.2.254
ip routing
interface 1
   speed-duplex 100-full
   exit
snmp-server community "MOGL" unrestricted
snmp-server contact "it@mogl.net" location "Mequon Data Closet"
vlan 1
   name "DEFAULT_VLAN"
   no untagged 1,3,6,46
   untagged 2,4-5,7-37,40-45,47-48,Trk2
   ip address 172.20.2.252 255.255.255.0
   ip helper-address 172.20.1.3
   exit
vlan 50
   name "Guest_Wireless"
   untagged 6
   tagged 2,4
   no ip address
   exit
vlan 100
   name "WAN"
   untagged 1,46
   ip address 172.20.100.2 255.255.255.248
   exit
vlan 102
   name "Voice"
   untagged 3
   ip address 172.21.2.254 255.255.255.0
   qos priority 5
   exit
spanning-tree Trk2 priority 4
no autorun
password manager
password operator

XXX-XX-48G+POE-2-254(config)#


The DHCP at Office#1 is working just fine; but, Office#2 leased IP addresses began expiring.  Some PC's at Office#2 still were functional because those leased IP addresses did not expire yet.  I tested this by going to a Windows 7 Pro. PC that was still working (at Office#2) and that was using an IP in the DHCP Scope.  I ran an ipconfig /release<enter> and then an ipconfig /renew<enter> and all of the sudden that PC could not re-connect to the network.  It could not fetch a new DHCP IP address.

It is important to mention that I did update the firmware on the Office#2 switch the day before these DHCP leases started dropping.  But, if I boot the switch back to the previous flash version (stored in the secondary flash slot)  The same problem is present.  We should try to make this work with the new version anyway.

I then added a static IP address to these disconnected PC's, at Office#2, and then those PC's began working just fine.  Hence, I do not believe it is a network routing problem.  Because the telephony is working just fine(which is housed at Office#1) and the data is flowing with Devices that have static IP addresses just fine(Data Replication appliances located at both offices).  Again, the problem is exclusively with devices that were trying to connect to the DHCP at Office#2(ip config /release /renew proved that).  Incidentally one could ping 172.20.1.3 from any device (PC or Switch/Appliance) in Office#2 that had a static IP address.  And Devices in Office#1 could ping Devices in Office#2 just fine.  I do not think routing is a problem.  

Perhaps the "ip helper-address" is not working at the switch in Office#2 or the DHCP scope is not working?  If I am at Office#1 and I try the same ipconfig /release and /renew test on a PC at Office #1, the Office#1 PC's(172.20.1.X) are able to retrieve their DHCP IP address just fine.  

We have users bringing laptops and moving back and forth between Office#1 and Office #2; hence, we will need to get this DHCP problem solved.

My question is what can I do to get DHCP to begin working again at the Office#2 location.
Avatar of Pkafkas
Pkafkas

ASKER

I wonder if I create a new VLAN on a Switch at Office#1 with a 172.20.2.X IP address and then connect a laptop to an open port on that VLAN.  

Will that laptop on that 172.20.2.X VLan (at Office#1) receive a new DHCP IP address?  That should be a good test to determine if the 172.20.2.0 DHCP scope needs to be removed and then re-created.  

Any thoughts?
Avatar of Pkafkas

ASKER

It might be important to mention that I was able to successfully able to install the Firmware update on another HP ProCurve switch and there were no problems with the other switch (at Office#1).

https://www.experts-exchange.com/questions/28707383/How-to-update-the-firmware-on-an-HP-ProCurve-Ethernet-Switch-with-a-USB-drive.html
Avatar of Pkafkas

ASKER

From what I have read, the the Default Gateway setting on the switch is disabled once IP routing is enabled; but, we do not have anything active at 172.20.2.253 so it should be removed any way.  Perhaps that is something different with the new firmware version?  

Should I take out that rule, for the Default Gateway, since we have the ip route 0.0.0.0 0.0.0.0 172.20.2.254 anyway?  I do not think that the default gateway rule serves any purpose so should it be taken out anyway?
ASKER CERTIFIED SOLUTION
Avatar of Don Johnston
Don Johnston
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Pkafkas

ASKER

Ahh, so you are suggesting that I assign a Vlan-whatever to a port on a switch in Office#1, not office #2.  The test a laptop to see if the switch in Office#1 trying to access DHCP scope 172.20.2.20 works.  That is a good idea.

If the DHCP Scope doe snot work on Office#1 switch, what will that tell us?  Perhaps that I should remove that scope and then re-create it?

If the DHCP scope works on Office#1 switch, what will that tell us?


ANotehr question is, what about the fact that the default gateway setting on the Office#2 switch is set to 172.20.2.253 (which is not active), should I remove that entry?  Is removing that entry better design, since it is not being used because IP routing is enabled?  172.20.2.253 is not being used by anything right now anyway,
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Pkafkas

ASKER

So I am not sure how to setup the VLan-Test to accept because our Default gateway is setup to route 172.20.2.0 traffic to the other location:

ip route 172.20.2.0 255.255.255.0 172.20.100.5

Perhaps we temporarily disable that rule for testing.  BY monday all of the IP addresses will have expired and that office will not be active with people anyway so it is a good opportunity to temporarily disable that routing rule.

I am planning on working on this problem with a Consultant on Monday.  I do think that if the DHCP Scope does not work at Office #1 then it should be re-created.  The thing is that nothing was changed on that DHCP Server; but there have been recent changes to the routing rules.

If the DHCP Scope (172.20.2.0) appears to work correctly at Office#1(for testing) then the problem has to be a routing rule.  DO you concur?
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
If I recall correctly, vlan 1 is the "default/management" vlan

I think that you do have a problem with the ip helper config.
Check this wikipedia article for more details about dhcp relaying:
https://en.wikipedia.org/wiki/Dynamic_Host_Configuration_Protocol#DHCP_relaying

When a host on any vlan sends a DHCP request on this vlan, the request is forwarded as unicast to the IP address specified by ip-helper, with a GIADDR field set to the gateway interface. In your case, the  GIADDR would then be 172.20.2.252
When received by DHCP server, it tries to match GIADDR with one of its scopes and assign an IP address in that scope.

IP routing must then be OK.
In your case, it seems that it would be routed to 172.20.2.254 but I seem to recall that there should be a gateway definition in your vlan 1 setting.

Also the double configuration ip route 0.0.0.0 0.0.0.0/default gateway seems odd to me.
default gateway IS the route to 0.0.0.0/0.0.0.0!

I would clean-up the procurve settings for the routing stuff, remove the 0.0.0.0 route and set the default gateway to .2.254. I would also check if a gateway definition for vlan 1 is not working better.
Avatar of Pkafkas

ASKER

I plan to test the DHCP Scope (172.20.2.0) at Office#1 and in the process temporarily stop routing data to Office#2 (for this test).

I will be working with a consultant on this problem Monday.  If the DHCP Scope does not work with the test for at Office#1 then the problem has to be with the Scope.

If the DHCP Scope (172.20.2.0) does work at Office#1, then the problem must be with the DHCP relay.  I did update the Firmware on the HP ProCurve Switch Thursday.  Keep in mind the Office#2 (Firewall and the HP ProCurve Switch at Office#2 and every Device that has a Static IP address at Office #2 can ping and receive replies back to the DHCP server.  If it is not a routing problem then the ip helper address command may not be working correctly for some reason.  A Call to HP support may be required.

We will see.  It has always been to my benefit to simplify the problem and to break it up into different peaces.  If you identify the specific Problem (ip helper command, DHCP settings, Firmware version) then finding the solution is a lot easier.

Keep in mind that I have already tried booting switch to the previous flash version (secondary slot) and that did not make any difference; hence, the firmware should not be the culprit could it?  Right?  One step at a time.

https://h10145.www1.hpe.com/Downloads/DownloadSoftware.aspx?SoftwareReleaseUId=12042&ProductNumber=J9147A&lang=en&cc=us&prodSeriesId=3901671&OrderNumber=&PurchaseDate=
Seeing the topology and configs from the devices would help figure out what going on.
Avatar of Pkafkas

ASKER

What happened was that updating the firmware broke the switches ability to use the ip helper address correctly.

1.  I created a new VLAN at Office#1 to test and see if a device may receive a new IP address from the 172.20.2.0 Scope.
       a.  It was able to.
       b.  hence the DHCP server was not the problem.

2.  A consultant that I was working with suggested that we use the Firewall at Office#2 (which is the default gateway at Office#2) to do the DHCP relaying instead of the HP ProCurve Switch.
       a.  We configured the firewall to do this and then everything worked like a champ.

3.  The ip helper address command was working before the firmware update.
       a.  The ip helper-address command was entered in correctly.
       b.  Switching to teh previous flash version did not help the situation.

4.  The consultant told me that he has seen something like this happen 1 time before.
       a.  A firmware update messed something up and booting the switch to the previous version did not help.
       b.  HP Support stated to re-format the switch and reload the previous firmware and reload the previous config.
       c.  That worked and then updating the firmware after that had no side affects.

5.  Here we took the HP ProCurve's DHCP relay function out of the equation and then everything worked fine.
       a.  So is HP Firmware updates that unpredictable?
       b.  This was a pretty specific problem it was not very obvious that something was wrong until the next day.
       c.  I understand anything could happen; but, is this common with HP ProCurve Switches?
       d.  What if something else breaks next time, will we need to reformat the switch and then reload the configurations?  
       e.  I know that backups are necessary; but, if I need to rely on these updates to work it can be quire annoying.
In 8 years working with HP switches, this is not any different than weird things I've  seen on Cisco switches (which I've been working on for 20 years).

What do you mean by "reformat the switch"?
Avatar of Pkafkas

ASKER

To reset the switch to factory default settings.  The reload the config and firmware.  I suppose this type of thing can happen with any update.  One just needs to know how to reformat the switch and start over if you need to.

http://h10032.www1.hp.com/ctg/Manual/c02564359
I had never heard of resetting to factory defaults and loading a configuration referred to as "reformatting".
Avatar of Pkafkas

ASKER

Interesting, well there is a first for everything I guess.  But regardless resetting to factory default settings should not be a knee jerk reaction to solving a problem.  One should exhaust all other options.  Now I know what to consider for next time.

I never had to do that (reset to default settings) after an update in my 10 years of working with Network switches:

- Juniper
- Nortel
- HP ProCurve

I have not had the pleasure to work with CIsco switches, despite their popularity.