TFTP Oen Timeout when PXE Booting to WDS Server

Current Config:
1x HP Procurve 5300 series Switch
Vlan 1 = 172.16.x.x Servers
Vlan 2 = 192.168.x.x Clients
IP Helper (DHCP Server address)
Forward Protocol tftp (Port69) to WDS Server
Forward Protocol 4011  to WDS Server

1x DHCP Server = Win 2003 Server
Scopes are set with Option 66 and 67

1x WDS Server = Win Server 2008
Both Servers are on Vlan 1
Client machines are on Vlan 2

Clients on Vlan 1 PXE boot no problem.
Clients on Vlan 2 recieve a DHCP Address but then recieve a TFTP Open Timout message.

Any suggestions are welcome on how to get the clients in Vlan 2 to PXE boot with out this error.
adamlcohenAsked:
Who is Participating?
 
vivigattCommented:
You do not need to forward TFTP port.
TFTP works accross routers (but not accross NATS)

You do not need either to forward UDP 4011:
Check my article here:
http://www.experts-exchange.com/Networking/Misc/A_2978-PXEClient-what-is-it-for-Can-I-use-PXE-without-it.html
If you have to use UDP 4011, the PXE ROM client already knows what (server) IP address to use for the "PXE requests" and won't use broadcasts to contact a PXE server. Instead, it will send direct packets to the DHCP server, with PXE requests, on UDP 4011. So no ip-helper needed for this port either. SInce your WDS server and your DHCP server do not run on the same host, you must NOT use UDP 4011. So you must not set DHCP option 60 to "PXEClient"

Thus all you need to forward with ip-helpers are the DHCP broadcasts

Regarding the option 66 and 67, they should NOT be set if you are using a standard WDS configuration. But since it works on VLAM 1, it should work too on Vlan 2, so don't change that.

Now, make sure that WDS TFTP service is bound to the WDS server NIC/IP address which is connected to Vlan 2.

You can make non-PXE tests (easier):
Use a tftp client. Windows has one (tftp.exe a command line tool).
Store a file "foobar.txt" in the TFTP root folder of your WDS server
From a computer in VLAN1, (to make sure it works OK), run the command
TFTP <WDS-Server IP address> GET foobar.txt
if it works, delete the local file "foobar.txt" (note that this file is set to be read-only.

Now from a computer connected to VLAN2, run the SAME command.
TFTP <WDS-Server IP address> GET foobar.txt

Make sure it works OK and fix things that may prevent it from working.
For instance:
1/ bindings of services
(check on WDS server with the command
netstat -ba -p UDP -n
see what process are bound to tftp port (UDP 69), and on what addresse it is bound

You should have something like this:
UDP    0.0.0.0:69             *:*

followed by the name of the WDS process (sorry I can't tell what process it uses. maybe WDS.EXE !

2/
Also check your firewall(s) configuration: UDP traffic on port 69 must be allowed from VLAN2 to WDS server

3/ routing:
make sure that a client in VLAN2 can actually "see" hosts in VLAN1. Use ping, telnet, ssh, whatever, to check the connectivity between the VLANS. iphelper can fool you to think that routing does work when actually it does not.

4/
Last but not the least: upgrade your procurve firmware to the latest version. ProCurve are nice devices, but their firmware needs often to be updated to solve various issues.
0
 
pmasottaCommented:
something to do with the fact that you are forwarding the TFTP control port (69) but probably not the random port used by the TFTP server for sending DATA packets?
This way the file request reach the TFTP server but the TFTP answer never gets to the client

explained on  RFC 1350
0
 
adamlcohenAuthor Commented:
Thanks for the reply.

I assume we are talking abou the TID's here? If so, am I correct in assuming Microsoft WDS ia using ports 64000-65000 ? In which case we need to port forward this range from VLAN 2 to VLAN 1? There ports would be open by default, but I guess the clients on VLAN 2 just don't know where to send the traffic.

Shame the dynamic port process doesn't work like FTP, then we wouldn't have to forward all these ports. I guess that is why it is 'trivial'!!!

Cheers,.
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

 
pmasottaCommented:
if the connection to the TFTP "times out", the clients know where the TFTP server is located and they reach it using UDP port port 69
next the TFTP answer tries to come back on a random port that if not open never reaches the client.
If the clients do not receive the TFTP address or cannot connect to it the message is different than a TFTP timeout.

if you are using WDS and your client gets to the TFTP instance the WDS ports for its RPC communication seem to be working fine, here your problem is not WDS RPC, it's not DHCP/Proxy DHCP, it seems to be only the TFTP DATA packets...

if you feel confident sniffing the protocols give wireshark a try; you'll quickly see where the TFTP traffic gets stuck

0
 
adamlcohenAuthor Commented:
Thanks for all the updates.

I've just checked the TFTP from the command prompt:
On both VLAN1 and VLAN2, the response is as follows:
"Error on Server: Transfer mode not supported"
Perhaps this is a typrical response from WDS, or have I got the wrong root folder.
WDS Root folder is <drive>\RemoteInstall ?

Any how, in reponse to your other tests...

1/ NETSTAT outout.
(Is the Error 5 signicant ?)
x: Windows Sockets initialization failed: 5
  UDP    172.16.0.10:4011       *:*
  WDSServer
 [svchost.exe]
  UDP    172.16.0.35:67         *:*
  WDSServer
 [svchost.exe]
  UDP    172.16.0.35:68         *:*
  WDSServer
 [svchost.exe]
  UDP    172.16.0.35:69         *:*
  WDSServer
 [svchost.exe]
  UDP    172.16.0.35:4011       *:*
  WDSServer

2/Firewall,
I've disabled the Windows firewall and tested again.
It made no difference.

3/VLAN routing.
VLAN2 clients can ping the WDS, DHCP and other servers on VLAN1.

4/Procurve firmware.
Recently update to E.11.29.

Thanks for all your help thus far.

0
 
pmasottaCommented:
the term "forward" here is used in the sense of the first post while describing the server config and not in the "router" sense.
of course UDP is "routed" by routers. The term means that the particular port has to be able to "travel" and not being blocked from one net to another one.

the rest of your explanation seems to forget that "everything works" but the TFTP DATA answer....
0
 
pmasottaCommented:
reading @adamlcohen last post it seams the problem is not just a "TFTP Open Timout message" as described on the 1st post.
and yes "Windows Sockets initialization failed" it could be important if a requiered socket failed on init.
0
 
vivigattCommented:
Look. Your TFTP server is bound only to this IP address:
UDP    172.16.0.35:69

Is it expected ?

Also regarding the error
"Error on Server: Transfer mode not supported"

try this command instead:
TFTP -i <WDS-Server IP address> GET foobar.txt

This will instruct TFTP client to use a binary transfer mode
0
 
adamlcohenAuthor Commented:
Good spot!

We actually added 172.16.0.10 address to this server after it was built, this was the address of out WDS/RIS server which also had the same problem. Perhaps that is why the the bindings on this address are not correct.

So using the IP 172.16.0.35, the response from theTFTP -i command is as following on both VLANS.
Error on Server : Access Denied.

So I placed foorbar.txt in the \Boot\x64\ folder and tried again and went with
TFTP -i 172.16.0.35 get \Bookt\x64\foobar.txt and now get the following error on both VLANS:
'Timeout occured'

I've installed Microsoft Netmonitor in the WDS server as well, and following the TFTP command I can se the following:
245      14:38:44 06/05/2011      17.0791241      svchost.exe      172.16.0.218      172.16.0.35      TFTP      TFTP: Read Request - File: \boot\x64\foobar.txt, Transfer Mode: octet       {UDP:113, IPv4:112}
257      14:38:45 06/05/2011      18.0684730      svchost.exe      172.16.0.218      172.16.0.35      TFTP      TFTP: Read Request - File: \boot\x64\foobar.txt, Transfer Mode: octet       {UDP:113, IPv4:112}
293      14:38:47 06/05/2011      20.0685479      svchost.exe      172.16.0.218      172.16.0.35      TFTP      TFTP: Read Request - File: \boot\x64\foobar.txt, Transfer Mode: octet       {UDP:113, IPv4:112}
358      14:38:51 06/05/2011      24.0685915      svchost.exe      172.16.0.218      172.16.0.35      TFTP      TFTP: Read Request - File: \boot\x64\foobar.txt, Transfer Mode: octet       {UDP:113, IPv4:112}
468      14:38:59 06/05/2011      32.0687085      svchost.exe      172.16.0.218      172.16.0.35      TFTP      TFTP: Read Request - File: \boot\x64\foobar.txt, Transfer Mode: octet       {UDP:113, IPv4:112}
623      14:39:07 06/05/2011      40.0687536      svchost.exe      172.16.0.218      172.16.0.35      TFTP      TFTP: Read Request - File: \boot\x64\foobar.txt, Transfer Mode: octet       {UDP:113, IPv4:112}
810      14:39:15 06/05/2011      48.0689737      svchost.exe      172.16.0.218      172.16.0.35      TFTP      TFTP: Read Request - File: \boot\x64\foobar.txt, Transfer Mode: octet       {UDP:113, IPv4:112}
938      14:39:23 06/05/2011      56.0690752      svchost.exe      172.16.0.218      172.16.0.35      TFTP      TFTP: Read Request - File: \boot\x64\foobar.txt, Transfer Mode: octet       {UDP:113, IPv4:112}
1078      14:39:31 06/05/2011      64.0701653      svchost.exe      172.16.0.218      172.16.0.35      TFTP      TFTP: Error - ErrorCode: 0, ErrorMessage: timeout on receive       {UDP:113, IPv4:112}
1081      14:39:31 06/05/2011      64.0724239      svchost.exe      172.16.0.35      172.16.0.218      TFTP      TFTP: Error - ErrorCode: 4, ErrorMessage: Illegal operation error.       {UDP:113, IPv4:112}

NOTE: this is on VLAN1 and we know that PXE booting and imaging works correcrtly on this netowork. I get the same log when connecting with TFTP -i on the VLAN  2.


So tried a PXE boot from the same devices with on VLAN 1 (and it still works OK):
287      14:49:53 06/05/2011      17.6962693      svchost.exe      172.16.0.218      172.16.0.10      DHCP      DHCP:Request, MsgType = REQUEST, TransactionID = 0x35870F22      {DHCP:55, UDP:64, IPv4:63}
290      14:49:53 06/05/2011      17.6967747      svchost.exe      172.16.0.10      172.16.0.218      DHCP      DHCP:Reply, MsgType = ACK, TransactionID = 0x35870F22      {DHCP:55, UDP:64, IPv4:63}
291      14:49:53 06/05/2011      17.6988918      svchost.exe      172.16.0.218      172.16.0.10      TFTP      TFTP: Read Request - File: Boot\x86\pxeboot.com, Transfer Mode: octet tsize: 0       {UDP:65, IPv4:63}
292      14:49:53 06/05/2011      17.7005532            172.16.0.10      172.16.0.218      TFTP      TFTP: Option Acknowledgement - tsize: 25772       {UDP:51, IPv4:63}
293      14:49:53 06/05/2011      17.7007692            172.16.0.218      172.16.0.10      TFTP      TFTP: Error - ErrorCode: 0, ErrorMessage: TFTP Aborted       {UDP:51, IPv4:63}
294      14:49:53 06/05/2011      17.7013935      svchost.exe      172.16.0.218      172.16.0.10      TFTP      TFTP: Read Request - File: Boot\x86\pxeboot.com, Transfer Mode: octet blksize: 1456       {UDP:67, IPv4:63}
295      14:49:53 06/05/2011      17.7030575            172.16.0.10      172.16.0.218      TFTP      TFTP: Option Acknowledgement - blksize: 1456       {UDP:52, IPv4:63}
296      14:49:53 06/05/2011      17.7032664            172.16.0.218      172.16.0.10      TFTP      TFTP: Acknowledgement - Block Number: 0      {UDP:52, IPv4:63}
297      14:49:53 06/05/2011      17.7034037            172.16.0.10      172.16.0.218      TFTP      TFTP: Data - Block Number: 1      {UDP:52, IPv4:63}
298      14:49:53 06/05/2011      17.7038913            172.16.0.218      172.16.0.10      TFTP      TFTP: Acknowledgement - Block Number: 1      {UDP:52, IPv4:63}

Then after F12 on the client.....
1243      14:54:07 06/05/2011      19.0573770            172.16.0.218      172.16.0.35      TFTP      TFTP: Acknowledgement - Block Number: 34      {UDP:56, IPv4:33}
1244      14:54:07 06/05/2011      19.0825033      svchost.exe      172.16.0.218      172.16.0.35      TFTP      TFTP: Read Request - File: \hiberfil.sys, Transfer Mode: octet tsize: 0       {UDP:57, IPv4:33}
1245      14:54:07 06/05/2011      19.0840979      svchost.exe      172.16.0.35      172.16.0.218      TFTP      TFTP: Error - ErrorCode: 4, ErrorMessage: Access violation.       {UDP:57, IPv4:33}
1274      14:54:10 06/05/2011      22.0073586      svchost.exe      172.16.0.218      172.16.0.35      TFTP      TFTP: Read Request - File: \Boot\x86\Images\boot.wim, Transfer Mode: octet tsize: 0       {UDP:68, IPv4:33}
1275      14:54:10 06/05/2011      22.0092173            172.16.0.35      172.16.0.218      TFTP      TFTP: Option Acknowledgement - tsize: 145399718       {UDP:69, IPv4:33}
1276      14:54:10 06/05/2011      22.0093577            172.16.0.218      172.16.0.35      TFTP      TFTP: Error - ErrorCode: 0, ErrorMessage: TFTP Aborted       {UDP:69, IPv4:33}
1277      14:54:10 06/05/2011      22.0094823      svchost.exe      172.16.0.218      172.16.0.35      TFTP      TFTP: Read Request - File: \Boot\Boot.SDI, Transfer Mode: octet tsize: 0       {UDP:70, IPv4:33}
1278      14:54:10 06/05/2011      22.0111048            172.16.0.35      172.16.0.218      TFTP      TFTP: Option Acknowledgement - tsize: 3170304       {UDP:71, IPv4:33}
1279      14:54:10 06/05/2011      22.0113579            172.16.0.218      172.16.0.35      TFTP      TFTP: Error - ErrorCode: 0, ErrorMessage: TFTP Aborted       {UDP:71, IPv4:33}
1280      14:54:10 06/05/2011      22.0113579      svchost.exe      172.16.0.218      172.16.0.35      TFTP      TFTP: Read Request - File: \Boot\Boot.SDI, Transfer Mode: octet tsize: 0 blksize: 1422 windowsize: 4       {UDP:72, IPv4:33}
1281      14:54:10 06/05/2011      22.0132645      svchost.exe      172.16.0.35      172.16.0.218      TFTP      TFTP: Option Acknowledgement - blksize: 1422 windowsize: 4 tsize: 3170304       {UDP:73, IPv4:33}
1282      14:54:10 06/05/2011      22.0524825      svchost.exe      172.16.0.218      172.16.0.35      TFTP      TFTP: Acknowledgement - Block Number: 0      {UDP:73, IPv4:33}
1283      14:54:10 06/05/2011      22.0529849      svchost.exe      172.16.0.35      172.16.0.218      TFTP      TFTP: Data - Block Number: 1      {UDP:73, IPv4:33}
1284      14:54:10 06/05/2011      22.0530058      svchost.exe      172.16.0.35      172.16.0.218      TFTP      TFTP: Data - Block Number: 2      {UDP:73, IPv4:33}

Even with the Access Violations, the WDS image processes still works as expected from VLAN 1.

I am on the WDS server on VLAN1 andI don't get any activity when I monitor the traffic from  a client on VLAN2 traffic.

??

0
 
vivigattCommented:
Tftp utility in windows create a readonly file on a get operation.
Subsequent get on the same file will fail with access is denied error
0
 
adamlcohenAuthor Commented:
Hmm, I could never get the file transferred, so that does not explain the error ?
I think the Connection Timeout from the TFTP -i GET  \Bookt\x64\foobar.txt
is closes to the problem we are getting (PXE-E32 TFTP Open Timeout).

 What I can not figure out is why the connection is timing out ???
0
 
adamlcohenAuthor Commented:
Ok, more information: We have just noticed that if we include the scope option 66 and 67 on VLAN 1, we see the following PXE mesages:
UDSNBP Started Using DHCP Referral
Contacting Server: 172.16.0.35 (Gateway 0.0.0.0) <---- WHAT!
Contacting Server: 172.16.0.35
TFTP Download: Boot\x64\pxeboot.com

Press F12 for entwork service boot.

------
OK, so why is the gateway 0.0.0.0, if we have this gateway also on he VLAN 2, then there is no way the traffic can route between VLAN1 and VLAN2.

Does anyone know where WDS is getting this gateway address from, as this is not configured on the server NIC ?

Cheers ?
0
 
vivigattCommented:
This is NOT wds which sets this gateway, but the ship server.  the dhcp scope for vlan2 should set the correct gateway for the hosts in this lan.
can you connect a pc to vlan2 and make it hdd boot with it getting its up config by dhcp? Then check that it can ping its router (the interface on the procure which is in vlan2). And then that it can ping wds server.
After all it may be a routing problem!
0
 
vivigattCommented:
Of course i meant dhcp server. Not ship server. Stupid autocorrect in android!
0
 
adamlcohenAuthor Commented:
Outside of the PXE boot issue, all our desktop on VLAN2 can connect/ping/map drivers to any of the servers on VLAN1 I can also release and renew the IP address of client on VLAN 2.

Cheers.
0
 
adamlcohenAuthor Commented:
Just to clarify your other point;
Can ping the gateway of the VLAN1 and VLAN2 from either VLAN.
Also can ping the SHIP :) server from a client on VLAN2.

So don't think this is a routing issue?
0
 
vivigattCommented:
I seem t recall that there used to be some troubles with PXE and the DHCP provided gateway.
Is there a BIOS update for your computer that are supposed to PXE-Boot? You could try that.
Also, you may want to try a "PXE on floppy" just to see if the issue is related to your PXE implementation.
0
 
vivigattCommented:
Can you post a trace (a real .cap file...) of what happens on UDP port 67, 68, 69 and 4011 on the WDS server when a client from Vlan2 is booting?
0
 
adamlcohenAuthor Commented:
Sorry not been update this recently.
However, as you suggested I attempted it from the PXE boot disk and the whole process worked perfectly from the 192.168.0.0 network.

So now I am confused as to why the intergrated PXE fails to work on all our clients, but the same clients work with the Bootable disk....?
0
 
vivigattCommented:
This could be a BUG in the PXE implementation.
It wouldn't get or use the "gateway".
What is the NIC in the clients? And the PXE code version/level?
Check for BIOS updates for your clients...
0
 
adamlcohenAuthor Commented:
On this particular machine, the I've just update the BIOS, but Intel UNDI was always  PXE 2.1 (Build 082) which I think is the latest.  This one is from a Realtek Controller, although I the issue will affect all clients on the 192.168.0.0 network and we have around 500+ plus of them.
0
 
vivigattCommented:
Intel base code 082 is not bugged, but the issue may be in UNDI driver.
JUst to make sure that I understand:
When you use PXE boot disk on a client which cannot boot through its own embedded PXE, then it works?
0
 
adamlcohenAuthor Commented:
Yes thats correct
0
 
vivigattCommented:
Correct me if Y am wrong but the boot disk may be using PXE 0.99 (if it is a LanWorks or Argon Technology boot disk). Can you confirm? This could help us trying to understand what actually happens.
0
 
adamlcohenAuthor Commented:
The boot disk is from http://rom-o-matic.net/.

The image reports to be gPXE1.0.1+

Cheers.
0
 
vivigattCommented:
OK. gPXE is not PXE and adds some more stuff to the PXE system.
So it may be difficult to get some useful facts from the facts that it works with gPXE put not with embedded PXE.

I think that a dual recording of the packets on the client side and then on the server side may help.
You might be able to set one port of your switch to be used as a "sniffer" port on the VLAN in which the server is connected and then on the 192.168.0.0 VLAN.
Filter on UDP 67, 68 and 69 (and 4011, who knows).
You might want to record
- 1/ a non working session on the client side
- 2/ a non working session on the server side (with the same client)
- 3/ 2 working sessions (with the gPXE boot disk) on the client and server side

And then, maybe (if you can), post the captures so that we can study them.

Ah, one thing: What is the setting of STP (Spanning Tree Protocol) on your switch(es)? It should be enabled and, if possible, set to "portfast".
0
 
adamlcohenAuthor Commented:
Thanks for the update. I belive that HP does have configurable monitor port, so I'll see what I can captiure for you.

With regards to STP, I know that this not enabled on the switches. We where advised by HP to disable this as it causes to many problems. Just so I understand, can you fill me in with the what spanning tree will do to help DHCP/pxe traffic ?

Many thanks,
Adam.
0
 
vivigattCommented:
Spanning tree will prevent loops in your network (by detecting them and disabling the corresponding ports).
If configured correctly, this is not a problem, usually.

I know a little the ProCurve hardware. Can you tell me what firmware is installed on your 5300 ?
0
 
adamlcohenAuthor Commented:
Current firmware is E.11.29.

This deivice was updated about a month ago.

Cheers.
0
 
vivigattCommented:
What is the status of "IP Routing on your switch.

can you go on the "IP configuration" page and let us know if you have something similar to:

IP Routing : Enabled


  Default TTL     : 64  
  Arp Age         : 20  
  Domain Suffix   :                              
  DNS server      :                                        

  VLAN                 | IP Config  IP Address      Subnet Mask     Proxy ARP
  -------------------- + ---------- --------------- --------------- ---------
  DEFAULT_VLAN         | Manual     192.168.13.252  255.255.255.0   No
  VLAN_14            | Manual     192.168.14.252  255.255.255.0   No
  VLAN_16            | Manual     192.168.16.252  255.255.255.0   No
  VLAN_18            | Manual     192.168.18.252  255.255.255.0   No
  VLAN_23            | Manual     192.168.23.252  255.255.255.0   No
0
 
adamlcohenAuthor Commented:
Here you go:
 Internet (IP) Service

  IP Routing : Enabled


  Default TTL     : 64
  Arp Age         : 20

  VLAN         | IP Config  IP Address      Subnet Mask     Proxy ARP
  ------------ + ---------- --------------- --------------- ---------
  default      | Manual     172.16.1.100    255.255.252.0   No
  VLAN2        | Manual     192.168.0.1     255.255.248.0   No
  VLAN3        | Manual     192.168.8.1     255.255.248.0   No
  VLAN2100     | Disabled
0
 
vivigattCommented:
Seems OK with me.
I assume that there is no routing issue since this works with a gPXE on "disk-booted systems".
Just to make sure: Clients in VLAN2 must receive 192.168.0.1 as their default address.

To use the monitoring feature in a ProCurve, go to "configuration"/Monitor port.
Then, select to monitor either a VLAN either another port. In your case, VLANs may be better.

BTW: Are the clients in VLAN3 able to PXE boot or do they act exactly as the ones in VLAN2?
0
 
adamlcohenAuthor Commented:
Thanks, yes client on VLAN2 get 192.168.01 as the gateway.
I verified this from gPXE and also from the OS.

I'll set up the port monitoring tomorrow and see what we get.

VLAN3 is our wirless network, so don't generally PXE boot.

Thanks for all you help thus far.

0
 
adamlcohenAuthor Commented:
The only solution was to add a second NIC to thw WDS server and on the VLAN2.
This meant that all clients on VLAN2 could now see the WDS server.
Not an ideal soluion but the only one that worked.
The HP port mirroring did show any conlsusive resolts for PXE boot requests.


(p.s.) sorry for the late update been on a sabitcal since last update.
0
 
adamlcohenAuthor Commented:
Could not get any usable info from HP port monitoring.
Even HP could not solve this problem.
Had to resort to a dual-nic/VLAN configuration
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.