bjulian
asked on
XP cannot access Samba without restart
My office has a mixed network of 98SE and XP Pro PCs and two linux servers running Samba 3.x. One server is set up as a secondary Samba server, pulling a windows browsing list from the other, the primary Samba box. The secondary box is set to not attempt to become the master browser, so as not to compete with the primary box. I have disabled the computer browser service on all XP PCs.
Every morning for the last two weeks I've had to restart the primary samba server before two of the three XP boxes will be able to access it via SMB. No problems connecting the XP boxes via any other services (telnet, ssh, ping, pop3, DNS), just SMB. After a restart, all PCs connect without problems to the primary samba server *until the next morning*.
At first I thought it was some kind of negotiation problem caused by XP, since no other clients were having problems. I tried to test that by temporarily fixing the XP connection problem with a samba restart, and rebooting one of the problem XP boxes in the hopes that the problem would recur, but it doesn't - not until the next morning. I've tried to isolate the problem - win98 PCs never have connection problems to the primary samba server. One of the XP computers, running an upgrade copy of Pro, does not have the connection issue at all. I verified this by booting it before the others. No PCs have problems connecting to the secondary samba server.
Some more about the environment:
Been having network browsing problems, but that was fixed by disabling computer browser service on all XP machines and installing NetBeui protocol. All PCs (98 and XP) have TCP/IP, IPX/SPX, NetBeui and NetBios support installed. Probably overkill but I'm not concerned with the security of it. I don't think that issue is related to my current problem. I should state though, that when the XP pc's can't connect, they can't view the shares of the primary samba server in their network browse lists, even though it is listed.
Two weeks ago I had to rebuild the primary server after a crash, but I recovered the smb.conf file from backup, so I don't know what has changed. I've done some tinkering lately, added a line restricting the port that SMB connections can operate on.
Everyone is on the same subnet. I have no TCP/IP problems. Everyone uses the same workgroup. I don't do windows domain logons, but smbclient lists the domain as CAF for both samba servers, and as the individual computer name of the XP units. Don't know if that matters or not, not sure how to change it on XP side.
Below are my samba setups, minus the shares info (individual share setup shouldn't matter -- XP computers can't see *any* shares, none of it has changed in months anyway)
primary samba server (192.168.99.4) smb.conf:
# Global parameters
[global]
workgroup = CAF
server string = cityh
map to guest = Bad User
username map = /etc/samba/smbusers
password level = 8
username level = 8
log file = /var/log/samba/%m.log
max log size = 50
smb ports = 139
socket options = TCP_NODELAY SO_RCVBUF=8192 SO_SNDBUF=8192
os level = 65
preferred master = Yes
domain master = Yes
wins support = Yes
remote announce = 192.168.99.255
hosts allow = 192.168.99.
preserve case = No
short preserve case = No
secondary samba server (192.168.99.9) smb.conf:
# Global parameters
[global]
workgroup = CAF
server string = cityJ
obey pam restrictions = Yes
pam password change = Yes
log file = /var/log/samba/%m.log
max log size = 50
smb ports = 139
socket options = TCP_NODELAY SO_RCVBUF=8192 SO_SNDBUF=8192
local master = No
domain master = No
remote browse sync = 192.168.99.4
hosts allow = 192.168.99.
This is probably a pretty difficult question so I gave it 500 pts.
Any help is appreciated.
Every morning for the last two weeks I've had to restart the primary samba server before two of the three XP boxes will be able to access it via SMB. No problems connecting the XP boxes via any other services (telnet, ssh, ping, pop3, DNS), just SMB. After a restart, all PCs connect without problems to the primary samba server *until the next morning*.
At first I thought it was some kind of negotiation problem caused by XP, since no other clients were having problems. I tried to test that by temporarily fixing the XP connection problem with a samba restart, and rebooting one of the problem XP boxes in the hopes that the problem would recur, but it doesn't - not until the next morning. I've tried to isolate the problem - win98 PCs never have connection problems to the primary samba server. One of the XP computers, running an upgrade copy of Pro, does not have the connection issue at all. I verified this by booting it before the others. No PCs have problems connecting to the secondary samba server.
Some more about the environment:
Been having network browsing problems, but that was fixed by disabling computer browser service on all XP machines and installing NetBeui protocol. All PCs (98 and XP) have TCP/IP, IPX/SPX, NetBeui and NetBios support installed. Probably overkill but I'm not concerned with the security of it. I don't think that issue is related to my current problem. I should state though, that when the XP pc's can't connect, they can't view the shares of the primary samba server in their network browse lists, even though it is listed.
Two weeks ago I had to rebuild the primary server after a crash, but I recovered the smb.conf file from backup, so I don't know what has changed. I've done some tinkering lately, added a line restricting the port that SMB connections can operate on.
Everyone is on the same subnet. I have no TCP/IP problems. Everyone uses the same workgroup. I don't do windows domain logons, but smbclient lists the domain as CAF for both samba servers, and as the individual computer name of the XP units. Don't know if that matters or not, not sure how to change it on XP side.
Below are my samba setups, minus the shares info (individual share setup shouldn't matter -- XP computers can't see *any* shares, none of it has changed in months anyway)
primary samba server (192.168.99.4) smb.conf:
# Global parameters
[global]
workgroup = CAF
server string = cityh
map to guest = Bad User
username map = /etc/samba/smbusers
password level = 8
username level = 8
log file = /var/log/samba/%m.log
max log size = 50
smb ports = 139
socket options = TCP_NODELAY SO_RCVBUF=8192 SO_SNDBUF=8192
os level = 65
preferred master = Yes
domain master = Yes
wins support = Yes
remote announce = 192.168.99.255
hosts allow = 192.168.99.
preserve case = No
short preserve case = No
secondary samba server (192.168.99.9) smb.conf:
# Global parameters
[global]
workgroup = CAF
server string = cityJ
obey pam restrictions = Yes
pam password change = Yes
log file = /var/log/samba/%m.log
max log size = 50
smb ports = 139
socket options = TCP_NODELAY SO_RCVBUF=8192 SO_SNDBUF=8192
local master = No
domain master = No
remote browse sync = 192.168.99.4
hosts allow = 192.168.99.
This is probably a pretty difficult question so I gave it 500 pts.
Any help is appreciated.
ASKER
Well, I assume so... nmbd.log for the last few days appears to only show the usual startup routine, once a day, when I restart samba in the morning. nmbd does take longer to shut down than smbd does when I run /etc/init.d/smb restart, but I don't get a shutdown failure warning for nmbd.
This is the current output of ps searching for nmbd and smbd:
root # ps -ef | grep nmbd
root 5542 1 0 Jun20 ? 00:00:00 nmbd -D
root 15958 1 0 Jun21 ? 00:00:00 nmbd -D
root 29749 1 0 Jun22 ? 00:00:00 nmbd -D
root 9402 1 0 06:57 ? 00:00:00 nmbd -D
root 9403 9402 0 06:57 ? 00:00:00 nmbd -D
root 11929 11699 0 11:02 pts/1 00:00:00 grep nmbd
root # ps -ef | grep smbd
root 9398 1 0 06:57 ? 00:00:00 smbd -D
root 9720 9398 0 07:27 ? 00:00:00 smbd -D
root 9764 9398 0 07:35 ? 00:00:00 smbd -D
root 9811 9398 0 07:38 ? 00:00:00 smbd -D
root 9977 9398 0 08:11 ? 00:00:00 smbd -D
root 10243 9398 0 08:37 ? 00:00:00 smbd -D
root 10416 9398 0 08:52 ? 00:00:00 smbd -D
root 11931 11699 0 11:02 pts/1 00:00:00 grep smbd
smbd entries all show they were started today, but nmbd has some that are days old. Is that indicative of a problem? As I stated before, I have restarted smb every morning for the past few days, so I'd expect all nmbd entries to be no more than 1 day old.
This is the current output of ps searching for nmbd and smbd:
root # ps -ef | grep nmbd
root 5542 1 0 Jun20 ? 00:00:00 nmbd -D
root 15958 1 0 Jun21 ? 00:00:00 nmbd -D
root 29749 1 0 Jun22 ? 00:00:00 nmbd -D
root 9402 1 0 06:57 ? 00:00:00 nmbd -D
root 9403 9402 0 06:57 ? 00:00:00 nmbd -D
root 11929 11699 0 11:02 pts/1 00:00:00 grep nmbd
root # ps -ef | grep smbd
root 9398 1 0 06:57 ? 00:00:00 smbd -D
root 9720 9398 0 07:27 ? 00:00:00 smbd -D
root 9764 9398 0 07:35 ? 00:00:00 smbd -D
root 9811 9398 0 07:38 ? 00:00:00 smbd -D
root 9977 9398 0 08:11 ? 00:00:00 smbd -D
root 10243 9398 0 08:37 ? 00:00:00 smbd -D
root 10416 9398 0 08:52 ? 00:00:00 smbd -D
root 11931 11699 0 11:02 pts/1 00:00:00 grep smbd
smbd entries all show they were started today, but nmbd has some that are days old. Is that indicative of a problem? As I stated before, I have restarted smb every morning for the past few days, so I'd expect all nmbd entries to be no more than 1 day old.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
I assume that command is supposed to register a request from smbclient for an invalid user outside of the subnet. I tried #smbclient --user pje -L 192.168.1.100 from the secondary linux server and the terminal hung for about a minute until I hit CTRL-C. I tried an unassigned address within my subnet:
# smbclient --user pje -L 192.168.99.100
Error connecting to 192.168.99.100 (No route to host)
Connection to 192.168.99.100 failed
I checked smbd.log on the primary server, and smbd.log on the secondary server, nothing showed up. I must be doing this incorrectly.
# smbclient --user pje -L 192.168.99.100
Error connecting to 192.168.99.100 (No route to host)
Connection to 192.168.99.100 failed
I checked smbd.log on the primary server, and smbd.log on the secondary server, nothing showed up. I must be doing this incorrectly.
Sorry - replace pje with an appropriate user for your setup, and also substitute the ip address for the ip of your server. It will show whether the shares are all functioning correctly by giving a complete as a result of a query process, rather than in the case of your windows 98 clients which are connecting to a specific share.
Likewise from your Windows systems replace the ip address with that of your server!
( (()
(`-' _\
'' ''
Likewise from your Windows systems replace the ip address with that of your server!
( (()
(`-' _\
'' ''
ASKER
pjedmond,
I ran the command from both primary and secondary servers and got the same output before restarting the server.
[root@cityj root]# smbclient --user ben -L192.168.99.4
Password:
Anonymous login successful
Domain=[CAF] OS=[Unix] Server=[Samba 3.0.2-6.3E]
Sharename Type Comment
--------- ---- -------
public Disk CityH storage drive
bigdisk Disk CityF bigdisk
u Disk CityF bigdisk u
geoghostimg Disk GCS ghost images
geoghostexe Disk GCS shared ghost executable
IPC$ IPC IPC Service (the real cityh)
ADMIN$ IPC IPC Service (the real cityh)
lp Printer
doslp Printer
faxlp Printer
shiplp Printer
xerox8550 Printer
xerox Printer
franklp Printer
benlp Printer
Anonymous login successful
Domain=[CAF] OS=[Unix] Server=[Samba 3.0.2-6.3E]
Server Comment
--------- -------
CITYF SCO VisionFS 3.00.913
CITYH the real cityh
CITYJ cityJ
REPAIR2
SPECTRO2
Workgroup Master
--------- -------
CAF CITYH
From the client XP box before smbd restart:
C:\I386>net view \\192.168.99.4
Shared resources at \\192.168.99.4
the real cityh
Share name Type Used as Comment
-------------------------- ---------- ---------- ---------- ---
benlp Print
bigdisk Disk CityF bigdisk
chuck Disk Home Directories
doslp Print
faxlp Print
franklp Print
geoghostexe Disk GCS shared ghost executable
geoghostimg Disk GCS ghost images
lp Print
public Disk CityH storage drive
shiplp Print
u Disk CityF bigdisk u
xerox Print
xerox8550 Print
The command completed successfully.
No shares appear missing from either end. I did notice the browse list was incomplete, but it always seems to take a few hours in the day before it is complete. I assume that is an unavoidable byproduct of the way netbios works.
On the other hand, perhaps the browse list is at the core of the problem after all. I found that if I specify mapping to the share on the primary samba server using the *IP* (not the netbios/dns name), it works fine from the XP box -- smbd restart not required. I assume that means the problem is not with samba or sharing but with nmbd and name resolution from that box. Perhaps the problem is the many protocols I have installed, or a misconfiguration in samba WINS functionality? Samba with WINS support and DNS (named) both run off of 192.168.99.4, and the XP computers are set to use DNS and WINS on 99.4. I should reiterate that even with an incomplete browse list on the primary samba server, other boxes can connect to the same share without problems.
I would go to a completely TCP/IP and DNS based system if I knew how to do it properly from XP and 98. If I knew didn't have to use WINS, netbios and netbeui, I wouldn't. I'm afraid if I remove those older protocols though, it will break connectivity for all of the 98 boxes, and bring back all my old client side windows browsing list problems.
I ran the command from both primary and secondary servers and got the same output before restarting the server.
[root@cityj root]# smbclient --user ben -L192.168.99.4
Password:
Anonymous login successful
Domain=[CAF] OS=[Unix] Server=[Samba 3.0.2-6.3E]
Sharename Type Comment
--------- ---- -------
public Disk CityH storage drive
bigdisk Disk CityF bigdisk
u Disk CityF bigdisk u
geoghostimg Disk GCS ghost images
geoghostexe Disk GCS shared ghost executable
IPC$ IPC IPC Service (the real cityh)
ADMIN$ IPC IPC Service (the real cityh)
lp Printer
doslp Printer
faxlp Printer
shiplp Printer
xerox8550 Printer
xerox Printer
franklp Printer
benlp Printer
Anonymous login successful
Domain=[CAF] OS=[Unix] Server=[Samba 3.0.2-6.3E]
Server Comment
--------- -------
CITYF SCO VisionFS 3.00.913
CITYH the real cityh
CITYJ cityJ
REPAIR2
SPECTRO2
Workgroup Master
--------- -------
CAF CITYH
From the client XP box before smbd restart:
C:\I386>net view \\192.168.99.4
Shared resources at \\192.168.99.4
the real cityh
Share name Type Used as Comment
--------------------------
benlp Print
bigdisk Disk CityF bigdisk
chuck Disk Home Directories
doslp Print
faxlp Print
franklp Print
geoghostexe Disk GCS shared ghost executable
geoghostimg Disk GCS ghost images
lp Print
public Disk CityH storage drive
shiplp Print
u Disk CityF bigdisk u
xerox Print
xerox8550 Print
The command completed successfully.
No shares appear missing from either end. I did notice the browse list was incomplete, but it always seems to take a few hours in the day before it is complete. I assume that is an unavoidable byproduct of the way netbios works.
On the other hand, perhaps the browse list is at the core of the problem after all. I found that if I specify mapping to the share on the primary samba server using the *IP* (not the netbios/dns name), it works fine from the XP box -- smbd restart not required. I assume that means the problem is not with samba or sharing but with nmbd and name resolution from that box. Perhaps the problem is the many protocols I have installed, or a misconfiguration in samba WINS functionality? Samba with WINS support and DNS (named) both run off of 192.168.99.4, and the XP computers are set to use DNS and WINS on 99.4. I should reiterate that even with an incomplete browse list on the primary samba server, other boxes can connect to the same share without problems.
I would go to a completely TCP/IP and DNS based system if I knew how to do it properly from XP and 98. If I knew didn't have to use WINS, netbios and netbeui, I wouldn't. I'm afraid if I remove those older protocols though, it will break connectivity for all of the 98 boxes, and bring back all my old client side windows browsing list problems.
Is WinXP configured to use WINS server?
You're absolutely right - tryin to get XP and 98 to co-exist is a complete nightmare.
>No shares appear missing from either end. I did notice the browse list was incomplete, but it always seems to >take a few hours in the day before it is complete. I assume that is an unavoidable byproduct of the way netbios >works.
I normally reckon about 30 minutes to an hour to guarantee that it's updated.
Did you try changing the os level to 255? The fact that one XP system can access and the others can't vaguely suggests to me that one of your XP Pro systems has decided to be the Master 'Workgorup' PC, and the others are then asking it for informationwhich it is failing to provide. 65 is sufficient for XP Home, but I don't have access to what XP Pro does - I'd expect a higher value.
Did you check the Samba Server logs when you try yo connect to a share or when you try:
Start->run->
\\ipa.dd.rr.ess
This will tell you whether the server is responsible for the rejection, or whether the XP Pro system cannot even connect to it.
( (()
(`-' _\
'' ''
>No shares appear missing from either end. I did notice the browse list was incomplete, but it always seems to >take a few hours in the day before it is complete. I assume that is an unavoidable byproduct of the way netbios >works.
I normally reckon about 30 minutes to an hour to guarantee that it's updated.
Did you try changing the os level to 255? The fact that one XP system can access and the others can't vaguely suggests to me that one of your XP Pro systems has decided to be the Master 'Workgorup' PC, and the others are then asking it for informationwhich it is failing to provide. 65 is sufficient for XP Home, but I don't have access to what XP Pro does - I'd expect a higher value.
Did you check the Samba Server logs when you try yo connect to a share or when you try:
Start->run->
\\ipa.dd.rr.ess
This will tell you whether the server is responsible for the rejection, or whether the XP Pro system cannot even connect to it.
( (()
(`-' _\
'' ''
ASKER
I changed the server OS level to 255 today. I also killed all the old nmdb processes. The one user that consistently has a problem is logging into his share just fine if he connects specifying the IP. From the user's log file:
[2006/06/27 05:20:57, 1] smbd/service.c:make_connec tion_snum( 705)
chuck2 (192.168.99.13) connect to service u initially as user chuck (uid=200, gid=100) (pid 19028)
[2006/06/27 06:54:34, 1] smbd/service.c:close_cnum( 887)
chuck2 (192.168.99.13) closed connection to service u
That connection was before raising the OS level, so I'll have to try again tomorrow and see if the OS level change makes a difference. It can't hurt.
I was looking at my nmbd log and noticing that for some reason, nmbd was attempting to make itself master browser on what looked like odd subnets: 192.168.99.4 and 192.168.98.4. I may have this wrong, but A subnet of 99.4, would, as I understand it, comprise only of 99.4 and down. Also, subnet 192.168.98.0 isn't in use. I don't know why samba doesn't announce itself as a master browser on 99.0 as I intented. The subnet mask in use on our network is the typical 255.255.255.0. Nmbd also listed the wins server as "Attempting to become domain master browser on workgroup CAF, subnet UNICAST_SUBNET". Not sure what "UNICAST_SUBNET" means - perhaps just that it is attempting to broadcast to all accessible subnets? Bottom line, I'm not sure quite what nmbd is trying to do but it doesn't look like what I'd expect.
[2006/06/27 05:20:57, 1] smbd/service.c:make_connec
chuck2 (192.168.99.13) connect to service u initially as user chuck (uid=200, gid=100) (pid 19028)
[2006/06/27 06:54:34, 1] smbd/service.c:close_cnum(
chuck2 (192.168.99.13) closed connection to service u
That connection was before raising the OS level, so I'll have to try again tomorrow and see if the OS level change makes a difference. It can't hurt.
I was looking at my nmbd log and noticing that for some reason, nmbd was attempting to make itself master browser on what looked like odd subnets: 192.168.99.4 and 192.168.98.4. I may have this wrong, but A subnet of 99.4, would, as I understand it, comprise only of 99.4 and down. Also, subnet 192.168.98.0 isn't in use. I don't know why samba doesn't announce itself as a master browser on 99.0 as I intented. The subnet mask in use on our network is the typical 255.255.255.0. Nmbd also listed the wins server as "Attempting to become domain master browser on workgroup CAF, subnet UNICAST_SUBNET". Not sure what "UNICAST_SUBNET" means - perhaps just that it is attempting to broadcast to all accessible subnets? Bottom line, I'm not sure quite what nmbd is trying to do but it doesn't look like what I'd expect.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
For future reference - log files are almost always the fist place to look when something does not appear to be doing what it should....in fact you often find other 'oddities' when you start browsing the logs...as you may be discovering;)
( (()
(`-' _\
'' ''
( (()
(`-' _\
'' ''
ASKER
I checked the log files before I posted here, but I didn't really understand what they were telling me, and I still dont, in the case of nmbd.log, though bmquintas's input may shed some light: "remote announce = 192.168.99.255 (no need for this, its used to "tell" other lmb in OTHER SUBNETS what he's got)".
I'm going to take bmquintas's suggestion and disable both remote announce and remote browse sync.
WINS doesn't and never did help at all, regardless of which clients were configured to use it. It's probably not working but I'm just going to ignore it.
Everyone, thanks for all your help. Pjedmond, I think what helped the most was raising the OS level. That, and killing those nmbd processes, because nmbd seems to be restarting whithout problems now. It turned out to be a browsing issue, and not something wrong with samba sharing.
I'm going to take bmquintas's suggestion and disable both remote announce and remote browse sync.
WINS doesn't and never did help at all, regardless of which clients were configured to use it. It's probably not working but I'm just going to ignore it.
Everyone, thanks for all your help. Pjedmond, I think what helped the most was raising the OS level. That, and killing those nmbd processes, because nmbd seems to be restarting whithout problems now. It turned out to be a browsing issue, and not something wrong with samba sharing.
harbor235 ;}