• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1315
  • Last Modified:

Proxy.pac Not Returning Expected Results

Hello,

We are attempting use a proxy.pac file on our Windows 2003 web server to limit what web sites can be accessed from some public library computers.  Here is an example of the proxy.pac file:

function FindProxyForURL(url, host)
{
if (isPlainHostName(host) ||
dnsDomainIs(host, ".learnatest.com") ||
dnsDomainIs(host, ".learningexpresslibrary.com") ||
dnsDomainIs(host, ".galegroup.com"))
return "DIRECT";
else
return "PROXY alert.owls.lib.wi.us"
}

We have configured IIS on our Server 2003 web server to use a MIME type "Assosiated Extension: .pac" with a "Content Type(MIME): application/x-ns-proxy-autoconfig"

We have also configured IE 6.0 to "Use automatic configuration script" which points to the URL location of our proxy.pac file.

At first our tests looked very good, but we have found an odd problem.  First of all, we can access the 3 URLs listed in the proxy.pac file with out any issues, good.  And if we try to access, for example, http://www.symantec.com, the browser comes back with our custom "sorry you can't get there from here" page.  So far so good as this is what we would expect.

But the problem is if we try to go to a Web site's subdirectory, for example, http://www.symantec.com/avcenter, the browser comes back with a "Page Cannot Be Found" 404 error.  This pattern is very consistant and happens with all Web site subdirectorys not listed in the proxy.pac file (not just symantec's)  We would like these subdirectory web site visits to also return our "sorry" page which is much kinder than a general 404 error.

Can anyone see a problem in our proxy.pac file?  Or am I missing a step with this type of configuration?

Thank you in advance for your help,

Dave Bacon
Computer Network Manager
Outagamie Waupaca Library System
0
owlsnet
Asked:
owlsnet
  • 5
  • 4
1 Solution
 
owlsnetAuthor Commented:
After many hours of testing, I now have more information available on what I see is happening with this problem.

Using the proxy.pac file as described earlier, I mentioned that only accessing direct domain level URLS (like http://www.symantec.com) returned our desired PROXY web site of "alert.owls.lib.wi.us".  And accessing a domain level URL with a subdirectory (like http://www.symantec.com/avcenter) caused a 404 error.

I have found that the issue is not just with a domain level URL address with a subdirectory.  The 404 error also happens when we access another HTML file on the main domain level URL (like http://www.symantec.com/testpage.htm - which is not a real web site.  It's just an example).

More importantly (and here is the real issue) I have found that the 404 errors are not happening because the, "URL sub-pages", are not being found at the requested domain like symantec.com.  Instead the proxy.pac line ( return "PROXY alert.owls.lib.wi.us" ) address of "alert.owls.lib.wi.us" is somehow replacing the domain of the requested URL and is keeping the rest of the URL request string.  Here is an example:

Requested URL:  "http://www.symantec.com/avcenter"
Proxy modified URL:  "http://alert.owls.lib.wi.us/avcenter"

And of course we don't have an "avcenter" folder on our Web server which in turn causes the 404 error.

So the new question is, is there a way to modify this configuration so the returned proxy information ONLY goes to "alert.owls.lib.wi.us" and ignores any other information that is getting appended to the end of that proxy address.

Thank you again for any help you can offer.

Dave Bacon - OWLS
0
 
Netman66Commented:
Try placing a trailing period at the end of your return line:


return "PROXY alert.owls.lib.wi.us."

I will continue my testing in the meantime.


0
 
Netman66Commented:
or

return "PROXY alert.owls.lib.wi.us/default.html"  <== or whatever your default page is.


0
Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

 
owlsnetAuthor Commented:
Thank you for your comments on this problem Netman66.  I've tried both without success.  Adding a trailing period did not change how it worked.  And specifying the default page (return "PROXY alert.owls.lib.wi.us/default.htm") made everything become a 404 error.  I did verify that "default.htm" is the correct file.

Do you (or anyone else) have any other suggestions?

Thanks again,

Dave Bacon -OWLS
0
 
Netman66Commented:
Interesting..

I'll keep digging.

0
 
Netman66Commented:
Every reference I have looked at has you last line written like this:

return "PROXY 192.168.0.1:80"

...rather than site name.

I can't see why this would matter, but I have yet to see a sample file written using site name instead of IP..

0
 
owlsnetAuthor Commented:
Thanks again for your time on this difficult issue Netman66.  I modified the return PROXY line to use our IP Address and Port #80 and as expected there was no change in the result.

Do you know if it is the designed nature of the return PROXY line to accept the additional URL data from the original URL request and try to pass the complete modified URL on the the PROXY?

For example:
Requested URL:  "http://www.symantec.com/avcenter"
Proxy modified URL:  "http://alert.owls.lib.wi.us/avcenter"

Dave Bacon - OWLS
0
 
Netman66Commented:
It shouldn't do that...but it appears it does and that's what is puzzling.

Actually, now that I I'm thinking....

What is happening is that your PAC conditions are being satisfied if you do not hit those specified site, so it returns a proxy to use to go to the site requested.  By default it will return a 404 since it will use the proxy (because you told it to) and look to the site for the modified URL - which does not exist.  So it appears it does work properly.

How about replacing your 404 page with the default.html to see if the desired message is displayed.

Either that, or use the proxy for good requests only and the rest do not even get outside the network.

function FindProxyForURL(url, host)
{
if (isPlainHostName(host) ||
dnsDomainIs(host, ".learnatest.com") ||
dnsDomainIs(host, ".learningexpresslibrary.com") ||
dnsDomainIs(host, ".galegroup.com"))
return "PROXY alert.owls.lib.wi.us"
}
0
 
owlsnetAuthor Commented:
Thank you again Netman66 for your help (and time) with this question.  I had already modified the 404 page before I posted this question here.  Yes - It works but is not the best solution.  If an allowed page goes down, then the modified 404 may give the Library Patron the wrong idea.  We'll have to word the 404 page carefully.

Because of your time and willingness to help me with this, I am going to accept your last comment.  Please let me know if you have any other ideas.

Dave Bacon - OWLS
0

Featured Post

What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

  • 5
  • 4
Tackle projects and never again get stuck behind a technical roadblock.
Join Now