Link to home
Start Free TrialLog in
Avatar of owlsnet
owlsnet

asked on

Proxy.pac Not Returning Expected Results

Hello,

We are attempting use a proxy.pac file on our Windows 2003 web server to limit what web sites can be accessed from some public library computers.  Here is an example of the proxy.pac file:

function FindProxyForURL(url, host)
{
if (isPlainHostName(host) ||
dnsDomainIs(host, ".learnatest.com") ||
dnsDomainIs(host, ".learningexpresslibrary.com") ||
dnsDomainIs(host, ".galegroup.com"))
return "DIRECT";
else
return "PROXY alert.owls.lib.wi.us"
}

We have configured IIS on our Server 2003 web server to use a MIME type "Assosiated Extension: .pac" with a "Content Type(MIME): application/x-ns-proxy-autoconfig"

We have also configured IE 6.0 to "Use automatic configuration script" which points to the URL location of our proxy.pac file.

At first our tests looked very good, but we have found an odd problem.  First of all, we can access the 3 URLs listed in the proxy.pac file with out any issues, good.  And if we try to access, for example, http://www.symantec.com, the browser comes back with our custom "sorry you can't get there from here" page.  So far so good as this is what we would expect.

But the problem is if we try to go to a Web site's subdirectory, for example, http://www.symantec.com/avcenter, the browser comes back with a "Page Cannot Be Found" 404 error.  This pattern is very consistant and happens with all Web site subdirectorys not listed in the proxy.pac file (not just symantec's)  We would like these subdirectory web site visits to also return our "sorry" page which is much kinder than a general 404 error.

Can anyone see a problem in our proxy.pac file?  Or am I missing a step with this type of configuration?

Thank you in advance for your help,

Dave Bacon
Computer Network Manager
Outagamie Waupaca Library System
Avatar of owlsnet
owlsnet

ASKER

After many hours of testing, I now have more information available on what I see is happening with this problem.

Using the proxy.pac file as described earlier, I mentioned that only accessing direct domain level URLS (like http://www.symantec.com) returned our desired PROXY web site of "alert.owls.lib.wi.us".  And accessing a domain level URL with a subdirectory (like http://www.symantec.com/avcenter) caused a 404 error.

I have found that the issue is not just with a domain level URL address with a subdirectory.  The 404 error also happens when we access another HTML file on the main domain level URL (like http://www.symantec.com/testpage.htm - which is not a real web site.  It's just an example).

More importantly (and here is the real issue) I have found that the 404 errors are not happening because the, "URL sub-pages", are not being found at the requested domain like symantec.com.  Instead the proxy.pac line ( return "PROXY alert.owls.lib.wi.us" ) address of "alert.owls.lib.wi.us" is somehow replacing the domain of the requested URL and is keeping the rest of the URL request string.  Here is an example:

Requested URL:  "http://www.symantec.com/avcenter"
Proxy modified URL:  "http://alert.owls.lib.wi.us/avcenter"

And of course we don't have an "avcenter" folder on our Web server which in turn causes the 404 error.

So the new question is, is there a way to modify this configuration so the returned proxy information ONLY goes to "alert.owls.lib.wi.us" and ignores any other information that is getting appended to the end of that proxy address.

Thank you again for any help you can offer.

Dave Bacon - OWLS
Avatar of Netman66
Try placing a trailing period at the end of your return line:


return "PROXY alert.owls.lib.wi.us."

I will continue my testing in the meantime.


or

return "PROXY alert.owls.lib.wi.us/default.html"  <== or whatever your default page is.


Avatar of owlsnet

ASKER

Thank you for your comments on this problem Netman66.  I've tried both without success.  Adding a trailing period did not change how it worked.  And specifying the default page (return "PROXY alert.owls.lib.wi.us/default.htm") made everything become a 404 error.  I did verify that "default.htm" is the correct file.

Do you (or anyone else) have any other suggestions?

Thanks again,

Dave Bacon -OWLS
Interesting..

I'll keep digging.

Every reference I have looked at has you last line written like this:

return "PROXY 192.168.0.1:80"

...rather than site name.

I can't see why this would matter, but I have yet to see a sample file written using site name instead of IP..

Avatar of owlsnet

ASKER

Thanks again for your time on this difficult issue Netman66.  I modified the return PROXY line to use our IP Address and Port #80 and as expected there was no change in the result.

Do you know if it is the designed nature of the return PROXY line to accept the additional URL data from the original URL request and try to pass the complete modified URL on the the PROXY?

For example:
Requested URL:  "http://www.symantec.com/avcenter"
Proxy modified URL:  "http://alert.owls.lib.wi.us/avcenter"

Dave Bacon - OWLS
ASKER CERTIFIED SOLUTION
Avatar of Netman66
Netman66
Flag of Canada image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of owlsnet

ASKER

Thank you again Netman66 for your help (and time) with this question.  I had already modified the 404 page before I posted this question here.  Yes - It works but is not the best solution.  If an allowed page goes down, then the modified 404 may give the Library Patron the wrong idea.  We'll have to word the 404 page carefully.

Because of your time and willingness to help me with this, I am going to accept your last comment.  Please let me know if you have any other ideas.

Dave Bacon - OWLS