Link to home
Start Free TrialLog in
Avatar of projects
projects

asked on

htaccess prevent logging 404 errors using SetEnvIf

I want to use something like this in an .htaccess file instead of messing with the httpd.conf

I have a file called some.log which when generated, will be put in /somedir/somedir2/some.log
When the file is not there and clients are checking for it, it is generating useless errors because the only clients who can even check are authenticated connections/clients.

Therefore, I don't want the logs being filled up with this error because I am aware of it. yet, I also want the clients to be able to retrieve the file when it is there.

PLEASE don't bother telling me how doing this is a bad idea or that I should fix my code. I'm aware of what I am doing :).

I want to prevent the error in my access.log and my ssl error log using .htaccess only. I have been trying all kinds of combinations and haven't found one that works so am seeking help.

# Don't log missing files
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(content\log)$ - [R=404,L]
Avatar of Dr. Klahn
Dr. Klahn

I do this type of thing using SetEnvIf in the vhost's httpd.conf but it can equally well be done in .htaccess.  See the Apache documentation for mod_setenvif.

Set an environment variable when the condition is matched.  In the example below the condition is matched when the User-Agent contains Yandex.  In this case you might need to match on REQUEST_URI as REQUEST_FILENAME may not be available for use by setenvif.

SetEnvIf User-Agent Yandex nonlog-request

Open in new window


In the vhost configuration file, conditionalize logging so that it occurs when the "nonlog-request" environment variable is not set.

CustomLog log/access_hostname locallog env=!nonlog-request

Open in new window

Avatar of projects

ASKER

I've read many docs and tried many things but so far nothing has worked.
I would like to do this using htaccess only and what I need is a solution specifically for;

/somedir/somedir2/some.log

When the .log file is not there and clients are checking for it, I want to prevent an error in both access.log and ssl_error.log.
For other experts, there is a previous question.

@projects:
I'm guessing SetEnvIfPlus did not work out for you.

Despite your request to the contrary, I feel obligated to again remind you that the proper approach here is not to ignore/white-wash the error, but to handle it properly in your code.  If an error condition is expected, then your app should anticipate it and handle it when it arises.

That said, I've found another approach you can use.  However, two caveats:
This was tested on Apache 2.4
If you are generating enough traffic that this one 404 causes issues with log review, you need to be aware that this approach comes with a performance penalty.  Not *much* of one, but a it could be significant in a high-traffic site

# my virtual host config
<VirtualHost *:80>
  ServerName testing
  DocumentRoot c:/bb/www/testing
  <Directory "c:/bb/www/testing">
    AllowOverride All
  </Directory>

# Note that LogFormat, ErrorLog, and CustomLog all need to be 
# declared in your conf files, not .htaccess
  LogFormat "%h %l %u %t \"%r\" %>s %b" common
  ErrorLog c:/bb/www/testing/error.log
  
  # this logs everything but 404s
  CustomLog c:/bb/www/testing/custom.log common env=!LOG404

  # this logs only 404s
  CustomLog c:/bb/www/testing/404.log common env=LOG404
</VirtualHost>

# my .htaccess file
RewriteEngine On

# The alternate RewriteCond choice will generate a subrequest to 
# determine if the file exists.  You will need it only if you have other 
# internal rewrites working on each request.  It will also increase 
# the performance hit significantly.

# Use this line if you can...
RewriteCond %{REQUEST_FILENAME} !-f

# ...or this one if you have to 
# RewriteCond %{REQUEST_FILENAME} !-F

# the NS modifier is only necessary if you used the second RewriteCond, above
RewriteRule .* - [E=LOG404, NS]

Open in new window

One more thing..  The CustomLog directives as shown above will apply to the entire site.  This shouldn't be an issue, since only the per-directory .htaccess file is setting the LOG404 environmental variable.  If you do run into conflicts, just drop those logging directives into a <Directory> container.  That will allow you to apply the special condition to only the areas in which you need it.
Why tell me exactly what I said I didn't want to hear about? I said I already understand the ramifications and am not worried about them in this case.

I am pretty sure this can be done in a much simpler way than what you are suggesting. I can't imagine all this just to not log one directory. I have found countless examples on the net using only a couple of lines.
While this has not worked for me, many others say similar has worked for them so I'll keep looking for a simpler fix.
>>> Why tell me exactly what I said I didn't want to hear about?

Mostly because that's the right answer.  The solution I'm presenting here is a less-than-satisfactory alternate.  

>>> I can't imagine all this just to not log one directory.

I posted the entire config I used in order to be thorough.  The actual mechanism to ignore the 404 entries is controlled from the CustomLog directive (in your conf file), and the rewrites (in .htaccess).  It is roughly equivalent to the mod_setenvif solution initially posted on this question.  The difference being that mod_setenvif has no conditional for the existence of a file, and mod_rewrite does.
No, it is the wrong answer in this case. There are plenty of other questions on this site asking how to do this correctly. I am not asking that. I have a need to do something which is different and said that I clearly understand the ramifications. I also said that the connections are all authenticated.

So, the question remains... how can this be done?

It is a high traffic and I cannot really modify the whole server nor take a performance loss though I understand what you are saying above.

This all sounds much too complicated. From what I've read, it is one or two directives in the virtualhost itself then an .htaccess file. Problem is, all of the posts I've found are from people having problems and getting inpu from people who have this working. Following those examples however has not worked for me.

That is why I am paying for this site, so that perhaps someone can give me a nice simple expert solution.
On another site, someone says;

//
I found a solution that works for me. I added these lines to .htaccess:

# Don't log missing files
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(artifacts\.jar|content\.xml|content\.jar)$ - [R=404,L]

It uses mod_rewrite. R=404 says to send 404 Not Found status code when client accesses any of mentioned files.

RewriteCode is there to make sure that file indeed doesn't exist. If I ever put one of those files into the directory, it will be served as usual.

This works great: mentioned files are no longer logged into error.log (that was my goal), but are still logged in access log (with 404 status).
\\

I tried this but am not able to get the right syntax.
In my case, the file and directory are;

/somedir/somedir2/some.log

which I can edit back to what they actually are later.
That solution does work, though the 404 errors are now mixed in with the successful requests.  Here's the .htaccess I used to test.  As before, this was tested on Apache 2.4.
RewriteEngine On

RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule .* - [R=404,L]

Open in new window

That will act on any file not found in the directory.  To limit it to specific files, simply change the ".*" regex to something more suitable.  For example, to only act on missing *.png files, use "\.png$".

Also, as my previous notes indicated, the -f test for the RewriteCond directive may not catch all instances of missing files if you depend on other rewrites.  If you find the capture is incomplete, change it to the -F test (capital F) instead.  Remember that  this change comes with the performance hit, though.
I should have mentioned that currently, the server I need this on is running 2.2.
I don't think there is any reason I could not upgrade however but I need to look into it.
There are a number of syntax differences between 2.2 and 2.4, and some modules used in 2.2 are not compatible with 2.4.  On the bright side, mod_rewrite did not change that much, and this particular usage should work just as well in 2.2.  There should be no requirement for you to upgrade to make this work.
Ok, so, for testing, I have added an .htaccess file in the directory where the clients are looking for a xxx.log file. Maybe I need to put this directive in the .htaccess at the start of the site?

The file name is always different but the extension is always the same.
The connections are using https so, I am seeing the errors in the ssl_error_log

//
RewriteEngine On

RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule .* - [R=404,L]
\\

So to confirm, we've said we don't need anything in the httpd.conf file at this point, only an .htaccess in the correct directory where the log files are.

The 404 is still showing up in the ssl log.
Have you verified the rewrite is firing properly?  Try putting something in the .htaccess file that would have a visible effect on the request, like sending it to an existing text file (just for testing purposes).
I know it's working right because I have .htaccess in use all over the server. It hosts multiple sites. I use it with wordpress and to lock out various areas, logging, etc.
I understand, but have you verified this particular .htaccess file is working?  Specifically, its ability to direct a rewrite.  Try turning on the rewrite log and checking how the rule is applied.
Ok,logging is on. I first set it up for the whole server and could see rewrite working fine.
I then set it for the virtualhost I am working with and it is now logging fine.

I have the .htaccess file in the proper directory, It's contents are;

RewriteEngine On

RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule .* - [R=404,L]

Logging is working; (snip all private stuff)

/subreq] (1) pass through /index.php
/subreq] (2) init rewrite engine with requested uri /index.html

The error is still showing up in the logs. I am now testing filename.bin
The filename is always different but the ext is .bin

What next?
The snippet you posted shows that a request for /index.php is passed through.  If you have an actual file called index.php in that directory, then that is a successful test.  However, there should have been much more to those log entries.  Make sure you are setting RewriteLogLevel 9 to get the most information during the debugging.

The next steps are:
Reproduce a "pass through" attempt with debug-level logging (e.g., another attempt at index.php)
Reproduce a 404 attempt with debug-level logging (e.g., filename.i-dont-exist.bin)

Be sure to post as much information from the logs as possible.
Actually, there is a file in that directory but it's index.html

I have changed it again, this time to filename.txt. I've changed it so that it's easier to notice in the logs.

I've changed the logging to level 9 but am still not seeing anything for attempts to filename.txt
Should I not change the .htaccess to specifically reflect a filename.txt

There are MANY connections hitting the server every second and there are only so many changes I can make to the server while it is running.

I think maybe the problem is because I have an .htaccess file above this directory, one level deep, in /root/directory.

Connections are being authenticated first to be allowed into /root/directory, then allowed to check the root/directory/files/filename.txt for the text file.

root/directory/files/filename.txt
(/root/directory/files/filesname.txt)

The upper .htaccess contains;

AuthType Basic
AuthName "Restricted Area"
AuthUserFile /var/www/virthts/domain/.htpasswd
require valid-user

#RewriteCond %{REQUEST_FILENAME} !-f
#RewriteRule ^(content\bin)$ - [R=404,L]

Does this mean I should be putting the rewrite items in here, like above?
The .htaccess files will cascade, with the upper-level being applied before lower-level.  Is the RewriteRule you showed in your upper-level file active?  It looks like it is commented in your post, which means it should not be applied at all.

The RewriteLog and RewriteLogLevel directives need to be placed in your virtual host definition.  The server will need to be restarted in order to engage them properly.  The .htaccess files, on the other hand, can be changed live.  However, if you introduce a syntax error into the file, the server will generate a 50x response when it attempts to parse it.  So, we can change the .htaccess files for live testing, with some care.

I'm more concerned about the lack of logging.  At level 9, you should be seeing tons of entries for just about everything.  I would start by clearing the file of any existing entries (`echo . > /path/to/logfile` works well for this), then making sure it is generating new entries.  If not, check the permissions on the file and the directory.
>The RewriteLog and RewriteLogLevel directives need to be placed in your virtual host definition.

They are, but as part of the virtualhost directives for this vhost otherwise, the logging was for the entire server and that instantly created huge log files which didn't help me in testing this.

>Is the RewriteRule you showed in your upper-level file active?  It looks like it is commented in your
>post, which means it should not be applied at all.

While testing, I enabled that in httpd.conf this time instead of in .htaccess so yes, it is in use.

The way it works is like this.

There is one virtualhost.
It has three sub directories.
Each subdirectory has it's own .htaccess requirements.

/subdir1/.htaccess allows only authenticated connections using .htpasswd

/subdir2/.htaccess allows only authenticated connections using .htaccess but has additional rules in order for the app in that subdir to work.

/subdir3/.htaccess is another subdirectory structure which is being tested.

All three have different uses and needs so have their own .htaccess and directives.

One problem with this method is, as I understand it, there is a performance cost because I am using .htaccess files in three (or more) subdirectories whereas everything should really be in just one file at the root of the vhost in this case. This is what I would prefer but that would add complexity to the file.

This is likely why we are not seeing all of the traffic even at level 9?
There IS a lot more logging than I posted but I also don't want to post all this stuff publicly.

x.x.xx.x - xxxxxx [14/Dec/2014:08:32:38 --0700] [xxx.com/sid#7fece1fb6e80][rid#7fece23ac278/initial] (1) [perdir /var/www/vhosts/xxxx/html/subdir1/] pass through /var/www/vhosts/xxxx/html/subdir1/app.php

x.x.x.x - xxxxxx [14/Dec/2014:08:32:39 --0700] [xxxx.com/sid#7fece1fb6e80][rid#7fece25d6dc8/initial] (3) [perdir /var/www/vhosts/xxxx/html/subdir1/] strip per-dir prefix: /var/www/vhosts/xxxx/html/subdir1/files/somefile.log -> files/somefile.log

x.x.x.x - xxxxxx [14/Dec/2014:08:32:39 --0700] [xxxx.com/sid#7fece1fb6e80][rid#7fece25d6dc8/initial] (3) [perdir /var/www/vhosts/xxxx/html/subdir1/] applying pattern '^(content\\bin)$' to uri 'files/somefile.log'

x.x.x.x - xxxxxx [14/Dec/2014:08:32:39 --0700] [xxxx.com/sid#7fece1fb6e80][rid#7fece25d6dc8/initial] (1) [perdir /var/www/vhosts/xxxx/html/subdir1/] pass through /var/www/vhosts/xxxx/html/receiver/files/somefile.log

Open in new window


It is that somefile.log that I am trying to prevent 404 errors when it is missing. While the code could be changed, I am more interested in knowing how such an error could be prevented in the logs.
When the files doesn't exist, I don't want to have an error logged but when it does exist, I'd like the client to be able to get it.

One thing is that the client connection knows what file name it is looking for.
For example, the client might know that the file name will be 1.2.3.4.log.
The next client might know that the file it is looking for is 2.3.4.5.log.
I see one potential problem.  With .htaccess files, the directory passed to mod_rewrite is modified to strip the base directory.  For example, if I'm requesting /dir1/subdir1/subsubdir2/file.txt, here's how mod_rewrite sees things:
# In the root directory
# ${REQUEST_URI} should always be the full path, so this will match
RewriteCond %{REQUEST_URI} /dir1/subdir1/subsubdir2/file.txt [NC]
# in a VirtualHost, this would match
# in an .htaccess file, the initial '/' is stripped, which would cause this to *not* match
RewriteRule ^/dir1/subdir1/subsubdir2/file.txt$ somewhere.txt [NC,L]
# in an .htaccess file, this should match
RewriteRule ^dir1/subdir1/subsubdir2/file.txt$ somewhere.txt [NC,L]

# in /dir1/.htaccess
# the base is stripped (/dir1), so...
# this will not match
RewriteRule ^/dir1/subdir1/subsubdir2/file.txt$ somewhere.txt [NC,L]
# this should match
RewriteRule ^subdir1/subsubdir2/file.txt$ somewhere.txt [NC,L]

Open in new window

These rules are demonstrated in the second line of the log you posted.  You can see how mod_rewrite stripped the base from the URL to be tested.  The stripped URL is then matched to the pattern (3rd line), which is not found to match, and the request is passed through without being rewritten.

You'll need to correct your RewriteRule and RewriteCond directives to match your environment.  

Regarding how .htaccess files are applied, you can see the docs here: http://httpd.apache.org/docs/current/howto/htaccess.html#how.  Basically, only files in the currently requested tree are processed.  So, if a client requests /dir1/subdir1, .htaccess files in the root, /dir1, and /dir1/subdir1 will be processed, but files in /dir2 or /dir1/subdir2 will not.
So, what it sounds like you are saying is that I need one single .htaccess file at the root of the site, with rules for the different requirements for the different directories.
Not at all.  You can have .htaccess files in every subdirectory, if you want.  In fact, that's how the system was designed.   You just have to be aware of the context of each file, and how the relevant datapoints will appear.  While I posted a single code example, it was describing multiple files in different directories.
Ok, then there should be no problem since they nested, they one one in each subdirectory.

/subdir1/.htaccess
/subdir2/.htaccess
/subdir3/.htaccess
Correct.  They will not conflict with each other.  However, they could all each conflict with /.htaccess, if it exists.
Yes, it's why I pointed out that there is no .htaccess at the root.

What next for testing?
The next step is to modify your rules in the context of the last code example I posted.  Once you have them, test a single attempt to trigger the rewrites, and post the rewrite log created for that attempt.
Do you mean the above; ID: 40501951

I have no idea how to apply this. Using txt for example, the file name changes but the extension would always be .log. The examples you posted all seem to be for a fixed file name.

In this case, it is a virtualhost.
The directories as mentioned would be;

(root)/subdirx/.htaccess ...such as follows;

/subdir1/.htaccess
/subdir2/.htaccess
/subdir3/.htaccess

So in my case, for example, let's say the log files are in /subdir2.
This is where I need that .htaccess to control the logging for the requests for the log files in

/subdir2/logdirectory

The .htaccess file in /subdir2 would be referencing the *.log files in the above subdirectory.
Follow the examples to build what you need:
# This example would be for /subdir2/.htaccess
# It matches any file with a .log extension
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule \.log$ - [R=404,L]

Open in new window

Confusing but, doesn't work.

Now, to be sure...

# This example would be for /subdir2/.htaccess

The log files are in /subdir2/logdirectory

Do I need to point to that directory or is this supposed to work for ANY *.log file from /subdir2 on?

RewriteEngine On

RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule \.log$ - [R=404,L]
In theory, it should work if the filename ends in ".log", which should apply to the directory containing the .htaccess file and all its subdirectories.  If it is not being applied, I will need to see the log entries generated by mod_rewrite to troubleshoot.
Alright, here is some logging, edited for brevity and privacy;

(3) [perdir /var/www/vhosts/domain/html/myapp/] strip per-dir prefix: /var/www/vhosts/domain/html/myapp/myapp2.php -> myapp2.php
(3) [perdir /var/www/vhosts/domain/html/myapp/] applying pattern '\\.log$' to uri 'myapp2.php'
(1) [perdir /var/www/vhosts/domain/html/myapp/] pass through /var/www/vhosts/domain/html/myapp/myapp2.php
(3) [perdir /var/www/vhosts/domain/html/myapp/] strip per-dir prefix: /var/www/vhosts/domain/html/myapp/logfiles/xxxxxx.log -> logfiles/xxxxxx.log
(3) [perdir /var/www/vhosts/domain/html/myapp/] applying pattern '\\.log$' to uri 'logfiles/xxxxxx.log'
(4) [perdir /var/www/vhosts/domain/html/myapp/] RewriteCond: input='/var/www/vhosts/domain/html/myapp/logfiles/xxxxxx.log' pattern='!-f' => matched
(2) [perdir /var/www/vhosts/domain/html/myapp/] forcing responsecode 404 for /var/www/vhosts/domain/html/myapp/logfiles/xxxxxx.log
(3) [perdir /var/www/vhosts/domain/html/myapp/] strip per-dir prefix: /var/www/vhosts/domain/html/myapp/myapp2.php -> myapp2.php
(3) [perdir /var/www/vhosts/domain/html/myapp/] applying pattern '\\.log$' to uri 'myapp2.php'
(1) [perdir /var/www/vhosts/domain/html/myapp/] pass through /var/www/vhosts/domain/html/myapp/myapp2.php
(3) [perdir /var/www/vhosts/domain/html/myapp/] strip per-dir prefix: /var/www/vhosts/domain/html/myapp/myapp2.php -> myapp2.php
(3) [perdir /var/www/vhosts/domain/html/myapp/] strip per-dir prefix: /var/www/vhosts/domain/html/myapp/logfiles/xxxxxx.log -> logfiles/xxxxxx.log
(3) [perdir /var/www/vhosts/domain/html/myapp/] applying pattern '\\.log$' to uri 'logfiles/xxxxxx.log'
(4) [perdir /var/www/vhosts/domain/html/myapp/] RewriteCond: input='/var/www/vhosts/domain/html/myapp/logfiles/xxxxxx.log' pattern='!-f' => matched
(2) [perdir /var/www/vhosts/domain/html/myapp/] forcing responsecode 404 for /var/www/vhosts/domain/html/myapp/logfiles/xxxxxx.log

Open in new window


I notice;

applying pattern '\\.log$' to uri
Pay attention to this log excerpt:
(3) [perdir /var/www/vhosts/domain/html/myapp/] strip per-dir prefix: /var/www/vhosts/domain/html/myapp/logfiles/xxxxxx.log -> logfiles/xxxxxx.log
(3) [perdir /var/www/vhosts/domain/html/myapp/] applying pattern '\\.log$' to uri 'logfiles/xxxxxx.log'
(4) [perdir /var/www/vhosts/domain/html/myapp/] RewriteCond: input='/var/www/vhosts/domain/html/myapp/logfiles/xxxxxx.log' pattern='!-f' => matched
(2) [perdir /var/www/vhosts/domain/html/myapp/] forcing responsecode 404 for /var/www/vhosts/domain/html/myapp/logfiles/xxxxxx.log

Open in new window


The first line shows the filename's base directory being stripped, as expected.  The URL used for rewrite comparisons will be "logfiles/xxxxxx.log".  Next, it is matched to the pattern of the RewriteRule.  If this had not matched, you would have seen a pass through on the next line, as you see with "myapp2.php".  Next is the RewriteCond comparison, which is a check for the existence on the drive, and it is found to match.  Finally, the rewrite executes, which forces a 404 response code.  

So, the rewrite is working as intended.  Now on to the logs.  Please post your ErrorLog, CustomLog, and LogFormat directives applicable to this  virtual host.
I only allow https connections to this domain.
Possible typos related to obfuscating.

<VirtualHost *:443>
            DocumentRoot "/var/www/virtualhosts/domain/html"
            ServerName domain.com:443
            SSLEngine on
            SSLProtocol ALL -SSLv2 -SSLv3
            SSLCertificateFile /etc/pki/tls/certs/domain.crt
            SSLCertificateKeyFile /etc/pki/tls/private/domain.key
#            LogLevel warn
            ErrorLog /var/logs/domain/ssl_error_log
            TransferLog /var/logs/domain/ssl_access_log
#    RewriteLog "/var/logs/domain/rewrite.log"
#    RewriteLogLevel 9
</VirtualHost>
I believe we've been chasing a red herring.  My tests on 2.4 were without an SSL host, so of course all my requests went to the non-ssl log file.  I tested again this evening, and all 404's, regardless of the rewrites, will be recorded there.  In an SSL host, that's the SSL access log - exactly what you're seeing.

I'm afraid we're back to #40491672.  You'll need to use an environmental variable.  Set it in the .htaccess if the file does not exist.  Edit your CustomLog directive to log only when the variable is not set.  Optionally, add another to log to a different file when the variable is set.
I can change the error logging for this domain, that is not an issue but I am not quite sure what you are suggesting I do next ;)
ASKER CERTIFIED SOLUTION
Avatar of Steve Bink
Steve Bink
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial