Link to home
Start Free TrialLog in
Avatar of Cosmin Curticapean
Cosmin CurticapeanFlag for Romania

asked on

Analyze Web Traffic generated by workstations

Hello all

i have to analyze the web sites that employees are acessing and sort them in order to block the ones that are not work related. The current network configuration is that i am using a RedHat 8 Linux Gateway with iptables, no proxy server. I have seen snort, iptraf and argus programs, also i have them installed and captured traffic but i'm having a hard time to make use of this traffic they are capturing. Any help would be great to a quick solution in having this report done.

I mention that this monitoring would be used for a week or two in order to capture the data needed to make my report.

Regards,
Cosmin
Avatar of pritamdutt
pritamdutt
Flag of India image

Hi,

You need to implement a Web Proxy such as squid to act as middle-man in order to track the websites being visited by employees in your organization. Since you are already using implementing SQUID would not be a difficult task. You would find a lot of information on internet on setting up transparent proxy, but here is the catch "you can't use transparent proxy for SSL traffic", so any websites viewed by using HTTPS protocol will be missed.

So, assuming you have a Active Directory in place, with all machines connected to Domain; you can perform following steps:

1. Configure a website IIS Server as wpad.<your domain name> and host your organization specific Proxy.pac and wpad.dat
2. Configure group policy to enable Automatic Configuration, and with information on automatic configuration script
3. Disable transparent NAT / access to internet.

This would force all traffic through your proxy and you would be able to prepare a report on Internet Access by IP Address.

But, next question would be are you using DHCP Server for IP addresses? If yes your report would not correctly map to respective users. There is different method to achieve that... But then it would make sense to talk about it only if it pertains to you.

Hope this helps!

Regards,
Avatar of xterm
xterm

If you can't/won't push your users through a proxy, there is another quick and dirty way to at least get the IP addresses of remote sites that your users are visiting but you'll then have to do some kind of post processing to resolve those to names.  Still, at least its a no-brainer:

Just add the following somewhere in your iptables config (probably /etc/sysconfig/iptables):

-A OUTPUT -s 0.0.0.0/0 -p tcp -m tcp --dport 80 -j LOG --log-level 4 --log-prefix "http "
-A OUTPUT -s 0.0.0.0/0 -p tcp -m tcp --dport 443 -j LOG --log-level 4 --log-prefix "https "

If you use the log-level (4) I provided above, then you'll need an entry in /etc/syslog.conf to catch these:

# Log outbound http/https requests
kern.warning                                                                 /var/log/outbound_web.log

Restart both daemons (/etc/init.d/iptables restart; /etc/init.d/syslog restart)

Not ideal, but its a whole lot easier than trying to inspect gigs of packet capture logs.  If you want actual URLs, you're probably going to have to use tcpdump in non-interactive fashion, and then use ethereal to parse the logs for you.

Best of luck!


Avatar of Cosmin Curticapean

ASKER

In reply to pritamdutt:

squid is interesting as an option and if it works with ssl it's even better. I'm not sure about this, i need to document it. In my case i need to implement a proxy server (not transparent one), unfortunatelly the DC installed does not cover all PC's and my Linux box covers the firewall, gateway services.
I also have DHCP server, but it does not matter since i want a list of visited websites not the users who accesses them. Anyway the DHCP is configured to assign the same ip address to the mac address so it's even easier.

In reply to xterm:

this sounds like a more easy and quick solution at first look but the line you wrote to add to iptables does not work. Nothing gets added to the outbound_web log file. After playing a lot with iptables, i finally made this line write lines to the log file:

 iptables -t nat -A PREROUTING -p tcp --dport 80 -m state --state NEW -j LOG --log-level 4

unfortunatelly the messages log file gets the same output, so the data get's doubled. This after all would not be such an inconvenience, since this analisys takes a week maby.
My log file has this kind of lines:

Oct 21 11:46:07 HOSTNAME kernel: IN=eth1 OUT= MAC=x0:1x:x7:1x:x6:xa:x0:x4:x1:x6:8x:6x:0x:x0 SRC=192.16x.x.xxx DST=2xx.1xx.x6x.2 LEN=60 TOS=0x00 PREC=0x00

and i'm not sure that i can use the DST IP to trace the web page address.

Regards,
Cosmin
ASKER CERTIFIED SOLUTION
Avatar of pritamdutt
pritamdutt
Flag of India image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
You can read more about Proxy AutoConfig @ http://en.wikipedia.org/wiki/Proxy_auto-config

Regards,
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Sorry for the delay, the season's flu got me for a few days. My progress is in this direction:
-installed new version of squid (update actually)
-installed sarg, an application that uses the log files of squid to analyze and display a report per user (ip in my case), downloads and top sites. It helps me alot in my quest!

Now follows the transparent proxy (tutorial of pritamdutt) since now squid is configured by default to work with port 3128. Updates this afternoon!

Regards and thanks,
Cosmin C.
In the end my working solution is this, having in mind that my DHCP server si on linux not on the domain server (microsoft).
1. created a file wpad.dat with content on one web server that can be accessed from anywhere (Public IP):
 
function FindProxyForURL(url, host) { return "PROXY 192.168.0.1:3128; DIRECT"; }

Open in new window

2. created a policy in the domain controller for the automatic proxy detection and they appear in IE as follows:
User generated image
Did some tests on a workstation in local lan i can see that Internet traffic is going through my proxy server. If i have the workstation disconnected from the lan and access Internet the proxy server is bypassed.

Finally i will have to change the policies for firefox since it does not support GPO natively.

Regards,
Cosmin C.
Thank you and keep up the good work in supporting others!