I am writing this from an Ubuntu (and likely Debian) perspective, but most of it translates to RPM based distributions. This deployment is tested and running on Ubuntu 12.04 using 2 cores and 3gb of RAM. Disk space requirements vary based on how many servers you have shipping logs to it; Graylog2 keeps the last 60 days of data and purges every 30 minutes by default, but you can adjust the storage settings if you have other requirements.
The first thing I would like to share about this set up is: I start from the Binaries or sources for everything except MongoDB; there are PPA's with the other parts in them but I had nothing but problems with those.
The second thing is that this is not a complete copy/paste article; I will lead you but assume you know your system well enough to actually complete some of the steps I provide
--The good bits--
So you have all these verbose, helpful logs on all of your servers, but you are getting tired of logging into each server to trace traffic through your stack.
An easy solution is to ship your logs over to another server then you only have to log into one server to look at your logs but now you are using up valuable system resources syncing them on a schedule, and you will find yourself waiting for the newest ones to show up.
You are close to a working solution... but now you need a notification when a condition occurs in those logs. There are a lot of options out there for centralizing and parsing your log files then sending out notifications; the one everyone knows is Splunk. I have used Splunk; it took a while to learn the query language but once I did it worked fine for a while... then we went over the limit for the free version and were looking at putting out thousands of dollars to keep using it with the log traffic we generate.
Enter Graylog2. It is a Java app and a Rails webapp that uses a NoSQL backend and Elasticsearch for indexing. It seems like a complex setup and it is a lot more complex than running a single installer and being done with it, but the queries are easier, the interface is easy to use, notifications are easy to set up and it can easily be automated with puppet or chef... and best of all the logo is a gorilla with a party hat and balloon!
--The Actual set up--
: Prepare your locations and install Java.
For the purposes of this article I will store sources in /usr/share/src
and will install to /var/lib/<component>
You can go ahead and create your src directory now.
You should create the /var/lib/graylog
Install the Java of your choosing. This set up will work with openjdk-jre just as well as the sun-jre
: MongoDB for Storage.
Graylog2 writes everything to a MongoDB instance. Mongo is really easy to set up and configure and can be done in minutes
First you need to add the 1gen keys
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv 7F0CEB10
Add the following line to your sources.list
deb http://downloads-distro.mongodb.org/repo/ubuntu-upstart dist 10gen
apt-get install mongodb-10gen
This install adds an upstart script to /etc/init/
and a mongodb.conf to /etc, and the data will be stored in /var/lib/mongodb
Go ahead and start mongodb now and lets make sure it works. To open a mongo console just type "mongo" and you will be in. Now let's create the Graylog2 database:
at the prompt:
This adds a user graylog2 with a password "password"; you can check it by running "show users".
: Elasticsearch for indexing.
Graylog2 uses an Elasticsearch instance to store indexes in. This allows for faster groking of the (potentially) massive amounts of data you will be storing. Elasticsearch is quite easy to get setup.
We will grab two packages for this install: the Elasticsearch main package and the Service package, which is a nice Service wrapper for Java. Download the main package tar.gz from: http://www.elasticsearch.org/download/2012/05/21/0.19.4.html
Download the service wrapper from github here:https://github.com/elasticsearch/elasticsearch-servicewrapper/downloads
Extract the main package to /var/lib/elasticsearch
and then extract the service package to /var/lib/elasticsearch/bin/service
In my deployment I wanted to use a custom config for the cluster in the future so added the following to elasticsearch.conf in the service directory:
Then I customized the /var/lib/elasticsearch/config/elasticsearch.yml
per the documentation in the config file (these are the bare minimum and depending on your required scale this may be just the beginning)
Now you need to install the init scripts. The service package we downloaded has a mechanism for this. Run:
Now start up Elasticsearch and we will see that it works. Run the following to check that it is working (If you don't have links or prefer to use a browser you can open http://
<server_IP>:9200 in your favourite browser):
You should get a json response with the cluster name, build version and a status. As long as you see this response, Elasticsearch is working. You may need to increase the number of file descriptors if your server still has the default set; increasing it to 64k is safe.
Step four: Graylog2 server.
Graylog2 server is a java app that listens on a syslog port and a GELF port for incoming messages and then writes them to Mongodg and updates the indexes in Elasticsearch.
Grab the Graylog2 server tar.gz
cd /usr/share/src && wget http://cloud.github.com/downloads/Graylog2/graylog2-server/graylog2-server-0.9.6.tar.gz | tar zxv
cp -R graylog2-server-0.9.6/ /var/lib/graylog2/server/
Now you need to link the conf file (or copy it if you want) into the /etc
cd /etc/ && ln -s /var/lib/graylog2/server/graylog2.conf.example graylog2.conf
Now you need to modify the graylog2.conf to reflect the MongoDB user account and password you created earlier and make sure that the Elasticsearch section references the correct port etc.
Next you can create an Upstart script or an init script, I used marzocchi's init script found here: https://gist.github.com/1659948
but you can create your own if you want.
Start up Graylog2 server and make sure that it stays running (this can be one of those silently failing times, wait a couple of seconds and do a ps to make sure Graylog2 server is still running). If Graylog fails to stay running, you can check out the log file in /var/log/graylog2.log
. A number of failures occur becasue Graylog can't talk to MongoDB properly or to Elasticsearch; check your configs.
Step five: Graylog2 web interface.
The web interface for Graylog2 is a Rails app that comes with a WEBrick setup, If this is meant to be a production server you really want to host it on Apache with passenger or on something else of your choosing. WEBrick is a Ruby library meant for development and testing purposes (WEBrick description)
First grab the tar.gz from github:
cd /usr/share/src && wget http://cloud.github.com/downloads/Graylog2/graylog2-web-interface/graylog2-web-interface-0.9.6.tar.gz | tar zxv
cp -R graylog2-web-interface-0.9.6/ /var/lib/graylog2/web/
Install apache and passenger
apt-get install apache2 libapache2-mod-passenger
Now create a site for the web interface (/etc/apache2/sites-available/graylog2
) that looks like this:
Allow from all
SetEnv MONGOID_HOST localhost
SetEnv MONGOID_PORT 27017
SetEnv MONGOID_USERNAME graylog2
SetEnv MONGOID_DATABASE graylog2
CustomLog /var/log/apache2/graylog2_access.log combined
To install the web interface you need to have Rubygems and bundler installed
apt-get install rubygems && gem install bundler --no-ri --no-rdoc
Once these are installed navigate into /var/lib/graylog2/web
Now you will want to make sure that the conf/mongoid.yml and the conf/indexer.yml config files have the proper values for your configuration.
Enable your graylog2 site and disable the default:
a2ensite graylog2 && a2dissite 000-default
Restart Apache and check it out. Graylog2 web runs on port 80 now, so just just hit your server in a web browser. You will be prompted to create your first user and then will have a working and accessible Graylog2 install.
Step six: Get some logs to Graylog
You can ship your logs to Graylog2 in a variety of ways. You can ship using rsyslog or syslog-ng, or via AMQP or even using logstash as a middleware for filtering the logs that make it to your server. Just to see the data and bask in the glory of a working install let's ship syslog from one of our other servers using rsyslog.
Make sure that you ahve rsyslog installed, and then add the following line to /etc/rsyslog.d/50-default.conf
*.* @<graylog server IP address>
Restart rsyslog and wait patiently for your logs to show up. It won't take long until you have data to mine.