--Preamble--
I am writing this from an Ubuntu (and likely Debian) perspective, but most of it translates to RPM based distributions. This deployment is tested and running on Ubuntu 12.04 using 2 cores and 3gb of RAM. Disk space requirements vary based on how many servers you have shipping logs to it; Graylog2 keeps the last 60 days of data and purges every 30 minutes by default, but you can adjust the storage settings if you have other requirements.
The first thing I would like to share about this set up is: I start from the Binaries or sources for everything except MongoDB; there are PPA's with the other parts in them but I had nothing but problems with those.
The second thing is that this is not a complete copy/paste article; I will lead you but assume you know your system well enough to actually complete some of the steps I provide
--The good bits--
So you have all these verbose, helpful logs on all of your servers, but you are getting tired of logging into each server to trace traffic through your stack.
An easy solution is to ship your logs over to another server then you only have to log into one server to look at your logs but now you are using up valuable system resources syncing them on a schedule, and you will find yourself waiting for the newest ones to show up.
You are close to a working solution... but now you need a notification when a condition occurs in those logs. There are a lot of options out there for centralizing and parsing your log files then sending out notifications; the one everyone knows is Splunk. I have used Splunk; it took a while to learn the query language but once I did it worked fine for a while... then we went over the limit for the free version and were looking at putting out thousands of dollars to keep using it with the log traffic we generate.
Enter Graylog2. It is a Java app and a Rails webapp that uses a NoSQL backend and Elasticsearch for indexing. It seems like a complex setup and it is a lot more complex than running a single installer and being done with it, but the queries are easier, the interface is easy to use, notifications are easy to set up and it can easily be automated with puppet or chef... and best of all the logo is a gorilla with a party hat and balloon!
--The Actual set up--
Step one: Prepare your locations and install Java.
For the purposes of this article I will store sources in /usr/share/src and will install to /var/lib/<component> You can go ahead and create your src directory now.
You should create the /var/lib/graylog directory too
Install the Java of your choosing. This set up will work with openjdk-jre just as well as the sun-jre
Step two: MongoDB for Storage.
Graylog2 writes everything to a MongoDB instance. Mongo is really easy to set up and configure and can be done in minutes
First you need to add the 1gen keys
This install adds an upstart script to /etc/init/ and a mongodb.conf to /etc, and the data will be stored in /var/lib/mongodb.
Go ahead and start mongodb now and lets make sure it works. To open a mongo console just type "mongo" and you will be in. Now let's create the Graylog2 database:
This adds a user graylog2 with a password "password"; you can check it by running "show users".
Step three: Elasticsearch for indexing.
Graylog2 uses an Elasticsearch instance to store indexes in. This allows for faster groking of the (potentially) massive amounts of data you will be storing. Elasticsearch is quite easy to get setup.
Then I customized the /var/lib/elasticsearch/config/elasticsearch.yml per the documentation in the config file (these are the bare minimum and depending on your required scale this may be just the beginning)
Now start up Elasticsearch and we will see that it works. Run the following to check that it is working (If you don't have links or prefer to use a browser you can open http://<server_IP>:9200 in your favourite browser):
You should get a json response with the cluster name, build version and a status. As long as you see this response, Elasticsearch is working. You may need to increase the number of file descriptors if your server still has the default set; increasing it to 64k is safe.
Step four: Graylog2 server.
Graylog2 server is a java app that listens on a syslog port and a GELF port for incoming messages and then writes them to Mongodg and updates the indexes in Elasticsearch.
Grab the Graylog2 server tar.gz
cd /usr/share/src && wget http://cloud.github.com/downloads/Graylog2/graylog2-server/graylog2-server-0.9.6.tar.gz | tar zxv
Now you need to modify the graylog2.conf to reflect the MongoDB user account and password you created earlier and make sure that the Elasticsearch section references the correct port etc.
Next you can create an Upstart script or an init script, I used marzocchi's init script found here: https://gist.github.com/1659948 but you can create your own if you want.
Start up Graylog2 server and make sure that it stays running (this can be one of those silently failing times, wait a couple of seconds and do a ps to make sure Graylog2 server is still running). If Graylog fails to stay running, you can check out the log file in /var/log/graylog2.log. A number of failures occur becasue Graylog can't talk to MongoDB properly or to Elasticsearch; check your configs.
Step five: Graylog2 web interface.
The web interface for Graylog2 is a Rails app that comes with a WEBrick setup, If this is meant to be a production server you really want to host it on Apache with passenger or on something else of your choosing. WEBrick is a Ruby library meant for development and testing purposes (WEBrick description).
First grab the tar.gz from github:
cd /usr/share/src && wget http://cloud.github.com/downloads/Graylog2/graylog2-web-interface/graylog2-web-interface-0.9.6.tar.gz | tar zxv
Restart Apache and check it out. Graylog2 web runs on port 80 now, so just just hit your server in a web browser. You will be prompted to create your first user and then will have a working and accessible Graylog2 install.
Step six: Get some logs to Graylog
You can ship your logs to Graylog2 in a variety of ways. You can ship using rsyslog or syslog-ng, or via AMQP or even using logstash as a middleware for filtering the logs that make it to your server. Just to see the data and bask in the glory of a working install let's ship syslog from one of our other servers using rsyslog.
Make sure that you ahve rsyslog installed, and then add the following line to /etc/rsyslog.d/50-default.conf
Comments (0)