Highly Available ELK (Elasticsearch, Logstash and Kibana) Setup

In this post I will be going over how to setup a complete ELK (Elasticsearch, Logstash and Kibana) stack with clustered elasticsearch and all ELK components load balanced using HAProxy. I will be setting up a total of four six servers (2-HAProxy, 2-ELK frontends and 2-Elasticsearch master/data nodes) in this setup however you can scale the ELK stack by adding additional nodes identical to logstash-1/logstash-2 for logstash processing and Kibana web interfaces and adding the additional node info to the HAProxy configuration files to load balance. You can also scale the Elasticsearch Master/Data nodes by building out addtional nodes and they will join the cluster.

Acronyms throughout article
ELK – Elasticsearch Logstash Kibana
ES – Elasticsearch


Requirements:
In order for all logstash-elasticsearch clustering to work correctly all HAProxy nodes and ELK nodes should be on the same subnet (If not you will need to configure unicast mode for Elasticsearch as multicast is enabled using these scripts).
Two Ubuntu (12.04LTS/14.04LTS) HAProxy nodes with two NICS each. (1vCPU and 512MB memory will work)
Two or more Ubuntu (12.04LTS/14.04LTS) nodes to install the ELK stack frontends. (2vCPU and 2GB memory will work)
Two or more Ubuntu (12.04LTS/14.04LTS) nodes to install the ES Master/Data nodes. (2vCPU and 4GB of memory will work)

IP Addresses required to set all of this up. (Change to fit your environment.)
DNS A Record: logstash (with the LB VIP address) (If you use something other than this name update in each location that logstash is configured for. I will be providing a script to do this in the near future.)
LB VIP 10.0.101.60
haproxy-1 10.0.101.61
haproxy-2 10.0.101.62
logstash-1 10.0.101.185
logstash-1 172.16.0.1 (Cluster Heartbeat)
logstash-2 10.0.101.180
logstash-2 172.16.0.2 (Cluster Heartbeat)
es-1 10.0.101.131
es-2 10.0.101.179

If you decide to use different node names than the above list then you will need to make sure to make changes to the configurations to reflect these changes.


HAProxy Nodes (haproxy-1, haproxy-2):
Setup both HAProxy nodes identical all the way down to the ELK stack setup section. The below instructions which have been crossed out are no longer valid but will remain in the off chance that you would like to use heartbeat instead of keepalived for your cluster setup.

First thing we need to do is install all of the packages needed.
sudo apt-get install haproxy heartbeat watchdog
Now we will need to configure networking on each nodes as follows. (Again modify to fit your environment.)

sudo apt-get install haproxy keepalived

HAProxy-1 (Primary)
sudo nano /etc/network/interfaces
Overwrite the contents with the code from below.
https://gist.github.com/mrlesmithjr/1a52e824f22ced8e6758

HAProxy-2 (Failover)
sudo nano /etc/network/interfaces
Overwrite the contents with the code from below.
https://gist.github.com/mrlesmithjr/c8d756fb927af7f0927d

We need to allow an interface to be brought online that is not part of the /etc/network/interfaces configuration so we need to run the following. This will allow all of our VIP’s to come up.

echo "net.ipv4.ip_nonlocal_bind=1" >> /etc/sysctl.conf

Verify that the above setting has been set by running the following on each node. You should get back the following net.ipv4.ip_nonlocal_bind = 1

sysctl -p

Now you will need to restart networking on each node or reboot for the IP settings from above to be set.
sudo service networking restart
Now we are ready to configure our heartbeat service on each node. We will do that by setting up the following configuration files on each node.
sudo nano /etc/ha.d/ha.cf
Copy the following into ha.cf file.

https://gist.github.com/mrlesmithjr/1e9a5072b668fb5ea839
sudo nano /etc/ha.d/authkeys
Copy the following into authkeys (change password to something else).
auth 3
1 crc
2 sha1 password
3 md5 password

Now change the permissions of the authkeys as follows.
sudo chmod 600 /etc/ha.d/authkeys
Now we will create the haresources file to complete the heartbeat service setup.
sudo nano /etc/ha.d/haresources
Copy the following into haresources.
haproxy-1 IPaddr::10.0.101.60/24/eth0 logstash

Now we need to configure the keepalived cluster service. All that we need to do is create /etc/keepalived/keepalived.conf

sudo nano /etc/keepalived/keepalived.conf

And copy the contents from below and save the file. Make sure to modify the IP addresses to match your environment.

Now you need to start the keepalived service

sudo service keepalived start

You can check and make sure that all of your VIP’s came up by running the following. A normal ifconfig will not show them.

sudo ip a | grep -e inet.*eth0

You should see something similar to below.

2014-06-09_22-04-58

Now we are ready to setup HAProxy for our ELK stack. The final piece of our setup for frontend load balancer cluster.

sudo nano /etc/haproxy/haproxy.cfg

Replace all contents in haproxy.cfg with the following code.

Now we need to set HAProxy to enabled so it will start.

sudo nano /etc/default/haproxy

Change

ENABLED=0

to

ENABLED=1

Now we should be able to start HAProxy up.

sudo service haproxy start

If you see errors similar to below these can be ignored.

[WARNING] 153/132650 (4054) : config : 'option httplog' not usable with proxy 'logstash-syslog-514' (needs 'mode http'). Falling back to 'option tcplog'.
[WARNING] 153/132650 (4054) : config : 'option httplog' not usable with proxy 'logstash-syslog-1514' (needs 'mode http'). Falling back to 'option tcplog'.
[WARNING] 153/132650 (4054) : config : 'option httplog' not usable with proxy 'logstash-eventlog' (needs 'mode http'). Falling back to 'option tcplog'.
[WARNING] 153/132650 (4054) : config : 'option httplog' not usable with proxy 'logstash-iis' (needs 'mode http'). Falling back to 'option tcplog'.
[WARNING] 153/132650 (4054) : config : 'option httplog' not usable with proxy 'logstash-redis' (needs 'mode http'). Falling back to 'option tcplog'.
[WARNING] 153/132650 (4054) : config : 'option httplog' not usable with proxy 'elasticsearch' (needs 'mode http'). Falling back to 'option tcplog'.
[ OK ]

Now one last thing to do based on the fact that HAProxy cannot load balance UDP ports and not all network devices have the option to send their syslog data to a TCP port. We will install an instance of Logstash and setup rsyslog forwarding on each HAProxy node. This instance will only listen for syslog on the standard UDP/514 port, do some filtering and join the logstash-elasticsearch cluster as a client and output to this cluster. be configured to monitor the nginx logs and forward them back to the logstash cluster using redis. We will be configuring rsyslog to listen on UDP/514 and forward to the logstash cluster over TCP/514. I have made this extremely easy by running a script. However do not run this until after you have setup your ELK stack nodes below. If you do set this up prior to building out your ELK nodes then you will need to restart the logstash service on each of your haproxy nodes.
If for some reason you need to restart the logstash service you can do so by running.

sudo service logstash restart

So let’s setup our logstash instance and configure rsyslog forwarding. To do so run the following commands in a terminal session on each of your HAProxy nodes.

sudo apt-get install git
cd ~
git clone https://github.com/mrlesmithjr/Logstash_Kibana3
chmod +x ./Logstash_Kibana3/Cluster_Setup/Logstash-HAProxy-Node.sh
sudo ./Logstash_Kibana3/Cluster_Setup/Logstash-HAProxy-Node.sh

If you copied the haresources file exactly from above then Logstash will only be running on the active cluster node and will start on the failover node when a failover occurs.

Now HAProxy node1 is complete make sure to do all of the above on your HAProxy node2 and make sure to change the priority as noted in the keepalived.conf file. Once you have completed HAProxy node2 continue onto the next section of setting up your ELK stack. You could also clone the first node to create the second node but if you do; make sure to make the proper change in keepalived.conf and haproxy.cfg as above.


ES (Elasticsearch Master/Data Nodes (es-1, es-2):
Now we will be setting up our two nodes to build our Elasticsearch cluster and again I have a script to do this. These nodes will only be Master/Data nodes. They will not be doing any logstash processing. They will purely be used to maintain the cluster and provide redundancy. These nodes will not be exposed to the HAProxy Load Balancers; Only our ELK nodes below will be. These nodes will process all of the data that our frontend ELK nodes send back to be ingested, indexed and etc. For now we will only be setting up two ES Master/Data nodes; however you can build out as many as you like and use this same script each time for each additional node (If you add more than two you will want to adjust the following parameter in /etc/elasticsearch/elasticsearch.yml to ensure you do not experience a split-brain ES cluster. You will set the value to n/2+1 where n=number of nodes. So for example with just two nodes it would be 1 (or not set); whereas with 3 the value would be 2 (3/2+1=2)). Just make sure that every node-name is unique and has a DNS record associated with it.

discovery.zen.minimum_master_nodes: 2

So let’s get these nodes up and running.
On your new ES nodes run the following script on each to get them running.

sudo apt-get install git
cd ~
git clone https://github.com/mrlesmithjr/Logstash_Kibana3
chmod +x ./Logstash_Kibana3/Cluster_Setup/Logstash-ES-Cluster-Master-data-node.sh
sudo ./Logstash_Kibana3/Cluster_Setup/Logstash-ES-Cluster-Master-data-node.sh

Once these are up and running your new ES cluster (logstash-cluster) should be ready to go. However you will want to modify your Java Heap Size to 50% of the installed memory. So if you installed per the requirements you will want to adjust the ES_HEAP_SIZE to 2g because by default it will be at 1g. And it is commented out by default.

sudo nano /etc/init.d/elasticsearch

change

#ES_HEAP_SIZE=1g

to

ES_HEAP_SIZE=2g

Now proceed onto setting up the frontend ELK nodes.


ELK (Elasticsearch, Logstash and Kibana) Nodes (logstash-1, logstash-2):

Now we are ready to set up our ELK frontend nodes and again I have a script to make this process repeatable and simple. For now we will only be setting up two ELK nodes; however you can build out as many as you like and use this same script each time for each additional node. Just make sure that every node-name is unique and has a DNS record associated with it.

So to get started all you need to do is run the following on a fresh Ubuntu 12.04LTS/14.04LTS server. And let the script setup your ELK node. Again this script will install Elasticsearch and join the “logstash-cluster” with master capabilities and as a data node as a client node, install Logstash with many different filtering patterns and inputs; as well as join the “logstash-cluster” as a client node (From logstash output – so yes; 2 instances per ELK node will show as clients in the ES cluster) to output all logs to and install the Kibana3 webUI configured to read from the “logstash-cluster”. These ELK nodes will do all of the heavy lifting for logstash processing as well as servicing Kibana requests meanwhile keeping that load off of the ES Master/Data nodes from above (allowing them to do nothing more than churn data).

So all that is left to do once this is done is to start pointing your network devices to the HAProxy VIP (10.0.101.60 or logstash) for syslog and watch the data start flowing in.

sudo apt-get install git
cd ~
git clone https://github.com/mrlesmithjr/Logstash_Kibana3
chmod +x ./Logstash_Kibana3/Cluster_Setup/Logstash-ELK-ES-Cluster-client-node.sh
sudo ./Logstash_Kibana3/Cluster_Setup/Logstash-ELK-ES-Cluster-client-node.sh

Once this has been completed make sure to go back up at the end of the HAProxy setup and install the logstash instance on each node. Once that has been completed you can begin to test out with only one ELK node or you can build out a few more ELK nodes if you like. I would at least start with two to get the full benefit of this setup.

****NOTE
If you used different naming for your VIP hostname other than logstash you will need to modify the following file on on each of your ELK Client nodes for the Kibana web interface to connect to ES correctly.
You can do that by doing the following and replacing logstash with your viphostname used for your setup…(example myloghostname)
Edit /usr/share/nginx/html/kibana/config.js and change http://logstash:9200 to http://yourviphostname:9200

sudo nano /usr/share/nginx/html/kibana/config.js

Or you can do the following but replace yourviphostname with the actual VIP hostname used for your setup

sed -i -e 's|^elasticsearch: "http://logstash:9200",|elasticsearch: "http://yourviphostname:9200",|' /usr/share/nginx/html/kibana/config.js

Now all that is left to do is configure your network devices to start sending their syslogs to the HAProxy VIP and if your device supports sending via TCP, use it. Why use it? Because you will benefit from the load balancing of the TCP connections and there will not be any lost events (UDP – Best effort, fast!, TCP – Guaranteed, slower; but this type of setup will bring great results!)

Reference the port list below on configuring some of the devices that are pre-configured during the setup.


Port List
TCP/514 Syslog (Devices supporting TCP)
UDP/514 Syslog (Devices that do not support TCP – These are captured on the HAProxy nodes and shipped to logstash using redis)
TCP/1514 VMware ESXi
TCP/1515 VMware vCenter (Windows install or appliance) (For Windows install use NXLog from below in device setup) (For appliance reference device setup below)
TCP/3515 Windows Eventlog (Use NXLog setup from below in device setup)
TCP/3525 Windows IIS Logs (Use NXLog setup from below in device setup)


Device Setup
For Windows (IIS,Eventlog and VMware vCenter logging) install nxlog and use the following nxlog.conf file below to replace everything in C:\Program Files (x86)\nxlog\conf\nxlog.conf

For VMware vCenter appliance do the following from the appliance console.

vi /etc/syslog-ng/syslog-ng.conf

Now add the following to the end of the syslog-ng.conf file

source vpxd {
       file("/var/log/vmware/vpx/vpxd.log" follow_freq(1) flags(no-parse));
       file("/var/log/vmware/vpx/vpxd-alert.log" follow_freq(1) flags(no-parse));
       file("/var/log/vmware/vpx/vws.log" follow_freq(1) flags(no-parse));
       file("/var/log/vmware/vpx/vmware-vpxd.log" follow_freq(1) flags(no-parse));
       file("/var/log/vmware/vpx/inventoryservice/ds.log" follow_freq(1) flags(no-parse));
};

# Remote Syslog Host
destination remote_syslog {
       tcp("logstash" port (1515));
};
#
# Log vCenter Server vpxd log remotely
log {
        source(vpxd);
        destination(remote_syslog);
};

Now restart syslog-ng

/etc/init.d/syslog restart

For Linux (Ubuntu, etc.) I prefer rsyslog as it is installed by default on most.

sudo nano /etc/rsyslog.d/50-default.conf

Now add the following to the end of this file

*.* @@logstash

Note the “@@” this means use TCP; whereas “@” means use UDP.


Now that your setup is complete you can browse to the Kibana webUI by using your browser of choice and go here.

2014-06-07_22-06-43

You should see some logs showing up here now but the view is not that great or usable so you will need to start building how you want your dashboard to look. Or you can use some of the dashboards I have created by clicking the load folder at the top right and go to advanced and enter the gist number or url by using the gist url’s below (copy and paste the https://url). Once you load the dashboard make sure to save it or it will be gone once you browse away.
Apache https://gist.github.com/mrlesmithjr/32affb2316d38500f7e5
Windows IIS https://gist.github.com/mrlesmithjr/4c20dd5ffc79c47474a2
Nginx https://gist.github.com/mrlesmithjr/cf8cb356b05765bd764d
PFsense Firewall https://gist.github.com/mrlesmithjr/f4c9945e04de3211d076
Syslog https://gist.github.com/mrlesmithjr/b0c8f9d8495c8dbefba7
VMware https://gist.github.com/mrlesmithjr/3f7c937cbefe83dafc60
Windows https://gist.github.com/mrlesmithjr/a9847a369c7d92bbac1d

To view your Elasticsearch Elastic HQ plugin go here.

2014-06-07_22-40-07

To view the Elasticsearch Paramedic plugin go here.

2014-06-07_22-46-47

To view the Elasticsearch Head plugin go here.

2014-06-07_22-48-38

To view the Elasticsearch Marvel plugin go here.

2014-06-07_22-50-20

To view your HAProxy stats go here. (Login with admin/admin)

2014-06-07_22-56-07

So there you have it. A highly available ELK setup which also allows us to scale out extremely easy and is repeatable.

While I have been going through this setup and testing out different components brought to light many other options for HAProxy and the ideas behind this post so stay tuned to more soon. As well as I will be providing a visio drawing of the layout. I am also working on some scripts to setup a proxy (nginx) in front of kibana for ssl password protection to login and to redirect ES queries through the proxy; as well as some scripts to do IPTables firewall configurations to tighten down access into the ES nodes forcing access through the nginx proxy and HAProxy Load Balancers mitigating access directly to an ES node. This will all be in a follow up post very soon.


 

Follow up posts

Setup all ELK components to work in unicast mode instead of mutlicast discovery mode.

 

Here is a quick screenshot of performance from the marvel plugin just for reference. Only processing about 6GB/Day right now.

2014-06-26_11-34-18

 

And the Visio drawing to represent the components.

ELK-Stack-HA

Enjoy!

Need help with your ELK Stack deployments? Head over here and see how we can help.

NEW !!!!

If you are looking for a way of deploying this using Ansible head over here.

72 thoughts on “Highly Available ELK (Elasticsearch, Logstash and Kibana) Setup

    • @John – Thanks! All of this has been drafted up and sized for my lab and many things have changed since I wrote this up so I will be modifying it here soon. I have learned quite a bit in the past few weeks for sure. Now on the sizing I am unsure on recommendations at this time but will have a very good idea here soon as I will be putting this solution in place at work which has 100k+ message/second so I will have a much better idea soon. Right now for my lab I am configured as the following…
      (2) Haproxy LB nodes – 1vCPU and 1GB memory
      (3) ELK nodes – 2vCPU and 2GB memory

  1. Awesome! Thanks! What about the my-sql server? It looks like that was separate from the rest of the build or did I miss a link to some other tutorial?

    • @John – LOL! The MySQL is from some testing that I am doing with HAProxy Load-Balancing for a MySQL Master-Master replication cluster. More to come on that soon! It is working awesome as of now.

  2. Great writeup and nice job on those scripting GITs!

    Are you updating those scripts for new versions of elasticsearch (1.2.1 instead of 1.1.1) and kibana (3.1.0 instead of 3.0.1)?

    Also is there a special reason, you are downloading logstash and cat'ting the start script and stuff instead of using their .deb package directly?

    Thanks!

    • @Jens – Thanks. Yes, I will be keeping them up to date. I had some issues with ES 1.2.1 so I am staying at 1.1.1 for now. I updating the scripts to include Kibana 3.1.0 today. When I started with the initial script there wasn't a .deb package and now I just like having more control over the install. I may at some point switch to the .deb but not anytime soon. 🙂

      • Hey Larry,

        what Problems did you have with 1.2.1? I modified your Cluster-ES script to download 1.2.1 and Kibana 3.1.0 and ran it on two nodes. Seems to run fine (OK it was a new install).

        Only problem I have is the Marvel plugin not displaying anything at all.

        (No results: There were no results because no indices were found …)

        Could that have to do with 1.1.1 vs 1.2.1? The other Plugins seem to run fine.

        Also I'm looking around for the directory where ES stores its data?

        Greets

        • Looks like my issues cleaned up after I ran some curator jobs successfully. 🙂 ES stores it's data in /var/lib/elasticsearch

          • Thought so. Any chance you have a guess why the marvel plugin won't see any data? All other plugins and kibana show the data/documents just fine.

            Will have a look into the pfsense syslog problem and report back 🙂

          • @Jens – Have you tried restarting ES? I had to restart mine to get marvel to report correctly after adding in dedicated ES Master/Data nodes.

  3. Looking forward to the revised scripts. I've been jamming on this for a few days now and very few issues at all. I added Palo Alto firewall support to your logstash filters but unfortunately Palo Alto does not play nicely with haproxy. Question on creating the UDP listener for the haproxy servers – why also add the logstash to those servers? rsyslog does just fine "flipping" the traffic back to logstash. Here's my rsyslog.conf:

    $ModLoad imtcp

    $RuleSet PaloAltoLog

    *.* @@logstash:1520

    $InputTCPServerBindRuleset PaloAltoLog

    $InputTCPServerRun 1518

    • @John – LOL! I have since gone to just using rsyslog as well. I had already updated the scripts and updated the blog post. I have also run into some strange issues upgrading to ES 1.2.1 as well so I am looking at laying the design out a bit differently right now. Will know more in the next few days! I would be interested in checking out the Palo Alto filters BTW!

  4. Of course! I'll send you the kibana dashboard as well. E-mail address or post here? BTW, I noticed your script didn't turn on HTTPS, although your haproxy config is technically balancing it. Any reason there? Included in version 2.0 🙂 ?

    • @John – LOL! Yeah I am planning on a separate post on setting up HTTPS using NGINX and securing the whole ELK stack. I changed all of the scripts today and updated this post with many new things. So the version of the setup you may be using may not be the latest! 🙂

  5. Hi Larry,

    you logic doesn't get to me on this one:

    "You will set the value to n/2+1 where n=number of nodes. So for example with just two nodes it would be 1; whereas with 3 the value would be 2 (3/2+1=2)). Just make sure that every node-name is unique and has a DNS record associated with it."

    With n/2+1 as value, a 2-node-setup would have a value of 3, not 2 😉 (2/2 = 1 +1 = 2). And not sure about the three node value, as most round operations would make 3/2 = 2 (1,5 rounded up) so result would be 3.

    So… how to set the value right?! 🙂

    • @Jens – a 2 node setup would be 1 which is why it is not set for just a 2 node setup. Maybe I should reword the part about having just two nodes 🙂 2/2+1 = 2, 3/2+1=3, 5/2+1=3 The correct logic in doing this is to take half of the total number of your nodes and +1 them 🙂 Make sense? No need to round up but rather round down to ensure that you always set it to 1 more than half of your nodes.

      • I did understand it correctly but thought the given logic (n/2+1) is a bit faulty for that case. If you word it like you said: "always take half your nodes (rounded down) and add one so you always have a majority" that makes way more sense 😉

        But thanks for the clarification. So a pair would be bad as no majority, 3 nodes would be 2, 4+5 nodes would be 3, 6+7 nodes would be 4… and so on.

      • This is incorrect. minimum_master_nodes should be 2 for a cluster with 2 nodes.

        To have a majority when you just have two nodes, you *must* have both nodes available.

        minimum_master_nodes=1 on both nodes will let either of them form a cluster on their own when a partition happens, and you *do* get a split brain.

        To be able to lose a node, you will need at least three nodes. Two nodes does not get you HA, unless you only do reads and can handle a split brain.

        • @Alex-It does not state that minimum_master_nodes should be 2 with 2 nodes…It clearly says 1…However I would agree with you that a minimum of 3 nodes should be used which is what I use in prod and then set minimum_master_nodes to 2..Setting up 2 nodes in this was the goal of the setup..leaving it up to the admin to decide on how many nodes to setup. I myself have also taken this design and changed it much more than what is here. I will be sharing that setup in the future. Many tweaks..ex. moving ES master nodes onto their own two servers and etc. Thanks for the feedback as well…

    • @Sabreesh – The elasticsearch cluster is configured to come up as ‘logstash-cluster’. This is set during the install. You can change that if the need is there. Hope this helps!

  6. Looking forward to the SSL setup script. Haven't built the new stack with the additional hosts yet but I'll let you know how it goes. Do you have any suggestions on how to produce reports from ELK? I believe ELK possess all the capability of a full fledged SIEM tool, save two critical factors, reporting and alerting.

  7. Definitely would be interested in the HTTPS as well… been working on something similar configured with puppet. Also curious as you didn't mention the use of your broker. You have content going to redis and being read from redis (logstash input) but this post didn't mention the setup of the broker logstash instance to get data into redis. Are you not using that anymore?

  8. Pingback: Ubuntu Logstash Server with Kibana3 Front End Autoinstall | Everything Should Be Virtual

  9. @Larry,

    Thanks for such a nice post.

    I faced a problem is that load balancing on TCP connection doesn't guarantee actual traffic are balanced.

    For example: if instances of service_1, service_2, service_3, service_4 pushing log logstash-syslog-TCP, it might load balancing as logstash_1: service_1, service_2, logstash_2: service_3, service_4, but service_1 + service_2 has double traffic then service_3, service_4, then the loading of logstash_1, logstash_2 is very unbalanced. Just wonder if there's any solution to mitigate this problem?

    I had tried using "message queue", which adds lots of overhead on push/pull message from message broker.

    Also tried to manually specify which service connect to which logstash server, which has the similar problem.

    Thanks a lot 😀

    • @Jim – Thanks…Hope this post has been helping out for everyone….Many things I have changed since this post that I have not released yet of course 🙂 However the following will balance the traffic much more evenly and I have not had any issues whatsover…..this will need to be modified on your haproxy nodes /etc/haproxy/haproxy.cfg

      mode tcp
      option tcpka
      option tcplog
      #balance leastconn – The server with the lowest number of connections receives the connection
      #balance roundrobin – Each server is used in turns, according to their weights.
      #balance source – Source IP hashed and divided by total weight of servers designates which server will receive the request
      balance leastconn
      default-server inter 2s downinter 5s rise 3 fall 2 slowstart 60s maxconn 256 maxqueue 128 weight 100

      • I think "balance leastconn" means for every new tcp connection request, the load balancer will forward to upstream server which has least tcp connections.

        However, the situation is that traffic of every tcp connection built from logstash-forwarder to logstash varies a lot, some are 0.3k event/s, some are 5k events/s.

        Which means, even the tcp connection number are distribute evenly,

        logstash_1: 10 connection

        logstash_2: 11 connection

        but logstash_1's 10 connection's traffic might as double as logstash_2's

        currently seems "message queue" architecture can solve this case?

        • @Jim – This sounds very interesting if I am understanding what you are saying. What have you figured out so far? I would definitely be interested in hearing what you come up with. Let me know if you haven't as well. I would also be interested in figuring this out as well.

    • @Larry,
      I'l might go for ELB currently, since "message queue" should be best for balancing CPU load across logstash nodes, but the logstash's input/output rabbitmq threads consume 30%-40% CPU that makes extra overhead, which result in adding more nodes to gain the target throughput. 🙁
      However, someone in google-group suggest to use "kafka" message queue for much higher performance, but I don't have time to try it out yet. FYR. https://groups.google.com/forum/#!topic/logstash-

      • @Jim – did you use my setup scripts to setup your environment? I would guess no based on you using rabbitmq? I can say for sure in my testing of rabbitmq the results were awful which is why I went back to redis. I can also say that using my setup in production I can process up to 10k messages per second. I do have a different design which I have not published yet that I could share with you if interested. I do not do any pre processing on the source nodes but have an additional layer which does all processing for multiline events and sends to redis. Then the logstash processors/indexers pull all messages from redis and process the non multiline events and pass through multiline to a 3 node elasticsearch cluster. In total including load balancers in that solution consists of 16 nodes. All messages are near real time ~10 seconds.

        • @Larry is there more info on the redis config? You see it in parts of configs but still not clear of full workflow. Also can you describe your production setup here. I'm interested in scaling and how/where redis will scale with it.
          Thanks

  10. Pingback: The Pros and Cons of Open Source Logging | Logentries Blog

    • @cdenneen – yeah I will share out my current nxlog.conf which includes a working iis setup. I will also share a working logstash.conf which works along with the iis setup. I will also be putting together an updated design which I haven’t shared at this point yet here soon. Stay tuned.

  11. Amazing post! Very well detailed. You stated that you would have a Visio diagram available. Do you have that by any chance as I would love to see this laid out architecturally speaking.

    Again, fantastic post.

  12. Hi,

    I have ran your scripts in sequence and everything was working. I could see some logs and i was playing with it but after restarting the 6 servers, I am getting an error while accessing the kibana page. At http://logstash.mydomain.com the page now states "Connection Failed
    Possibility #1: Your elasticsearch server is down or unreachable Possibility #2: You are running Elasticsearch 1.4 or higher". I have googled and some sites were stating the same issue to with elasticsearch version 1.4 and i have made the changes though the elasticsearch the script is running is 1.3. Did you face this issue or anyone has come up with this issue?

    Thanks for the detailed intallation steps.

    Regards
    Suraj

    • @Suraj – I have seen this where the Kibana instance cannot connect to Elasticsearch either because it is not running or potentially the HAProxy Load Balancers are not running. Validate that Elasticsearch is running on all nodes that it should be and validate that HAProxy is running all required services.

      • haproxy-1 & haproxy-2 – service logstash, haproxy running
        es-1 & es-2 – service elasticsearch running
        logstash-1 & logstash-2 – service logstash, elasticsearch running

        when i checked http://logstash.mydomain.com:9090/haproxy?stats all of them are "active up" except for the kibana-https

        Any other services that i need to check? Actually everything was working before i restarted the 6 servers.

        Regards
        Suraj

    • I faced this issue in my environment and found that I had to go unicast. I have not yet followed this article but the error you have is identical. I'm running on Openstack and the issue was multicast. Once I had unicast set up Kibana worked. Take note to hard refresh when troubleshooting this issue as I thought the issue persisted for some time and it turned out to be cache.

  13. Yeah, nothing changed.

    Strangely, http://logstash.mydomain.com:9200/_plugin/marvel http://logstash.mydomain.com:9200/_plugin/paramed
    works fine but http://logstash.mydomain.com:9200/_plugin/HQ shows the web page but unable to connect

    any idea about enabling the below part in the elasticsearch.yml
    http.cors.allow-origin: "/.*/"
    http.cors.enabled: true

    some were suggesting that there may be not enough resources and hence i have increased the memory and the cpu too.
    http://serverfault.com/questions/646484/logstash-… just a link for reference regarding people facing similar issues.

    Regards
    Suraj

  14. I'm also interested in the redis config. I am workingo n a similair config for a company wide deployment in support of our cloud services. I'd like to use Elasticache (Redis) in my deployment.

  15. It's really very strange.

    I have reinstalled the whole 6 servers again with the only change of the LB VIP DNS A record from logstash to elkstack. The reason why i did that was while googling someone suggested that the error could be due to some cross domain issue. I am dealing with multiple domain here say mydomain.com and mydomain.edu and i already had a logstash.mydomain.edu and hence that could be the possible reason why it failed. so i made it elkstack.mydomain.com and things were working fine.

    The only issue i am facing now is from some of the desktops, i am still facing the connection failed error but few other desktops are connecting fine and showing the graph.

  16. Suraj, i had the same issue with failed connections.
    The issue is dns, the clients need to be able to resolve “logstash” or whatever dns you have for the cluster.

  17. Pingback: Installing Nagios Log Server - Virtxpert

  18. Pingback: Why Interoperability is a Key Requirement for Your DevOps Toolkit | Logentries Blog

  19. Pingback: Link to Free utilities compiled from VirtXpert Blog | Infrastructure Land

  20. Hi,
    My Dissertation is on Log analysis and management using Open Source Tools. SI have access to logs from different servers that have been extracted to an external storage. I would like to setup a Virtual ELK stack and use it to carryout offline log visualisation. Is that possible? How do i go about it?

  21. Syslogs are fine. How do I transfer application logs generated by, say, some java apps to the logstash? The java apps use log4j as the logger.

      • Hi @mrlesmithjr, thanks for the logstash config, i’m using nxlog on my IIS servers with below config, but not seeing anything in logstash/kibana:

        ## Please set the ROOT to the folder your nxlog was installed into,
        ## otherwise it will not start.

        #define ROOT C:\Program Files\nxlog
        define ROOT C:\Program Files (x86)\nxlog

        Moduledir %ROOT%\modules
        CacheDir %ROOT%\data
        Pidfile %ROOT%\data\nxlog.pid
        SpoolDir %ROOT%\data
        LogFile %ROOT%\data\nxlog.log

        # Enable json extension

        Module xm_json

        Module xm_syslog

        Module pm_buffer
        MaxSize 1024
        Type Mem
        WarnLimit 512

        Module pm_buffer
        MaxSize 1024
        Type Mem
        WarnLimit 512

        Module xm_csv
        Fields $date, $time, $s-ip, $cs-method, $cs-uri-stem, $cs-uri-query, $s-port, $cs-username, $c-ip, $csUser-Agent, $cs-Referer, $sc-status, $sc-substatus, $sc-win32-status, $time-taken
        FieldTypes string, string, string, string, string, string, integer, string, string, string, string, integer, integer, integer, integer
        Delimiter ‘ ‘
        QuoteChar ‘”‘
        EscapeControl FALSE
        UndefValue –

        Module im_file
        File “C:\inetpub\logs\LogFiles\W3SVC1\u_ex*.log”
        ReadFromLast True
        SavePos True
        Exec if $raw_event =~ /^#/ drop(); \
        else \
        { \
        w3c->parse_csv(); \
        $SourceName = “IIS”; \
        $Message = to_json(); \
        }

        Module im_file
        File “C:\inetpub\logs\LogFiles\W3SVC2\u_ex*.log”
        ReadFromLast True
        SavePos True
        Exec if $raw_event =~ /^#/ drop(); \
        else \
        { \
        w3c->parse_csv(); \
        $SourceName = “IIS”; \
        $Message = to_json(); \
        }

        Module im_file
        File “C:\inetpub\logs\LogFiles\W3SVC3\u_ex*.log”
        ReadFromLast True
        SavePos True
        Exec if $raw_event =~ /^#/ drop(); \
        else \
        { \
        w3c->parse_csv(); \
        $SourceName = “IIS”; \
        $Message = to_json(); \
        }

        Module om_tcp
        Host x.x.x.x
        Port 5544

        Path iis_1, iis_2, iis_3 => out_iis

        Sorry, early days with logstash, appreciate the help.

  22. Pingback: Where have I been? And why no posts? Simple enough! - Everything Should Be Virtual

  23. Please change the filter of syslog to accept the new format RFC5424

    match => { “message” => “%{SYSLOG5424BASE} +%{GREEDYDATA:syslog5424_msg}” }
    add_field => [ “received_at”, “%{@timestamp}” ]
    add_field => [ “received_from”, “%{host}” ]
    }
    syslog_pri { }
    date {
    match => [ “syslog5424_ts”, “MMM d HH:mm:ss”, “MMM dd HH:mm:ss”, “ISO8601” ]
    }
    if !(“_grokparsefailure” in [tags]) {
    mutate {
    replace => [ “host”, “%{syslog5424_host}” ]
    replace => [ “@source_host”, “%{syslog5424_host}” ]
    replace => [ “@message”, “%{syslog5424_msg}” ]

    Thanks for the nice install script! 😉

    • @mane – Thanks for the feedback. I am definitely already testing this. I have moved this script all to @ansible which I have not provided to the community yet. This will be coming soon though. I can assure you that if you like the script, YOU WILL LOVE the Ansible way.

  24. I set up everything from scratch and i can access all pages, but the .logstash-* index is missing. Marvel works fine.

    Any ideas how i can check where the error is? I tried to follow the path with tcpdump and it looks fine. Everything is in the same network.

  25. Pingback: Building a highly available ELK stack with Puppet, Part I: Introduction | Razor Consulting

  26. Pingback: Ansible - Highly Available ELK Stack - EverythingShouldBeVirtual

  27. Pingback: Ansible - Highly Available ELK Stack - EverythingShouldBeVirtual

  28. Pingback: DatabaseCast: Pesquisas textuais, log e information retrieval – Agency Major

  29. Pingback: Installing Nagios Log Server - NerdBlurt

  30. Pingback: vSphere 6.0 Syslog Configuration – Tout Devrait Etre Virtual

Leave a Reply

Your email address will not be published. Required fields are marked *

*