Tech Chorus

Ansible Role To Install And Configure Logstash OSS

written by Sudheer Satyanarayana on 2020-06-13

I have released the Ansible role to install Logstash OSS. The source is available on Github.

Using the role, install and configure Logstash OSS.

You might be thinking why another Ansoble role? Here are the reasons, I wrote this role:

Example playbook

- hosts: servers
  vars:
    logstash_oss_install_java_11: true
    logstash_beats_input: true
    logstash_elasticsearch_output: true
  roles:
    - role: bngsudheer.logstash-oss

The role can optionally install Java 11 which is a dependency for Logstash. You can skip this and satisfy the dependencies by other means. Notice that in the example, the optional Beats input and Elasticsearch output is enabled.

Related Posts

Generating Self-Signed Certificate For Logstash And Other Services

written by Sudheer Satyanarayana on 2020-06-11

When configuring Logstash with SSL certificate, you need the certificate key and certificate. You can generate them yourselves using openssl.

Fedora/CentOS

cp /etc/pki/tls/openssl.cnf my_openssl.cnf

For Ubuntu

cp /etc/ssl/openssl.cnf my_openssl.cnf

Edit the file my_openssl.cnf and in the v3_ca section add the subjectAltName:

subjectAltName = IP: 192.168.200.19

If you have multiple IP addresses, use a comma separated string. For example:

subjectAltName = IP: 192.168.200.19,IP: 192.168.200.20

If you are using this for Logstash, use the IP address of the Logstash server.

Generate the certificate and key

openssl req -x509 -batch -nodes -days 3650 -newkey rsa:2048 -keyout my.key -out my.crt --config my_openssl.cnf

Convert private to PKCS8 format

openssl pkcs8 -in my.key -topk8 -nocrypt -out my.p8

Use the my.key and my.crt in your Logstash configuration.

Related Posts

Logging From Flask Application To ElasticSearch Via Logstash

written by Sudheer Satyanarayana on 2020-06-01

In this blog post, I will explain how to send logs from Flask application to Elasticsearch via Logstash. As a bonus, I will show how to view the logs in Kibana. No prior working knowledge of the Elastic stack is expected.

Prerequisites

Our log pipeline:

Flask application -> python-logstash-async -> Logstash -> Elasticsearch

Flask Logstash Elasticsearch Kibana Pipeline

You can view the logs in Kibana. Or you can query Elasticsearch directly manually or from another application.

Flask application environment setup

For this example, I used the CentOS 8 stream Linux distribution. You are free to use the distribution of your choice. If you are using another distribution adjust the commands.

sudo dnf install python36 vim-enhanced screen
python3 -m venv myenv
source myenv/bin/activate
pip install flask

Minimal Flask application

Let's start with the minimal Flask application from the official quick start guide.

from flask import Flask
app = Flask(__name__)

@app.route('/')
def hello_world():
    return 'Hello, World!'

Start the application:

export FLASK_APP=hello.py
export FLASK_ENV=development
flask run

Send a request to the Flask application

curl localhost:5000

Add a logging line and view the log entry in stdout

from flask import Flask
app = Flask(__name__)

@app.route('/')
def hello_world():
    app.logger.info("Hello there")
    return 'Hello, World!'

Here's my terminal output:

(myenv) [vagrant@dev-c8-01 flask-app]$ flask run
 * Serving Flask app "hello.py" (lazy loading)
 * Environment: development
 * Debug mode: on
 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
 * Restarting with stat
 * Debugger is active!
 * Debugger PIN: 223-107-058
[2020-05-31 13:19:49,981] INFO in hello: Hello there
127.0.0.1 - - [31/May/2020 13:19:49] "GET / HTTP/1.1" 200 -

So far, so good.

Yes, I use Vagrant.

Let's install the Python package python-logstash-async:

pip install python-logstash-async

python-logstash-async helps you ship logs from your Flask application to Logstash. As you can expect from the name, the connection to Logstash is handled asynchronously.

Import the log handler and formatter.

from logstash_async.handler import AsynchronousLogstashHandler
from logstash_async.formatter import FlaskLogstashFormatter

Configure Logstash. In this example, we use Python variables in the hello.py file.

LOGSTASH_HOST = "192.168.200.19"
LOGSTASH_DB_PATH = "/home/vagrant/app-data/flask_logstash.db"
LOGSTASH_TRANSPORT = "logstash_async.transport.BeatsTransport"
LOGSTASH_PORT = 5044

We are stating that the Logstash runs on the IP address, 192.168.200.19 on the TCP port 5044. Remember, the port has to be an integer. The python-logstash-async package offers a few options for the transport protocol. We chose the Beats Transport, because it is one of the popular input sources for Logstash. You can expect Logstash to accept data on the Beats port in many common Logstash configurations. On the server where the Flask application runs, you have to select a path at which python-logstash-async stores temporary data. In my example, it is /home/vagrant/app-data/flask_logstash.db.

Ensure the directory of LOGSTASH_DB_PATH exists. In my case, I did mkdir /home/vagrant/app-data/.

In a sophisticated application, you might want to load the configuration from a file or environment variables.

Configure the Logstash handler

logstash_handler = AsynchronousLogstashHandler(
    LOGSTASH_HOST,
    LOGSTASH_PORT,
    database_path=LOGSTASH_DB_PATH,
    transport=LOGSTASH_TRANSPORT,
)

Configure the formatter

logstash_handler.formatter = FlaskLogstashFormatter(metadata={"beat": "myapp"})

python-logstash-async package provides a formatter specifically for Flask. Some common Logstash configurations listen to Beats input. Hence, set up the metadata for Beats protocol like this.

Attach the handler:

app.logger.addHandler(logstash_handler)

Our application looks like this:

from flask import Flask
app = Flask(__name__)

from logstash_async.handler import AsynchronousLogstashHandler
from logstash_async.formatter import FlaskLogstashFormatter

LOGSTASH_HOST = "192.168.200.19"
LOGSTASH_DB_PATH = "/home/vagrant/app-data"
LOGSTASH_TRANSPORT = "logstash_async.transport.BeatsTransport"
LOGSTASH_PORT = 5044

logstash_handler = AsynchronousLogstashHandler(
    LOGSTASH_HOST,
    LOGSTASH_PORT,
    database_path=LOGSTASH_DB_PATH,
    transport=LOGSTASH_TRANSPORT,
)
logstash_handler.formatter = FlaskLogstashFormatter(metadata={"beat": "myapp"})
app.logger.addHandler(logstash_handler)

@app.route('/')
def hello_world():  
    app.logger.info("Hello there")
    return 'Hello, World!'

Configure The Logstash Server

I run another CentOS 8 Vagrant instance with the IP address 192.168.200.19.

CentOS 8 quick setup:

sudo dnf install epel-release
sudo dnf install vim-enhanced screen

Logstash requires Java 8 or 11.

Install Java 11.

sudo dnf install java-11-openjdk

Import Elasticsearch GPG key.

sudo rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch

Create a DNF repository for Elasticsearch OSS packages

Create the file /etc/yum.repos.d/elasticsearch-oss-7.x.repo and put the following contents in it:

[elasticsearch-oss-7.x]
name=Elasticsearch repository for oss-7.x packages
baseurl=https://artifacts.elastic.co/packages/oss-7.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md

Install Logstash OSS package:

sudo dnf install logstash-oss

Start and enable the logstash systemd unit:

sudo systemctl enable logstash
sudo systemctl start logstash

Configure the logstash pipeline to accept input using the Beats protocol.

Create the file /etc/logstash/conf.d/beats-input.conf and put the following contents in it:

input {
  beats {
    port => 5044
  }
}

To test the Logstash pipeline so far, configure the stdout output. Create the file /etc/logstash/conf.d/stdout.conf and put the following contents in it:

output {
    stdout { codec => rubydebug }
}

Restart logstash

systemctl restart logstash

Depending on your environment, it may take a few seconds for logstash to start the services. Keep this in mind when testing your pipelines.

In a terminal window, watch the logstash logs via journalctl

sudo journalctl -fu logstash

Send a request to the Flask application again.

In the logstash journal you should see something like:


Jun 01 11:36:53 dev-c8-02.labp.gavika.dev logstash[3124]: {
Jun 01 11:36:53 dev-c8-02.labp.gavika.dev logstash[3124]:                "logger_name" => "hello",
Jun 01 11:36:53 dev-c8-02.labp.gavika.dev logstash[3124]:         "req_remote_address" => "127.0.0.1",
Jun 01 11:36:53 dev-c8-02.labp.gavika.dev logstash[3124]:                    "req_uri" => "http://localhost:5000/",
Jun 01 11:36:53 dev-c8-02.labp.gavika.dev logstash[3124]:                    "program" => "/home/vagrant/flask-app/myenv/bin/flask",
Jun 01 11:36:53 dev-c8-02.labp.gavika.dev logstash[3124]:                       "line" => 27,
Jun 01 11:36:53 dev-c8-02.labp.gavika.dev logstash[3124]:                   "req_host" => "localhost",
Jun 01 11:36:53 dev-c8-02.labp.gavika.dev logstash[3124]:              "flask_version" => "1.1.2",
Jun 01 11:36:53 dev-c8-02.labp.gavika.dev logstash[3124]:                "req_referer" => nil,
Jun 01 11:36:53 dev-c8-02.labp.gavika.dev logstash[3124]:                  "logsource" => "dev-c8-01.lab.gavika.com",
Jun 01 11:36:53 dev-c8-02.labp.gavika.dev logstash[3124]:                       "type" => "python-logstash",
Jun 01 11:36:53 dev-c8-02.labp.gavika.dev logstash[3124]:                   "@version" => "1",
Jun 01 11:36:53 dev-c8-02.labp.gavika.dev logstash[3124]:                       "tags" => [
Jun 01 11:36:53 dev-c8-02.labp.gavika.dev logstash[3124]:         [0] "beats_input_codec_plain_applied"
Jun 01 11:36:53 dev-c8-02.labp.gavika.dev logstash[3124]:     ],
Jun 01 11:36:53 dev-c8-02.labp.gavika.dev logstash[3124]:                      "level" => "INFO",
Jun 01 11:36:53 dev-c8-02.labp.gavika.dev logstash[3124]:               "process_name" => "MainProcess",
Jun 01 11:36:53 dev-c8-02.labp.gavika.dev logstash[3124]:                "thread_name" => "Thread-6",
Jun 01 11:36:53 dev-c8-02.labp.gavika.dev logstash[3124]:              "req_useragent" => "",
Jun 01 11:36:53 dev-c8-02.labp.gavika.dev logstash[3124]:                    "message" => "Hello there",
Jun 01 11:36:53 dev-c8-02.labp.gavika.dev logstash[3124]:                       "path" => "/home/vagrant/flask-app/hello.py",
Jun 01 11:36:53 dev-c8-02.labp.gavika.dev logstash[3124]:                "interpreter" => "/home/vagrant/flask-app/myenv/bin/python3",
Jun 01 11:36:53 dev-c8-02.labp.gavika.dev logstash[3124]:                 "req_method" => "GET",
Jun 01 11:36:53 dev-c8-02.labp.gavika.dev logstash[3124]:                  "func_name" => "hello_world",
Jun 01 11:36:53 dev-c8-02.labp.gavika.dev logstash[3124]:                       "host" => "dev-c8-01.lab.gavika.com",
Jun 01 11:36:53 dev-c8-02.labp.gavika.dev logstash[3124]:                        "pid" => 3482,
Jun 01 11:36:53 dev-c8-02.labp.gavika.dev logstash[3124]:                 "@timestamp" => 2020-06-01T11:36:51.705Z,
Jun 01 11:36:53 dev-c8-02.labp.gavika.dev logstash[3124]:     "logstash_async_version" => "1.6.4",
Jun 01 11:36:53 dev-c8-02.labp.gavika.dev logstash[3124]:        "interpreter_version" => "3.6.8"
Jun 01 11:36:53 dev-c8-02.labp.gavika.dev logstash[3124]: }

If you see output like this, the connection between the Flask application and Logstash is good.

Next step is to configure Elasticsearch server and modify the Logstash pipeline.

To keep it simple, I will install and configure Elasticsearch server on the same machine as Logstash. In a production environment, you might want to install Elasticsearch on a few separate servers.

Install Elasticsearch OSS package.

We have the prerequisites - a) Java 11 b) Elasticsearch GPG key c) Elasticsearch OSS repository configuration.

sudo dnf install elasticsearch-oss
sudo systemctl start elasticsearch-oss
sudo systemctl enable elasticsearch-oss

By default, Elasticsearch listens on 127.0.0.1. Change it to listen on the IP address 192.168.200.19. Edit /etc/elasticsearch/elasticsearch.yml:

network.host: 192.168.200.19
discovery.seed_hosts: ["192.168.200.19"]

cluster.initial_master_nodes:
  - node1
cluster.name: log-analytics-cluster
node.data: true
node.master: true
node.name: node1

Having Elasticsearch listen on the IP address 192.168.200.11 allows us to query it from another machine on the same network. We have put

Restart elasticsearch service.

Now, remove the stdout output in the Logstash pipeline. rm /etc/logstash/conf.d/stdout.conf

Add a new output destination in Logstash.

Create the file /etc/logstash/conf.d/elasticsearch-output.conf

output {
  elasticsearch {
    hosts => ["http://192.168.200.19:9200"]
    index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"
  }
}

We are stating that Logstash should send output to the elasticsearch server hosted on 192.168.200.19 on TCP port 9200. The index name should be %{[@metadata][beat]}-%{+YYYY.MM.dd}. Recall that we added @metadata[beat] in our Flask application with value myapp. If the log is sent on 2020-06-01, Logstash will send the output to the elasticsearch index named myapp-2020.06.01. The beat name shuold be in lowercase. This is a generic configuration in Logstash that accepts input on 5044 via the Beats protocal and sends the output to elasticsearch index conveniently named using the beat name and the date.

Restart logstash service.

Send the request again to the Flask application.

List the indices in elasticsearch.

curl '192.168.200.19:9200/_cat/indices'

On my server, I see someting like:

yellow open myapp-2020.06.01 PU8K4humTnWrvAg6OUiEOw 1 1 1 0 18kb 18kb

This means, there is an index called myapp-2020.06.01 on my Elasticsearch server.

Search for the log messages

curl '192.168.200.19:9200/_search?pretty=&q=message=hello'

I see response like this:

{
  "took" : 17,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 0.18232156,
    "hits" : [
      {
        "_index" : "myapp-2020.06.01",
        "_type" : "_doc",
        "_id" : "WLHHb3IBj0Bi87BR6NBQ",
        "_score" : 0.18232156,
        "_source" : {
          "logger_name" : "hello",
          "req_uri" : "http://localhost:5000/",
          "interpreter_version" : "3.6.8",
          "path" : "/home/vagrant/flask-app/hello.py",
          "process_name" : "MainProcess",
          "thread_name" : "Thread-7",
          "req_host" : "localhost",
          "host" : "dev-c8-01.lab.gavika.com",
          "flask_version" : "1.1.2",
          "interpreter" : "/home/vagrant/flask-app/myenv/bin/python3",
          "message" : "Hello there",
          "tags" : [
            "beats_input_codec_plain_applied"
          ],
          "@timestamp" : "2020-06-01T12:07:50.063Z",
          "pid" : 3482,
          "func_name" : "hello_world",
          "line" : 27,
          "level" : "INFO",
          "logsource" : "dev-c8-01.lab.gavika.com",
          "req_remote_address" : "127.0.0.1",
          "req_referer" : null,
          "req_useragent" : "",
          "logstash_async_version" : "1.6.4",
          "program" : "/home/vagrant/flask-app/myenv/bin/flask",
          "req_method" : "GET",
          "@version" : "1",
          "type" : "python-logstash"
        }
      },
      {
        "_index" : "myapp-2020.06.01",
        "_type" : "_doc",
        "_id" : "e6jTb3IB_6K0Q_DqeqUM",
        "_score" : 0.18232156,
        "_source" : {
          "logger_name" : "hello",
          "req_uri" : "http://localhost:5000/",
          "interpreter_version" : "3.6.8",
          "path" : "/home/vagrant/flask-app/hello.py",
          "process_name" : "MainProcess",
          "thread_name" : "Thread-8",
          "req_host" : "localhost",
          "host" : "dev-c8-01.lab.gavika.com",
          "flask_version" : "1.1.2",
          "interpreter" : "/home/vagrant/flask-app/myenv/bin/python3",
          "message" : "Hello there",
          "tags" : [
            "beats_input_codec_plain_applied"
          ],
          "@timestamp" : "2020-06-01T12:20:28.830Z",
          "pid" : 3482,
          "func_name" : "hello_world",
          "line" : 27,
          "level" : "INFO",
          "logsource" : "dev-c8-01.lab.gavika.com",
          "req_remote_address" : "127.0.0.1",
          "req_referer" : null,
          "req_useragent" : "",
          "logstash_async_version" : "1.6.4",
          "program" : "/home/vagrant/flask-app/myenv/bin/flask",
          "req_method" : "GET",
          "@version" : "1",
          "type" : "python-logstash"
        }
      }
    ]
  }
}

Bonus: view the logs in Kibana.

Install Kibana.

sudo dnf install kibana-oss
sudo systemctl start kibana
sudo systemctl enable kibana

Edit /etc/kibana/kibana.yml and add the following contents:

server.host: "192.168.200.19"
elasticsearch.hosts: ["http://192.168.200.19:9200"]

Restart Kibana.

sudo systemctl restart kibana

In your browser, visit http://192.168.200.19:5601/. Click Visualize. Kibana will prompt you to create an index pattern. In the search box type:

myapp-*

Then click Next Step. In the Time Filter field name select @timestamp. Click Create Index Pattern.

Click Discover. You should see your logs on the screen.

Using TLS with Logstash

If you want to use TLS with Logstash, modify your Flask application like this:

LOGSTASH_HOST = "192.168.200.19"
LOGSTASH_DB_PATH = "/home/vagrant/app-data"
LOGSTASH_TRANSPORT = "logstash_async.transport.BeatsTransport"
LOGSTASH_PORT = 5044

LOGTSASH_SSL_ENABLE = True
LOGSTASH_SSL_VERIFY = True
LOGSTASH_CA_CERT_FILE_PATH = "/home/vagrant/logstash.crt"

logstash_handler = AsynchronousLogstashHandler(
    LOGSTASH_HOST,
    LOGSTASH_PORT,
    database_path=LOGSTASH_DB_PATH,
    transport=LOGSTASH_TRANSPORT,
    ssl_enable=LOGTSASH_SSL_ENABLE,
    ssl_verify=LOGSTASH_SSL_VERIFY,
    ca_certs=LOGSTASH_CA_CERT_FILE_PATH,
)

Notice that the certificate file logstash.crt is made available on the server where the Flask application runs.

Production Environment

This blog post explains how to setup log pipeline from Flask application to Elasticsearch via Logstash. In a production environment, the setup should improve in many ways. Here are few things you might consider for production:

Related Posts

Introduction To Vagrant

written by Sudheer Satyanarayana on 2019-08-26

Introduction

Vagrant is a glue tool to build and manage virtual environments. You might have been using libvirt, virt-manager or similar tools to build and manage virtual guest operating systems. Vagrant provides a better experience to do the same. I recommend adding Vagrant to the developer and DevOps consultant's toolbox.

To quote HashiCorp: "HashiCorp Vagrant provides the same, easy workflow regardless of your role as a developer, operator, or designer. It leverages a declarative configuration file which describes all your software requirements, packages, operating system configuration, users, and more."

On your host machine, install Vagrant and libvirt. Vagrant works with many providers such as libvirt, and VirtualBox. My favorite is libvirt. The integration between Vagrant and libvirt is provided by Vagrant-libvirt.

Without tools like Vagrant, if you are manually installing virtual guest machines, you have to bear some burden:

  1. No easy way to replicate guests on multiple devices. If you use a laptop and a desktop, porting the environment is error-prone and adds some manual steps.
  2. No easy way to destroy and re-create the guest machines with a desired state.
  3. Manually configure the guest networking and storage.
  4. No version control. It's hard or impossible to go back to a previous version. Branching is probably hard to even imagine.
  5. Sharing the development environment with others is probably reduced to writing a set of instructions in a document.

Installing Vagrant

Fedora 30:

sudo dnf install vagrant vagrant-libvirt

Ubuntu 18.04:

sudo apt install vagrant vagrant-libvirt

First Vagrant Example

Here's a sample Vagrantfile to bring up a CentOS 7 virtual guest. Create the file Vagrantfile and add the following contents:

# -*- mode: ruby -*-
# vi: set ft=ruby :

Vagrant.configure("2") do |config|
  config.vm.box = "centos/7"
  config.vm.hostname = "mycoolguest.example.com"
  config.vm.post_up_message = "Happy development"
end

If you do not know Ruby, don't freak out. You don't need to know Ruby to write beginner to intermediary Vagrantfiles. For most use cases, you can simply adjust the templates from this blog post or other resources.

In the Vagrantfile, we are describing our guest virtual machine. We want to use the image centos/7. These images are hosted on Vagrant Cloud. We assign the hostname mycoolguest.example.com to our guest OS. Once the guest is booted up, Vagrant will show the post_up_message.

To bring up the guest virtual machine:

sudo vagrant up

When you run this command for the first time, Vagrant adds .vagrant directory. If you are using Git, make sure to add the file path to your .gitignore. We don't want .vagrant directory in our version control.

To ssh onto the guest, cd to the directory containing Vagrantfile

cd DIRECTORY_CONTAINING_Vagrantfile
sudo vagrant ssh

That's all it takes to bring up a virtual guest. Simply pull this Vagrantfile from your version control on another device and run vagrant up, you will have your new VM guest in a few seconds or minutes depending on a few factors such as availability of cached image and your hardware performance.

Removing The Need To Use Sudo By Adding A Polkit Policy

By default, the host OS will require you to use sudo for commands like vagrant up. You can modify the behavior by adding a policy in Polkit. Create the file /etc/polkit-1/localauthority/50-local.d/vagrant.pkla and add the following contents:

[Allow youruser libvirt management permissions]
Identity=unix-user:YOUR_USERNAME
Action=org.libvirt.unix.manage
ResultAny=yes
ResultInactive=yes
ResultActive=yes

Replace YOUR_USERNAME with your Linux username. Henceforth, you don't have to use sudo for commands like vagrant up.

Another Vagrant Example: Ubuntu 18.04

Play with the Vagrantfile to understand it better. For example, try changing the image to ubuntu.

# -*- mode: ruby -*-
# vi: set ft=ruby :

Vagrant.configure("2") do |config|
  config.vm.box = "generic/ubuntu1804"
  config.vm.hostname = "mycoolguestubuntu.example.com"
  config.vm.post_up_message = "Happy development"
end

Sharing Files Between Host And Guest

Vagrant provides a nifty technique to share files between the host and the guest. You can make a directory on the host available to the guest. Vagrant provides a few options to share files. In this post, I will provide an example using vagrant-sshfs. Install the plugin:

If your OS provides the vagrant-sshfs package, you should use it.

On Fedora:

dnf install  vagrant-sshfs

Alternatively, you can install the plugin using vagrant plugin install command:

vagrant plugin install vagrant-sshfs

Add this block of code to your Vagrantfile:

  config.vm.provider "libvirt" do |lvt, override|
   override.vm.synced_folder ".", "/vagrant", type: "sshfs"
  end

We're overriding the default behavior. If the provider is libvirt, we want the current directory on host, represented by . to be mounted on /vagrant on the guest using the sshfs network filesystem. Here's the complete Vagrantfile:

# -*- mode: ruby -*-
# vi: set ft=ruby :

Vagrant.configure("2") do |config|
  config.vm.box = "generic/ubuntu1804"
  config.vm.hostname = "mycoolguestubuntu.example.com"
  config.vm.post_up_message = "Happy development"

  config.vm.provider "libvirt" do |lvt, override|
   override.vm.synced_folder ".", "/vagrant", type: "sshfs"
  end

end

Reload the guest:

vagrant reload

SSH onto the guest and list the contents of /vagrant

vagrant ssh
ls /vagrant/

You should also see the mounted filesystem if you execute the command mount on the guest.

Provisioning: Shell

Vagrant has a concept of provisioners. Simply put, a provisioner executes your programs after booting up the guest for the first time. Shell provisioner executes your shell script and Ansible provisioner executes your playbook. There are other provisioners available. Let's make a small excercise of installing Apache.

  config.vm.provision "shell", path: "script.sh"

Create the file script.sh in the same directory as Vagrantfile and add the following contents:

#!/bin/bash
apt install apache2 -y

When you run vagrant up for the first time, the provisioners are executed. If you modify the Vagrant file later you have to run the provisioner manually.

Here's the complete example with provisioner to install apache2

#!/bin/bash
apt install apache2 -y

Here's the complete Vagrantfile :

# -*- mode: ruby -*-
# vi: set ft=ruby :

Vagrant.configure("2") do |config|
  config.vm.box = "generic/ubuntu1804"
  config.vm.hostname = "mycoolguestubuntu.example.com"
  config.vm.post_up_message = "Happy development"

  config.vm.network "private_network", ip: "192.168.12.12"
  config.vm.provision "shell", path: "script.sh"

  config.vm.provider "libvirt" do |lvt, override|
    override.vm.synced_folder ".", "/vagrant", type: "sshfs"
    lvt.qemu_use_session = false
  end

end

Execute the Ansible provisioner manually:

vagrant up --provision

When you logon to the guest, you will see that Apache is installed. It is a good idea to use idempotent scripts in provisioners. Idempotent meaning, the script should not change the state of the machine if it is executed again. In our example, when the provisioner is executed for the first time, the apache2 package will be installed. If the provisioner is executed again, apt will see that the package is already installed and there is no need to install it again and hence no action will be taken. You should be aware of this when you are doing thing such as writing to a file. Be aware that when the provisioner is executed a second time, your script might be writing to the file the second time. In such cases, you probably want to write to file conditionally.

Provisioning: Ansible

Vagrant also has a built-in Ansible provisioner. There are two ways to run the Ansible playbook:

  1. Using the Ansible executable on the host
  2. Installing and using the Ansible executable on the guest.

Let's make a small example to install the package htop

# -*- mode: ruby -*-
# vi: set ft=ruby :

Vagrant.configure("2") do |config|
  config.vm.box = "generic/ubuntu1804"
  config.vm.hostname = "mycoolguestubuntu.example.com"
  config.vm.post_up_message = "Happy development"

  config.vm.network "private_network", ip: "192.168.12.12"
  config.vm.provision "shell", path: "script.sh"

  config.vm.provision "ansible" do |ansible|
    ansible.playbook = "playbook.yml"
  end

  config.vm.provider "libvirt" do |lvt, override|
    override.vm.synced_folder ".", "/vagrant", type: "sshfs"
    lvt.qemu_use_session = false
  end

Run the provisioner again and see if the package htop is installed.

Networking

Vagrant allows you to assign static IP addresses to the guest. This is accomplished by adding one line of definition in Vagrantfile.

 config.vm.network "private_network", ip: "192.168.12.12"

There are couple caveats here. If you are using, libvirt, set the qemu_use_session to false. The second caveat is, don't assign the IP address x.x.x.1 to the guest. x.x.x.1 is typically reserved for gateways. Our Ubuntu example looks like this after adding networking:

# -*- mode: ruby -*-
# vi: set ft=ruby :

Vagrant.configure("2") do |config|
  config.vm.box = "generic/ubuntu1804"
  config.vm.hostname = "mycoolguestubuntu.example.com"
  config.vm.post_up_message = "Happy development"

  config.vm.network "private_network", ip: "192.168.12.12"

  config.vm.provider "libvirt" do |lvt, override|
    override.vm.synced_folder ".", "/vagrant", type: "sshfs"
    lvt.qemu_use_session = false
  end

end

Inserting Your SSH Public Key To The Guest

You can ssh on to the guest machine by executing vagrant ssh. But what if you want to ssh on to the guest directly using the ssh client? Using the shell provisioner you can insert your SSH public key. Here's an example:

# -*- mode: ruby -*-
# vi: set ft=ruby :

Vagrant.configure("2") do |config|
  config.vm.box = "generic/ubuntu1804"
  config.vm.hostname = "mycoolguestubuntu.example.com"
  config.vm.post_up_message = "Happy development"

  config.vm.network "private_network", ip: "192.168.12.12"
  config.vm.provision "shell", path: "script.sh"

  config.ssh.insert_key=false
  config.ssh.private_key_path = ['~/.vagrant.d/insecure_private_key', '~/.ssh/id_rsa']
  config.vm.provision "file", source: "~/.ssh/id_rsa.pub", destination: "~/.ssh/authorized_keys" 

  config.vm.provision "ansible" do |ansible|
    ansible.playbook = "playbook.yml"
  end

  config.vm.provider "libvirt" do |lvt, override|
    override.vm.synced_folder ".", "/vagrant", type: "sshfs"
    lvt.qemu_use_session = false
  end

end

We are using three configuration lines to accomplish our goal. Firstly, we instruct Vagrant to not insert the keypair. This will result in Vagrant utilizing the default insecure key. Secondly, we are passing a list of SSH private key file paths to be used to ssh onto the guest machine. Vagrant uses the default insecure key to bootstrap the machine. Thirdly, we instruct Vagrant to copy the file ~/.ssh/id_rsa.pub on the host to the guest path ~/.ssh/authorized_keys. At this point, Vagrant guest uses allows only our secure private key for SSH authentication.

Now you can use ssh using your private key like:

ssh vagrant@192.168.12.12

The default way of sshing on to the guest also works:

vagrant ssh

Hopefully, you will use this knowledge to build and maintain development and testing environments and share them with others.

AWS Certified Solutions Architect - Associate

written by Sudheer Satyanarayana on 2019-08-01

AWS Certified Solutions Architect - Associate is one of the most sought-after certifications in the IT industry.

Here's a few tips for those seeking this certification.

Background Knowledge And Experience

AWS recommends "at least one year of hands-on experience designing available, cost-efficient, fault-tolerant, and scalable and distributed systems on AWS." In other words, if you recently started your career in IT, there are a few things you have to do before you start preparing for this certification.

Prerequisites

Suggested Plan

Should you use a course?

Video courses help a lot. If you have membership at Linux Academy or A Cloud Guru platforms, in addition to the Certified Solutions Architect - Associate courses, go through the videos of Certified Developer Associate and Certified SysOps Associate courses. They have a huge overlap and helps you gain a better understanding of AWS services. Also, after completing the CSA exam you can quickly prepare for the Certified Developer Associate and Certified SysOps Administrator Associate exams.

Should you use dumps?

Dumps are huge lists of questions and answers compiled by third-parties. Do not waste your time with dumps. Instead focus on gaining first-hand experience by doing it yourself. Instead of memorizing which AWS service would you use from the dump's question and answer perspective, do the research yourselves. Read the AWS documentation and find out what service you would use for a particular situation. To understand various contexts of these questions, read the AWS FAQs. You can also get a sense of the question contexts from the well-architected framework. Take one or two official practice exams. They give you a good overview of what to expect in the exam.

Why this difficult path?

It won't be unfair if you dub this plan as "getting AWS certified the hard-way". You could probably pass the certification exam by taking a video course and reading some dumps. That can only give you the certification but not the knowledge and experience required to provide solutions to real-world problems. The idea behind this suggested learning path is improving the knowledge depth and breadth as opposed to shallow understanding of relevant subjects.

How long is the preparation time?

The answer is subjective. It depends on your prior experience and ability to learn new things. Some people are fast and some slow. If you have a full-time day job, and you decided to utilize your spare time on weekdays and weekends, it will probably take somewhere between six months to a year or more. Don't be disheartened by the length of preparation time. In the end, it will definitely payoff.

Run Your Own OpenVPN Server

written by Sudheer Satyanarayana on 2019-06-30

Introduction

The article explains how to run your own OpenVPN server. We will setup one Certificate Authority Server and an OpenVPN server. We will also generate certificates for the clients. We will also learn how to manage revocation of client certificates using the Ansible roles.

Use the Ansible roles gavika.openvpn and gavika.easy_rsa to install and configure your OpenVPN server.

You can install the OpenVPN server on any public cloud or hosting provider or on-premise servers. The Ansible roles are designed to install the OpenVPN server and a Certificate Authority server.

At the moment these Ansible roles support Ubuntu 18.04 and CentOS 7.

System Architecture And Requirements

In order to run your OpenVPN server via these Ansible roles, you will need three machines:

  1. Controller machine. This is the machine from which you execute the Ansible playbooks. This could be your laptop or a machine in the cloud. You will designate a directory on this machine as a temporary pool of files.
  2. Certificate Authority server. You will create your own CA machine that signs the certificate requests. You will need SSH access to this machine from the controller machine. You will only need to turn this server on when required. It is recommended to shut down the CA server when not in use to improve the security. Also, saves cost.
  3. OpenVPN server. You will create your OpenVPN server on this machine. You will need SSH access to this machine from the controller machine. You will also need to ensure that UDP port 1194 is open on this machine. The Ansible playbook takes care of enabling the port on the machine itself. You are responsible to open the ports on the network firewall(such as AWS Security Groups, on-premise hardware or software firewall). You will have adjust your network firewalls too in case you change the defaults in Ansible playbook or inventory.

In addition to SSH access, the servers require a user with administrative privileges via sudo. Typically, cloud images of servers provide such user accounts on the server. On AWS, for the Ubuntu images, the user is typically called ubuntu. On CentOS the user is typically called centos. If you do not have such a username, create one. There's an Ansible role to create administrative user accounts too.

Once you have provisioned the servers, proceed to create the Ansible playbooks.

Installing The Ansible Roles

Our roles require Ansible 2.8 or higher. Ensure that the required version of Ansible is installed. If not, follow the instructions to install Ansible.

Create a directory to store the playbooks and inventory.

mkdir my-openvpn-server-orchestration
cd my-openvpn-server-orchestration

I create a directory called my-openvpn-server-orchestration. You can name it whatever you want.

Next step is to install the Ansible roles from Ansible Galaxy.

ansible-galaxy install gavika.easy_rsa
ansible-galaxy install gavika.openvpn

If your target OS is CentOS, install the centos_base role too:

ansible-galaxy install bngsudheer.centos_base

Preparing Ansible Inventory

Create the file inventory.yml and add the following contents:

all:
  hosts:
    placeholder
  children:
    ca_server:
      hosts:
        dev-ca-01.example.com:
          ansible_become: true
          ansible_user: ubuntu
          ansible_host: 192.168.1.10
          easy_rsa_ca_server_mode: true
          ansible_python_interpreter: /usr/bin/python3
    openvpn_server:
      hosts:
        dev-vpn-01.example.com:
          ansible_python_interpreter: /usr/bin/python3
          ansible_become: true
          ansible_user: ubuntu
          ansible_host: 192.168.1.11
          openvpn_server_ip_address: 192.168.1.11

I prefer to use YAML formatted Ansible inventory file. Your mileage may vary. If you are using INI format for your inventory file, make sure to port the format as required.

dev-ca-01.example.com is our CA server and dev-vpn-01.example.com. We are specifying the IP addresses of these hosts, in case the DNS is not setup yet. If the DNS resolves to the correct IP addresses, you can remove the ansible_host key. Specifying ansible_host is especially useful in test environments where there is no proper DNS system.

In this example we are using Ubuntu 18.04 for both the CA and OpenVPN servers. Ansible connects to these servers with the username ubuntu. We also tell Ansible to use the Python interpreter from the location usr/bin/python3. If Python 2 is installed on the servers, you don't have to mention the interpreter path. We also mention in our inventory that Ansible should use sudo via ansible_become.

If your OS has another administratnor user, adjust the value of ansible_user. If the target host has Python 2 installed, remove the key ansible_python_interpreter from your inventory.

Notice that the IP address of the OpenVPN server is mentioned in both ansible_host and openvpn_server_ip_address. ansible_host is used to connect to the server via SSH by Ansible. openvpn_server_ip_address is used to generate the client certificate.

Preparing The OpenVPN Server

Create the file openvpn-server.yml with the following contents if your target host is Ubuntu 18.04:

---
- hosts: openvpn_server
  vars:
    openvpn_client_users:
      - janedoe
      - johndoe
    openvpn_generated_configurations_local_pool: true
    easy_rsa_req_country: "IN"
    easy_rsa_req_province: "KA"
    easy_rsa_req_city: "Bangalore"
    easy_rsa_req_org: "My Organization"
    easy_rsa_req_email: "admin@example.com"
    easy_rsa_req_ou: "My Organization Unit"
    easy_rsa_local_pool_directory: /tmp/ca_openvpn_pool_example
  roles:
    - role: gavika.easy_rsa
    - role: gavika.openvpn

If your target host is CentOS, ensure EPEL is enabled. Edit your openvpn-server.yml like below

---
- hosts: openvpn_server
  vars:
    centos_base_enable_epel: true
    openvpn_client_users:
      - janedoe
      - johndoe
    openvpn_generated_configurations_local_pool: true
    easy_rsa_req_country: "IN"
    easy_rsa_req_province: "KA"
    easy_rsa_req_city: "Bangalore"
    easy_rsa_req_org: "My Organization"
    easy_rsa_req_email: "admin@example.com"
    easy_rsa_req_ou: "My Organization Unit"
    easy_rsa_local_pool_directory: /tmp/ca_openvpn_pool_example
  roles:
    - role: bngsudheer.centos_base
    - role: gavika.easy_rsa
    - role: gavika.openvpn

We are specifying that we want to create two client users janedoe and johndoe. We also specify the variables for the EasyRSA Public Key Infrastructure. On the OpenVPN server, we will also setup PKI but not in the CA mode. We use the PKI on this server to generate certificate requests and to store the client configurations. Certificate signing is done on the CA server.

Setting openvpn_generated_configurations_local_pool to true causes the generated client configurations to be copied to the local pool. We also ensure that easy_rsa_local_pool_directory is set to same value as in our ca-server.yml playbook.

In this playbook, we are executing two roles. gavika.easy_rsa to setup PKI and gavika.openvpn to setup OpenVPN server.

Run the playbook:

ansible-playbook -i inventory.yml openvpn-server.yml --private-key /path/to/my/private/key

At this point, you should see the file server.req in the path /tmp/ca_openvpn_pool_example/server/ in the local pool. You should also see janedoe.req and johndoe.req in /tmp/ca_openvpn_pool_example/client/ in the local pool.

Preparing The CA Server

Create the file: ca-server.yml

---
- hosts: ca_server
  vars:
    easy_rsa_req_country: "IN"
    easy_rsa_req_province: "KA"
    easy_rsa_req_city: "Bangalore"
    easy_rsa_req_org: "Example"
    easy_rsa_req_email: "admin@example.com"
    easy_rsa_req_ou: "Example"
    easy_rsa_local_pool_directory: /tmp/ca_openvpn_pool_example
    easy_rsa_ca_server_mode: true
  roles:
    - role: gavika.easy_rsa

We want to run the playbook on the hosts group: ca_server. This is exactly what we have in our inventory. The vars section has a series of variables used in certificates. Adjust them to your liking. Some files have to be transferred between the CA server and the OpenVPN server. For this purpose, we use a directory on the controller machine(the machine on which you execute the Ansible playbooks, probably your laptop or a bastion host or a management host). In our example we use /tmp/ca_openvpn_pool_example as the pool. You are free to choose a different directory.

Setting easy_rsa_ca_server_mode to true ensures we want to make this server a Certificate Authority.

Just like we did for OpenVPN playbook, adjust the CA playbook for CentOS 7:

---
- hosts: ca_server
  vars:
    centos_base_enable_epel: true
    easy_rsa_req_country: "IN"
    easy_rsa_req_province: "KA"
    easy_rsa_req_city: "Bangalore"
    easy_rsa_req_org: "Example"
    easy_rsa_req_email: "admin@example.com"
    easy_rsa_req_ou: "Example"
    easy_rsa_local_pool_directory: /tmp/ca_openvpn_pool_example
    easy_rsa_ca_server_mode: true
  roles:
    - role: bngsudheer.centos_base
    - role: gavika.easy_rsa

Execute the playbook:

ansible-playbook -i inventory.yml ca-server.yml --private-key /path/to/my/private/key

/path/to/my/private/key is your SSH private key used to connect to the CA server.

If the playbook ran successfully, your CA server is setup. At this point you should see the file ca.crt in /tmp/ca_openvpn_pool_example/.

The certificate signing request for the server - server.req will be uploaded to the CA server. The CA server imports the request and signs it. The signed certificate will be copied to the local pool. You should be able to see server.crt in /tmp/ca_openvpn_pool_example/issued/server/ local pool.

Execute the openvpn-server.yml playbook again:

ansible-playbook -i inventory.yml openvpn-server.yml --private-key /path/to/my/private/key

This time, openvpn service will be started. The playbook execution will also copy the generated client configuration files in /tmp/ca_openvpn_pool_example/generated/.

Connect To The OpenVPN Server

The gavika.openvpn role generates three files for each user.

Install the openvpn package on the client machine:

Fedora:

sudo dnf install openvpn

Ubuntu:

sudo apt install openvpn

Example command to connect to the OpennVPN server on a Fedora client:

 sudo openvpn --config /tmp/ca_openvpn_pool_example/generated/janedoe/janedoe-el.ovpn

Example command to connect to the OpennVPN server on an Ubuntu client:

 sudo openvpn --config /tmp/ca_openvpn_pool_example/generated/janedoe/janedoe.ovpn

If you see a message like:

Tue Jul  2 00:34:37 2019 Initialization Sequence Completed

then you have connected successfully. Try browsing the Internet from your browser. Or just check your Internet routed IP address from the command line:

curl http://api.ipify.org

The output should show your OpenVPN server's IP address.

Revoking Certificates

If you want to revoke access to a client, edit your ca-server.yml playbook and include the list of clients to be revoked:

---
- hosts: ca_server
  vars:
    ...
    easy_rsa_revoke_clients:
      - janedoe
  roles:
    - role: gavika.easy_rsa

In this example, we are revoking the certificate for the client janedoe. Next step is to run the CA playbook:

ansible-playbook -i inventory.yml ca-server.yml --private-key /path/to/my/private/key

When the playbook finishes executing, you should see the file crl.pem in /tmp/ca_openvpn_pool_example/crl/ directory of the local pool.

Next, we run the OpenVPN playbook to update the Certificate Revocation List:

ansible-playbook -i inventory.yml openvpn-server.yml --private-key /path/to/my/private/key

After the playbook executes successfully, the client janedoe won't be able to connect to the OpenVPN server.

Routing

You can configure your OpenVPN server to:

  1. route all traffic via the OpenVPN server
  2. route traffic via OpenVPN server to specific IP addresses or networks.

If you want to route traffic to specific networks, change the Ansible variables like below:

openvpn_route_all_traffic: false
openvpn_additional_configs:
   - push: "topology subnet"
   - push: "route 192.168.4.5 255.255.255.255"
   - push: "route 192.168.4.6 255.255.255.255"

Setting openvpn_route_all_traffic to false removes the redirect-gateway field and that def1 and bypass-dhcp flags in the OpenVPN server configuration. openvpn_additional_configs allows you to write additional OpenVPN server configuration. In our example, we set two such additional configuration lines. Each push line ensures that the client uses the OpenVPN connection to reach out to the corresponding IP address. In this example, when the client tries to reach the IP addresses 192.168.4.5 or 192.168.4.6, it uses the OpenVPN connection.

Creating Administrative Linux User Accounts: gavika.administrators

written by Sudheer Satyanarayana on 2019-06-10

We are pleased to announce gavika.administrators.

The Ansible role provides a declarative method to create Linux user accounts with administrative privileges. In other words, the these users have sudo access without password and are empowered to run all commands on the system.

You might be wondering, why you would need a role when you can write a couple tasks yourselves in an Ansible playbook. The reason is, Do Not Repeat Yourself(DRY ). Instead of writing such playbook tasks over and over, use the abstraction provided by the role. You just have to write some YAML declaration and be done with it. Moreover, the maintenance is outsourced to an Apache licensed open source software. The role has Molecule tests to boost your confidence.

Here's an example:

  - hosts: servers  
    vars:
      - administrators_names: ['admin01', 'admin02']
      - administrators_keys:
          - username: admin01
            key: /path/to/id_rsa_pub_admin01
    roles:
       - role: gavika.administrators

This playbook will create the users admin01 and admin02. After creating the users, sudoers configuraion is added to empower these users to run any command with sudo and without password. In addition the public key from the file /path/to/id_rsa_pub_admin01 is added to autorized_keys file of admin01 .

How To Determine Your Public IP Address Programmatically

written by Sudheer Satyanarayana on 2019-03-30

Short answer: use ipify

ipify provides a simple public address API.

Using the tool, you can determine your public IP address programmatically. If you are using shell:

curl 'https://api.ipify.org'

Using it in a shell script:

my_ip=$(curl 'https://api.ipify.org' -s)
echo $my_ip

Using the Ansible ipify module:

- hosts: localhost
  vars:
  tasks:
    - name: Get my public IP
      ipify_facts:
        timeout: 20
      delegate_to: localhost
      register: public_ip
    - name: output
      debug: msg="{{ ipify_public_ip }}"

Sample output of Ansible playbook execution:

ansible-playbook ipify.yml 
 [WARNING]: provided hosts list is empty, only localhost is available. Note that the implicit localhost does not match 'all'


PLAY [localhost] **************************************************************************************************************************************************************************************************

TASK [Gathering Facts] ********************************************************************************************************************************************************************************************
ok: [localhost]

TASK [Get my public IP] *******************************************************************************************************************************************************************************************
ok: [localhost -> localhost]

TASK [output] *****************************************************************************************************************************************************************************************************
ok: [localhost] => {
    "msg": "49.206.13.205"
}

PLAY RECAP ********************************************************************************************************************************************************************************************************
localhost                  : ok=3    changed=0    unreachable=0    failed=0

Gavika Ansible Roles

written by Sudheer Satyanarayana on 2019-03-27

Yesterday, we announced the launch of Ansible role to install and configure AWS CloudWatch Agent.

You might have seen my other open source Ansible roles on Ansible Galaxy and Github.

In the same spirit, the company, Gavika Information Technologies Pvt. Ltd. Bangalore, has started publishing open source projects on Github. Ansible role to install and configure AWS CloudWatch Agent is the first project. Expect more projects in future.

These are some of the guidelines for the Ansible role projects that Gavika follows.

Installing AWS CloudWatchAgent On EC2 Instance Via Ansible

written by Sudheer Satyanarayana on 2019-03-26

Install the Ansible role gavika.aws_cloudwatchagent via Galaxy.

ansible-galaxy install gavika.aws_cloudwatchagent

Create The Playbook File - cw-play.yml :

---
- hosts: all
  become: true
  vars:
  roles:
    - role: gavika.aws_cloudwatchagent

Prepare the AWS CloudWatch Agent configuration - aws-cw-config.json:

{
    "metrics": {
        "namespace": "gavika",
        "metrics_collected": {
            "cpu": {
                "measurement": [
                    "cpu_usage_idle",
                    "cpu_usage_iowait",
                    "cpu_usage_user",
                    "cpu_usage_system"
                ],
                "metrics_collection_interval": 360,
                "resources": [
                    "*"
                ],
                "totalcpu": false
            },
            "disk": {
                "measurement": [
                    "used_percent",
                    "inodes_free"
                ],
                "metrics_collection_interval": 360,
                "resources": [
                    "*"
                ]
            },
            "diskio": {
                "measurement": [
                    "io_time"
                ],
                "metrics_collection_interval": 360,
                "resources": [
                    "*"
                ]
            },
            "mem": {
                "measurement": [
                    "mem_used_percent"
                ],
                "metrics_collection_interval": 360
            },
            "swap": {
                "measurement": [
                    "swap_used_percent"
                ],
                "metrics_collection_interval": 360
            }
        }
    }
}

In this example, I am using the namespace, gavika. Feel free to change it. We collect the cpu, disk, diskio, mem and swap metrics. The agent will send these metrics once in 360 seconds.

Run The Playbook (CentOS):

ansible-playbook -i centos@myserver.example.com, cw-play.yml

The target machine is a CentOS server. Hence you see centos username. I am passing the server name inline. In a production system, you might have a well-defined inventory file. Change the command to suit your needs.

Run The Playbook (Ubuntu)

ansible-playbook -i ubuntu@myserver.example.com, cw-play.yml -e ansible_python_interpreter=/usr/bin/python3 -vvv

The Ubuntu server has Python3 by default and Ansible expects Python2 by default. Therefore, I pass the ansible_python_interpreter extra variable from the command line.

After the playbook executes successfully, you should see the metrics in AWS CloudWatch under the namespace specified.