InfluxDB + Grafana for the IoT

Hello everyone, it has been a while since our last post. The Kickstarter campaign has proven to be quite involving, but things start to get back to normal and new posts will keep appearing.

As now we’ve an official product/platform, there’s a plan to start creating practical posts, complemented by video blogs to show real use cases step-by-step. Stay tuned!

Moving on, this post will present a very promising solution to store, presenting and analyse time series data. Perfect to save all the IoT data produced.

Grafana01

Time Series Data

Before anything else, it’s important to understand what a Time Series Data is and how/where it can be used. Well, as the name say all data has time as primary key and measurements collected from sensors and actuators are the perfect use case.

Let’s have a look on a Temperature measurement, the “Hello World” of all sensors! The data itself is very simple, it can be, for example, 21.7 (representing Celsius degrees). Although this single value would be enough to “drive” a fan or heater on a standalone project, the temperature value by itself is useless on a bigger solution where multiple sensors and actuators interact with each other.

Also, without additional context, the data can’t be used to produce statistics or provide insights. In this case it’s necessary to append extra information to each measurement, like timestamp and tags. The timestamp is pretty standard, just the time when the temperature reading was collected. Now the tags are the extra bits which will help to filter and enhance the measurement, for example, the node name and location where the information is coming from, the measurement unit utilized, the sensor type, etc.

On a standard SQL database, this information would be store in multiple tables, with a few constrains for data integrity. But SQL databases, although very power full, they have being designed for CRUD operations (Create, Read, Update and Delete), resulting on a poor performance.

Here the main goal is “write once, read many”. At the end, after a sensor is read and the information is collected, there’s no need to update it, making the Time Series and Non-SQL databases perfect candidates. Providing incredible speed, horizontal scaling and great schema flexibility!

Note that a fully implemented solution might not be exclusively based on non-SQL storage. It’s very likely both models will be implemented together, as some parts of the system requiring full CRUD operations, like a table containing users and password or the node inventory.

SQL Tables

Table: Nodes

id Name Location
10 FridgeNode 2ndShelf

 

Table: Temperature

id id_node* Timestamp Value Unit
1 10 2016-05-21 10:00:01 5.03 Celcius
1 10 2016-05-21 10:05:01 5.20 Celcius
1 10 2016-05-21 10:10:01 5.25 Celcius
1 10 2016-05-21 10:15:01 5.40 Celcius

*id_node is a Foreign Key to Node.id

Time Series

Measurement for Temperature
temperature,node=MyFridge,location=2ndShelf,unit=Celcius value=5.03 1463824801000000000
temperature,node=MyFridge,location=2ndShelf,unit=Celcius value=5.20 1463825101000000000
temperature,node=MyFridge,location=2ndShelf,unit=Celcius value=5.25 1463825401000000000
temperature,node=MyFridge,location=2ndShelf,unit=Celcius value=5.40 1463825701000000000

Most of the non-SQL DBs are schema-free. All information is stored on a flat model, repeating details, in this case TAGs, for every entry. This might sound silly because more storage is required, but on the other hand the performance and flexibility are greater. Some of the non-SQL databases might call each record a “Document”, made of multiple key=value pairs. For InfluxDB, they’re called just measurements and the timestamp is always present in nanoseconds.

InfluxDB

This post is not meant to be a tutorial about how to install and configure InfluxDB. All information can be found here: https://docs.influxdata.com/influxdb. Some parts of the documentation could be a bit more complete and better organized, but this is a quite new project and improvements are very frequent.

The easiest way to install InfluxDB is using a pre-package version. For example, for CentOS, just download the latest RPM and run:

wget https://dl.influxdata.com/influxdb/releases/influxdb-0.13.0.x86_64.rpm
sudo yum localinstall influxdb-0.13.0.x86_64.rpm
sudo service influxdb start

Adding Data

After the DB is installed and Running it’s time to create a DB and insert some data:

curl -POST http://localhost:8086/query --data-urlencode "q=CREATE DATABASE sensor"
curl -i -XPOST 'http://localhost:8086/write?db=sensor' --data-binary 'temperature,node=MyFridge,location=2ndShelf,unit=Celcius value=5.03 1463824801000000000'
curl -i -XPOST 'http://localhost:8086/write?db=sensor' --data-binary 'temperature,node=MyFridge,location=2ndShelf,unit=Celcius value=5.20 1463825101000000000'
curl -i -XPOST 'http://localhost:8086/write?db=sensor' --data-binary 'temperature,node=MyFridge,location=2ndShelf,unit=Celcius value=5.25 1463825401000000000'
curl -i -XPOST 'http://localhost:8086/write?db=sensor' --data-binary 'temperature,node=MyFridge,location=2ndShelf,unit=Celcius value=5.40 1463825701000000000'

In a nutshell, to insert data just need to post one entry per line using standard HTTP protocol. The first argument is the “measurement name”, followed by tags in key=value pairs, the reading value and the timestamp in nanoseconds.

The measurement value can be a single or multiple key=value, but it’s important to have the most important one as “value=xxx”. For example, to add “Time taken” the POST line would looks like:

curl -i -XPOST 'http://localhost:8086/write?db=sensor' --data-binary 'temperature,node=MyFridge,location=2ndShelf,unit=Celcius value=5.03,time_takem=199 1463824801000000000'

There are a few other options to insert data to InfluxDB, all described on its documentation.

Quick query

Now with some data, lets have a quick look into the DB:

influx --database sensor

This will bring the CLI console:

Visit https://enterprise.influxdata.com to register for updates, InfluxDB server management, and monitoring.
Connected to http://localhost:8086 version 0.13.0
InfluxDB shell version: 0.13.0
>
In the console just try a few commands:
> show measurements
name: measurements
------------------
name
temperature
> show tag keys
name: temperature
-----------------
tagKey
location
node
unit
> select * from temperature
name: temperature
-----------------
time location node unit value
1463824801000000000 2ndShelf MyFridge Celcius 5.03
1463825101000000000 2ndShelf MyFridge Celcius 5.2
1463825401000000000 2ndShelf MyFridge Celcius 5.25
1463825701000000000 2ndShelf MyFridge Celcius 5.4

The data should be there. Just spend a bit more time playing with the data and getting familiar with the SQL-like commands.

Grafana

Grafana is a plotting tool, which produce very professional looking graphs and charts based on multiple data sources, including InfluxDB. Again this post will not get into all features for this product, for all documentation access http://docs.grafana.org/.

Similar to InfluxDB, just use the package version to install it, here the example for CentOS:

sudo yum install https://grafanarel.s3.amazonaws.com/builds/grafana-3.0.2-1463383025.x86_64.rpm
sudo service grafana-server start

The Web GUI interface will be available on http://localhost:3000. User the default user and password: admin/admin.

Grafana_Login
Grafana Login Screen. Just use admin/admin

Data Sources

The first step is to add data sources, in this case for the InfluxDB Database created earlier. Just follow the screenshots:

Grafana_DataSources01
Select Data Sources from the Menu
Grafana_DataSources02
Click +Add data source
Grafana_DataSources03
Change to InfluxDB and fill all required fields
Grafana_DataSources04b
Click Add to confirm

It’s important to note that the Datasource is not accessed directly by Grafana, instead, it’ll be accessed by the browser via Javascript. For that reason make sure the browser can access the configured URL.

Grafana_DataSources04
Error when Datasource is not Accessible

A way to test the end-point URL is trying to access, for example, the URL: http://localhost:8086/query?db=sensor&epoch=ms&p=&q=SELECT+*+FROM+temperature. If that works a JSON result containing the Temperature Readings should appear.

Update 1: On the HTTP Settings, it’s possible to define “Access = Proxy”. This will force the Grafana to hit InfluxDB’s URL instead of the client browser hitting it directly. This seems to be a nice workaround in case there’s a Firewall or reverse proxy preventing direct access.

Dashboard

To query any data via Grafana, a Dashboard is required, just create one following the screenshot:

Grafana_Dashboards01
Select Dashboards and +New from the Menu

Query

Grafana offers two query methods, interactive and raw. For 95% of the cases the interactive will be easier and offer all options required. For the other 5% the raw method allows writing of the SQL itself. To debug all queries, keep eye on the InfluxDB log:

tail -F /var/log/influxdb/influxd.log

The log will show every request done via Grafana to InfluxDB and help to fixing any issue.

Now, just follow a the screenshots below to display the data:

Grafana_Dashboards02
Click on the Green icon and add a new Graph Panel
Grafana_Dashboards03
Before start is import to select the correct time period for the graph. Click on the time picker and select an interval compatible with the data inserted into InfluxDB.
Grafana_Dashboards04
Click on “A: Toggle Query” to show the interactive mode and select the Data Source “Sensor”.
Grafana_Dashboards05
Grafana will show the available measurements, just select “temperature”.
Grafana_Dashboards06
The chart should be plotted. Use the mouse to “zoom in”.
Grafana_Dashboards07
Moving the cursor over the graph will show details for each data point.
Grafana_Dashboards08
Keep exploring feature and graph options.

InfluxDB + Grafana vs. Splunk

Splunk is a very powerful tool and it’s free to index up to 500Mb per day. In terms of functionality Splunk has way more features compared to InfluxDB and Grafana, especially if stats are coming from non pre-formatted messages and log files.

From a practical point of view Splunk can can be an overkill if the plan is to only ingest simple measurements. Also the learning curve for Splunk to start building some neat dashboards can be a bit stepper.

Below the same Battery Voltage measurement plotted on Splunk for the first 4 moths and the same data now plotted on Grafana up to recent days.

Chart_Battery
Data from August/2015 to November/2015
Grafana_Dashboards09
Same data, now from August/2015 to May/2016

Collecting Data

Now that the basics of InfluxDB and Grafana have being explained the next step is to setup a process to regularly collect data from the nodes.

The simplest way is by choosing a familiar script language and translating a data from a remote node into a HTTP Post into InfluxDB.

Advanced techniques might be used, for example, by having a Queue (like RabbitMQ) in the middle, like discussed on previous topic.

Soon a practical example will show how to implement it end-to-end using two Talk² Whisper Nodes, a Raspberry PI and a Google Cloud VM… stay tuned!

Advertisements

3 thoughts on “InfluxDB + Grafana for the IoT

  1. Dear Talk² team, Dear Mike,

    we developed a flexible, generic and open source data acquisition backend around the components outlined in this post, it is based on InfluxDB and Grafana as well and glues things together as it offers multiple ways of multi-tenant data acquisition and automatically creates Grafana dashboards on data arrival, besides other nice things. While still having rough edges, it is already in the making for some time so the core features are reasonably mature. Its name is Kotori (https://getkotori.org/) and it might classify as a data historian (https://en.wikipedia.org/wiki/Operational_historian).

    It could be a good complement to the things you are working on. We would be happy to hear from you as you seem to like this software stack as well.

    With kind regards,
    Andreas.

    P.S.: You may also want to have a look at the Hiveeyes project (https://hiveeyes.org/), this is one of the systems supported by the open source data acquisition backend, both have been growing up together. BERadio (https://hiveeyes.org/docs/beradio/), »an [opinionated] encoding specification and implementation for efficient communication in constrained radio link environments based on Bencode« might also spark your interest.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s