Jekyll2018-10-27T11:02:13+00:00/When a shell is not enoughUnix engineer with interests in Information Security and Ethical Hacking. I love to break and fix things :) Xavier GarciaInfrastructure monitoring and the journey to the cloud2018-10-27T08:00:00+00:002018-10-27T08:00:00+00:00/2018/10/monitoring-and-the-journey-to-the-cloud<p>At work we are in the process of moving from a datacenter centric infrastructure to AWS. This is a great journey with many interesting challenges and today we will discuss one of them, <strong>the monitoring of our infrastructure</strong>.</p>
<!--more-->
<p>Taking into account we are quite busy, dedicating resources and money to prototype and later migrate to a new monitoring platform is not high priority unless we detect important gaps. <strong>“If it works, don’t touch it”, they said.</strong></p>
<p>Well, at the moment we have a quite stable monitoring stack that meets our requirements but it’s obvious that it doesn’t adapt well to the cloud, that is very dynamic by nature, because the current monitoring stack relies on manual changes in configuration files and commits to the software repo.</p>
<h1 id="current-setup">Current setup</h1>
<p>We are using Munin for graphing, that also forwards the alerts to our Icinga satellites, that are running in different locations. The setup is great right now, but we have identified the following gaps:</p>
<ul>
<li>Very static configuration. We still could use the AWS SDK to rebuild the configuration file automatically, but that could potentially cause other issues, because Munin is quite sensitive when it comes to syntax errors.</li>
<li>Munin doesn’t scale because it’s written in Perl. The fact that it needs to run every 5 minutes is a good indicator.</li>
<li>It also relies in lots of dependencies and scripts in the client. For instance, it cannot be used to monitor Docker containers or cloud managed services, like RDS.</li>
<li>Having a graph resolution of 5 minutes is not a big drama but it’s not usable if we want to do application instrumentation.</li>
</ul>
<h1 id="searching-for-the-right-solution">Searching for the right solution</h1>
<h2 id="going-fully-managed">Going fully managed</h2>
<p>A simple solution is going to a fully managed monitoring platform, in our case Cloudwatch. This way we could get rid of the burden that is the setup and management.</p>
<p>In this scenario, we thought of preparing some tooling to help the developers to setup their own alerts and dashboards, mainly with the help of <strong>Terraform</strong> and the <strong>Python AWS SDK</strong>. A great solution, one may think, but we found some issues, being <strong>the big monthly costs</strong> one of them.</p>
<p>Monitoring a fleet of 100 EC2 instances, to put an example, would cost around 30 Euros/month, alerts included. But the costs explode as soon as we want to use some metrics that don’t exist out of the shelf. This is: disk utilization, memory usage, number of processes, Nginx metrics, application instrumentation, etc.</p>
<p>AWS charges you 0,30 Euros per custom metric, times the number of resources (Ec2 instances). Having an example of 5 custom metrics, this would be:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>100 Ec2 instances x 5 metrics x 0.30 Eur/month = 150 Eur/month
</code></pre></div></div>
<p>It’s clear that paying thousands of Euros every year for the monitoring alone cannot be justified. We have to search for alternatives.</p>
<h2 id="improving-our-current-stack">Improving our current stack</h2>
<p>Searching a bit in Google we found that Icinga has a plugin to import resources from <a href="https://github.com/Icinga/icingaweb2-module-aws">AWS</a>. This is: Ec2 instances, load balancers, RDS databases and auto scaling groups.</p>
<p>Keeping Iginga would be a good solution because we don’t need to learn and setup a new platform and we know it’s reliable. Alternatively, we could also use something like Grafana to generate dashboards with the metrics.</p>
<p>After playing around with a prototype I found several issues.</p>
<ul>
<li>I couldn’t find an easy way to filter which resources I want to monitor. Something as simple as filtering by tag to only monitor the production resources. It may exist, because the importer configuration has a field for a filter expression but I found no documentation. As a result, the full inventory is imported.</li>
<li>The plugin seems to be in an early stage of development and it’s maintained by the community.</li>
<li>Again, we cannot monitor things like containers. Probably it will be implemented over time.</li>
<li>We cannot pull standard metrics from Cloudwatch. If we want to monitor an RDS database, we have to use a Nagios plugin.</li>
<li>The import tasks need to be run periodically to keep the inventory current.</li>
<li>The import works by autogenerating the same configuration files that we now maintain in our software repos and then reloading the Icinga configuration. Sometimes the imports fail because of syntax errors (self inflicted pain), causing the synchronization to fail. This would leave us blind if it’s not spotted.</li>
</ul>
<p>In general, it feels that the setup it’s not appropriate for us. It would be a good solution for an organization that is using cloud services but, at the same time, is going to keep a quite static fleet. Also, not using hosted services at all.</p>
<h2 id="what-is-the-industry-using">What is the industry using?</h2>
<p>We need:</p>
<ul>
<li>Monitor some static resources, like long running EC2 instances.</li>
<li>Metrics from short living Ec2 instances in autoscaling groups.</li>
<li>Perhaps some batch jobs that run for some hours and then the resources are destroyed.</li>
<li>Hosted services like RDS and probably Kubernetes in the future.</li>
</ul>
<p>What is the industry using to monitor such a diverse environment? It seems that all points to <a href="https://prometheus.io">Prometheus</a>.</p>
<p>Prometheus, like Munin, uses a pull model but it escalates better because it’s written in Go and can auto-discover all the AWS services we are planning to use. Finally, for the graphing, it supports Grafana out of the box and the alerting capabilities are also fine for us (email, sms, webhooks, etc.).</p>
<p>Regarding the costs, it also seems to be a good solution, because it’s open source and it doesn’t have big hardware requirements nor we have to setup big database clusters. It also can easily run in a small Ec2 instance or in docker containers.</p>
<p>The only down side at the moment is the steep learning curve, because the platform is composed of many microservices, but it shouldn’t be a big issue. We will try in the lab and some staging environments but, after all the alternatives, it seems to be the right move.</p>Xavier GarciaAt work we are in the process of moving from a datacenter centric infrastructure to AWS. This is a great journey with many interesting challenges and today we will discuss one of them, the monitoring of our infrastructure.DNS over TLS forwarding with Unbound and Quad92018-04-02T08:00:00+00:002018-04-02T08:00:00+00:00/2018/04/dns-over-tls-forwarding-with-unbound-and-quad9<p>In my previous post I explained <a href="/2016/08/building-dns-sinkhole-in-freebsd-with.html">how to build a DNS sinkhole with Unbound</a> by downloading block lists from different sources. I also tried to use dnscrypt in the setup, but I had to disable it because the service provided was unreliable.</p>
<p>Yesterday Cloudflare <a href="https://blog.cloudflare.com/announcing-1111/">announced</a> that they were providing a “privacy-first consumer DNS service”, whatever it means.</p>
<p><br />
<br /></p>
<blockquote class="twitter-tweet" data-lang="en"><p lang="en" dir="ltr">Announcing 1.1.1.1: the fastest, privacy-first consumer DNS service - <a href="https://t.co/xiM3yllWHj">https://t.co/xiM3yllWHj</a> <a href="https://t.co/5keff8uuD2">pic.twitter.com/5keff8uuD2</a></p>— Cloudflare (@Cloudflare) <a href="https://twitter.com/Cloudflare/status/980430875258212352?ref_src=twsrc%5Etfw">April 1, 2018</a></blockquote>
<script async="" src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
<p><br />
<br /></p>
<p>Since it’s Easter and I have more free time than usual, I thought it would be cool to have a look and update my DNS sinkhole at home.</p>
<p>While I was searching for information related to <strong>DNS over TLS</strong>, that is one of the main features provided by Cloudflare, I came across <strong>Quad9</strong>, that it’s offering the same service. They have been <a href="https://arstechnica.com/information-technology/2017/11/new-quad9-dns-service-blocks-malicious-domains-for-everyone/">in the news a lot</a> but I didn’t play attention because the media outlets only reported it as an alternative to Google DNS and back then I was too busy.</p>
<p>In a nutshell, Quad9 is a <a href="https://quad9.net/about/">sinkhole that blocks DNS requests to malicious domains</a>, that is pretty much the same I am doing at home with Unbound and a shell script, but <strong>with more resources</strong>. My blackhole hast more than <strong>30K domains blacklisted</strong>, that is not bad at all :)</p>
<p>At the end, I decided to use the DNS over TLS resolvers from Quad9, but you can find the resolvers from Cloudflare commented out in the configuration file. I will keep my own list of blocked domains for the time being, but I may kill it in the future because my configuration fails every now and then when the domain names have non-acii characters.</p>
<p>The minimum <a href="https://www.unbound.net/documentation/unbound.conf.html">configuration options</a> are:</p>
<ul>
<li><strong>ssl-upstream</strong> tells Unbound to use TLS to communicate with the upstream server.</li>
<li><strong>ip_add@port</strong> to define the upstream server.</li>
</ul>
<p>Additionally I am using configuration parameters that come in handy:</p>
<ul>
<li>
<p>minimal-responses: yes</p>
<p>Reduces the size of the response when possible to improve the performance a bit.</p>
</li>
<li>
<p>prefetch: yes</p>
<p>Fetch the about to expire cache elements.</p>
</li>
<li>
<p>qname-minimisation: yes</p>
<p>Best effort to send minimum amount of info to the upstream servers but not super helpful.</p>
</li>
</ul>
<p>Notice that Unbound is <strong>not running daemonized</strong> because it’s being monitored by the <a href="https://cr.yp.to/daemontools.html">Daemontools supervisor</a>. That is also why the configuration and control files are not placed in the usual locations.
<br />
<br /></p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>server:
interface: 10.10.10.10
access-control: 127.0.0.0/8 allow
access-control: 10.10.10.0/24 allow
do-daemonize: no
logfile: ""
username: unbound
directory: /usr/local/var/service/unbound
chroot: /usr/local/var/service/unbound
pidfile: /usr/local/var/service/unbound/unbound.pid
verbosity: 1
minimal-responses: yes
prefetch: yes
qname-minimisation: yes
# we are doing DNS over TLS
ssl-upstream: yes
root-hints: /usr/local/var/service/unbound/config/root.hints
# my DNS zone at home
include: /usr/local/var/service/unbound/config/local.zone
# autogenerated every night to block malicious domains
include: /usr/local/var/service/unbound/config/blackhole.zone
forward-zone:
name: "."
forward-addr: 9.9.9.9@853 # quad9.net primary
forward-addr: 149.112.112.112@853 # quad9.net secondary
#forward-addr: 1.1.1.1@853 # cloudflare primary
#forward-addr: 1.0.0.1@853 # cloudflare secondary
remote-control:
control-enable: yes
control-interface: /usr/local/var/service/unbound/control.clt
control-use-cert: no
</code></pre></div></div>Xavier GarciaIn my previous post I explained how to build a DNS sinkhole with Unbound by downloading block lists from different sources. I also tried to use dnscrypt in the setup, but I had to disable it because the service provided was unreliable. Yesterday Cloudflare announced that they were providing a “privacy-first consumer DNS service”, whatever it means. Announcing 1.1.1.1: the fastest, privacy-first consumer DNS service - https://t.co/xiM3yllWHj pic.twitter.com/5keff8uuD2— Cloudflare (@Cloudflare) April 1, 2018 Since it’s Easter and I have more free time than usual, I thought it would be cool to have a look and update my DNS sinkhole at home. While I was searching for information related to DNS over TLS, that is one of the main features provided by Cloudflare, I came across Quad9, that it’s offering the same service. They have been in the news a lot but I didn’t play attention because the media outlets only reported it as an alternative to Google DNS and back then I was too busy. In a nutshell, Quad9 is a sinkhole that blocks DNS requests to malicious domains, that is pretty much the same I am doing at home with Unbound and a shell script, but with more resources. My blackhole hast more than 30K domains blacklisted, that is not bad at all :) At the end, I decided to use the DNS over TLS resolvers from Quad9, but you can find the resolvers from Cloudflare commented out in the configuration file. I will keep my own list of blocked domains for the time being, but I may kill it in the future because my configuration fails every now and then when the domain names have non-acii characters. The minimum configuration options are: ssl-upstream tells Unbound to use TLS to communicate with the upstream server. ip_add@port to define the upstream server. Additionally I am using configuration parameters that come in handy: minimal-responses: yes Reduces the size of the response when possible to improve the performance a bit. prefetch: yes Fetch the about to expire cache elements. qname-minimisation: yes Best effort to send minimum amount of info to the upstream servers but not super helpful. Notice that Unbound is not running daemonized because it’s being monitored by the Daemontools supervisor. That is also why the configuration and control files are not placed in the usual locations. server: interface: 10.10.10.10 access-control: 127.0.0.0/8 allow access-control: 10.10.10.0/24 allow do-daemonize: no logfile: "" username: unbound directory: /usr/local/var/service/unbound chroot: /usr/local/var/service/unbound pidfile: /usr/local/var/service/unbound/unbound.pid verbosity: 1 minimal-responses: yes prefetch: yes qname-minimisation: yes # we are doing DNS over TLS ssl-upstream: yes root-hints: /usr/local/var/service/unbound/config/root.hints # my DNS zone at home include: /usr/local/var/service/unbound/config/local.zone # autogenerated every night to block malicious domains include: /usr/local/var/service/unbound/config/blackhole.zone forward-zone: name: "." forward-addr: 9.9.9.9@853 # quad9.net primary forward-addr: 149.112.112.112@853 # quad9.net secondary #forward-addr: 1.1.1.1@853 # cloudflare primary #forward-addr: 1.0.0.1@853 # cloudflare secondary remote-control: control-enable: yes control-interface: /usr/local/var/service/unbound/control.clt control-use-cert: noFixing OpenSC after updating to MacOS Sierra2018-03-17T12:10:00+00:002018-03-17T12:10:00+00:00/2018/03/fixing-opensc-after-updating-to-macos-sierra<p>Sierra introduced restrictions to the ssh-agent (new version of OpenSSH) by limiting the PKCS#11 libraries that can be loaded to a list of whitelisted directories. As of now, this is public domain because I must be one of the last persons updating from <strong>El Capitan</strong> to <strong>Sierra</strong>. Yes, I am not an early adopter!</p>
<p>So, until now I had an alias in my .bashrc that was loading my SSH key in the Yubikey to the ssh agent. The alias was just fine but the library is now outside the trusted path, that is “/usr/lib:/usr/local/lib”.</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">alias </span><span class="nv">load_key</span><span class="o">=</span><span class="s2">"ssh-add -s /Library/OpenSC/lib/pkcs11/opensc-pkcs11.so"</span>
<span class="nb">alias </span><span class="nv">unload_key</span><span class="o">=</span><span class="s2">"ssh-add -e /Library/OpenSC/lib/pkcs11/opensc-pkcs11.so"</span>
</code></pre></div></div>
<p>As always, the error message is everything but useful:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>ssh-add <span class="nt">-s</span> /Library/OpenSC/lib/pkcs11/opensc-pkcs11.so
Enter passphrase <span class="k">for </span>PKCS#11:
Could not add card <span class="s2">"/Library/OpenSC/lib/pkcs11/opensc-pkcs11.so"</span>: agent refused operation
</code></pre></div></div>
<p>I had to run the ssh-agent in debug mode to understand what was happening (Google is your friend) and the output said: <strong>provider not whitelisted</strong>.</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>ssh-agent <span class="nt">-d</span> <span class="nt">-a</span> /tmp/agent.socket
<span class="nv">SSH_AUTH_SOCK</span><span class="o">=</span>/tmp/agent.socket<span class="p">;</span> <span class="nb">export </span>SSH_AUTH_SOCK<span class="p">;</span>
<span class="nb">echo </span>Agent pid 2918<span class="p">;</span>
debug2: fd 3 setting O_NONBLOCK
debug3: fd 4 is O_NONBLOCK
debug1: <span class="nb">type </span>20
refusing PKCS#11 add of <span class="s2">"/Library/OpenSC/lib/opensc-pkcs11.so"</span>: provider not whitelisted
debug1: XXX shrink: 3 < 4
</code></pre></div></div>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ SSH_AUTH_SOCK</span><span class="o">=</span><span class="s2">"/tmp/agent.socket"</span> ssh-add <span class="nt">-s</span> /Library/OpenSC/lib/pkcs11/opensc-pkcs11.so
Enter passphrase <span class="k">for </span>PKCS#11:
Could not add card <span class="s2">"/Library/OpenSC/lib/pkcs11/opensc-pkcs11.so"</span>: agent refused operation
</code></pre></div></div>
<p>Goggling again, the first hit is the bug report that was opened last year.</p>
<p><a href="https://github.com/OpenSC/OpenSC/issues/1008">MacOS: cannot use /usr/local/lib/opensc-pkcs11.so (provider not whitelisted)</a></p>
<p>At the end, I had to modify my aliases to the library located in the trusted path.</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">alias </span><span class="nv">load_key</span><span class="o">=</span><span class="s2">"ssh-add -s /usr/local/lib/opensc-pkcs11.so"</span>
<span class="nb">alias </span><span class="nv">unload_key</span><span class="o">=</span><span class="s2">"ssh-add -e /usr/local/lib/opensc-pkcs11.so"</span>
</code></pre></div></div>
<p>The original post where I setup <a href="/2016/10/ssh-public-key-authentication-with.html">SSH public key authentication with security tokens</a></p>Xavier GarciaSierra introduced restrictions to the ssh-agent (new version of OpenSSH) by limiting the PKCS#11 libraries that can be loaded to a list of whitelisted directories. As of now, this is public domain because I must be one of the last persons updating from El Capitan to Sierra. Yes, I am not an early adopter! So, until now I had an alias in my .bashrc that was loading my SSH key in the Yubikey to the ssh agent. The alias was just fine but the library is now outside the trusted path, that is “/usr/lib:/usr/local/lib”. alias load_key="ssh-add -s /Library/OpenSC/lib/pkcs11/opensc-pkcs11.so" alias unload_key="ssh-add -e /Library/OpenSC/lib/pkcs11/opensc-pkcs11.so" As always, the error message is everything but useful: $ ssh-add -s /Library/OpenSC/lib/pkcs11/opensc-pkcs11.so Enter passphrase for PKCS#11: Could not add card "/Library/OpenSC/lib/pkcs11/opensc-pkcs11.so": agent refused operation I had to run the ssh-agent in debug mode to understand what was happening (Google is your friend) and the output said: provider not whitelisted. $ ssh-agent -d -a /tmp/agent.socket SSH_AUTH_SOCK=/tmp/agent.socket; export SSH_AUTH_SOCK; echo Agent pid 2918; debug2: fd 3 setting O_NONBLOCK debug3: fd 4 is O_NONBLOCK debug1: type 20 refusing PKCS#11 add of "/Library/OpenSC/lib/opensc-pkcs11.so": provider not whitelisted debug1: XXX shrink: 3 < 4 $ SSH_AUTH_SOCK="/tmp/agent.socket" ssh-add -s /Library/OpenSC/lib/pkcs11/opensc-pkcs11.so Enter passphrase for PKCS#11: Could not add card "/Library/OpenSC/lib/pkcs11/opensc-pkcs11.so": agent refused operation Goggling again, the first hit is the bug report that was opened last year. MacOS: cannot use /usr/local/lib/opensc-pkcs11.so (provider not whitelisted) At the end, I had to modify my aliases to the library located in the trusted path. alias load_key="ssh-add -s /usr/local/lib/opensc-pkcs11.so" alias unload_key="ssh-add -e /usr/local/lib/opensc-pkcs11.so" The original post where I setup SSH public key authentication with security tokensBacking up my Githup repos2018-02-05T16:45:00+00:002018-02-05T16:45:00+00:00/2018/02/backing-up-my-github-repos<p>I am using <a href="http://code.dogmap.org/runwhen/">runwhen</a> together with <a href="https://cr.yp.to/daemontools.html">daemontools</a> to launch and monitor the backup. The run script used by the svc service executes runwhen commands to sleep until the next run (every hour) and then launch the backup script. The service is running in a dedicated <a href="https://www.freebsd.org/doc/handbook/jails.html">jail</a>.</p>
<p>The run script listed below uses some runwhen commands (rw-add,rw-matchand rw-sleep) to wake-up every hour and <a href="https://cr.yp.to/daemontools/setuidgid.html">setuidgid</a> to run the service with an unprivileged user.</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#!/bin/sh</span>
<span class="nb">exec </span>2>&1
<span class="nb">exec </span>setuidgid gitbackup <span class="se">\</span>
rw-add n d1S now1s <span class="se">\</span>
rw-match <span class="se">\$</span>now1s ,M<span class="o">=</span>00 wake <span class="se">\</span>
rw-sleep <span class="se">\$</span>wake <span class="se">\</span>
/home/gitbackup/update.sh
</code></pre></div></div>
<p>The actual backup script that iterates over all the git repos and fetches the changes.</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#!/bin/sh</span>
<span class="nb">exec </span>2>&1
<span class="nb">cd</span> /usr/home/gitbackup/backup
<span class="nb">echo</span> <span class="s2">"===="</span>
date
<span class="nb">echo</span> <span class="s2">"===="</span>
<span class="k">for </span>repo <span class="k">in</span> <span class="sb">`</span><span class="nb">ls</span> <span class="nt">-d1</span> <span class="k">*</span>.git<span class="sb">`</span><span class="p">;</span> <span class="k">do
</span><span class="nb">cd</span> <span class="nv">$repo</span> <span class="o">&&</span> /usr/local/bin/git fetch <span class="nt">--all</span>
<span class="nb">cd</span> -
<span class="k">done
</span><span class="nb">echo</span> <span class="s2">"===="</span>
</code></pre></div></div>
<p>checking the output log</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span><span class="nb">cat</span> /var/service/backups/log/main/current | tai64nlocal
2018-02-05 18:00:00.098641500 <span class="o">====</span>
2018-02-05 18:00:00.150083500 Mon Feb 5 18:00:00 CET 2018
2018-02-05 18:00:00.180056500 <span class="o">====</span>
2018-02-05 18:00:00.211689500 Fetching origin
2018-02-05 18:00:01.073738500 From https://github.com/xgarcias/ansible-cmdb-freebsd-template
2018-02-05 18:00:01.073743500 <span class="k">*</span> branch HEAD -> FETCH_HEAD
2018-02-05 18:00:01.091577500 Fetching origin
2018-02-05 18:00:02.185366500 From https://github.com/xgarcias/ansible-daemontools
2018-02-05 18:00:02.185371500 <span class="k">*</span> branch HEAD -> FETCH_HEAD
2018-02-05 18:00:02.203049500 Fetching origin
2018-02-05 18:00:04.180310500 From https://github.com/xgarcias/ansible-macbook
2018-02-05 18:00:04.180315500 <span class="k">*</span> branch HEAD -> FETCH_HEAD
2018-02-05 18:00:04.198104500 Fetching origin
2018-02-05 18:00:06.448429500 From https://github.com/xgarcias/daemontools-dyndns
2018-02-05 18:00:06.448434500 <span class="k">*</span> branch HEAD -> FETCH_HEAD
2018-02-05 18:00:06.466266500 Fetching origin
2018-02-05 18:00:08.299785500 From https://github.com/xgarcias/daemontools-poudriere
2018-02-05 18:00:08.299790500 <span class="k">*</span> branch HEAD -> FETCH_HEAD
2018-02-05 18:00:08.321755500 Fetching origin
2018-02-05 18:00:09.749956500 From https://github.com/xgarcias/daemontools-unbound-sinkhole
2018-02-05 18:00:09.749961500 <span class="k">*</span> branch HEAD -> FETCH_HEAD
2018-02-05 18:00:09.771744500 Fetching origin
2018-02-05 18:00:11.113934500 From https://github.com/xgarcias/elasticsearch-plugin-readonlyrest
2018-02-05 18:00:11.113939500 <span class="k">*</span> branch HEAD -> FETCH_HEAD
2018-02-05 18:00:11.135774500 Fetching origin
2018-02-05 18:00:12.703191500 From https://github.com/xgarcias/freebsd_local_ports
2018-02-05 18:00:12.703197500 <span class="k">*</span> branch HEAD -> FETCH_HEAD
2018-02-05 18:00:12.724967500 Fetching origin
2018-02-05 18:00:13.583204500 From https://github.com/xgarcias/xgarcias.github.io
2018-02-05 18:00:13.583209500 <span class="k">*</span> branch HEAD -> FETCH_HEAD
2018-02-05 18:00:13.601461500 <span class="o">====</span>
</code></pre></div></div>Xavier GarciaI am using runwhen together with daemontools to launch and monitor the backup. The run script used by the svc service executes runwhen commands to sleep until the next run (every hour) and then launch the backup script. The service is running in a dedicated jail. The run script listed below uses some runwhen commands (rw-add,rw-matchand rw-sleep) to wake-up every hour and setuidgid to run the service with an unprivileged user. #!/bin/sh exec 2>&1 exec setuidgid gitbackup \ rw-add n d1S now1s \ rw-match \$now1s ,M=00 wake \ rw-sleep \$wake \ /home/gitbackup/update.sh The actual backup script that iterates over all the git repos and fetches the changes. #!/bin/sh exec 2>&1 cd /usr/home/gitbackup/backup echo "====" date echo "====" for repo in `ls -d1 *.git`; do cd $repo && /usr/local/bin/git fetch --all cd - done echo "====" checking the output log $ cat /var/service/backups/log/main/current | tai64nlocal 2018-02-05 18:00:00.098641500 ==== 2018-02-05 18:00:00.150083500 Mon Feb 5 18:00:00 CET 2018 2018-02-05 18:00:00.180056500 ==== 2018-02-05 18:00:00.211689500 Fetching origin 2018-02-05 18:00:01.073738500 From https://github.com/xgarcias/ansible-cmdb-freebsd-template 2018-02-05 18:00:01.073743500 * branch HEAD -> FETCH_HEAD 2018-02-05 18:00:01.091577500 Fetching origin 2018-02-05 18:00:02.185366500 From https://github.com/xgarcias/ansible-daemontools 2018-02-05 18:00:02.185371500 * branch HEAD -> FETCH_HEAD 2018-02-05 18:00:02.203049500 Fetching origin 2018-02-05 18:00:04.180310500 From https://github.com/xgarcias/ansible-macbook 2018-02-05 18:00:04.180315500 * branch HEAD -> FETCH_HEAD 2018-02-05 18:00:04.198104500 Fetching origin 2018-02-05 18:00:06.448429500 From https://github.com/xgarcias/daemontools-dyndns 2018-02-05 18:00:06.448434500 * branch HEAD -> FETCH_HEAD 2018-02-05 18:00:06.466266500 Fetching origin 2018-02-05 18:00:08.299785500 From https://github.com/xgarcias/daemontools-poudriere 2018-02-05 18:00:08.299790500 * branch HEAD -> FETCH_HEAD 2018-02-05 18:00:08.321755500 Fetching origin 2018-02-05 18:00:09.749956500 From https://github.com/xgarcias/daemontools-unbound-sinkhole 2018-02-05 18:00:09.749961500 * branch HEAD -> FETCH_HEAD 2018-02-05 18:00:09.771744500 Fetching origin 2018-02-05 18:00:11.113934500 From https://github.com/xgarcias/elasticsearch-plugin-readonlyrest 2018-02-05 18:00:11.113939500 * branch HEAD -> FETCH_HEAD 2018-02-05 18:00:11.135774500 Fetching origin 2018-02-05 18:00:12.703191500 From https://github.com/xgarcias/freebsd_local_ports 2018-02-05 18:00:12.703197500 * branch HEAD -> FETCH_HEAD 2018-02-05 18:00:12.724967500 Fetching origin 2018-02-05 18:00:13.583204500 From https://github.com/xgarcias/xgarcias.github.io 2018-02-05 18:00:13.583209500 * branch HEAD -> FETCH_HEAD 2018-02-05 18:00:13.601461500 ====Querying ANS/IP records via non rate-limited REST API2018-02-03T10:20:00+00:002018-02-03T10:20:00+00:00/2018/02/quering-whois-records-rest-api<p>Querying ANS/IP records via non rate-limited unauthenticated REST API.</p>
<p>More <a href="https://www.arin.net/resources/whoisrws/whois_api.html">Info</a></p>
<p>Also, you can use <a href="https://twitter.com/DuckDuckGo">@DuckDuckGo</a> to get the same results with the <strong>!Arin</strong> and <strong>!Ripe</strong> bang searches.</p>
<blockquote class="twitter-tweet" data-lang="en"><p lang="en" dir="ltr">Periodic reminder of ARIN's public, non rate-limited, unauthenticated REST API for ASN/IP/Network lookups<br /><br />docs: <a href="https://t.co/iMWUOFZcgr">https://t.co/iMWUOFZcgr</a> <a href="https://t.co/Yc7HQ69xrI">pic.twitter.com/Yc7HQ69xrI</a></p>— Andrew Morris (@Andrew___Morris) <a href="https://twitter.com/Andrew___Morris/status/957347178216935424?ref_src=twsrc%5Etfw">January 27, 2018</a></blockquote>
<script async="" src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
<blockquote class="twitter-tweet" data-lang="en"><p lang="en" dir="ltr">You can also use !Arin and !Ripe bang searches on <a href="https://twitter.com/DuckDuckGo?ref_src=twsrc%5Etfw">@DuckDuckGo</a> to quickly lookup IP information</p>— Greg Bray (@GBrayUT) <a href="https://twitter.com/GBrayUT/status/957370934167453696?ref_src=twsrc%5Etfw">January 27, 2018</a></blockquote>
<script async="" src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>Xavier GarciaQuerying ANS/IP records via non rate-limited unauthenticated REST API. More Info Also, you can use @DuckDuckGo to get the same results with the !Arin and !Ripe bang searches. Periodic reminder of ARIN's public, non rate-limited, unauthenticated REST API for ASN/IP/Network lookupsdocs: https://t.co/iMWUOFZcgr pic.twitter.com/Yc7HQ69xrI— Andrew Morris (@Andrew___Morris) January 27, 2018 You can also use !Arin and !Ripe bang searches on @DuckDuckGo to quickly lookup IP information— Greg Bray (@GBrayUT) January 27, 2018Blocklist for browser based cryptominers2018-02-03T10:20:00+00:002018-02-03T10:20:00+00:00/2018/02/blocklist-for-browser-based-cryptominters<p><a href="https://github.com/ZeroDot1/CoinBlockerLists/">List of DNS records and IP addresses</a> to prevent cryptomining in the browser or orther applications</p>
<blockquote class="twitter-tweet" data-lang="en"><p lang="en" dir="ltr">Blocklist for browser based cryptominers. Time to add it to my DNS sinkhole ;)<a href="https://t.co/OxUHsiV1J4">https://t.co/OxUHsiV1J4</a></p>— Xavier Garcia (@shellguardians) <a href="https://twitter.com/shellguardians/status/959338286291570688?ref_src=twsrc%5Etfw">February 2, 2018</a></blockquote>
<script async="" src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>Xavier GarciaList of DNS records and IP addresses to prevent cryptomining in the browser or orther applications Blocklist for browser based cryptominers. Time to add it to my DNS sinkhole ;)https://t.co/OxUHsiV1J4— Xavier Garcia (@shellguardians) February 2, 2018Centrally managed Bhyve infrastructure with Ansible, libvirt and pkg-ssh2017-05-15T15:08:00+00:002017-05-15T15:08:00+00:00/2017/05/centrally-managed-bhyve-infrastructure<p>At work we’ve been using <a href="https://wiki.freebsd.org/bhyve">Bhyve</a> for a while to run non-critical systems. It is a really nice and stable hypervisor even though we are using an earlier version available on FreeBSD 10.3. This means we lack Windows and VNC support among other things, but it is not a big deal.</p>
<p>After some iterations in our internal tools, we realised that the installation process was too slow and we always repeated the same steps. Of course, any good sysadmin will scream “<strong>AUTOMATION!”</strong> and so did we. Therefore, we started looking for different ways to improve our deployments.</p>
<p>We had a look at existing frameworks that manage Bhyve, but none of them had a feature that we find really important: having a centralized repository of VM images. For instance, <a href="https://www.joyent.com/smartos">SmartOS</a> applies this method successfully by having a backend server that stores a catalog of VMs and Zones, meaning that new instances can be deployed in a minute at most. This is a game changer if you are really busy in your day-to-day operations.</p>
<p>Since we are not great programmers, we decided to leverage the existing tools to achieve the same results. This is, <strong>having a centralised repository of Bhyve images in our data centers.</strong> The following building blocks are used:</p>
<ul>
<li>The ZFS snapshot of an existing VM. This will be our VM template.</li>
<li>A modified version of <a href="https://github.com/danrue/oneoff-pkg-create/">oneoff-pkg-create</a> to package the ZFS snapshots.</li>
<li><a href="https://www.freebsd.org/cgi/man.cgi?query=pkg-ssh&apropos=0&sektion=8&manpath=FreeBSD+10.3-RELEASE+and+Ports&arch=default&format=html">pkg-ssh</a> and <a href="https://www.freebsd.org/cgi/man.cgi?query=pkg-repo&apropos=0&sektion=8&manpath=FreeBSD+10.3-RELEASE+and+Ports&arch=default&format=html">pkg-repo</a> to host a local FreeBSD repo in a FreeBSD jail.</li>
<li><a href="http://libvirt.org/drvbhyve.html">libvirt</a> to manage our Bhyve VMs.</li>
<li>The ansible modules <a href="http://docs.ansible.com/ansible/virt_module.html">virt</a>, <a href="http://docs.ansible.com/ansible/virt_net_module.html">virt_net</a> and <a href="http://docs.ansible.com/ansible/virt_pool_module.html">virt_pool</a>.</li>
</ul>
<p>Workflow:</p>
<ul>
<li>We write a yml dictionary to define the parameters needed to create a new VM:
<ul>
<li>VM template (name of the pkg that will be installed in /bhyve/images)</li>
<li>VM name, cpu, memory, domain template, serial console, etc.</li>
</ul>
</li>
<li>This dictionary will be kept in the corresponding host_vars definition that configures our Bhyve host server.</li>
<li>The Ansible playbook:
<ul>
<li>installs the package named after the VM template (ZFS snapshot).e.g. pkg install <strong>FreeBSD-10.3-RELEASE-ZFS-20G-20170515</strong>.</li>
<li>uses <strong>cat</strong> and <strong>zfs receive</strong> to load the ZFS snapshot in a new volume.</li>
<li>calls the libvirt modules to automatically configure and boot the VM.</li>
</ul>
</li>
<li>The Sysadmin logs in the new VM and adjusts the hostname and network settings.</li>
<li>Run a separate Ansible playbook to configure the new VM as usual.</li>
</ul>
<p>Once automated, the installation process needs 2 minutes at most, compared with the 30 minutes needed to manually install VM plus allowing us to deploy many guests in parallel.</p>
<p>Resources:</p>
<ul>
<li>Sample config for FreeBSD <a href="https://people.freebsd.org/~rodrigc/libvirt-bhyve/libvirt-bhyve.html">https://people.freebsd.org/~rodrigc/libvirt-bhyve/libvirt-bhyve.html</a></li>
<li>bhyve driver for libvirt <a href="http://libvirt.org/drvbhyve.html">http://libvirt.org/drvbhyve.html</a></li>
<li>virsh examples <a href="https://wiki.libvirt.org/page/VM_lifecycle#Creating_a_domain">https://wiki.libvirt.org/page/VM_lifecycle#Creating_a_domain</a></li>
<li>migrating VMs w/o shared storage <a href="https://hgj.hu/live-migrating-a-virtual-machine-with-libvirt-without-a-shared-storage/">https://hgj.hu/live-migrating-a-virtual-machine-with-libvirt-without-a-shared-storage/</a></li>
<li>xml reference <a href="http://libvirt.org/formatdomain.html">http://libvirt.org/formatdomain.html</a></li>
<li>Virtual networking <a href="https://wiki.libvirt.org/page/VirtualNetworking">https://wiki.libvirt.org/page/VirtualNetworking</a></li>
</ul>Xavier GarciaAt work we’ve been using Bhyve for a while to run non-critical systems. It is a really nice and stable hypervisor even though we are using an earlier version available on FreeBSD 10.3. This means we lack Windows and VNC support among other things, but it is not a big deal. After some iterations in our internal tools, we realised that the installation process was too slow and we always repeated the same steps. Of course, any good sysadmin will scream “AUTOMATION!” and so did we. Therefore, we started looking for different ways to improve our deployments. We had a look at existing frameworks that manage Bhyve, but none of them had a feature that we find really important: having a centralized repository of VM images. For instance, SmartOS applies this method successfully by having a backend server that stores a catalog of VMs and Zones, meaning that new instances can be deployed in a minute at most. This is a game changer if you are really busy in your day-to-day operations. Since we are not great programmers, we decided to leverage the existing tools to achieve the same results. This is, having a centralised repository of Bhyve images in our data centers. The following building blocks are used: The ZFS snapshot of an existing VM. This will be our VM template. A modified version of oneoff-pkg-create to package the ZFS snapshots. pkg-ssh and pkg-repo to host a local FreeBSD repo in a FreeBSD jail. libvirt to manage our Bhyve VMs. The ansible modules virt, virt_net and virt_pool. Workflow: We write a yml dictionary to define the parameters needed to create a new VM: VM template (name of the pkg that will be installed in /bhyve/images) VM name, cpu, memory, domain template, serial console, etc. This dictionary will be kept in the corresponding host_vars definition that configures our Bhyve host server. The Ansible playbook: installs the package named after the VM template (ZFS snapshot).e.g. pkg install FreeBSD-10.3-RELEASE-ZFS-20G-20170515. uses cat and zfs receive to load the ZFS snapshot in a new volume. calls the libvirt modules to automatically configure and boot the VM. The Sysadmin logs in the new VM and adjusts the hostname and network settings. Run a separate Ansible playbook to configure the new VM as usual. Once automated, the installation process needs 2 minutes at most, compared with the 30 minutes needed to manually install VM plus allowing us to deploy many guests in parallel. Resources: Sample config for FreeBSD https://people.freebsd.org/~rodrigc/libvirt-bhyve/libvirt-bhyve.html bhyve driver for libvirt http://libvirt.org/drvbhyve.html virsh examples https://wiki.libvirt.org/page/VM_lifecycle#Creating_a_domain migrating VMs w/o shared storage https://hgj.hu/live-migrating-a-virtual-machine-with-libvirt-without-a-shared-storage/ xml reference http://libvirt.org/formatdomain.html Virtual networking https://wiki.libvirt.org/page/VirtualNetworkingOpenNTPD, leap seconds and other horror stories2017-01-03T11:00:00+00:002017-01-03T11:00:00+00:00/2017/01/openntpd-leap-seconds-and-other-horror<p>In case you are not informed, there was a <a href="https://en.wikipedia.org/wiki/Leap_second">leap second</a> on December 31, 2016. I don’t know you, but I’ve read many horror stories about things going terribly wrong after leap seconds and sysadmins in despair being paged at night. Well, today I am going to share one of those stories with you and I hope it will be terrifying.</p>
<h1 id="horror-story">Horror story</h1>
<p>Like diligent sysadmins, we monitor the ntpd services on our servers (<a href="http://www.openntpd.org/">OpenNTPD</a> in our case) and we will be alerted if a noticeable clock offset happens. Of course, in the event of a leap second, all the servers should trigger an alert and the corresponding recovery. The leap second was inserted as 23:59:60 on December 31 and the servers slowly <strong>chewed</strong> the difference in around 3 hours.</p>
<p>But… Here comes the horror story. Some of the servers didn’t recover at all. The graphs showed that the offset was still around -900 ms ( an extra second was introduced, therefore we were one second behind). At the end we had to restart openntpd as a quick remediation.</p>
<p>Below you can find the status of one of the servers, for reference.</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ ntpctl -s all
4/4 peers valid, clock synced, stratum 3
peer
wt tl st next poll offset delay jitter
176.9.31.215 from pool de.pool.ntp.org
1 10 2 1474s 1502s -984.909ms 6.266ms 0.175ms
62.116.162.126 from pool de.pool.ntp.org
* 1 10 2 733s 1640s -984.824ms 1.105ms 0.126ms
78.46.79.68 from pool de.pool.ntp.org
1 10 3 888s 1509s -984.824ms 6.380ms 0.138ms
46.4.54.78 from pool de.pool.ntp.org
1 10 2 3087s 3098s 105.306ms 6.295ms 0.130ms
</code></pre></div></div>
<p>You may notice that one of the peers has a positive offset and it doesn’t make any sense because an extra second was introduced as already explained above. I hope you can smell the stink at this moment because it is quite strong.</p>
<p>Well, digging in the logs I also found the following line:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ntpd[1438]: reply from 46.4.54.78: not synced (alarm), next query 3228s
</code></pre></div></div>
<p>Yes, openntpd was unhappy with that peer and decided to stop the time synchronisation until the issue is solved. <strong>Notice that this is really bad situation because we don’t control that peer at all</strong>. The only option is to restart openntpd because we configured a Round Robin DNS record.</p>
<p>I decided to do a bit of research and I went to the openntpd’s github repo to read the source code. Particuarly, <strong>src/usr.sbin/ntpd/client.c</strong> . Here, the NTP packet’s status is evaluated against a bit mask to analyse the LI bits (Leap Indicator)</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>(msg.status & LI_ALARM) == LI_ALARM || msg.stratum == 0 || msg.stratum > NTP_MAXSTRATUM)
</code></pre></div></div>
<p>The name LI_ALARM is self explanatory. This bitmask evaluates to true when both bits in the Leap Indicator are set to 1. From the <a href="https://tools.ietf.org/html/rfc5905">RFC</a>:</p>
<blockquote>
<p>LI Leap Indicator (leap): 2-bit integer warning of an impending leap second to be inserted or deleted in the last minute of the current month with values defined in Figure 9.
0 no warning
1 last minute of the day has 61 seconds
2 last minute of the day has 59 seconds
3 unknown (clock unsynchronized)</p>
</blockquote>
<p>At this point, I can claim that the peer was totally broken because it ran <strong>for hours</strong> (and it may be still broken at this point) <strong>with the clock unsynchronized</strong> and it hit us in a chain reaction. Well, one may expect a minimum quality but these are the risks that you must accept if you use services ran by others (unless you sign on paper a Service Level Agreement).</p>
<p>To understand how risky it can be, we can look at the page that describes how to <a href="http://www.pool.ntp.org/en/join.html">join an ntp pool</a>. Only an static IP address and a minimum bandwidth is required. This and a couple of recommendations. Hence, the many hobbyists running their own time servers.</p>
<p>ntp.org is running a monitoring system, that can be queried <a href="http://www.pool.ntp.org/scores">online</a>. The servers with a score lower than 10 will be automatically removed from the pool (mine had -100) and this is a good measure but good luck if they are already active in your ntpd service. They will cause you trouble until you manually restart the service.</p>
<h1 id="lessons-learned">Lessons learned</h1>
<ul>
<li>Actively monitor the ntp service.</li>
<li>Monitor the general status: un/synchronized, stratum, num valid peers, etc</li>
<li>Monitor the offset. I do an average of all peers and then apply abs().</li>
<li>Plan carefully and search for a reliable ntp source.</li>
<li>Does your datacenter offer this service? Can you have an SLA?</li>
<li>Avoid country/region pool at pool.ntp.org because they may be run by hobbyists and will cost you pain, even if ntp.org recommends you do to so. Perhaps running the ntp servers provided by your OS vendor is safer.</li>
<li>Perhaps buy a <a href="https://en.wikipedia.org/wiki/DCF77">DCF77 receiver</a> to make your own Stratum 1 server but you may need an external antenna if the datacenter walls are too thick.</li>
</ul>Xavier GarciaIn case you are not informed, there was a leap second on December 31, 2016. I don’t know you, but I’ve read many horror stories about things going terribly wrong after leap seconds and sysadmins in despair being paged at night. Well, today I am going to share one of those stories with you and I hope it will be terrifying. Horror story Like diligent sysadmins, we monitor the ntpd services on our servers (OpenNTPD in our case) and we will be alerted if a noticeable clock offset happens. Of course, in the event of a leap second, all the servers should trigger an alert and the corresponding recovery. The leap second was inserted as 23:59:60 on December 31 and the servers slowly chewed the difference in around 3 hours. But… Here comes the horror story. Some of the servers didn’t recover at all. The graphs showed that the offset was still around -900 ms ( an extra second was introduced, therefore we were one second behind). At the end we had to restart openntpd as a quick remediation. Below you can find the status of one of the servers, for reference. $ ntpctl -s all 4/4 peers valid, clock synced, stratum 3 peer wt tl st next poll offset delay jitter 176.9.31.215 from pool de.pool.ntp.org 1 10 2 1474s 1502s -984.909ms 6.266ms 0.175ms 62.116.162.126 from pool de.pool.ntp.org * 1 10 2 733s 1640s -984.824ms 1.105ms 0.126ms 78.46.79.68 from pool de.pool.ntp.org 1 10 3 888s 1509s -984.824ms 6.380ms 0.138ms 46.4.54.78 from pool de.pool.ntp.org 1 10 2 3087s 3098s 105.306ms 6.295ms 0.130ms You may notice that one of the peers has a positive offset and it doesn’t make any sense because an extra second was introduced as already explained above. I hope you can smell the stink at this moment because it is quite strong. Well, digging in the logs I also found the following line: ntpd[1438]: reply from 46.4.54.78: not synced (alarm), next query 3228s Yes, openntpd was unhappy with that peer and decided to stop the time synchronisation until the issue is solved. Notice that this is really bad situation because we don’t control that peer at all. The only option is to restart openntpd because we configured a Round Robin DNS record. I decided to do a bit of research and I went to the openntpd’s github repo to read the source code. Particuarly, src/usr.sbin/ntpd/client.c . Here, the NTP packet’s status is evaluated against a bit mask to analyse the LI bits (Leap Indicator) (msg.status & LI_ALARM) == LI_ALARM || msg.stratum == 0 || msg.stratum > NTP_MAXSTRATUM) The name LI_ALARM is self explanatory. This bitmask evaluates to true when both bits in the Leap Indicator are set to 1. From the RFC: LI Leap Indicator (leap): 2-bit integer warning of an impending leap second to be inserted or deleted in the last minute of the current month with values defined in Figure 9. 0 no warning 1 last minute of the day has 61 seconds 2 last minute of the day has 59 seconds 3 unknown (clock unsynchronized) At this point, I can claim that the peer was totally broken because it ran for hours (and it may be still broken at this point) with the clock unsynchronized and it hit us in a chain reaction. Well, one may expect a minimum quality but these are the risks that you must accept if you use services ran by others (unless you sign on paper a Service Level Agreement). To understand how risky it can be, we can look at the page that describes how to join an ntp pool. Only an static IP address and a minimum bandwidth is required. This and a couple of recommendations. Hence, the many hobbyists running their own time servers. ntp.org is running a monitoring system, that can be queried online. The servers with a score lower than 10 will be automatically removed from the pool (mine had -100) and this is a good measure but good luck if they are already active in your ntpd service. They will cause you trouble until you manually restart the service. Lessons learned Actively monitor the ntp service. Monitor the general status: un/synchronized, stratum, num valid peers, etc Monitor the offset. I do an average of all peers and then apply abs(). Plan carefully and search for a reliable ntp source. Does your datacenter offer this service? Can you have an SLA? Avoid country/region pool at pool.ntp.org because they may be run by hobbyists and will cost you pain, even if ntp.org recommends you do to so. Perhaps running the ntp servers provided by your OS vendor is safer. Perhaps buy a DCF77 receiver to make your own Stratum 1 server but you may need an external antenna if the datacenter walls are too thick.SSH public key authentication with security tokens2016-10-21T12:04:00+00:002016-10-21T12:04:00+00:00/2016/10/ssh-public-key-authentication-with<p>I’ve been using a Yubikey for two factor authentication with HOTP for a long time but this crypto hardware has many more functionalities, like storing certificates (RSA and ECC keys).</p>
<p>The use I will describe below allows us to do SSH public key authentication while keeping the private key stored in the device at all times. This gives an extra layer of security, because the key cannot be extracted and the device will be locked if the PIN is bruteforced.</p>
<p>Formally speaking, many of these crypto keys (commonly in the form of a USB device emulating a card reader) support the Personal Identity Verification (PIV) card interface, that allows ECC/RSA sign/decryption operations with the private key stored in the device ( Read the <a href="http://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-78-4.pdf">NIST SP 800-78</a> document for more information). This hardware interface together with the API <a href="https://en.wikipedia.org/wiki/PKCS_11">PKSC11</a> will allow programs like ssh to perform cryptographic operations with the certificates stored in the device.</p>
<p>One weak point in this scenario is the vendor trust, particularly when it comes to the random number generator implemented in the hardware, that can potentially create weak and easy to bruteforce certificates, but this can be minimized if we use the normal ssh tools to generate the ssh keys and then we import them into the device. In my case, I have followed this path.</p>
<p>Another downside is that NIST SP 800-78 only defines RSA keys up to 2048 bits. You must take this into consideration because the chip may support bigger keys (e.g. for OpenPGP cards) but the PIV interface is up to 2048 unless NIST updates the standard.</p>
<p>Finally, either PKCS11 or <a href="https://github.com/OpenSC/OpenSC/wiki">OpenSC</a> (I don’t quite remember), do not support ECC keys. You are out of luck in that case.</p>
<h1 id="preparation">Preparation</h1>
<ul>
<li>A crypto device that is NIST SP 800-78 compliant, a Yubikey 4 in my case.</li>
<li>An RSA key pair created with ssh-keygen(1).</li>
<li>Install <a href="https://github.com/OpenSC/OpenSC/wiki">OpenSC</a> in your computer to have the PKCS11 library support and management tools. There are installers available for almost any platform: Windows, OSX, Linux, BSD,etc.</li>
</ul>
<h1 id="steps">Steps</h1>
<ul>
<li>
<p>Convert the RSA private key into pem format.</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>openssl rsa <span class="nt">-in</span> ./id_rsa <span class="nt">-out</span> id_rsa.pem
</code></pre></div> </div>
</li>
<li>
<p>Load the private key into the <a href="https://developers.yubico.com/PIV/Introduction/Certificate_slots.html">slot 9a</a> in the device. It will ask for the PIN, that you may have changed (look for ‘change-pin’ and ‘change-puk’ in <a href="https://www.yubico.com/wp-content/uploads/2016/05/Yubico_PIV_Tool_Command_Line_Guide_en.pdf">this document</a>). Notice that I’ve setup the ‘pin-policy’ to once and the ‘touch-policy’ to never, effectively asking the PIN only once when I load the key in the ssh-agent, but you can change the behaviour that fits you best (e.g. force a touch every time you want to login via ssh). </p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>yubico-piv-tool <span class="nt">-a</span> import-key <span class="nt">-s</span> 9a <span class="nt">--pin-policy</span><span class="o">=</span>once <span class="nt">--touch-policy</span><span class="o">=</span>never -i id_rsa.pem
</code></pre></div> </div>
</li>
<li>
<p>Transform the public key to a format that is understood by the device</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>ssh-keygen <span class="nt">-e</span> <span class="nt">-f</span> ./id_rsa.pub <span class="nt">-m</span> PKCS8 <span class="o">></span> id_rsa.pub.pkcs8
</code></pre></div> </div>
</li>
<li>
<p>Use the public and private keys (the last one in the device) to generate an SSL selfsigned certificate, to be imported later in the device, with a 10 years expiration date (just in case). It will ask for your PIN again.</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>yubico-piv-tool <span class="nt">-a</span> verify <span class="nt">-a</span> selfsign-certificate <span class="nt">--valid-days</span> 3650 -s 9a <span class="nt">-S</span> <span class="s2">"/CN=myname/O=ssh/"</span> <span class="nt">-i</span> id_rsa.pub.pkcs8 <span class="nt">-o</span> 9a-cert.pem
</code></pre></div> </div>
</li>
<li>
<p>Import the generated certificate.</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>yubico-piv-tool <span class="nt">-a</span> verify <span class="nt">-a</span> import-certificate <span class="nt">-s</span> 9a <span class="nt">-i</span> 9a-cert.pem
</code></pre></div> </div>
</li>
</ul>
<h1 id="using-the-device-together-with-openssh">Using the device together with OpenSSH</h1>
<p>In case you don’t have the public key (this step is not needed because I generated the key in my PC), you can extract it with the ssh-keygen. You have to search for the pkcs11 shared library, that is /Library/OpenSC/lib/pkcs11/opensc-pkcs11.so in case of OSX.</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>ssh-keygen <span class="nt">-D</span> /Library/OpenSC/lib/pkcs11/opensc-pkcs11.so
ssh-rsa AAAAB....e1
</code></pre></div></div>
<p>Then you can tell ssh to interact with the device by pointing to this library instead of using a private key stored in your disk, but it is not very convenient because it will always ask for your pin.</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>ssh <span class="nt">-I</span> /Library/OpenSC/lib/pkcs11/opensc-pkcs11.so myserver
Enter PIN <span class="k">for</span> <span class="s1">'PIV_II (PIV Card Holder pin)'</span>:
</code></pre></div></div>
<p>Loading the key in your ssh-agent is more convenient because it will only ask for the PIN once (following the pin-policy=once) and you can be sure nobody will try to abuse it because the device must be present at all times. <strong><em>Remember that the private key never leaves the device</em></strong>.</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>ssh-add <span class="nt">-s</span> /Library/OpenSC/lib/pkcs11/opensc-pkcs11.so
Enter passphrase <span class="k">for </span>PKCS#11:
Card added: /Library/OpenSC/lib/pkcs11/opensc-pkcs11.so
</code></pre></div></div>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code> bash-3.2<span class="nv">$ </span>ssh-add <span class="nt">-l</span>
2048 SHA256:random_hash_value /Library/OpenSC/lib/pkcs11/opensc-pkcs11.so <span class="o">(</span>RSA<span class="o">)</span>
</code></pre></div></div>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>ssh-add <span class="nt">-e</span> /Library/OpenSC/lib/pkcs11/opensc-pkcs11.so
Card removed: /Library/OpenSC/lib/pkcs11/opensc-pkcs11.so
</code></pre></div></div>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>ssh-add <span class="nt">-l</span>
The agent has no identities.
</code></pre></div></div>Xavier GarciaI’ve been using a Yubikey for two factor authentication with HOTP for a long time but this crypto hardware has many more functionalities, like storing certificates (RSA and ECC keys). The use I will describe below allows us to do SSH public key authentication while keeping the private key stored in the device at all times. This gives an extra layer of security, because the key cannot be extracted and the device will be locked if the PIN is bruteforced. Formally speaking, many of these crypto keys (commonly in the form of a USB device emulating a card reader) support the Personal Identity Verification (PIV) card interface, that allows ECC/RSA sign/decryption operations with the private key stored in the device ( Read the NIST SP 800-78 document for more information). This hardware interface together with the API PKSC11 will allow programs like ssh to perform cryptographic operations with the certificates stored in the device. One weak point in this scenario is the vendor trust, particularly when it comes to the random number generator implemented in the hardware, that can potentially create weak and easy to bruteforce certificates, but this can be minimized if we use the normal ssh tools to generate the ssh keys and then we import them into the device. In my case, I have followed this path. Another downside is that NIST SP 800-78 only defines RSA keys up to 2048 bits. You must take this into consideration because the chip may support bigger keys (e.g. for OpenPGP cards) but the PIV interface is up to 2048 unless NIST updates the standard. Finally, either PKCS11 or OpenSC (I don’t quite remember), do not support ECC keys. You are out of luck in that case. Preparation A crypto device that is NIST SP 800-78 compliant, a Yubikey 4 in my case. An RSA key pair created with ssh-keygen(1). Install OpenSC in your computer to have the PKCS11 library support and management tools. There are installers available for almost any platform: Windows, OSX, Linux, BSD,etc. Steps Convert the RSA private key into pem format. $ openssl rsa -in ./id_rsa -out id_rsa.pem Load the private key into the slot 9a in the device. It will ask for the PIN, that you may have changed (look for ‘change-pin’ and ‘change-puk’ in this document). Notice that I’ve setup the ‘pin-policy’ to once and the ‘touch-policy’ to never, effectively asking the PIN only once when I load the key in the ssh-agent, but you can change the behaviour that fits you best (e.g. force a touch every time you want to login via ssh). $ yubico-piv-tool -a import-key -s 9a --pin-policy=once --touch-policy=never -i id_rsa.pem Transform the public key to a format that is understood by the device $ ssh-keygen -e -f ./id_rsa.pub -m PKCS8 > id_rsa.pub.pkcs8 Use the public and private keys (the last one in the device) to generate an SSL selfsigned certificate, to be imported later in the device, with a 10 years expiration date (just in case). It will ask for your PIN again. $ yubico-piv-tool -a verify -a selfsign-certificate --valid-days 3650 -s 9a -S "/CN=myname/O=ssh/" -i id_rsa.pub.pkcs8 -o 9a-cert.pem Import the generated certificate. $ yubico-piv-tool -a verify -a import-certificate -s 9a -i 9a-cert.pem Using the device together with OpenSSH In case you don’t have the public key (this step is not needed because I generated the key in my PC), you can extract it with the ssh-keygen. You have to search for the pkcs11 shared library, that is /Library/OpenSC/lib/pkcs11/opensc-pkcs11.so in case of OSX. $ ssh-keygen -D /Library/OpenSC/lib/pkcs11/opensc-pkcs11.so ssh-rsa AAAAB....e1 Then you can tell ssh to interact with the device by pointing to this library instead of using a private key stored in your disk, but it is not very convenient because it will always ask for your pin. $ ssh -I /Library/OpenSC/lib/pkcs11/opensc-pkcs11.so myserver Enter PIN for 'PIV_II (PIV Card Holder pin)': Loading the key in your ssh-agent is more convenient because it will only ask for the PIN once (following the pin-policy=once) and you can be sure nobody will try to abuse it because the device must be present at all times. Remember that the private key never leaves the device. $ ssh-add -s /Library/OpenSC/lib/pkcs11/opensc-pkcs11.so Enter passphrase for PKCS#11: Card added: /Library/OpenSC/lib/pkcs11/opensc-pkcs11.so bash-3.2$ ssh-add -l 2048 SHA256:random_hash_value /Library/OpenSC/lib/pkcs11/opensc-pkcs11.so (RSA) $ ssh-add -e /Library/OpenSC/lib/pkcs11/opensc-pkcs11.so Card removed: /Library/OpenSC/lib/pkcs11/opensc-pkcs11.so $ ssh-add -l The agent has no identities.Building a DNS sinkhole in FreeBSD with Unbound and Dnscrypt2016-08-25T18:30:00+00:002016-08-25T18:30:00+00:00/2016/08/building-dns-sinkhole-in-freebsd-with<p>There is already lots of literature regarding <a href="https://www.sans.org/reading-room/whitepapers/dns/dns-sinkhole-33523">DNS sinkholes</a> and it is a <a href="https://en.wikipedia.org/wiki/DNS_sinkhole">common term</a> in
Information Security. In my case, I wanted to give it a try on FreeBSD 10 but I didn’t want to make use of <a href="https://www.isc.org/">Bind</a> since it was removed from the base distribution in favor of <a href="https://www.unbound.net/">Unbound</a>.</p>
<p>The setup will have the following steps:</p>
<ul>
<li>Create a jail where the service will be configured (not explained
because there is lots of examples in Internet)</li>
<li>Install Unbound</li>
<li>Basic Unbound configuration</li>
<li>Configure Unbound to block DNS queries</li>
<li>Choosing block lists available in Internet</li>
<li>Updating the block lists</li>
<li>Bonus: use dnscrypt to avoid DNS spoofing</li>
<li>Final Unbound configuration file</li>
</ul>
<h2 id="configuring-our-dns-sinkhole">Configuring our DNS sinkhole</h2>
<h3 id="installing-unbound">Installing Unbound</h3>
<p>I ran my test in FreeBSD 10.1. Sadly, it ships Unbound v. 1.4.x, that
is quite old and lacks some nice features. In the end, I had to install
dns/unbound form the ports, that currently installs v.1.5.9.</p>
<p>If you are using a most recent FreeBSD distribution (e.g. FreeBSD 10.3),
you will not need to install the port.</p>
<p>The only difference is that you will need to use <strong>local_unbound_enable=”YES”</strong>
in /etc/rc.conf instead of <strong>unbound_enable=”YES”</strong> and the configuration file
will be located in <strong>/etc/unbound/unbound.conf</strong> instead of
<strong>/usr/local/etc/unbound/unbound.conf.</strong></p>
<h3 id="basic-unbound-configuration">Basic Unbound configuration</h3>
<p>First, we have to download the root-hints, to allow our dns cache to
find the right master DNS servers.</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code># fetch ftp://ftp.internic.net/domain/named.cache -o /usr/local/etc/unbound/root.hints
</code></pre></div></div>
<p>Then, we edit the unbound.conf.</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>server:
interface: 10.10.10.10
#who can use our DNS cache
access-control: 10.10.10.0/24 allow
logfile: "/usr/local/etc/unbound/logs/unbound.log"
username: unbound
directory: /usr/local/etc/unbound
chroot: /usr/local/etc/unbound
pidfile: /usr/local/etc/unbound/unbound.pid
verbosity: 1
root-hints: /usr/local/etc/unbound/root.hints
#remote-control allows us to use the unbound-control
#utility to manage the service from the command line
remote-control:
control-enable: yes
control-interface: /usr/local/etc/unbound/local_unbound.ctl
control-use-cert: no
</code></pre></div></div>
<p>Please, notice, that all files are located in <strong>/usr/local/etc/unbound/</strong>.
If you are not using the version provided by de ports tree, the base directory will be<strong>/var/unbound/</strong> instead.</p>
<p>The last step is to enable and to start the service</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code># sysrc unbound_enable="YES"
# service unbound start
</code></pre></div></div>
<p>With this setup, we have a basic DNS cache configurated in our network.
Now, you should be able to query the DNS server listening on 10.10.10.10</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code># host www.google.com 10.10.10.10
Using domain server:
Name: 10.10.10.10
Address: 10.10.10.10#53
Aliases:
www.google.com has address 74.125.68.147
www.google.com has address 74.125.68.105
www.google.com has address 74.125.68.103
www.google.com has address 74.125.68.99
www.google.com has address 74.125.68.104
www.google.com has address 74.125.68.106
www.google.com has IPv6 address 2404:6800:4003:c02::6
</code></pre></div></div>
<h3 id="configure-unbound-to-block-dns-queries">Configure Unbound to block DNS queries</h3>
<p>The classic trick in DNS sinkholes is to define authoritative zones in
the DNS cache, that will return a defined static IP address
(e.g.127.0.0.2) to identify in the logs (or in network devices) when
somebody is trying to connect to a blocked domain.</p>
<p>In Unbound, it is a bit more difficult because it is only a basic DNS
cache service and lacks some features, but there are some ways around.</p>
<p><strong>unbound.conf(5)</strong> has the local-zone directive and it is used to
define local DNS zones but we will “abuse it”, by dropping all the
queries to these domains. For instance, if we want to drop all the DNS
queries asking for google.com (and subdomains) we need to add the
directive:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>local-zone: "google.com" inform_deny.
</code></pre></div></div>
<p>This will silently drop the DNS query and it will write an entry in the
log file (<strong>/usr/local/etc/unbound/logs/unbound.log</strong> in our case). The
client will see show the query times out.</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[1472139065] unbound[28162:0] info: google.com. inform 10.10.10.3@31679 google.com. A IN
[1472139071] unbound[28162:0] info: google.com. inform 10.10.10.3@56551 google.com. A IN
</code></pre></div></div>
<p>To keep a tidy configuration, we will not add this big list of
<strong>local-zone</strong> directives in the main configuration file but we will
include a file thanks to the the <strong>include</strong> directive, that is located
in the <strong>server</strong> section.</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>server:
....
include: /usr/local/etc/unbound/blackhole.zone
....
</code></pre></div></div>
<h3 id="choosing-block-lists-available-in-internet">Choosing block lists available in Internet</h3>
<p>I am using the following URLs that should be considered safe, with
around 23 thousand domains listed.</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>http://mirror1.malwaredomains.com/files/justdomains
https://zeustracker.abuse.ch/blocklist.php?download=domainblocklist
https://ransomwaretracker.abuse.ch/downloads/RW_DOMBL.txt
http://isc.sans.edu/feeds/suspiciousdomains_Low.txt
http://isc.sans.edu/feeds/suspiciousdomains_Medium.txt
http://isc.sans.edu/feeds/suspiciousdomains_High.txt
</code></pre></div></div>
<h3 id="updating-the-block-lists">Updating the block lists</h3>
<p>I’ve written a small shell script that downloads all the lists every
night and reloads the Unbound configuration.</p>
<p>Please, notice that reloading Unbound will also flush the DNS cache. A
good way to do it is:</p>
<p><strong>Dump the cache</strong></p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code># unbound-control dump_cache > $cache_file
</code></pre></div></div>
<p>Then, download the files with fetch(1) and regenerate
/usr/local/etc/unbound/blackhole.zone</p>
<p><strong>Reload the configuration</strong></p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code># unbound-control reload
</code></pre></div></div>
<p><strong>Load the cache dump back in Unbound</strong></p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code># unbound-control load_cache < $cache_file
</code></pre></div></div>
<h3 id="bonus-use-dnscrypt-to-avoid-dns-spoofing">Bonus: use dnscrypt to avoid DNS spoofing</h3>
<p><a href="https://dnscrypt.org/">Dnscrypt</a> can be used to avoid some common DNS
attacks by encrypting and signing the DNS queries. All traffic will go
encrypted using the port 443, both TCP and UDP.</p>
<p>Of course, other issues remain, like DNS spoofing at the server end and
the possible logging.</p>
<p>The client is available in the port tree under dns/dnscrypt-proxy and it
is really easy to configure. We only need two parameters: the ip:port
where we want to listen and which server we want to connect to (aka
Resolver)</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code># sysrc dnscrypt_proxy_enable="YES"
# sysrc dnscrypt_proxy_flags="-a 10.10.10.10:5353"
# sysrc dnscrypt_proxy_resolver="dnscrypt.eu-nl"
# service dnscrypt_proxy start
</code></pre></div></div>
<p>The final step will be configuring Unbound to forward all the DNS
queries to dnscrypt. This can be done in the <strong>forward-zone</strong> section.</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>forward-zone:
name: "."
forward-addr: 10.10.10.10@5353
</code></pre></div></div>
<h3 id="final-unbound-configuration-file">Final Unbound configuration file</h3>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>server:
interface: 10.10.10.10
access-control: 10.10.10.0/24 allow
logfile: "/usr/local/etc/unbound/logs/unbound.log"
username: unbound
directory: /usr/local/etc/unbound
chroot: /usr/local/etc/unbound
pidfile: /usr/local/etc/unbound/unbound.pid
verbosity: 1
root-hints: /usr/local/etc/unbound/root.hints
include: /usr/local/etc/unbound/blackhole.zone
remote-control:
control-enable: yes
control-interface: /usr/local/etc/unbound/local_unbound.ctl
control-use-cert: no
forward-zone:
name: "."
forward-addr: 10.10.10.10@5353
</code></pre></div></div>Xavier GarciaThere is already lots of literature regarding DNS sinkholes and it is a common term in Information Security. In my case, I wanted to give it a try on FreeBSD 10 but I didn’t want to make use of Bind since it was removed from the base distribution in favor of Unbound. The setup will have the following steps: Create a jail where the service will be configured (not explained because there is lots of examples in Internet) Install Unbound Basic Unbound configuration Configure Unbound to block DNS queries Choosing block lists available in Internet Updating the block lists Bonus: use dnscrypt to avoid DNS spoofing Final Unbound configuration file Configuring our DNS sinkhole Installing Unbound I ran my test in FreeBSD 10.1. Sadly, it ships Unbound v. 1.4.x, that is quite old and lacks some nice features. In the end, I had to install dns/unbound form the ports, that currently installs v.1.5.9. If you are using a most recent FreeBSD distribution (e.g. FreeBSD 10.3), you will not need to install the port. The only difference is that you will need to use local_unbound_enable=”YES” in /etc/rc.conf instead of unbound_enable=”YES” and the configuration file will be located in /etc/unbound/unbound.conf instead of /usr/local/etc/unbound/unbound.conf. Basic Unbound configuration First, we have to download the root-hints, to allow our dns cache to find the right master DNS servers. # fetch ftp://ftp.internic.net/domain/named.cache -o /usr/local/etc/unbound/root.hints Then, we edit the unbound.conf. server: interface: 10.10.10.10 #who can use our DNS cache access-control: 10.10.10.0/24 allow logfile: "/usr/local/etc/unbound/logs/unbound.log" username: unbound directory: /usr/local/etc/unbound chroot: /usr/local/etc/unbound pidfile: /usr/local/etc/unbound/unbound.pid verbosity: 1 root-hints: /usr/local/etc/unbound/root.hints #remote-control allows us to use the unbound-control #utility to manage the service from the command line remote-control: control-enable: yes control-interface: /usr/local/etc/unbound/local_unbound.ctl control-use-cert: no Please, notice, that all files are located in /usr/local/etc/unbound/. If you are not using the version provided by de ports tree, the base directory will be/var/unbound/ instead. The last step is to enable and to start the service # sysrc unbound_enable="YES" # service unbound start With this setup, we have a basic DNS cache configurated in our network. Now, you should be able to query the DNS server listening on 10.10.10.10 # host www.google.com 10.10.10.10 Using domain server: Name: 10.10.10.10 Address: 10.10.10.10#53 Aliases: www.google.com has address 74.125.68.147 www.google.com has address 74.125.68.105 www.google.com has address 74.125.68.103 www.google.com has address 74.125.68.99 www.google.com has address 74.125.68.104 www.google.com has address 74.125.68.106 www.google.com has IPv6 address 2404:6800:4003:c02::6 Configure Unbound to block DNS queries The classic trick in DNS sinkholes is to define authoritative zones in the DNS cache, that will return a defined static IP address (e.g.127.0.0.2) to identify in the logs (or in network devices) when somebody is trying to connect to a blocked domain. In Unbound, it is a bit more difficult because it is only a basic DNS cache service and lacks some features, but there are some ways around. unbound.conf(5) has the local-zone directive and it is used to define local DNS zones but we will “abuse it”, by dropping all the queries to these domains. For instance, if we want to drop all the DNS queries asking for google.com (and subdomains) we need to add the directive: local-zone: "google.com" inform_deny. This will silently drop the DNS query and it will write an entry in the log file (/usr/local/etc/unbound/logs/unbound.log in our case). The client will see show the query times out. [1472139065] unbound[28162:0] info: google.com. inform 10.10.10.3@31679 google.com. A IN [1472139071] unbound[28162:0] info: google.com. inform 10.10.10.3@56551 google.com. A IN To keep a tidy configuration, we will not add this big list of local-zone directives in the main configuration file but we will include a file thanks to the the include directive, that is located in the server section. server: .... include: /usr/local/etc/unbound/blackhole.zone .... Choosing block lists available in Internet I am using the following URLs that should be considered safe, with around 23 thousand domains listed. http://mirror1.malwaredomains.com/files/justdomains https://zeustracker.abuse.ch/blocklist.php?download=domainblocklist https://ransomwaretracker.abuse.ch/downloads/RW_DOMBL.txt http://isc.sans.edu/feeds/suspiciousdomains_Low.txt http://isc.sans.edu/feeds/suspiciousdomains_Medium.txt http://isc.sans.edu/feeds/suspiciousdomains_High.txt Updating the block lists I’ve written a small shell script that downloads all the lists every night and reloads the Unbound configuration. Please, notice that reloading Unbound will also flush the DNS cache. A good way to do it is: Dump the cache # unbound-control dump_cache > $cache_file Then, download the files with fetch(1) and regenerate /usr/local/etc/unbound/blackhole.zone Reload the configuration # unbound-control reload Load the cache dump back in Unbound # unbound-control load_cache < $cache_file Bonus: use dnscrypt to avoid DNS spoofing Dnscrypt can be used to avoid some common DNS attacks by encrypting and signing the DNS queries. All traffic will go encrypted using the port 443, both TCP and UDP. Of course, other issues remain, like DNS spoofing at the server end and the possible logging. The client is available in the port tree under dns/dnscrypt-proxy and it is really easy to configure. We only need two parameters: the ip:port where we want to listen and which server we want to connect to (aka Resolver) # sysrc dnscrypt_proxy_enable="YES" # sysrc dnscrypt_proxy_flags="-a 10.10.10.10:5353" # sysrc dnscrypt_proxy_resolver="dnscrypt.eu-nl" # service dnscrypt_proxy start The final step will be configuring Unbound to forward all the DNS queries to dnscrypt. This can be done in the forward-zone section. forward-zone: name: "." forward-addr: 10.10.10.10@5353 Final Unbound configuration file server: interface: 10.10.10.10 access-control: 10.10.10.0/24 allow logfile: "/usr/local/etc/unbound/logs/unbound.log" username: unbound directory: /usr/local/etc/unbound chroot: /usr/local/etc/unbound pidfile: /usr/local/etc/unbound/unbound.pid verbosity: 1 root-hints: /usr/local/etc/unbound/root.hints include: /usr/local/etc/unbound/blackhole.zone remote-control: control-enable: yes control-interface: /usr/local/etc/unbound/local_unbound.ctl control-use-cert: no forward-zone: name: "." forward-addr: 10.10.10.10@5353