Getting started with self hosting

Table of Contents

self hosting for all your apps from nothing to done

I'm writing this post as a very long reply to a reddit post. I hope it helps some people to get start with self hosting as:

This guide will give you the options available I know of for self hosting. That way you can make informed decisions based on your needs knowing about the pro and cons.

1 General Approach to self hosting

Depending on your level, you might consider two opposite path:

  1. go with a fully custom solution that you build from scratch
  2. install an easy to use solution aiming to do self hosting easier such as:

If you're not willing to learn about linux server administration and don't have an ok knowledge of it yet, just go with the second option. To make it happen, go to the getting started page of any of the website listed above and you should be ready to go quickly.

Option 1 is what's this post is all about. It's definitly harder but it lets you install pretty much everything you'd need without being limited by different app store. Our moto will be about helping you making the right decisions for your installation, giving the options you can choose from while setting up your environment. Most of the time, it's a tradeoff you have to choose from. Keep in mind, there's always a cost to something that look like a great idea and we'll try to expose those to the best of my knowledge.

2 The Infrastructure

2.1 Server

It might sounds obvious but to self host your own set of applications, you need a server. That's probably the first choice you'll have to make thoughout your journey:

where do you host the server(s)?

  1. Host it at home. The only advantage I see is you have a direct access to your machine and your internet provider won't charge you for bandwith if you're using it mostly in your house network. A few things about this:
    • real servers make real noises, you probably don't want this at home if you're not living alone on an island without proper investment. What you want is to use whatever piece of hardware you already have, it will do the job just fine. You'll know when you need more anyway.
    • pay a lot of attention to your internet connection, if you're out of your house, your service could become very very slow
    • don't think it's free, you'll still be paying for electricity and that won't be cheaper than a cheap machine on the cloud
    • You'll have to setup NAT redirection and request your internet provider a fixed IP. If that's no possible, you can look at dyndns but keep in mind those solutions are by definition flaky
  2. Rent it from a hosting company. They'll usually give you choice with a VPS or a dedicated server with various support agreement. Depending on the number of apps you want to install, start small and get more when you need more.

Tips: Weigh the pro and cons and make a decision. Tips: If you go with option 2 I would recommend Digital Ocean as everything has been so far cheap, nice and easy. On the plus side, you can also have 2 month free credit with their cheapest machine if you create an account with this link. If you consider using OVH, for your own sake, run away. I've been using OVH for 5 years without an inch to host pretty much everything for me and companies for which I was working without any problems until recently where:

  • they refused to provide me with an invoice in plain English for a new contract
  • I needed to use their support which is unresponsive, takes forever to get something done and automatically close tickets even if the issue isn't solve
  • they decided to sue me on the basis I need to pay 100% of the renewal of yearly contract for a service I don't want. If you decide to go ahead anyway, at least pay attention to all their newsletter bullshit and pay a lot of attention to any contract you sign with them as they change it every day or so.

2.2 OS

Choosing an OS for your self hosted infrastructure is more a matter of choice than anything else. Choosing an OS is like talking politics, the truth is there's no black or white, everything is nuance of gray. Anything could do the job but I personally favor linux based distribution as it makes my life much easier with much less hassle.

The following statement is heavily biased: If you don't want bloat, lag, random reboot and extra bugs, avoid any Microsoft solution. First they don't deserve your money and second they'll still make your life miserable if you're trying to use anything not made by them. Let's face it, the only thing for which they did a good job is the locked down .NET environment that only run properly on window. Out of .NET, their developer tooling suck hard. Same for OSX but at least they don't pretend to maintain a full blown server version of their OS.

You get it, I'm heavily biased when it comes to picking an OS but to be fair, I think your life will be much easier with a Linux based OS as you have a deep control over all the moving pieces. In this serie, I'll use ubuntu server as my linux distribution. The actual reason for this is I already know it well.

Tips: Pick any linux distribution you're already familiar with. Preferably a server distribution as desktop distribution consume much more resources

Tips: There's a wide range of offering for container based orchestration. You might see a lot of buzzwords with CoreOS, Kubernetes, Swarm, … I will ignore those offerings as:

  • I don't need scalability for self hosting. Those tools are design to solve problems for infrastructure at a much larger scale than mine.
  • Those techs are bleeding edge
  • I don't have a deep knowledge about those out of toy projects.
  • If you don't really master those things, you'll endup with more maintenance and less time doing what you really like. Usually, pieces of tech are designed to be quite appealing and easy to get started but no one will tell you about their drawbacks. Be aware that choosing something will almost always result on getting burn by something unexpected on the long run.

2.3 Before getting deeper

From now on, we assume you have a linux machine running somewhere. The instructions to setup your server will be given for ubuntu server but it won't be hard to translate those in your favorite distribution.

If:

  • you went with the home hosting option, plug a screen and a keyboard in your server.
  • you went with a cloud provider, wait until you receive credentials to your newly created machine and you should be able to send command into it by opening your command line and typing:
# replace username by your actual username, it might be 'root' by default 
# replace host_ip by the ip adress of your machine
ssh username@host_ip

If you server is working you should be prompt with a password. Once you type in your password, you should be welcome with a message and a prompt you can type commands from.

2.4 Get yourself a domain name

A domain name makes it convenient for your users to remind the actual location of your service. You can get one for less than 10$ per year.

If you don't want to invest in a domain name, you have 2 options to access your services:

  • you'll have to use your machine IP adress, with the services running on different port. It won't be user friendly at all and if you're running on the internet, your services will run without any filters.
  • you can trick your OS dns resolution by editing your host file (/etc/hosts in linux and OSX). It will be much easier to memorise the URL without investing on a domain name. Inconvenient is anybody who needs access to your services will have to do weird things before getting in. Example: if you add: "192.168.0.17 selfhosted.com" in your host file, typing selfhosted.com in your browser will bring you to 192.168.0.17

3 Setting up our server correctly

3.1 First things first

First things first on your newly created machine, you need to:

  1. Upgrade your server
  2. If you're using the root account, create a new user that you'll use to do things on the server

To do that, fire up a terminal a type:

# upgrade all your dependencies
sudo apt-get update
sudo apt-get upgrade
# create user: replace username with the username you want to use
sudo adduser username
sudo su username
passwd

3.2 Connect to the server

Being able to connect to your server is an absolute requirement. For this, SSH is your friends. If it's not already done:

apt-get install openssh-server fail2ban

This will install a ssh server and a tool to block user trying to brute force your server. To increase security a bit, I would advise to change the default port SSH is using (22) to something else like: 8080. To do that, you need to edit the configuration of your ssh server in /etc/ssh/sshdconfig by setting the Port entry to something else than 22. You can set this up manually with your favorite editor or copy/paste this command:

sudo sed -i 's/Port 22/Port 8080/' /etc/ssh/sshd_config
sudo service ssh restart

From now on you have difference choices:

  1. you stay with this basic setup, and you can jump to the firewall configuration section
  2. you want to increase security even more.

If you go with option 2, you have different options available and you can add them one on top of the other:

  1. setup a VPN with openVPN. It will make things more secure but it adds an extra step when you want to connect your server: first connect to the VPN, then connect to your SSH server. If you want to go down this road anyway, I'll show you a bit later on this tutorial how to setup openVPN.
  2. Login to your SSH server with a private key. It's much better than the default but if you lose your private key, you're locked out and will have to phisically access your machine to get your access back. It's a risk you can mitigate with multiple backups but it will still be here. If you're willing to accept this risk, you can follow this tutorial to set this up.

3.3 Configuring your firewall

Ubuntu comes with iptables, the classic firewall most of the people in the linux community are using. It does it job at protecting against attack from the internet. Just a few words on Firewall, years ago I thought nobody would even try to attack my server as they were nothing of interest for anybody else. If you too think this, you have to understand internet is like walking down a minefield with plane constantly droping bombs in your server. If you don't have a bunker (aka a firewall), it would be like running naked in the middle of the minefield. Well, you'll get hit, then owned and ultimatly your server will participate in some larger attack that make big headline in the news: "The largest attack made against xxxx" or your server will be use to spam other people, that sort of things.

If you're infrastructure consist of one machine, iptables will likely be your only protection. That's why it's important to get this right. If you have many machines part of the same network, then you probably want to use a physical firewall, create a DMZ and a do bunch of enterprise stuff. However if you read this, you probably aren't in this case as you probably don't need a tutorial to show you all the basic things.

Tips: there's some tooling that was build around iptables, ufw seems to be a nice wrapper on top of iptables. In practise, you probably don't want to use it as I've seen some software (eg: docker) who edit their own iptables rules and those rules won't be seen by ufw.

Warning: Don't get locked out of your server while configuring your firewall. It already happenned to me multiple time, it's very easy to do if you don't double check and understand everything we'll do here. To avoid this, I would recommend to follow this guide and make sure you understand the script before launching it.

We'll first create our script and store it under /tmp/firewall.sh. To do this, log into your server and copy and paste this large command:

cat > /tmp/firewall.sh <<EOF
#!/bin/bash

# Flush current rules
iptables -F
iptables -t nat -F
iptables -t mangle -F
iptables -X

# block everything by default
iptables -P INPUT DROP
iptables -P FORWARD DROP
iptables -P OUTPUT ACCEPT

# allow SSH: replace your SSH port if needed
# in
iptables -A INPUT -i eth0 -p tcp --dport 8080 -m state --state NEW,ESTABLISHED -j ACCEPT
iptables -A OUTPUT -o eth0 -p tcp --sport 8080 -m state --state ESTABLISHED -j ACCEPT
# out
iptables -A OUTPUT -o eth0 -p tcp --dport 8080 -m state --state NEW,ESTABLISHED -j ACCEPT
iptables -A INPUT -i eth0 -p tcp --sport 8080 -m state --state ESTABLISHED -j ACCEPT


# allow HTTP
iptables -A INPUT -i eth0 -p tcp --dport 80 -m state --state NEW,ESTABLISHED -j ACCEPT
iptables -A OUTPUT -o eth0 -p tcp --sport 80 -m state --state ESTABLISHED -j ACCEPT

# allow HTTPS
iptables -A INPUT -i eth0 -p tcp --dport 443 -m state --state NEW,ESTABLISHED -j ACCEPT
iptables -A OUTPUT -o eth0 -p tcp --sport 443 -m state --state ESTABLISHED -j ACCEPT


###### 
# PROTECTION TO AVOID GETTING KICK OUT OF THE SERVER
# the idea is we will run this script, if the firewall is badly configure, it will be disabled after 120seconds so that you don't have to call support saying them: I lost access to my machine ....
# if everything is fine, then quit the script before the protection kicks in
sleep 120
echo "- PROTECTION"
iptables -F
iptables -t nat -F
iptables -t mangle -F
iptables -X
iptables -P INPUT ACCEPT
iptables -P FORWARD ACCEPT
iptables -P OUTPUT ACCEPT
EOF

Read this script and understand it first, making relevant change that apply to your specific use case. If you haven't made any special customization to the configuration I gave you earlier, you should be good to go:

sudo bash /tmp/firewall.sh
# wait for a few seconds
# then Ctrl-c to quit the script

3.4 A note about security

You might be tempted to install many things on your server right away. Keep in mind that the more tools you have the larger is your attack surface. Be very conservative about what software you want to install on your server.

4 Setup your apps

4.1 Achieve segregation between apps

The first thing here is why would you want to separate all the apps from each other? It's mainly to solve those problems:

  1. The version problem. A example would be with 2 apps that need 2 different version of PHP. yes you can fix this specific issue for PHP but if you have the same problem for go, node, python, …. you'll end up spending more time hitting your head on the wall trying to fix those and you probably have better things to do
  2. The security problem. If an attacker break into your server because one app has a security issue, he gains access to everything.

To make all your apps independant, there's a wide range of techs available to you but you can group them in 2 different range of solutions:

  • VM based solutions: Vmware, Proxmox, ….
  • Container base solutions: docker, ….

Choosing between one or the other is a trade off. VM based solutions will almost always consume more resources as you'll need more hardware whereas container based solution will be cheap to launch consuming fewer resources. On the other side of the spectrum VM will be more secure and achieve real process isolation as your apps will run on a different kernel whereas containers based solution will use the host kernel.

Personally I went with a container based approach as it's good enough for my use case, better for my wallet and I don't see any value going with a full VM.

4.2 Install apps

Because it's not possible to write one post giving all the details for all the different approachs, we'll only give details about the container based approach as It's the one I'm using for my own setup.

If you're new to docker, then I would advice start with their getting started. Another good write up is this tutorial or this free course.

4.2.1 How it works

Nothing better than a good old school schema to gain a better understanding of what we're building here: / SCHEMA WITH INTERNET / MACHINE / CONTAINERS / PORT

  1. The Machine: A software called nginx will be install on the host and listen to the port 80 and 443. That way, all the traffic initiate from a browser will be taking care of by nginx who will be in charge to redirect this traffic to the container running the service depending on the url you're trying to access. That's what we call a reverse proxy. To make a parrallel, in our use case, nginx would be a traffic cop in the real world, showing up where cars packets should go at an intersection.
  2. Containers: Containers will be only expose on a certan port to the loopback ip so that our traffic cop nginx will forward the request to the containers, and a response will be send back to the original browser

4.2.2 Example for mattermost, a slack alternative

4.2.3 Enable https on your sites

Http usage is decreasing over time as more and more website use https. Self signed certificates are good for development but if you do it wrong, chances are you can be victim of a man in the middle attack. When I say do it wrong, I mean doing either of those mistakes:

  1. don't let your OS know about your self signed certificate
  2. don't verify your actual SSL certificate before clicking on I understand the risk and I want to pursue the navigation anyway.

If you don't want to spend money on creating SSL certificates, you can still use Let's encrypt, a free provider of SSL certificate. As Let's encrypt is free I can't advise to use it for something you make money from as it doesn't come with proper support. For example, the other day their server was down and I couldn't complete the setup of an app as I had to wait until they were back online.

To create a certificate using let's encrypt we assume you'll be using nginx as we did earlier. To make

4.2.4 Common pitfalls using docker

There's a lot of docker-compose available on the internet. Here is a checklist to use before deploying another server:

  1. Make sure the port section of you container redirect to the loopback ip. By default docker would open a hole in your firewall and that's a bad thing as it would shortcut your nginx proxy.
  2. Make sure the application state is store in the host filesystem (database, custom configuration, files, …). In practise, if you're using mysql you need to create a volume to /var/lib/mysql or /var/lib/postgresql/data for postgres. I see a lot of people trying to use different hack to backup db but really, the only thing you need is to backup the filesystem.

5 Maintenance

5.1 Backup

5.1.1 Approach

A golden rule for backing things up is, try to recover from a backup before an actual problem occur. You don't want to end up with backup that can't be restore for some reasons.

Looking at tooling there's a wide range available: rsync, rclone, ….. The difference between those isn't big and they all fill a specific niche some other tools don't fit in. At the end of the day, the result will be the same as your data will be somewhere safe is something goes south. Those tools usually provides you with different strategies for backing up data:

  • differential backup
  • incremental backup

If you don't know about those strategies, go take a look here

5.1.2 Dead simple approach without tooling

I personally have a ftp server available for backup as it came for free with my dedicated server. That's the cheap backup option but work for my needs.

Trying to get it right, I ended up spending more time one the tooling than anything else. At the end I didn't succeed to restore all my applications correctly as the free ftp backup server given by my host wouldn't let me setup the permission I needed. At the end, I finally went with an incremental old school backup strategy, creating compressed archive (tar.gz) files for all of my data and piping it to the ftp server.

Concretly, my backup is run from the root user with a cronjob that look like this:

crontab -l
# Edit this file to introduce tasks to be run by cron.
#
# Each task to run has to be defined through a single line
# indicating with different fields when the task will be run
# and what command to run for the task
#
# To define the time you can provide concrete values for
# minute (m), hour (h), day of month (dom), month (mon),
# and day of week (dow) or use '*' in these fields (for 'any').#
# Notice that tasks will be started based on the cron's system
# daemon's notion of time and timezones.
#
# Output of the crontab jobs (including errors) is sent through
# email to the user the crontab file belongs to (unless redirected).
#
# For example, you can run a backup of all your user accounts
# at 5 a.m every week with:
# 0 5 * * 1 tar -zcf /var/backups/home.tgz /home/
#
# For more information see the manual pages of crontab(5) and cron(8)
#
# m h  dom mon dow   command
0 0 * * * cd /app && PERIOD=daily make backup
0 0 * * 0 cd /app && PERIOD=weekly make backup
0 0 15 * * cd /app && PERIOD=monthly make backup
0 0 1 */3 * cd /app && PERIOD=quaterly make backup

The Makefile has many lines but is constructed in this way:

mattermost_start:
        cd mattermost && docker-compose up -d
mattermost_stop:
        cd mattermost && docker-compose down || true
mattermost_backup:
        tar -zcf - mattermost | ncftpput -u$(FTP_USERNAME) -p$(FTP_PASSWORD) -c $(FTP_HOSTNAME) mattermost_${PERIOD}.tar.gz

5.1.3 ZFS

ZFS snapshots are much more superior to the dead simple approach we've seen above. A good introduction can be found here. however I won't go into details here as I don't have it place in my setup.

5.2 Restoration

Can't say it enough, test it before it's too late.

I can't find it back but I read this story about this guy who lost his entire owncloud as he ran into a bug in the upgrade that has corrupt it's entire data. Because everything was encrypted he had a really hard time trying to recover from it. Bad stories don't only happen to others.

5.3 Security upgrade

Once in a while, some large securities issues pop up and if you don't take security seriously, it won't be long until your server is own by somebody else. To avoid this, you need to regularly go to your server and do the following:

  1. upgrade packages on the host machine:
sudo apt-get update
sudo apt-get upgrade
  1. upgrade package from your containers. Basically:
cd /app/mattermost
docker-compose build
docker-compose up -d

and do this for each app.

Because I like to automate everything, I tend to have a separe machine running the proper commands for me on a cronjob. That way all my apps are always using secured version of their dependencies.

On top of this, I'm using Jenkins to verify all my services are available and running properly so that I know when something is wrong. I'll show you how this is done in the section below, monitor your services.

6 Possible Improvements

6.1 Monitor your services

There's again many different way to ensure you're services are up and running. Personally I use Jenkins, a tool use for continuous integration. From Jenkins, all my services are monitor from a job running every 2 minutes to ensure everything is working alright. When it's not, I know where to look for.

// TODO PICTURE OF JENKINS

If you're insterested in doing this, you can use the jenkins image in the github repo and launch it. From here, click on new item, fill in the name (I personnaly use "up - servicename"). From the job configuration page:

  1. under build triggers, click on build periodically. I personnaly trigger the job every 2 minutes with "H/2 * * * *"
  2. Build command looks like this:
curl -L -X GET https://gitlab.com/users/sign_in | grep -q "About GitLab"
  1. Then I am using the "Hudson post build task" plugin to see what the error was by configuring the log text field with "build as failure" and the script field with "curl -L -X GET https://gitlab.com/users/sign_in"

It takes no more than 5 minutes to add another service. Once you did it several times and you end up with a dashboard that let you know about your installation health.

6.2 Another point about backup

Another thing you might want to consider is not being vulnerable to a ramsonware. If your host is pushing backups to a remote server, you're expose to someone trashing all your backups. A better way to do it would be to have another separate machine connecting on the host, generating the backups and copying them without any way for the host to log in the backup server.

6.3 Use a CDN

CDN can signifanctly increase the perceived load times by caching the generated page of your services in servers all around the world. I don't use this for my self hosting environment as it means more maintenance and I'm fine with the loading speed of my apps. However if you're interested going down this road, you have several options to choose from:

  • Akamai
  • Cloudfront
  • Cloudflare
  • ….

6.4 https all the way down to your service

If you've followed this guide until now, you may have realised that connection are encrypted until they touch the reverse proxy but the connection to your host machine to the container remains unencrypted. The risk here is having someone who own the server manipulating request to all your services.

You can improve upon this by proxyfying your request using https. I haven't spend enough time on this to find a solution that relyon letsencrypt and don't end up with more maintenance so I won't get into more details

Author: Mickael KERJEAN

Created: 2017-08-02 Wed 19:34