Serving a blog with nginx

To host your website you will need a web server. My choice of web server for hosting my Pelican blog is nginx. Why nginx? Because it’s light, fast with static files and popular as well, plus no dependencies if you configure it minimally. Another solid choice of course is Apache, it’s solid, popular and proven, and there’s is loads of information to be found about it on the web. And there are others you can use, whatever is your favorite will probably do fine, because we are just serving static files.

I will perform this on the jail I set up before, which means FreeBSD and I like keeping it small by keeping out stuff I do not need. So I compile from Ports, but if you want to use packages go right ahead, it won’t make a difference for the configuration part.

Install nginx

satyr /root >cd /usr/ports/www/nginx
satyr /usr/ports/www/nginx >make config install clean

I disable pretty much everything because I’m not going to use many special features. All I keep enabled at this moment:

  • [*] IPV6 for IPv6 support.
  • [*] HTTP we want to serve over HTTP remember.
  • [*] HTTP_GZIP_STATIC so we can pre-compress our files and have nginx serve them out.
  • [*] HTTP_REWRITE because it enables you to do cool things. (rewrite rules, if-statements, etc)

Configuring nginx

There are many sources on configuring nginx and they also have a decent wiki. I’ve used nginx before and I must say it’s relatively easy to configure so you get going. Of course fine-tuning and getting it perfect takes more time, but you can do those after your initial set-up.

To get you guys off to a good start I tinkered on this a bit, and found Calomel’s blog about nginx is a really good read.

Edit /usr/local/etc/nginx/nginx.conf with an editor of your choice. I have this in there, with some options commented out because they might require some system tuning on your end to be useful or for completeness:

user  nobody;
worker_processes  1;                    # 1 worker can utilize 1 CPU core. 
pid        /var/run/nginx.pid;

events {
    use kqueue;                         # Important on FreeBSD for performance, similar to Linux epoll.
    worker_connections  256;            # 256 concurrent connections. I won't need more.
    accept_mutex off;                   # Turn this 'on' if worker_processes > 1.
                                        #  No need for locking mutex on single worker, thus servers accepts faster.
}

http {
    ### General options
    charset             utf-8;
    default_type        application/octet-stream;
    ignore_invalid_headers on;
    include             mime.types;
    keepalive_requests  20;             # Default: 100.
    max_ranges          0;              # Disabled range headers. We only serve very small static files.
    keepalive_timeout   300 300;        # Default 65; made this bigger because creating new TCP connections. 
                                        #  is "expensive" and this helps to speed things along because open.
                                        #  TCP connections can be re-used more often.
    output_buffers      1 256k;         # Use output buffer of 256kB, only use for sendfile=off.
    #postpone_output     1440;           # Set this to your Maximum Segment size.
                                        #  Sysctl: net.inet.tcp.mssdflt; I set mine at 1440.
    recursive_error_pages on;           # Allows the use of "error_pages" directive.
    reset_timedout_connection on;       # Close stale connections
    sendfile            off;            # Default: on; I disable to prevent redundant caching, because I use ZFS.
    #sendfile_max_chunk 512K;            # Only used when sendfile=on; Set accordingly, see documentation.
    server_tokens       off;            # Disables showing the nginx version number in auto generated error pages.
    tcp_nodelay         on;             # Disable Nagle buffering algoritm. Helps sending small objects quickly.
    tcp_nopush          off;            # Default: off; Cannot be used without sendfile() anyway.

    ### Limit requests
    # Works together with the "limit" in the server{..} directive to limit requests per IP.
    limit_req_zone $binary_remote_addr zone=blog:1m rate=60r/m;

    ### Compression
    gzip  on;                           # On-the-fly compression. Static HTML and CSS compress well.
                                        #  Create static .gz files later to save CPU cycles.
                                        #  Compressing on-the-fly also adds delay! 
    #gzip_static       on;              # Enable this when you have .gz versions of your files in place.
    #gzip_buffers      16 8k;           # No need to change this anymore. 
                                        # Default has changed to 32 4K or 16 8K, since version 0.7.28.
    #gzip_comp_level   1;               # Lvl 1 is a good default for on-the-fly compression.
    #gzip_http_version 1.0;             # Set this to enable compression for HTTP 1.0 requests as well.
                                        #  This can break keepalives if set. Check your traffic and decide.
    gzip_min_length   860;              # Compress files larger than 860 bytes. Disable when using gzip_static. 
                                        #  Akamai uses 860 bytes as minimum to start compressing.
                                        #  Google recommends starting at 150 bytes, and although gzip starts 
                                        #  decreasing the size at that range, it's not yet worth the effort imo.
    gzip_types        text/plain text/html text/css image/x-icon image/bmp;     # Only compress these types.
    gzip_vary         on;               # Let clients know we can compress data.

    ### Log formatting
    log_format  main  '$remote_addr $host $remote_user [$time_local] "$request" $status $body_bytes_sent "$http_referer" "$http_user_agent" $request_time $gzip_ratio';

    ### Server
    server {
        add_header  Cache-Control public;                       # Tell clients/proxies it's OK to cache.
        expires     max;                                        # Cache as long as possible.

        access_log  /var/log/nginx/access.log main buffer=32k;  # Remove buffer when debugging!
        error_log   /var/log/nginx/error.log error;

        limit_req   zone=blog burst=200 nodelay;                # Limit requests per IP.
        listen      80 rcvbuf=1k sndbuf=128k backlog=128;       # Add an IP in front of the port number
                                                                #  to bind to a specific IP.
        root        /path/to/your/pelican/output;           # root from where nginx will serve files.
        server_name yourdomain.com www.yourdomain.com;
    }
}

I left out a couple of if-statements because they are not required. The ones I use are adaptations of examples provided by Calomel’s blog about nginx, so have a look around there and grab things you can use.

Dont forget to add the directory where you want nginx to write it’s logs, if it doesn’t exist yet. Mine didn’t.

satyr /root >mkdir /var/log/nginx

Start nginx

Now you are ready to enable nginx in rc.conf so we can start it for the first time. Edit /etc/rc.conf with your favorite editor and add:

# Start nginx web server
nginx_enable="YES"

Now the nginx web server will be started, when this machine (or jail in my case) is started. Start the nginx web server now by hand to test it:

satyr /root > /usr/local/etc/rc.d/nginx start

If something is wrong with your configuration file it will inform you and refuse to run. It will tell you which line number is causing the problem, so you know where to look. With the configuration file above I keep getting a warning, which is no problem in order to run, but still odd:

nginx: [warn] duplicate MIME type "text/html" in /usr/local/etc/nginx/nginx.conf:54

But I do not see a duplicate on that line… If I figure this out, I will update this post.

Test

Now you are running nginx, you should test if your site works correctly. Point your browser at your domain, or IP address of your server, and cross those fingers. Result?

Further tweaking

You can further optimize your web server if you want. This can involve checking your logs, running tests and benchmarks to compare different settings, and more… Search around, test and try things. The post about nginx of Calomel also contains addition on this, so have a look!

Log rotation

When you have access_log enabled, it will grow every time somebody visits your website. So over time this log file can grow pretty big. To prevent this from going out of control, especially if your website is publicly available, you want to set up log rotation. This will keep an eye on your log file, and when a certain size or specific time is reached, you can have it moved to a different name, compressed and start logging to a new file. It will also keep a set amount of rotated files and delete the old ones.

In FreeBSD a good way to do this is to edit /etc/newsyslog.conf and add the following entries for the nginx log files:

/var/log/nginx/access.log               644  7     1024 *     JB     /var/run/nginx.pid  30
/var/log/nginx/error.log                644  7     1024 *     JB     /var/run/nginx.pid  30

I will shortly explain each statement:

  • /var/log/nginx/access.log is the log file to watch
  • 644 is which mode to set the file to. This mode sets -rw-r--r-- rights.
  • 7 is the amount of rotated logs to keep.
  • 1024 is the maximum size in bytes the log file is allowed to reach.
  • * is the “when” column, setting it to * will make log rotation depend solely on the size field.
  • JB flags to use bzip2 (J) compression on rotation, and don’t write a “logfile turned over” line (B)
  • /var/run/nginx.pid is the PID file of nginx, so it can signal nginx when rotation happens.
  • 30 is the signal to send to the process, signal number 30 is numerical for signal SIGUSR1.

See the FreeBSD newsyslog.conf manual for more information about this file.

Note that using compression on the access logs might interfere with tools that analyze such files. If you run such tools, it is best to disable compression.

The signal to use (SIGUSR1) comes from the nginx documentation about Log Rotation and I looked up the numerical translation of it in the FreeBSD signal manual.

After editing, you do not have to restart newsyslog because it is run by cron by default. Take a look at /etc/crontab for confirmation.

satyr /root >grep newsyslog /etc/crontab
0       *       *       *       *       root    newsyslog

As you can see, newsyslog is run every hour to see if any log file needs to be rotated.

blogroll

social