New Django Server Setup: Part 2

System Setup

In order to make the OS usable to our System Administrators, we need to set up the system and its users. We start by creating a "worker" UNIX group and UNIX users for each Imaginary Landscape staff member. Each staff user is added to the worker group and is given sudo access. In addition, a umask of 002 is added to the end of the /etc/profile. I'll explain the significance of these settings later.

      sudo useradd -d /iscape/home/user1 user1 -s /bin/bash
      sudo usermod -aG worker user1
      sudo usermod -aG sudo user1
      sudo sh -c 'echo "umask 002" >> /etc/profile'

We copy each user's ssh public key into their authorized_keys file in their home directory. This allows users to securely login with ssh, instead of using password authentication. Finally, we need to create an application user. This is a UNIX user account that will own all of the Django application files and will be used to execute the project's Django/Gunicorn process.

       sudo useradd application_user

Site Setup

We need to set up a location on the server that will contain all of our project files and, at Imaginary Landscape, we've defined that location to be /iscape/sites/. As a general rule, all of our code, configurations, logs, home directories, and Python modules are located under this directory, though we occasionally break this rule depending on the needs of the situation. This defines one location for us to find all of the relevant application files and one location at which to target filesystem backups. Inside of this directory there exists another directory named after the site name which we call the ENVIRONMENT_ROOT. For example, if we had a project called "my_site", the ENVIRONMENT_ROOT directory would be as follows:

/iscape/sites/my_site/

This allows us to define multiple sites per server though, generally speaking, we typically only run one site per server. The ENVIRONMENT_ROOT is also the root of the virtualenv. If you are unfamiliar with a virtualenv, this is basically a special directory structure that creates a separate Python install and isolates all of the project-specific Python dependency modules from the operating system level Python modules. Therefore, installing or upgrading Python modules at the system level won't impact Python modules needed by the project. You can read more about virtualenv at www.virtualenv.org. Inside of a given ENVIRONMENT_ROOT directory, we create the following sub-directories. For the duration of this example, I'll use the double bracket syntax (i.e. {{site_name}} ) to refer to what you would replace with a value of your choosing.

    /iscape/sites/{{site_name}}/
        proj/{{proj_name}}/    : Django Project Source Code Files
                                    Also the root .git directory for the site
                                    projname is the name of the project as 
        bin/                   : Scripts and Binaries for Virtualenv
        data/                  : Datadumps, sqlite databases, etc 
        etc/nginx/             : Nginx configuration directory
        htdocs/                : Directory that contains django static files copied in 
                                    from the Django modules and django project
        lib/                   : Contains python modules installed via pip
        var/log/               : Stores all site logs
        var/run/               : Stores the pid file and the socket file 
        src/                   : Stores 3rd party source code pulled from git/svn

Project Directory

As noted above, the actual Django project source code files are located in the the following directory:

    /iscape/sites/{{site_name}}/proj/{{proj_name}}/

This is also the root of the Git repository and is referred to as the PROJECT_ROOT. In almost all cases, we name the {{proj_name}} the same name as the {{site_name}}. We'll cover the Django project code layout in more detail later.

Nginx Configuration Directory

The Nginx configuration files for a project are located in /iscape/sites/{{site_name}}/etc/nginx/. This directory typically contains two files: server.conf and locations.conf. The server.conf sets the configuration for the site-specific virtual servers. Traditionally, we define a virtual server to respond to http traffic on port 80 and a virtual server to respond to https traffic on port 443. Both of these virtual server blocks contain an Nginx include which includes the locations.conf file. Both also define reverse-proxy logic to direct dynamic traffic to the Django/Gunicorn process. Below is an example of a virtual server definition for HTTP traffic.

    upstream app_server {
       server unix:/iscape/sites/{{site_name}}/var/run/wsgi.socket fail_timeout=0;
    }
    
    server {
        listen 80;
    
        # Set the default document root for this server to the htdocs directory
        root /iscape/sites/{{site_name}}/htdocs;
        # Include locations.conf file containing definitions for static media
        include /iscape/sites/{{site_name}}/etc/nginx/locations.conf;
   
        location / {
             # checks for static file, if not found proxy to app
             try_files $uri @proxy_to_app;
        }
    
        location @proxy_to_app {
             proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
             proxy_set_header X-Forwarded-Protocol $scheme;
             proxy_set_header X-Real-IP $remote_addr;
             proxy_set_header Host $http_host;
             proxy_set_header X-Scheme $scheme;
             proxy_redirect off;
             proxy_pass   http://app_server;
        }
    }

The locations.conf file contains specific location directives that serve up static content under the /static/ and /media/ urls. When all is said and done, static content (js/css/images/etc) will be served by the /static location directive, and all user-uploaded media will be served by the /media location directive. Below is an example of the locations.conf configuration file.

    location /static {
      root /iscape/sites/{{site_name}}/htdocs;
      access_log off;
    }
    location /media {
      root /iscape/sites/{{site_name}}/htdocs;
      access_log off;
    }

As mentioned before, the default system-wide Nginx configuration is installed inside of the /etc/nginx/ directory. The top-level Nginx configuration file is located at /etc/nginx/nginx.conf. This file is responsible for setting up the global Nginx configuration and is what includes all of the previously mentioned server configurations defined in the /etc/nginx/conf.d/ directory.

There are a few things to note about this top-level configuration file. First, by default, we set value of the worker_processes variable to the number of CPU cores on the server. This allows Nginx to better take advantage of multi-core systems. Second, we set the "daemon off" setting at the top of this configuration file to ensure that Nginx doesn't start as a background process. We do this because Supervisor will manage the daemonization of Nginx.

Inside of the /etc/nginx directory, there exists a subdirectory called conf.d/. This subdirectory exists so that users may insert into it individual Nginx site configuration files like the one that we created called server.conf. Adding files to this directory automatically includes them as part of the Nginx configuration. However, instead of copying the server.conf into this directory, we symlink the server.conf into this conf.d/ directory (and we name the symlink {{site_name}}.conf).

    ln -s /iscape/sites/{{site_name}}/etc/nginx/server.conf \
        /etc/nginx/conf.d/{{site_name}}.conf

Var Directory

The /iscape/sites/{{site_name}/var/log/ directory stores all site-specific log files. This includes any logs produced by the Gunicorn process, the Nginx access log, the Nginx error log, and all others logs relevant to the project specifically. This defines for us one location where we can find logs for a given site. However, logs for system-level services, such as Postgres or SSH, will still reside in their default location on the system.

The /iscape/sites/{{site_name}/var/run/ directory contains the pid file for the given Gunicorn process. This pid file is used by the running Gunicorn process to help manage itself. Also, this directory contains the Unix Domain Socket file that is used by Nginx and Django/Gunicorn to communicate with each other. As mentioned before, Nginx communicates with Django/Gunicorn via a reverse proxy mechanism and this socket file is the actual mechanism that establishes that communication.

Note that there are two typical approaches to establish communications between Nginx and Gunicorn: Unix Domain Sockets as mentioned above, and TCP sockets which use a TCP connection to communicate. We have chosen to use the UNIX domain socket approach as it is reportedly faster and it avoids the need to utilize a TCP port on the system.

Site Permissions

It's very important that the ENVIRONMENT_ROOT has the correct permissions. Every file and directory under this hierarchy has the "worker" group as their primary group and is owned by the application user. Every directory in this hierarchy has the sticky group bit set. This ensures that new files and directories that are created inherit the "worker" group. Also, recall earlier that we set the umask in the /etc/profile to 002. This ensures that all users create files and directories with the proper group write permissions. Therefore, staff members in the "worker" group have the ability to edit any file under ENVIRONMENT_ROOT. The following commands will set the permissions correctly.

    sudo chown -R {{application_user}}:worker /iscape/sites/{{site_name}}
    sudo chmod -R g+w /iscape/sites/{{site_name}}
    sudo find /iscape/sites/{{site_name}} -type d -exec chmod g+s {} \;

Generally speaking, we try to give each file and directory the minimum permissions necessary. We try to abide by the following permission guidelines.

Python (*.py) files should have 664 set on them UNLESS a user is to directly execute the Python file from the command line as a script (i.e. manage.py). Executable Python scripts should be set to 775.
Script (*.sh) files should be set to 775.
Static files (*.html, *.css, *.jpg, *.png, etc) should be set to 664.
Directories should be set to 2775 (set GID set).
All other files should be set to 664 unless there's a good reason not to do so.

Supervisor Configuration

We use the Supervisor daemon to manage the running site processes. On our systems, Supervisor manages Nginx, the Django Gunicorn processes, and any other related, long-running processes, such as queue worker processes. These are all processes that a Systems Administrator might expect to restart from time to time and Supervisor helps with this management. Supervisor not only consolidates the related application processes into a nicely defined group and provides a common method to configure each process, but it also can auto-restart processes if any should crash.

Here is an example of the Nginx Supervisor configuration, located in the /etc/supervisor/conf.d/nginx.conf file.

    [program:nginx]
    command=nginx -c /etc/nginx/conf/nginx.conf
    directory=/etc/nginx
    user=root
    autostart=true
    autorestart=true
    redirect_stderr=True

Here is an example of the Django Supervisor configuration, located in the /etc/supervisor/conf.d/django.conf file.

    [program:{{site_name}}-django]
    command=/iscape/sites/{{site_name}}/bin/gunicorn {{proj_name}}.wsgi:application -c /iscape/sites/{{site_name}}/etc/gunicorn/gunicorn.py
    environment=PATH="/iscape/sites/{{site_name}}/bin",
                PYTHONPATH="${PYTHONPATH}:/iscape/sites/{{site_name}}/proj/{{proj_name}}/"
    autostart=true
    startretries=5
    autorestart=unexpected
    user={{site_user}}

The example above makes note of a gunicorn.py file, which serves as a configuration file for the gunicorn process. This file identifies the UNIX domain socket file that gunicorn will listen on and the pid file that will be created for the process.

This example is located at: /iscape/sites/{{site_name}}/etc/gunicorn/gunicorn.py

    bind = "unix:/iscape/sites/{{site_name}}/var/run/wsgi.socket"
    workers = 3
    daemon = False
    pidfile = "/iscape/sites/{{site_name}}/var/run/gunicorn.pid"

We do not use Supervisor to manage other system-level processes, such as Postgresql or Redis. We rarely need to manage or restart these services and the default init scripts / upstart configurations are sufficient for managing these processes.

Part 3 of this series will focus on django project related configuration and defaults.

Tagged django, pil, nginx, postgresql, mysql, ubuntu, git, virtualenv, rackspace, supervisord, redis, ntp, gunicorn, postfix, amazon, gcc, pgbouncer

Technology Blog