Nginx - Additional Modules, About Your Visitors

The following set of modules provides extra functionality that will help you find out more information about the visitors, such as by parsing client request headers for browser name and version, assigning an identifier to requests presenting similarities, and so on.

Browser

The Browser module parses the User-Agent HTTP header of the client request in order to establish values for variables that can be employed later in the configuration. The three variables produced are:

  • $modern_browser: If the client browser is identified as being a modern web browser, the variable takes the value defined by the modern_browser_value directive.
  • $ancient_browser: If the client browser is identified as being an old web browser, the variable takes the value defined by ancient_browser_value.
  • $msie: This variable is set to 1 if the client is using a Microsoft IE browser.

To help Nginx recognize web browsers, telling the old from the modern, you need to insert multiple occurrences of the ancient_browser and modern_browser directives:

modern_browser opera 10.0;

With this example, if the User-Agent HTTP header contains Opera 10.0, the client browser is considered modern.

Map

Just like the Browser module, the Map module allows you to create maps of values depending on a variable:

map $uri $variable {

  /page.html 0;

  /contact.html 1;

  /index.html 2;

  default 0;

}

rewrite ^ /index.php?page=$variable;

Note that the map directive can only be inserted within the http block. Following this example, $variable may have three different values. If $uri was set to /page.html, $variable is now defined as 0; if $uri was set to /contact.html, $variable is now 1; if $uri was set to /index.html, $variable now equals 2. For all other cases (default), $variable is set to 0. The last instruction rewrites the URL accordingly. Apart from default, the map directive accepts another special keyword: hostnames. It allows you to match hostnames using wildcards such as *.domain.com.

Two additional directives allow you to tweak the way Nginx manages the mechanism in memory:

  • map_hash_max_size: Sets the maximum size of the hash table holding a map
  • map_hash_bucket_size: The maximum size of an entry in the map

Regular expressions may also be used in patterns if you prefix them with ~ (case sensitive) or ~* (case insensitive):

map $http_referer $ref {

  ~google "Google";

  ~* yahoo "Yahoo";

  \~bing "Bing"; # not a regular expression due to the \ before the tilde

  default $http_referer; # variables may be used

}

Geo

The purpose of this module is to provide functionality that is quite similar to the map directive — affecting a variable based on client data (in this case, the IP address). The syntax is slightly different in the extent that you are allowed to specify address ranges (in CIDR format):

geo $variable {

  default unknown;

  127.0.0.1 local;

  123.12.3.0/24 uk;

  92.43.0.0/16 fr;

}

Note that the above block is being presented to you just for the sake of the example and does not actually detect U.K. and French visitors; you'll want to use the GeoIP module if you wish to achieve proper geographical location detection. In this block, you may insert a number of directives that are specific to this module:

  • delete: Allows you to remove the specified subnetwork from the mapping.
  • default: The default value given to $variable in case the user's IP address does not match any of the specified IP ranges.
  • include: Allows you to include an external file.
  • proxy: Defines a subnet of trusted addresses. If the user IP address is among the trusted, the value of the X-Forwarded-For header is used as IP address instead of the socket IP address.
  • proxy_recursive: If enabled, this will look for the value of the X-Forwarded-For header even if the client IP address is not trusted.
  • ranges: If you insert this directive as the first line of your geo block, it allows you to specify IP ranges instead of CIDR masks. The following syntax is thus permitted: 127.0.0.1-127.0.0.255 LOCAL;

GeoIP

Although the name suggests some similarities with the previous one, this optional module provides accurate geographical information about your visitors by making use of the MaxMind (www.maxmind.com) GeoIP binary databases. You need to download the database files from the MaxMind website and place them in your Nginx directory.

This module is not included in the default Nginx build.

All you have to do then is specify the database path with either directive:

geoip_country country.dat; # country information db

geoip_city city.dat; # city information db

geoip_org geoiporg.dat; # ISP/organization db

The first directive enables several variables: $geoip_country_code (two-letter country code), $geoip_country_code3 (three-letter country code), and $geoip_country_name (full country name). The second directive includes the same variables but provides additional information: $geoip_region, $geoip_city, $geoip_postal_code, $geoip_city_continent_code, $geoip_latitude, $geoip_longitude, $geoip_dma_code, $geoip_area_code, $geoip_region_name. The third directive offers information about the organization or ISP that owns the specified IP address, by filling up the $geoip_org variable.

If you need the variables to be encoded in UTF-8, simply add the utf8 keyword at the end of the geoip_ directives.

UserID Filter

This module assigns an identifier to clients by issuing cookies. The identifier can be accessed from variables $uid_got and $uid_set further in the configuration.


userid

Context: http, server, location

Enables or disables issuing and logging of cookies.

The directive accepts four possible values:

  • on: Enables v2 cookies and logs them
  • v1: Enables v1 cookies and logs them
  • log: Does not send cookie data but logs incoming cookies
  • off: Does not send cookie data

Default value: userid off;


userid_service

Context: http, server, location

Defines the IP address of the server issuing the cookie.

Syntax: userid_service ip;

Default: IP address of the server


userid_name

Context: http, server, location

Defines the name assigned to the cookie.

Syntax: userid_name name;

Default value: The user identifier


userid_domain

Context: http, server, location

Defines the domain assigned to the cookie.

Syntax: userid_domain domain;

Default value: None (the domain part is not sent)


userid_path

Context: http, server, location

Defines the path part of the cookie.

Syntax: userid_path path;

Default value: /


userid_expires

Context: http, server, location

Defines the cookie expiration date.

Syntax: userid_expires date | max;

Default value: No expiration date


userid_p3p

Context: http, server, location

Assigns a value to the P3P header sent with the cookie.

Syntax: userid_p3p data;

Default value: None


Referer

A simple directive is introduced by this module: valid_referers. Its purpose is to check the Referer HTTP header from the client request and possibly to deny access based on the value. If the referrer is considered invalid, $invalid_referer is set to 1. In the list of valid referrers, you may employ three kinds of values:

  • None: The absence of a referrer is considered to be a valid referrer
  • Blocked: A masked referrer (such as XXXXX) is also considered valid
  • A server name: The specified server name is considered to be a valid referrer

Following the definition of the $invalid_referer variable, you may, for example, return an error code if the referrer was found invalid:

valid_referers none blocked *.website.com *.google.com;

  if ($invalid_referer) {

  return 403;

}

Be aware that spoofing the Referer HTTP header is a very simple process, so checking the referrer of client requests should not be used as a security measure.

Real IP

This module provides one simple feature — it replaces the client IP address by the one specified in the X-Real-IP HTTP header for clients that visit your website behind a proxy or for retrieving IP addresses from the proper header if Nginx is used as a backend server. To enable this feature, you need to insert the real_ip_header directive that defines the HTTP header to be exploited — either X-Real-IP or X-Forwarded-For. The second step is to define trusted IP addresses. In other words, the clients that are allowed to make use of those headers. This can be done thanks to the set_real_ip_from directive, which accepts both IP addresses and CIDR address ranges:

real_ip_header X-Forwarded-For;

set_real_ip_from 192.168.0.0/16;

set_real_ip_from 127.0.0.1;

set_real_ip_from unix:; # trusts all UNIX-domain sockets

This module is not included in the default Nginx build.