As the complexity of today’s sites increases, so are the challenges to keep the site loading fast and bandwidth usage low. Minified scripts, concatenated CSS, image sprites and even hand-crafted static html are used for speedy delivery. This article discusses some less-known features of nginx that can lead to significant speed increase.
In my quest for performance, I switched one of our high-traffic sites from Apache to Nginx. It was a perfect candidate, as most of it is static with only client-side functionality and some AJAX calls; less than 10% is has server-side functionality.
Like Apache, Nginx has an on-the-fly compression feature, via gzip on
option.
When benchmarking the site, I noticed an increase in Time to First Byte (TTFB). This was to be expected – after all compressing a file does incur some overhead. Of course, the time lost on compression is made up many times over by the smaller time needed to download the file, but it still got me thinking – wouldn’t it be possible to have both a small file and a great TTFB?
GZip Static files
Nginx has an option called gzip_static. When turned on, if a request is made for a file, say, style.css, it looks first for style.css.gz and sends it back directly without any overhead. If the file does not exist, the file is compressed normally and sent back.
So the code might look like this:
location ~* \.(html|css|js|xml)$ { gzip_static on; }
(Be careful when placing the rules so they don’t overwrite other file rules! nginx is a bit peculiar in this matter.)
Now the TTFB drops from 0.3s to 0.09s!
There’s just one problem – nginx does not generate or update the .gz files itself. This is a nuisance.
Automate the Gzip generation with cron
The simplest choice is to batch generate the gzip files:
#! /bin/bash FILETYPES=( "*.html" "*.css" "*.js" "*.xml" ) DIRECTORIES="/var/www/" MIN_SIZE=1024 for currentdir in $DIRECTORIES do for i in "${FILETYPES[@]}" do find $currentdir -iname "$i" -exec bash -c 'PLAINFILE={};GZIPPEDFILE={}.gz; \ if [ -e $GZIPPEDFILE ]; \ then if [ `stat --printf=%Y $PLAINFILE` -gt `stat --printf=%Y $GZIPPEDFILE` ]; \ then gzip -1 -f -c $PLAINFILE > $GZIPPEDFILE; \ fi; \ elif [ `stat --printf=%s $PLAINFILE` -gt $MIN_SIZE ]; \ then gzip -1 -c $PLAINFILE > $GZIPPEDFILE; \ fi' \; done done
You would save this script and run it every hour or so via a cron job. The script searches for all files with the specified extensions inside the target directory and if the file size is larger than specified, compresses it with gzip. If the .gz file already exists, it looks at the modification time and updates only if necessary.
This works, but I still wasn’t happy. Often, only one file changes, but when it does, you want the .gz companion to be updated now, not within the next hour. Also, what to do if one of the uncompressed files is deleted?
The naive option would be to continuously poll the directory for changes; I shiver just thinking of this. If the idea crossed your mind, just say No.
Monitoring changes and generating gzip files as needed
Wouldn’t it be great if modern OSes would notify us when a file is added, modified or deleted? But wait – of course they do. On Linux it’s the inotify kernel subsystem. Unfortunately I couldn’t find mature high-level tools to take advantage of inotify. The most popular is incron, but it lacks the ability to monitor subdirectories, so it’s pretty useless for this task.
The only thing I could use is inotify tools, which work, albeit a bit low-level.
You install it with apt-get inotify-tools
(or your bistro’s package manager).
Afterwards you work with the inotifywatch
command that can monitor a directory for changes.
So, consider these two Bash scripts:
notify-edit.sh:
#!/bin/bash inotifywait -m -q -e CREATE -e MODIFY -e MOVED_TO -r "/var/www/" --format "%w%f" --excludei '\.(jpg|png|gif|ico|log|sql|zip|gz|pdf|php|swf|ttf|eot|woff|)
notify-delete.sh:
#!/bin/bash inotifywait -m -q -e DELETE -e MOVED_FROM -r "/var/www/" --format "%w%f" --excludei '\.(jpg|png|gif|ico|log|sql|zip|gz|pdf|php|swf|ttf|eot|woff|)
The first script listens for create, modify and move to monitored directory signals. For performance reasons it filters out unwanted file types. It would have been better if there was an option to exclude everything except specified pattern. There is a patch that accepts the –includei parameter but it’s not included in the main branch. The created/modified file names are piped to a bash script that further checks the file extension and compresses only the file types we want. The second script is similar, monitoring the directory for deleted and moved out files.
To run the scripts, you enter:
nohup ./notify-edit.sh & nohup ./notify-delete.sh &
As soon as a file is created or modified, a corresponding gzip version is (re)created. The .gz is deleted when the original file is deleted. If a file is moved from one folder to another, the corresponding gzip is deleted and then recreated at the new location.
A note on gzip compression
You may have noticed that I set the compression level to 1 (minimum). The natural tendency is to set the compression level to max, especially since the compression is done separately. However, the default nginx compression level is still 1 and I did run some tests on various file types – html, javascript and css.
Typically, using 1 for compression already brings 80% saving in file size. Going to 6 brings only another 3% saving; increasing compression level to 9 only improves the compression level by another 1%. At the same time, compression time shoots up: level 6 is over 70% more expensive and level 9 is almost 120% more expensive.
To summarize: a 4% compression improvement means over double the compression time. For archival, where space occupied is the most important factor, it makes sense to use higher compression levels (but even there it makes little sense to go over 6). On a server, a balance between file size and CPU usage hugely favor smaller compression levels.
Another note: I’ve seen nginx tutorials where even image files (JPEG, PNG, etc) were included in gzip compression. There no other way to put it but call it like it is: dangerously stupid. Image files, videos, PDFs and most other file types are already compressed. Gzipping them not only doesn’t bring any discernible benefit, it also slows down the browser that now has to decompress them as well.
Extending
The concepts explained here with inotify tools can be used to perform other server-side operations, for example recompressing JPEGs, optimizing PNGs, minifying css and js files and more.
4 Responses
I’m getting a lot of errors:
bash: line 5: [: 6820: unary operator expected
bash: line 5: [: 47238: unary operator expected
bash: line 5: [: 15838: unary operator expected
Using bash and shell in CentOS 6.4, x86_64 architecture.
Sorry Axel. Unfortunately I’m no shell programming master. I only tested on Ubuntu 12.04
miniz.c v1.10 includes an optimized real-time compressor written specifically for compression level 1 (MZ_BEST_SPEED). miniz.c’s level 1 compression ratio is around 5-9% higher than other real-time compressors, such as minilzo , fastlz , or liblzf . miniz.c’s level 1 data consumption rate on a Core i7 3.2 GHz typically ranges between 70-120.5 MB/sec. Between levels 2-9, miniz.c is designed to compare favorably against zlib, where it typically has roughly equal or better performance.
A bit optimized for CentOS code (should work in other builds as well):
#! /bin/bash
FILETYPES=( “*.html” “*.css” “*.js” “*.xml” “*.txt” )
DIRECTORIES=”/path/to/site/dir/”
MIN_SIZE=512
for currentDir in $DIRECTORIES; do
for f in “${FILETYPES[@]}”; do
files=”$(find $currentDir -iname “$f”)”;
echo “$files” | while read file; do
PLAINFILE=$file;
GZIPPEDFILE=$file.gz;
if [[ -e “$GZIPPEDFILE” ]]; then
if [[ `stat –printf=%Y “$PLAINFILE”` -gt `stat –printf=%Y “$GZIPPEDFILE”` ]]; then
echo .gz is older, updating “$GZIPPEDFILE”…;
gzip -2 -f -c “$PLAINFILE” > “$GZIPPEDFILE”;
fi;
if [[ `stat –printf=%s “$PLAINFILE”` -le $MIN_SIZE ]]; then
echo Uncompressed size is less than minimum “(“$(stat –printf=%s “$PLAINFILE”)”)”, removing “$GZIPPEDFILE”;
rm -f “$GZIPPEDFILE”;
fi;
elif [[ `stat –printf=%s “$PLAINFILE”` -gt $MIN_SIZE ]]; then
echo Creating .gz “for” “$PLAINFILE”…;
gzip -2 -c “$PLAINFILE” > “$GZIPPEDFILE”;
fi;
done
done
done
exit
Comments are closed.