Avatar

Ps0ke

Φίλιππος Στέφανος

Nocturnal Inclination

Nerdkram, Internethumor und der ganzen Rest.

mdbl0g Benchmark

When you build a file based blog system like I did, mdbl0g, you are often told: "That'll be way too slow, especially in PHP!". As I don't blog too often and won't have too many posts, it didn't bother me. But what if..? This challenged me to benchmark my code. I wrote a small bash script to generate posts with random names quickly and put some "lorem ipsum" in there the size is varying because of the iterator but each one is around 2,629 bytes (~2.6kB) of size. So I set out to create and test with 1,000; 10,000 and 100,000 files, and the results are quite pleasing. Everything went well for 1,000 and 10,000 and if at all only a small delay occurred. But with 100,000 posts things got ugly. Although the main page rendered surprisingly fast, trying to search fucked up the server. I even pushed up the set_time_limit() in PHP, but the server threw a 500 Internal Server Error. I think no one will ever gather 100,000 blog posts and so this benchmark proves to me, that a file based approach is not so much of an performance issue if you have a reasonable amount of data. The network latency is the bottleneck!

Results

The time is measured inside the PHP script to determine pure script rendering time without network latency.

The script

You can find the full script pasted at GitHub:gist.

#!/usr/bin/env bash
if [[ $# -ne 2 ]]; then
    echo "Usage: $0 path/to/target/dir <Number of files to create>"
    exit
fi

for i in `seq 1 $2+1`; do
    year=`shuf -i 1900-2012 -n 1`
    month=`shuf -i 10-12 -n 1`
    day=`shuf -i 10-31 -n 1`
    hour=`shuf -i 10-23 -n 1`
    min=`shuf -i 10-59 -n 1`
    echo "$i $year-$month-$day\_$hour-$min.md"

cat <<EOF > $1$year-$month-$day\_$hour-$min.md
Benchmark post #$i
Here’s to the crazy ones. ...

Lorem ipsum dolor sit amet ...
Vivamus placerat, nunc a accumsan ...

Sed ultrices purus eu erat ...
EOF
done

1,000 Files (~2.5MB)

% rm posts/*
zsh: sure you want to delete all the files in /home/ps0ke/html/mdbl0g/posts [yn]? y

% ls posts/ -1 | wc -l
0

% ./benchmark.sh posts/ 1000
...
997 1922-11-29\_10-19.md
998 1915-10-10\_13-58.md
999 1913-11-15\_14-33.md
1000 1962-11-30\_13-13.md
./benchmark.sh posts/ 1000  2.07s user 7.08s system 65% cpu 13.909 total

% ls posts/ -1 | wc -l
999

# First page, 5 posts
% curl -L -s http://ps0ke.de/mdbl0g/ | grep "Execution Time"
<!-- Execution Time: 0.02759s -->

# Search for 'crazy', matches all posts, renders 5 posts
% curl -L -s http://ps0ke.de/mdbl0g/search/crazy | grep "Execution Time"
<!-- Execution Time: 0.23799s -->

10,000 Files (~25MB)

% rm posts/*
zsh: sure you want to delete all the files in /home/ps0ke/html/mdbl0g/posts [yn]? y

% ls posts/ -1 | wc -l
0

% ./benchmark.sh posts/ 10000
...
9997 1908-10-14\_20-58.md
9998 1990-12-25\_11-26.md
9999 1925-12-10\_17-39.md
10000 1967-11-19\_17-48.md
./benchmark.sh posts/ 10000  20.88s user 72.58s system 69% cpu 2:14.51 total

% ls posts/ -1 | wc -l
9990

# First page, 5 posts
% curl -L -s http://ps0ke.de/mdbl0g/ | grep "Execution Time"
<!-- Execution Time: 0.11814s -->

# Search for 'crazy', matches all posts, renders 5 posts
% curl -L -s http://ps0ke.de/mdbl0g/search/crazy | grep "Execution Time"
<!-- Execution Time: 0.70745s -->

100,000 Files (~250MB)

% rm posts/*
zsh: sure you want to delete all the files in /home/ps0ke/html/mdbl0g/posts [yn]? y

% ls posts/ -1 | wc -l
0

% ./benchmark.sh posts/ 100000
99997 1962-10-12\_16-57.md
99998 1982-11-31\_12-33.md
99999 1938-12-13\_12-38.md
100000 1971-12-25\_21-15.md
./benchmark.sh posts/ 100000  219.87s user 960.45s system 67% cpu 29:02.07 total

% ls posts/ -1 | wc -l
98992

# First page, 5 posts
% curl -L -s http://ps0ke.de/mdbl0g/ | grep "Execution Time"
<!-- Execution Time: 0.75187s -->

# Search for 'crazy', matches all posts, renders 5 posts
% curl -L -s http://ps0ke.de/mdbl0g/search/crazy | grep "Execution Time"
[1]    21563 done       noglob curl -L -s http://ps0ke.de/mdbl0g/search/crazy | 
       21565 exit 1     grep "Execution Time"

# set_time_limit(5*60); >> index.php line 2

% curl -L -s http://ps0ke.de/mdbl0g/search/crazy
An internal server error occurred. Please try again later.