|
|
W3Perl speed
|
|
One w3perl weakness come from its speed to deliver stats. Of course, the
package is not able to compute as fast as package written in C but the following data will show you
that w3perl can do its job in a reasonable amount of time ... and as CPU speed are going faster and
faster, time is less a critical value.
Package | Langage | Speed (lines/s) | Time to process an 1 Gb logfile | Time to process an 1 Tb logfile |
Analog | C | 23 000 | 3 min 40 | 1 h |
Webalizer | C | 10 000 | 8 min 20 | 2 h 20 |
AWstats | Perl | 4 100 | 20 min 20 | 5 h 40 |
W3Perl | Perl | 3 500 | 23 min 40 | 6 h 40 |
If W3Perl is slower than AWstats, this is because W3Perl produce more detailed stats.
Don't forget that this is the
time need to compute stats from scratch. Daily run are incremental and take a few minutes to complete
whatever your logfile size.
|
Daily stats
|
This graph show how long it take to compute each daily stats of a logfile.
One point is one day, so a day took
typically 3 seconds to compute for an average 6000 hits/day website and 15-20 seconds for a 40000 hits/day website.
|
|
40 months of data (average 6000 hits/day)
|
12 months of data (average 40000 hits/day)
|
Stats have been computed on a XP2500+ with 512 Mb under Linux.
|
|
|
Reverse DNS
|
Reverse dns is used to translate IP to hostname (and thus allowing
country stats) but it's a very consuming job. The Netgeo and Geoip_free
module allow to reverse dns without to query a DNS and save a lots of
time (they have their own file database) ... but it's not as
accurate and uptodate than querying a DNS server.
(Logfile 1.7 Gb) |
CPU |
Speed (lines / sec) |
Relative speed (%) |
reverse dns off |
41 min 50 |
3550 |
100 % |
geoip_free module |
1 h 02 min 55 |
2280 |
65 % |
netgeo module |
1 h 04 min 15 |
2280 |
65 % |
reverse dns on |
9 h 12 min 20 |
270 |
7 % |
|
CPU timing
|
To give an idea, compressed logfiles are 90% smaller than uncompressed ones. Reverse dns is disabled
Precision level : 3. Reverse DNS disabled, virus stats disabled and robot detection disabled
Website |
Intervalle |
Logfile size (compressed) |
Logfile size (uncompressed) |
Hits / day |
CPU (reverse dns off) |
CPU |
Speed (lines / sec) |
W3Perl |
1216 days (40 months) |
124 Mb |
1.7 Gb |
6 000 |
41 min 50 |
XP2500+ (Linux) |
3550 |
INRP |
129 days (3 months) |
154 Mb |
1.2 Gb |
61 000 |
39 min 26 |
XP2500+ (Linux) |
3970 |
|
Hints
|
If you want to increase speed :
- Don't use the reverse dns option, querying dns server could take
several seconds for each IP address. Results are cached in a file for
the next calls but it really make things very slow. You could win
a three to five speed increase.
If you really want to get the country stats, better is to use the netgeo or geoip_free perl module.
You can also convert your logfile with Fastresolve which can translate IP to hostname.
- Disable unnecessary stats in your configuration file (scripts,
virus, will save you much CPU).
- Robots or spammers filtering is also time consuming.
- Don't select too much pages in the @selection variable.
- Use only one language output.
|