Last month AVG put out a new version of their anti-virus, version 8.0. It’s 8.0 that comes with LinkScanner and AVG LinkScanner is broken. It doesn’t handle base href properly and that’s why you’re seeing crazy urls with js/js/js/js/js/js in your access log.
Here’s a couple of (anonymized) examples from our own logs:
255.255.255.255 – - [20/Jun/2008:15:03:52 -0400] “GET /article/92572/js/js/gui.js HTTP/1.1″ 500 624 “-” “Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;1813)”
255.255.255.255 – - [20/Jun/2008:15:03:33 -0400] “GET /article/77673/js/js/js/js/js//”"+sWOUrl+”/” HTTP/1.1″ 500 624 “-” “Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;1813)”
I would like everyone to do like we did and redirect their User Agent so that AVG gets the message.
This is what we now have as our first htaccess rule:
RewriteRule ^.*$ http://www.grisoft.com?linkscanner=spamming_us&please_fix_it_kthx [R,L]
AVG Linkscanner uses 1813 to identify itself so as long as they keep a unique identifier we can cut them off.
The problem is how aggressive the program is. From avg’s own forum:
You’d think a business that’s entire reason to exist is to stop maliciousness wouldn’t pretty much spam every site on the web and drive up their bandwidth costs like mad (and slow down google a lot).
I suggest you put that htaccess access rule if you don’t want to end up like this poor guy:
What made us realize something was off, was that the page views according to Google Analytics were flat, yet traffic and bandwidth were EXPLODING. Most of this started in Early April, and started heading north in a really scary J curve. But when we ran Webalyzer stats, it indicated that the traffic on the site WAS going up. But since Google analytics only logs page views that the browser renders (via Javascript), none of this showed up in the Google stats. So clearly something OTHER than normal browser traffic was sucking up our bandwidth and CPU time.
That comment was posted on this blog entry and it’s that blog article (actually, one of the comments) that first tipped me off. Everything else I found elsewhere just confirmed it. One of the best articles I’ve found is from TheRegister. It’s called AVG scanner blasts internet with fake traffic.
I figure if AVG wants to chew up other people’s bandwidth, they can chew up their own. I can’t seem to register on their forum and their technical support form requires a product license so while we’re at it might as well send them a message in case they actually monitor their access logs.
UPDATE: I’ve been contacted by AVG. They’re putting together a group to address the concerns of webmasters and asked me to be part of it. If you have any comments or suggestions on what they can do to improve LinkScanner let me know and I’ll pass it along once I get the group invite.