Friday Jun 20 2008 2:33 pm by Smokinn

Last month AVG put out a new version of their anti-virus, version 8.0. It's 8.0 that comes with LinkScanner and AVG LinkScanner is broken. It doesn't handle base href properly and that's why you're seeing crazy urls with js/js/js/js/js/js in your access log.

Here's a couple of (anonymized) examples from our own logs:

255.255.255.255 - - [20/Jun/2008:15:03:52 -0400] "GET /article/92572/js/js/gui.js HTTP/1.1" 500 624 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;1813)"

255.255.255.255 - - [20/Jun/2008:15:03:33 -0400] "GET /article/77673/js/js/js/js/js//""+sWOUrl+"/" HTTP/1.1" 500 624 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;1813)"

I would like everyone to do like we did and redirect their User Agent so that AVG gets the message.

This is what we now have as our first htaccess rule:

RewriteCond %{HTTP_USER_AGENT} Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;1813)

RewriteRule ^.*$ http://www.grisoft.com?linkscanner=spamming_us&please_fix_it_kthx [R,L]

AVG Linkscanner uses 1813 to identify itself so as long as they keep a unique identifier we can cut them off.

The problem is how aggressive the program is. From avg's own forum:

I had Linkscanner turned on and have Google set to display 100 hits per page. In the process of scanning links, Linkscanner downloaded over 900 MegaBytes of data in one day! Use this feature with care if you have a download quota on your internet account!

You'd think a business that's entire reason to exist is to stop maliciousness wouldn't pretty much spam every site on the web and drive up their bandwidth costs like mad (and slow down google a lot).

I suggest you put that htaccess access rule if you don't want to end up like this poor guy:

Wow. I cannot believe this. We have been fighting performance issues on our web site for the last month, and just commissioned a new server. Then we got our bandwidth overage bill for May, and our bandwidth was more than double (and we got billed huge overages). The bandwidth on our site was going up EXPONENTIALLY! For June, we were looking at being 4-5 times more than our allocated bandwidth, and were looking at more than $5K this month in overages!

What made us realize something was off, was that the page views according to Google Analytics were flat, yet traffic and bandwidth were EXPLODING. Most of this started in Early April, and started heading north in a really scary J curve. But when we ran Webalyzer stats, it indicated that the traffic on the site WAS going up. But since Google analytics only logs page views that the browser renders (via Javascript), none of this showed up in the Google stats. So clearly something OTHER than normal browser traffic was sucking up our bandwidth and CPU time.

That comment was posted on this blog entry and it's that blog article (actually, one of the comments) that first tipped me off. Everything else I found elsewhere just confirmed it. One of the best articles I've found is from TheRegister. It's called AVG scanner blasts internet with fake traffic.

I figure if AVG wants to chew up other people's bandwidth, they can chew up their own. I can't seem to register on their forum and their technical support form requires a product license so while we're at it might as well send them a message in case they actually monitor their access logs.

UPDATE: I've been contacted by AVG. They're putting together a group to address the concerns of webmasters and asked me to be part of it. If you have any comments or suggestions on what they can do to improve LinkScanner let me know and I'll pass it along once I get the group invite.

Comments
Friday 20 2008 3:56 pm by Matthew Gallant

Gonna pass this article around, if we raise a stink about this perhaps it'll put some pressure on AVG. Perhaps tip off Consumerist?

Saturday 21 2008 4:26 pm by Martin

We were hit by this big time.

I have a comment for AVG, throw LinkScanner in the f-ing trash! We are contemplating legal action, or at least billing them for the time spent debugging and server downtime.

The fact that it is broken makes them look RATHER STUPID!!!

Saturday 21 2008 6:14 pm by Guillaume Theoret

That's exactly why I wrote this post, to hopefully help people debug the problem faster.

I honestly don't see how they're not going to eventually get sued by either someone with enough clout to do it own or someone that opens a class action against them because they're costing a lot of people a lot of money.

I have a friend who does tech support at a large isp and he thinks people have been calling in about overage charges more than usual lately and is going to look into whether LinkScanner is at fault. Just that isp alone is so big that being sued by them could mean big trouble for Grisoft, not to mention the large search engines that people will think are slow through no fault of their own because of LinkScanner downloading everything and affecting the page render time.

Sunday 22 2008 2:14 am by McGregor

Yeah, and besides I do not realize why it has to prefetch the material, ehy not just scan it before it is handed to the client?

This must be one of the biggest internet brainfarts ever.

Sunday 22 2008 8:09 pm by Aaron

What is the best way to block this using IIS. I dont use apache so therefore can not have the rewrite rules.

Thursday 31 2008 4:45 pm by Smokinn

comment

Post a comment
Name:
Email (optional):
URL (optional):
(Allowed tags: <a> <p> <strong> <em>)

Sorry, but due to spambots, to post I'll need you to prove you're human.

Of the six following animals, just select the two that are not fluffy

About the Site:

I might update. Don't hold your breath though.

About Me:

Name: Guillaume Theoret

Age: 801489854 seconds

Job: Mostly web dev

Some Friends:
Search:

RSS Feeds:

RSS