What do YOU think about pay-per-click and click fraud?
Reader gregbo and I have been having a long discussion about click fraud and the viability of pay-per-click advertising in the comments to my posting Don't panic: what to do if Google suspends your AdSense account, but no one's bothered to jump into the discussion to offer their opinions. So I'm extracting the comments here and inviting others to post their followup comments. It's an interesting debate, one that I'm sure is happening internally at Google, Yahoo! and other pay-per-impression advertisers.
gregbo: Don't you think that it's ironic that what was supposed to be a simple
means of online advertising and publishing is turning into quite a
labor-intensive means of determining what is “illegitimate” web usage?
Eric Giguere: It is ironic, yes, but not unexpected if you think about it. My hope is that Google and other companies will continue to develop algorithms to automatically detect these things. Not that click fraud will go away completely, but perhaps it can be minimized and the detection can be mostly automated. Co-operating with Google (or other advertising services) and providing them with all the information you can helps them develop those algorithms.
Any kind of computer-based ad system is subject to attack in one form or another. Impression-based fraud is just as possible as click-based fraud. It may seem that click fraud is suddenly a problem, but that's only because companies like Google have managed to automate advertising delivery so much that the ads have reached the critical mass to attract fraudsters of all kind. In other words, I'm not convinced that pay-per-click advertising is doomed, because I'm sure fraud would hit whatever alternative would take its place.
gregbo: Yeah … I have always thought click fraud was inevitable. (Part of my interest in this topic is wondering whether Google thought it was inevitable.) Actually, click fraud has been a known problem since the birth of online performance-based advertising. However, I think the degree of fraud would be much less if fixed fees were used instead of performance (clicks, impressions, etc.). It would also scale better and be cheaper to administer at all ends (publisher, advertiser, SE). The only drawback I can think of is that people feel they're not getting any value if they pay up front for exposure but don't get clicks, etc.
Eric: I'm sure Google thought it was inevitable and planned to handle it somehow, though maybe the extent of the fraud has surprised them.
A fixed-fee approach doesn't solve the problem. Magazines and TV programs use that approach, but the rates are based on knowing how many readers/viewers the advertiser can reach. Counting that audience accurately is what led to the creation of things like the Audit Bureau of Circulation. You'd need some way to accurately measure traffic and reach in order to charge more, and fraudsters would just devote their energies to getting bogus traffic or otherwise gaming the system. The fraudsters follow the money, after all.
gregbo: I don't mean to suggest that fixed fees will solve the fraud problem completely. It is much more difficult to defraud the system using fixed fees; you have to compromise the servers that serve the ads, or the advertisers' accounts.
There is no reason that audiences for web sites can't be sampled the way they are for newspaper circulation, TV viewership, etc. The ad prices can be determined using circulation, or people can bid on price as they do now. The point is that the spend does not vary with traffic.
I would be surprised if Google did not think they would have as much click fraud as they do.
Eric: Maybe I'm just pessimistic, but I think fixed fees would just make the fraudsters shift gears to inflating the traffic rankings of the sites in question. Maybe that would be harder initially, but it could probably happen.
The difficulty in all of these things is that it's hard to accurately measure web traffic in general, especially when you're trying to identify unique visitors. Sampling will work well enough for large-traffic sites, but not for the small guys. I think Google's big innovation with pay-per-click was being able to open the advertising pool to small sites. These sites would probably not benefit from switching to a fixed-fee system, they'd probably be abandoned entirely. AdSense makes a lot of money from these sites, so I doubt Google is interested in abandoning them. I suspect they'll just keep working hard on the click fraud problem. Ultimately, it just may be that advertisers will have to factor in a certain amount of “wastage” in their ad plans, as I don't think all click fraud can be completely avoided. However, this is no different than what they have to swallow today with more traditional forms of advertising, including the fixed-fee system you're proposing…
gregbo: The fraudsters can't inflate the traffic rankings if the bid (or fixed fee) includes the ranking. It's not considered fraud if one pays more money to get a better ranking, similarly to how someone would pay more money to purchase something at an auction.
There are all sorts of tiny radio stations, magazines, etc. with small market penetration that seem to do ok with sampling. Plus, if these web sites are so little, how is it that they make so much money? How much money are the advertisers actually making from products advertised on the sites? Quite possibly, these sites have actual user populations that are larger than the actual user populations of the magazines, etc., which means an appropriately chosen sample could provide a reasonable indication of audience size. (But that's only if audience size is a factor in price; if advertisers bid for placement, they basically set the market, regardless of audience size.)
There's no reason for advertisers who pay fixed fees to abandon small sites. Why would they (assuming they're actually making money off of those sites)? Advertisers paid to reach tiny audiences that read computer hobbyist magazines way back in the 1970s, long before even the PC boom.
Don't you think that the degree of fraud that can be done via pay-per-performance advertising is far worse than through traditional advertising? Not that there are no problems with traditional advertising, but in my study of advertising models, I have never heard of anything that came anywhere close to the degree of fraud that can be perpretrated using pay-per-performance advertising. For example, it's far easier for someone to launch an attack on their rival causing them to be dropped from AdSense than it is for a traditional print publication to cause all of the advertisers to stop advertising in a rival's publication.
Eric: While the analogy with small radio stations and magazines works to some degree, it fails in some important respects. There are only a limited number of radio stations per market, since it's a regulated market. And there are a limited number of magazines. In contrast, there are hundreds of thousands of sites running PPC advertising.
The problem with PPC advertising when compared to traditional advertising is not the fact that it's PPC, in my opinion, but the fact that it's online advertising. Music piracy only became a huge issue when music went digital. Same deal here. We're still struggling to figure out the answer to the music question (I think part of it is charging less per song) and we're still struggling with the answer for online advertising piracy.
gregbo: Where Google went wrong, IMO, is that they didn't factor the economics of the Internet into their design of PPC advertising. They didn't take into account how easy it is to get Internet access, how easy it is to set up anonymous proxies, VPNs, etc., not to mention how easy it is to distribute malware that can be used to commit click fraud over a large number of computers.
But this isn't the first time Google has erred in this regard. Early in their existence, they made claims that PageRank was impervious to index spam, because no one could create a large enough link farm that could foil their anti-spam algorithms. They didn't take into account how easy it was to create web sites, and how for some queries, a link farm can be created that's approximately the same size and connectivity of an “organic” set of links. Basically, their algorithms could not discern the intent of linkers.
Regarding other media (radio, print, etc.), what is being sampled is the audience (the potential buyers of goods), not the media themselves. So it's not unreasonable to ask someone in a survey what web sites they usually visit, or the type of information they look for. If the vast majority of sites are never given as responses, this is not a flaw in the methodology. Also keep in mind that advertisers don't really need to know (from samples) how much to spend, because there is already some feedback present in the referrals they get from search queries or other sites that link to them. Thus, advertisers can bid for placement on search engines or publisher sites. If there is an issue with what the minimum bid should be, the web sites are hosted by ISPs who post the prices of bandwidth and disk space, so an advertiser can get a baseline figure of what costs the publisher bears. The marketplace will sort out the maximum bid prices.
Eric: Ah yes, the Doctrine of Google Infallibility is put into disrepute. Well, despite all the brain power at its disposal, there are times when Google will be wrong, and perhaps this indeed was one of them.
The problem is that their mission to boldy index where no one's indexed before can only be done by automated means, and of course that algorithmic approach pervades everything Google does, including its advertising. It seems to be necessary if you want to build truly complete indices. Look at the Yahoo! directory, the first successful attempt at categorizing the Web. It soon fell behind because the human editors simply couldn't keep up and they've had to resort to charging fees to get sites listed in their directory. While you can argue that it's the way they monetize their directory, it also serves nicely as a limiting mechanism to stem the flow of sites into the directory. The self-limiting nature of the directory can be adjusted at any time by raising or lowering the price of a listing, and without it the Yahoo! directory would sink under its own weight.
I doubt that surveying people as to what sites they visit will give you comprehensive sampling of the sites they truly visit. People forget the sites and they may also lie to avoid reporting certain surfing habits. This is why the TV guys moved to using technology to collect this information instead of relying on people writing things down. But surveying by automated means is as susceptible to being gamed as much as pay-per-click. What you're advocating is returning to the pre-Google days of advertising.
I don't have a real solution here. (If I did, I'd be out raising some venture capital money!) In the end I suspect the marketplace will settle on a combination of different advertising models. Maybe AdSense and AdBrite will be the two dominant programs, for example, and the advertisers can choose the one that gets them the best return.
gregbo: It is true that people can lie about what sites they visit, but there is no incentive to do so, particularly when the surveys are used to determine who will advertise where. If people really want to see their “free” content remain “free,” they will fill out the surveys as accurately as possible. As to people forgetting, true, this happens, but it is not a problem with the methodology. I could make an argument for software that randomly downloads people's browser caches to a collection site as input to survey data (and such software actually does exist), but due to the spread of adware, spyware, etc., people are much less likely to allow it on their systems (and businesses even less so).
To be honest, I'm not anti-technology; I just want to see technology appropriately used. For example, I don't think reasonable people, once things like an IP address, or an IP packet are explained to them, have trouble with the type of statistics that are kept on them by routers or hosts. There are benchmarks that are used to judge which network devices are better at processing packets, and people can have some confidence that the benchmarks are verifiable.
However, the situation is much less clear when talking about things like page impressions, or stickiness, or most of the other statistics that are used as input to determine how valuable a site is, what prices should be for ads placed on them, etc. You cannot measure that which is not communicated to you, and which is not available to be communicated. But an infrastructure has been built which measures other things (such as the request of a page from a server) and calls that an impression. Then, people start arguing about whether it was requested by a caching server, etc., without perhaps stopping to consider if a human being actually looked at the page (which is really the most critical issue).
To get an idea of where I'm coming from in general, read the section of the analog documentation on how the web works, and follow the links at the bottom of the page for further discussion. I'm sure most of this is pretty obvious to you; the problem is that it's not obvious to people who are empowered to make critical business decisions that once made, are difficult to change. With a carefully conceived architecture for describing events that take place on the web, it might be possible to fully automate the collection of those events in a reliable, secure manner that advertisers, publishers, and users of web sites could all agree on. However, we are nowhere near that today.
Eric: Presumably the guys at Google understand all of this!
I'd argue, in fact, that what you (and those pages) say about Web stats being meaningless actually strengthen the argument for pay-per-click (as opposed to pay-per-impression) advertising when the PPC is implemented the way AdSense does it — a dynamically-generated link inserted via JavaScript. A click on such an ad is much more meaningful, because there is almost a one-to-one correspondence between it and a human-initiated action. No proxies/caches fooling the stats. If there was no click fraud, it would almost be ideal. And it was like that at the beginning.
I just don't think people will fill out survey data accurately and in a meaningful enough quantity to be statistically significant. Not without being rewarded, and I don't think “free content” is the right reward.
gregbo: But there is click fraud, and it has to be factored into the architecture and design of collecting statistics and charging appropriately.
I probably shouldn't have brought up sampling. AFAIAC, people can bid on ad prices if they feel more comfortable doing so. The point is that after the final bid, they pay the fixed price for the time period the ad runs, not on a per-some-web-action basis.
Actually, very soon after online advertising appeared, click fraud occurred. I remember reading articles during the mid-1990s about sites that experienced click fraud and how this was causing people to lose faith in online advertising. Then the bubble began, and people started going gung-ho with online advertising despite the warnings. When the bubble burst, a lot of the online advertising activity subsided, but since Google (and Overture) have had success, it's heated up again.
Readers, what do you think?
| Enjoyed this post? Get free updates by mail or by RSS! |