tag:blogger.com,1999:blog-8016696494330504473.post1951816253170822526..comments2008-09-16T17:25:47.335-04:00Comments on Redirecting to http://thenoisychannel.com...: Q&A with Amit SinghalDaniel Tunkelanghttp://www.blogger.com/profile/10240432137428080022noreply@blogger.comBlogger6125tag:blogger.com,1999:blog-8016696494330504473.post-79153407839449553212008-04-15T00:50:00.000-04:002008-04-15T00:50:00.000-04:00This was also a topic of discussion while I was at...This was also a topic of discussion while I was at MS -- we did in fact publish some papers on spam and adversarial IR based on collaborations with MSR, but we were well aware that blackhats/spammers out there were reading these papers. When spam proves to be an existential threat to the relevance of search engines, things like transparency are sometimes hard to justify. <BR/><BR/>However, Google (and Yahoo and Live) have done a good job of providing at least some transparency to siteowners with their webmaster tools (ie: are you being hit with spam filters, etc). These didn't exist until relatively recently, but I'd say everyone's now come around to seeing the positive benefits of engaging with legitimate siteowners and offering them information in return for their registering themselves, their sites and sitemaps in a formal way. This is a sea-change from a previously adversarial relationship to all siteowners to one that tries to engage with normal/"good" sites.<BR/><BR/>Some other thoughts: <BR/>1) Search engines are always looking for proxies for relevance (like links) and aspects of an attention economics-approach to this are in play already. Unless I'm misunderstanding however, this too is currently prone to be gamed as well -- for example, botnets can be frighteningly effective.<BR/><BR/>2) Can there ever be a truly (or even quasi-) objective definition of relevance across the web? Obviously engines have ways of measuring their effectiveness, but those definitions are ultimately subjective ones. Would an "open" standard of relevance achieve this? <BR/><BR/>I'm inclined to think web relevance will remain subjective by virtue of the nature of the dataset, and because the stakes are high, monetarily speaking for all involved. I'll also say given the amount of money involved, spammers will try very, very hard to game whatever system is put out there. There's a few billion too many involved, and thus, unsurprisingly, a large number of smart folks working on spamming. Perhaps I have a dimmer view of human nature after dealing with this for a while, but I'm skeptical we can ever end the arms race with spammers unless the monetary incentive decreases in some way :)<BR/><BR/>Another thought is that an ancillary beneficiary to spam succeeding is often the search engine's ad wing itself in terms of fees. This isn't to suggest any conspiracy, just an example of the weird dynamics often at play vis-a-vis spam.Aaswath Ramanhttps://www.blogger.com/profile/13063303907514409659noreply@blogger.comtag:blogger.com,1999:blog-8016696494330504473.post-37583307614559733842008-04-14T08:35:00.000-04:002008-04-14T08:35:00.000-04:00I might be wrong, but if we built a ranking system...I might be wrong, but if we built a ranking system based on user visitation data, won't that be gamed as well? - instead of "link farms", won't we get "traffic farms" that artificially inflate the user visitation values of some sites by directing extra (fake) traffic to them?Markhttps://www.blogger.com/profile/07597409454396813818noreply@blogger.comtag:blogger.com,1999:blog-8016696494330504473.post-74978004683515105662008-04-12T15:11:00.000-04:002008-04-12T15:11:00.000-04:00Back in the day PageRank succeeded because it was ...Back in the day PageRank succeeded because it was a baseline approximation for user data. With richer user visitation data (through e.g. toolbars) PageRank becomes moot. See <A HREF="http://www.daniel-lemire.com/blog/archives/2007/11/28/is-pagerank-just-good-marketing/" REL="nofollow">here</A>.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-8016696494330504473.post-91554403174852538322008-04-11T17:30:00.000-04:002008-04-11T17:30:00.000-04:00Well back in the day Page Rank succeeded because i...Well back in the day Page Rank succeeded because it worked better than existing approaches, as well as being harder to game.<BR/><BR/>But yes the "attention economics" approach would be interesting, if there were some way to measure that, that did not incent people to mount DoS attacks to simulate attention 8).Markhttps://www.blogger.com/profile/07597409454396813818noreply@blogger.comtag:blogger.com,1999:blog-8016696494330504473.post-35901605662863685882008-04-11T15:07:00.000-04:002008-04-11T15:07:00.000-04:00Well, the initial success of PageRank, at least as...Well, the initial success of PageRank, at least as far as I can tell, came from it being harder to game than the IR measures that other search engines were using at the time. Since then, of course, it's been an arms race.<BR/><BR/>I'd really love to see the relevance arms race replace with a principled approach based on <A HREF="http://en.wikipedia.org/wiki/Attention_economy" REL="nofollow">attention economics</A>.Daniel Tunkelanghttps://www.blogger.com/profile/10240432137428080022noreply@blogger.comtag:blogger.com,1999:blog-8016696494330504473.post-3274461445082102482008-04-10T14:35:00.000-04:002008-04-10T14:35:00.000-04:00From the "loyal opposition" (the author of How Pag...From the "loyal opposition" (the author of <A HREF="http://www.skrenta.com/2007/12/pagerank_wrecked_the_web_3.html" REL="nofollow"><BR/>How Pagerank wrecked the web</A> founded a new search company) - essentially arguing that Google's ranking algorithms (effective, but, ultimately, arbitrary in some sense) are in fact the origin of the arms race. There must be some "less gamable" ranking system out there....Mark Watkinshttps://www.blogger.com/profile/12712666299523294530noreply@blogger.com