Arrington Still Convinced that Human Editor is a “Dying Breed”

October 2, 2007 by

Although there definitely seems to be some angst out there over the quality of the links on the Digg and Reddit front pages, Michael Arrington is still convinced that the human editor is a “dying breed”. So is crowd sourcing not working because people are too lazy to rate links (as Arrington suggests) or are Digg and Reddit vulnerable to spam (the plethora of Ron Paul links would suggest that they are)?


Kick-Ass Facebook News Application

September 25, 2007 by

Antlook has quietly released News for Facebook. Antlook’s personalized news feed is almost eerily good and the move to Facebook takes Antlook in an interesting direction.

Coupling Antlook’s powerful recommendation engine with Facebook’s social graph adds a fun social dimension to news reading. Friends see each other’s reading in their Facebook feeds. And, as one’s Facebook friends join Antlook’s internal social network, they influence their friends’ personal news feeds just by reading.

Check it out.

Geeky Personalized RSS Service FeedHub Grinds to Crawl

September 25, 2007 by

At DEMOfall yesterday, mSpoke announced FeedHub a Personalized RSS service that purportedly takes an OPML file and creates a personalized RSS feed. After an annoying registration process that requires YAP (yet another password) and uploading an OPML file, a magic personalized RSS URL is created. Unfortunately, the only content on the feed is this:

If you just created your feed, don’t worry — we’re choosing some great content based on what we’ve learned from your OPML.

Important note! Because of the enthusiastic response to our launch at DEMOFall this week, we are a bit overwhelmed with the number of new feeds being created and are actively beefing up our infrastructure to meet the demand. While you can normally expect to see new content in your feed every 3-4 hours, it will currently take 24 hours to start getting content in your new feed.

(emphasis mine)

Sigh. I’d like to evaluate the quality of their personalization, but that’ll have to wait for another day.

Scoble reports that it doesn’t work for him.

Facebook Ads to “Generate Demand”, Sink AdWords: Not Bloody Likely

August 23, 2007 by

The Wall Street Journal reports today on Facebook’s ad targeting plans. Currently, advertisers can target ads based on age, sex, and location. The new service would allow ads to be targeted based on all the information that users post in their profiles (e.g. favorite movies, music, schools attended, religion, political views, etc.).

Both the Journal and TechCrunch make like Facebook’s ad approach is going to poise a serious challenge to AdWords. I’m skeptical.

A few problems:

  1. While it is truly amazing how much people will disclose about themselves in their profiles, this information is explicit and is therefore biased (as discussed earlier).
  2. Some Facebook applications are injecting meaningful implicit information into the system (e.g. the Twitter and Facebook apps), but when Facebook fatigue sets in, will Facebook have enough of this information to target accurately?
  3. AdWords’ “demand fulfillment” approach–targeting based on what the user is searching at the moment–is far more powerful than Facebook’s proposed “demand generation” approach, which will try to predict interests from much older, more static profile information.
  4. Users are just having waaay too much fun on Facebook to notice ads. As the WSJ states:

    “The addictive quality of social networking means users are so busy reading about their friends that they hardly notice display ads and, even if they do, are loath to navigate away to an advertiser’s site. Advertisers say the percentage of people that click on display ads is lower on Facebook…”

Facebook at War with Abusive Applications

August 17, 2007 by

Facebook is an incredibly rich communications medium.  It’s also ripe for exploitation by abusive applications.

TechCrunch reports that Facebook is cracking down on abusive Facebook applications. The first form of abuse are applications that install profile boxes that appear differently to the installing user than to his friends. The installing user sees something benign, but friends see a big yellow spam box that screams: “INSTALL THIS APPLICATION NOW!”.

The second form of abuse, relates to notification fraud. Just last night I ran into this with the My Questions application. After installation, it primed itself with a question I never asked “What’s your favorite drink?”. I asked no such thing and was annoyed that my friends would be bothered with such nonesense. Encouragingly, Facebook is cracking down on this form of abuse as well.  Here, however, they lack a technical fix.

CACM Cover Story: Privacy Enhanced Personalization

August 15, 2007 by

The cover story of this month’s CACM is on privacy-enhanced personalization. Alfred Kobsa (UC Irvine professor and father of the field of privacy-enhanced personalization) devotes most of the piece to surveying privacy-related usability research. Most of the findings seem like common-sense:

  • Users tend to overvalue small, immediate benefits to disclosing personal information and undervalue potential future negative consequences.
  • Users fall into one of three camps: privacy fundamentalists (disclose nothing), the unconcerned (disclose everything), and privacy pragmatists (make sensible trade-offs)–with the two former classes on the decline and pragmatists on the rise.
  • Users value transparency (knowing how personal data will be used) and control.
  • Users will disclose more to established web sites and sites that have a professional appearance and a privacy policy.

No surprises there.

The tail end of the piece–where Kobsa surveys privacy-enhancing technology–is less fluff. The notable points there:

  • Client-side personalization is very limiting. Duh.
  • Allow pseudonymous access if you can. OK.
  • In collaborative filtering systems, perturb input data to hide users’ true values or deliberately introduce noise in the data, so that users can plausibly deny responsability for any potentially embarrassing data. Interesting ideas.

Intriguingly, he mentions a peer-to-peer approach to collaborative filtering that “allows users to privately maintain their own individual ratings, and a community of users to compute an aggregate of their private data…using homomorphic encryption…[and then for] personalized recommendations to be generated at the client side”. However, no citation for this work is provided.

Sep Kamvar: Personalized Search is Subtle (But is it useful?)

August 13, 2007 by

Read/Write Web has a nice interview with Google’ personalization guru Sep Kamvar.

In response to privacy questions, Sep emphasizes the amount of choice, transparency, and control in Google’s Web History (the richest source of personalized data)–though not in location acquisition (which is another key personalization input). Ultimately, however, Sep concedes that the effect of personalization on search rank is subtle and that users value having a diverse set of results rather than a narrow personalized set.

Sep points to an article in the Financial Times (by Google’s global privacy council, Peter Fleischer) that gives some compelling examples of how personalization can provide important context to disambiguate poorly or informally formulated search queries. I remain somewhat skeptical of the value of highly personalized search. The value is much harder to see with search (where someone is looking for something specific) than it is with news (where someone merely wants quality info-tainment).

Netflix Prize Contestants Stuck at 7.8%

August 6, 2007 by

The Netflix prize turned 300 last week and it seems that progress is
slowing to a crawl. To win the prize, participants have to beat
the predictive accuracy of Netflix’s own Cinematch algorithm by 10%.
Tom Slee reports:

The Cinematch score was matched within a week. Within a month the leaders were half way to the winning prize with a 5% improvement. But getting further improvement progress has proved more and more difficult. It took another month to get to a 6% improvement, about 5 more months to get to 7%, and the current (July 29 2007) leader is at 7.8% improvement and has been unchanged for a month

Tom goes on to expose some curious outliers in the data and expresses
skepticism that recommendation systems can unmask the wisdom of

Udi: “Implicit Kicks Explicit’s Ass”

August 1, 2007 by

There’s a great article over at Udi’s Spot on the superiority of collecting metadata implicitly (i.e. through natural user actions):

The massively important, and often overlooked, thing about implicit metadata is that it’s generally trustworthy. It’s like the results of a double-blind scientific study. Explicit metadata on the other hand, while often useful, is always in doubt. It’s like the results of an exit poll during an election. People lie. People are stupid. People are remarkably un-self-aware. Going the explicit route exposes you to all of these problems.

This has huge implications for how the metadata that feeds recommendation engines should be collected.

Some personal news sites seem to understand this (e.g. Google, Antlook, Findory). Others–like Reddit, Netflix, and Digg–don’t get it. These sites require that users rate content up or down (or, in the case of Netflix, on a scale of zero to five stars).

Explicit metadata is not only unreliable (as Udi points out), it’s also sparse. The click tax is very high and many users will simply not rate at all. Sites that are designed this way are throwing out a tremendous amount of implicit information about users did (and did not do) on their site.

And We’re Off…

July 31, 2007 by

What I hope to accomplish in this blog is to cover the emerging space of personalized web services. No, I’m not talking about the lame portals of yesteryear that allowed you to build your very own portal page with your very own selection of news, sports, finance, and weather feeds! I’m talking about services that automatically build profiles of each user’s tastes and interests, using them to create personalized content.What began as research, has been widely deployed by ecommerce sites (e.g. Amazon, NetFlix) to drive product recommendations, is increasingly being used to create personal news services (e.g. Google Personal News, Findory, Antlook) and influence search results.