blog_post_tests/20100415195359.blog

Pandora&#8217;s Biases
<p>Back in February I experimented with Pandora Radio and <a href="http://esr.ibiblio.org/?p=1670">loved it</a>&#8230;enough that I bought a subscription within a few days.  It&#8217;s my background music now; I might never own an analog radio again.</p>
<p>For a while I ran around telling all my friends about how Pandora was the greatest thing! since sliced bread!  you should try it!  But I&#8217;ve stopped doing that, because I&#8217;ve learned that it doesn&#8217;t work as well for other people&#8230;starting with my wife.  I think I know why, and it reveals an interesting failure mode of all such systems.</p>
<p><span id="more-1909"></span></p>
<p>Back in February I commented on my original post saying this:</p>
<blockquote><p>
From the reactions here I think itâ€™s the case that some seeds and gene clusters are more productive than others under their similarity metric, and I seem to have picked one thatâ€™s at the good end of the distribution. I wonder why that is? I have a tentative guess that itâ€™s because the stuff I like is complex and has lots of structure, so there are lots of traits sticking out of it.</p>
<p>There may also be a selection bias. The classification system was almost certainly designed by musicians and is certainly applied by musicians, so the traits itâ€™s going to represent most effectively will be those that are foreground for people with analytical musical ears. And that describes me; a lot of the stuff I like could be truthfully tagged â€œonly musicians listen to thisâ€.
</p></blockquote>
<p>60 days later the feedback I&#8217;m getting seems to confirm this pretty strongly.  How well Pandora will work for someone seems to correlate closely with the distance of the center of their tastes from &#8220;stuff musicians like&#8221;. And I think this highlights a likely failure mode of all recommender systems based in a taxonomy.</p>
<p>That is, if you try to do an equivalent of the Music Genome Project for creative content type X, your natural pool of evaluators is <em>people who make content type X</em>.  That pool is much smaller than, and may have different tastes than, most of the the X Genome Project&#8217;s potential audience.</p>
<p>But there&#8217;s a subtler and perhaps more important effect &#8211; not different tastes, but different feature filters.  It&#8217;s not just that musicians like somewhat different music than non-musicians do, it&#8217;s that they hear and retain things non-musicians miss.  As a personal example, my memory of electric-guitar solos I&#8217;ve heard more than once or twice is so precise that it includes pick-scrape noises and unintentional quarter-tone off-notes. I can still recall my bemusement when I finally figured how unusual that is &#8211; that most people have trouble hearing such things even when they&#8217;re cued to the timing and told what to listen for.</p>
<p>Thus: I think an in-built limitation of Pandora is that it will work well <em>if you have feature filters like a musician&#8217;s</em>. Actually, it&#8217;s worse than that &#8211; because my wife is a musician, but doesn&#8217;t hear music in the hyper-analytical way I do, and Pandora doesn&#8217;t work well for her.  So, maybe, the key group is &#8220;musicians  listening with their left ears&#8221;. Yes, this actually matters &#8211; it&#8217;s been shown in the lab that left-ear listening activates the analytical left brain.  It makes sense; if you&#8217;re hiring people to analyze music, you&#8217;re likely to find unusually analytical musicians.</p>
<p>The larger point here is that all recommender systems dependent on hiring evaluators are likely to have the same problem.  Even if you work at getting a broad selection of <em>taste</em> in the evaluators (say, by making an extra effort to hire people who understand country &#038; western, or psychotronic films, or 19th-century penny dreadfuls) you&#8217;re likely to end up with a pool that has feature filters different from the general population &#8211; probably more analytical, finer-grained, pickier.  This will leave your classification system with subtle biases, possibly ill-matched to the general population.</p>