Hunting in Groups

 Uncategorized
Mar 272005
 

With the sheer quan­tity of infor­ma­tion avail­able on the inter­net, min­ing it for rel­e­vance is where a lot of cool tech­nol­ogy is being devel­oped. Pager­ank rev­o­lu­tion­ized search and made it more demo­c­ra­tic. Search­ing is no longer hit or miss — today, Google can search over 8 bil­lion web pages and return the most rel­e­vant results in less than a second.

What’s next in search­ing? What search engines lack is the abil­ity to cus­tomize results depend­ing on the user. Spe­cific search terms return good results: (“Fourier Trans­form”, “Aish­warya Rai on Let­ter­man” etc.) Vague, gen­eral searches (“Inter­est­ing sci­ence fic­tion “, “Funny blog”) might not return what you expect; even if they do, every­one gets the same results — your “funny blog” might suck ass to me.

Col­lab­o­ra­tive Fil­ter­ing, might be the solu­tion. The idea is sim­ple enough — if some­one else buys the same books that you do, then there is a good chance that you’ll be inter­ested in the next book she buys. Ama­zon, and Net­flix use it to rec­om­mend books and movies to users; it is straigh­for­ward for online mer­chants to do this.

But how could a search engine use Col­lab­o­ra­tive Fil­ter­ing to tai­lor results? The easy way would be to ask users what they like, but that’s not gonna work well, is it? At Ama­zon, appar­ently, only about 1% of the items have user ranks. More sub­tle, implicit meth­ods would look at the user’s brows­ing pat­terns and form “opin­ions” from them.

A fas­ci­nat­ing arti­cle from the Econ­o­mist Tech­nol­ogy Quar­terly. Will this be the next PageRank?

Col­lab­o­ra­tive fil­ter­ing starts off by col­lect­ing data on indi­vid­u­als’ pref­er­ences. This can be an explicit process, by which a user ranks a book (or CD, or restau­rant) on a numer­i­cal scale, typ­i­cally on a scale of one to five. It can also be an implicit process–a pur­chase, for instance, is a clear indi­ca­tion that an indi­vid­ual is inter­ested in the item in ques­tion. But implicit mea­sures can also be more sub­tle; for instance, the amount of time spent view­ing a web page, or even just the “clickstream”–the sequence of links clicked on by a per­son brows­ing on the web. These dif­fer­ent meth­ods can then either be aggre­gated into a sin­gle score, or stored sep­a­rately to allow more detailed analy­sis. And some­times, con­sumers will be asked to score the same item in dif­fer­ent ways–for instance, what one thought of the food at a restau­rant, and what one thought of the service.

Where the user of a search engine is on a soli­tary quest, the user of a collaborative-filtering sys­tem is part of a crowd. Search, and you search alone; ram­ble from one rec­om­men­da­tion to another, and you may feel a curi­ous kin­ship with the like-minded indi­vid­u­als whose opin­ions influ­ence your own–and who are, in turn, influ­enced by your opinions.

Sorry, the comment form is closed at this time.

© 2012 etcetera Suffusion theme by Sayontan Sinha

Switch to our mobile site