Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    data analytics and truck accident claims
    How Data Analytics Reduces Truck Accidents and Speeds Up Claims
    7 Min Read
    predictive analytics for interior designers
    Interior Designers Boost Profits with Predictive Analytics
    8 Min Read
    image fx (67)
    Improving LinkedIn Ad Strategies with Data Analytics
    9 Min Read
    big data and remote work
    Data Helps Speech-Language Pathologists Deliver Better Results
    6 Min Read
    data driven insights
    How Data-Driven Insights Are Addressing Gaps in Patient Communication and Equity
    8 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: SIGIR 2009: Day 2, Morning Sessions (Anchor Text, Vertical Search)
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Uncategorized > SIGIR 2009: Day 2, Morning Sessions (Anchor Text, Vertical Search)
Uncategorized

SIGIR 2009: Day 2, Morning Sessions (Anchor Text, Vertical Search)

Daniel Tunkelang
Daniel Tunkelang
7 Min Read
SHARE

Sorry for the delay in postings. Not only was I super-busy the past week, but I had some connectivity challenges (both at SIGIR and at the apartment where I was staying) and mostly restricted my online activity to occasional tweets during talks. I meant to catch up on my blogging yesterday, but instead spent the day wine tasting in Long Island. But enough apologizing, I’m refreshed and ready to blog up a storm!

The second day of SIGIR (Tuesday) started straight off with research talks. I went to the web retrieval session, which consisted of two talks about anchor text and one about privacy-preserving link analysis.

“Building Enriched Document Representations using Aggregated Anchor Text“, by Don Metzler and colleagues at Yahoo Labs. They address the challenge of anchor text sparsity (the distribution of in-links for web pages follows a power law) by enriching document representation through aggregation of anchor text along the web graph. Their technique is intuitive, and the authors demonstrate statistically significant improvements in retrieval effectiveness. Unfortunately, their results are not repeatable, since used a proprietary test collection to obtain them.

The second …

More Read

Interactive Map of cloud services
The Best Books on Data Governance
Mobile Analytics Interview at ASUG SAP BusinessObjects User Conference
Electronic Enlightenment
Why No Regulation of Offshoring: Untangling the Gap Between Rhetoric and Action

Sorry for the delay in postings. Not only was I super-busy the past week, but I had some connectivity challenges (both at SIGIR and at the apartment where I was staying) and mostly restricted my online activity to occasional tweets during talks. I meant to catch up on my blogging yesterday, but instead spent the day wine tasting in Long Island. But enough apologizing, I’m refreshed and ready to blog up a storm!

The second day of SIGIR (Tuesday) started straight off with research talks. I went to the web retrieval session, which consisted of two talks about anchor text and one about privacy-preserving link analysis.

“Building Enriched Document Representations using Aggregated Anchor Text“, by Don Metzler and colleagues at Yahoo Labs. They address the challenge of anchor text sparsity (the distribution of in-links for web pages follows a power law) by enriching document representation through aggregation of anchor text along the web graph. Their technique is intuitive, and the authors demonstrate statistically significant improvements in retrieval effectiveness. Unfortunately, their results are not repeatable, since used a proprietary test collection to obtain them.

The second talk of the session, “Using Anchor Texts with Their Hyperlink Structure for Web Search,” was by a group of authors from Microsoft Research Asia. They address the opposite problem of the previous paper: how to handle too much, rather than too little, anchor text. Specifically, they model dependence among multiple anchor texts associated with the same target document. Like the Yahoo folks, they demonstrate statistically significant results on a proprietary test collection.

The third talk, “Link Analysis for Private Weighted Graphs” (ACM DL subscribers only) by Jun Sakuma (University of Tsukuba) and Shigenobu Kobayashi (Tokyo Institute of Technology), was a bit of an outlier, if one can call a paper in a three-paper session an outlier. The authors offer privacy-preserving expansions of PageRank and HITS, the best-known link analysis methods associated with relevance and authority in web search. I’ve noticed an increasing number of papers like these that mix cryptography with information retrieval or database concerns. One of my frustrations in reading such papers is that I always suspect that people are re-inventing wheels because so few people are able to keep up with research in multiple disciplines.

Then I had the coffee break to solve my own research problem: how to fill the 11:30 slot in the Wednesday Industry Track, since a speaker called in sick that morning. When I walked by the Bing table, I saw Jan Pedersen (Chief Scientist for Core Search at Microsoft), and I begged him to help me out. I must have been a persuasive supplicant, because he procured me Nick Craswell, an applied researcher who works on Bing. Out of gratitude for this 11th-hour favor, I wore a Bing t-shirt all day yesterday as I went wine-tasting. Bing drinking, not binge drinking!

Anyway, that urgent problem resolved, I went back to enjoying the conference. For the second morning session, I went to the vertical search session.

As it turns out, that session kicked off the with SIGIR Best Paper winner: “Sources of Evidence for Vertical Selection” by Jaime Arguello (CMU), Fernando Diaz (Yahoo), Jamie Callan (CMU), and Jean-François Crespo (Yahoo). The authors do a lot of things I like: they apply query clarity as a performance predictor, and they bootstrap on an external collection (specifically Wikipedia). The test collection they use for evaluation is proprietary, but that seems to be the price (at least today) of doing this kind of work.

The second talk of the session was bit a subset of the previous paper’s authors: “Adaptation of Offline Vertical Selection Predictions in the Presence of User Feedback” by Fernando Diaz and Jaime Arguello. The authors creatively used simulation to evaluate their approach. They did a nice job, but I have to admit I’m skeptical of results about feedback that aren’t based on user studies.

Unfortunately, I missed the third talk of the session because I had to play organizer. But I must have earned some good karma, because I got to enjoy a delightful lunch with Marti Hearst and David Grossman.

Stay tuned for more posts about the interactive search session, the keynote by Albert-László Barabási, the banquet at the JFK Presidential Library and Museum, and of course the Industry Track.

Link to original post

TAGGED:sigir
Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

AI Document Verification for Legal Firms: Importance & Top Tools
AI Document Verification for Legal Firms: Importance & Top Tools
Artificial Intelligence Exclusive
AI supply chain
AI Tools Are Strengthening Global Supply Chains
Artificial Intelligence Exclusive
data analytics and truck accident claims
How Data Analytics Reduces Truck Accidents and Speeds Up Claims
Analytics Big Data Exclusive
predictive analytics for interior designers
Interior Designers Boost Profits with Predictive Analytics
Analytics Exclusive Predictive Analytics

Stay Connected

1.2kFollowersLike
33.7kFollowersFollow
222FollowersPin

You Might also Like

SIGIR: Meet the Who’s Who of Search and Information Retrieval

5 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

ai is improving the safety of cars
From Bolts to Bots: How AI Is Fortifying the Automotive Industry
Artificial Intelligence
AI chatbots
AI Chatbots Can Help Retailers Convert Live Broadcast Viewers into Sales!
Chatbots

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?