Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    unusual trading activity
    Signal Or Noise? A Decision Tree For Evaluating Unusual Trading Activity
    3 Min Read
    software developer using ai
    How Data Analytics Helps Developers Deliver Better Tech Services
    8 Min Read
    ai for stock trading
    Can Data Analytics Help Investors Outperform Warren Buffett
    9 Min Read
    media monitoring
    Signals In The Noise: Using Media Monitoring To Manage Negative Publicity
    5 Min Read
    data analytics
    How Data Analytics Can Help You Construct A Financial Weather Map
    4 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: SIGIR 2009: Day 2, Morning Sessions (Anchor Text, Vertical Search)
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Uncategorized > SIGIR 2009: Day 2, Morning Sessions (Anchor Text, Vertical Search)
Uncategorized

SIGIR 2009: Day 2, Morning Sessions (Anchor Text, Vertical Search)

Daniel Tunkelang
Daniel Tunkelang
7 Min Read
SHARE

Sorry for the delay in postings. Not only was I super-busy the past week, but I had some connectivity challenges (both at SIGIR and at the apartment where I was staying) and mostly restricted my online activity to occasional tweets during talks. I meant to catch up on my blogging yesterday, but instead spent the day wine tasting in Long Island. But enough apologizing, I’m refreshed and ready to blog up a storm!

The second day of SIGIR (Tuesday) started straight off with research talks. I went to the web retrieval session, which consisted of two talks about anchor text and one about privacy-preserving link analysis.

“Building Enriched Document Representations using Aggregated Anchor Text“, by Don Metzler and colleagues at Yahoo Labs. They address the challenge of anchor text sparsity (the distribution of in-links for web pages follows a power law) by enriching document representation through aggregation of anchor text along the web graph. Their technique is intuitive, and the authors demonstrate statistically significant improvements in retrieval effectiveness. Unfortunately, their results are not repeatable, since used a proprietary test collection to obtain them.

The second …

More Read

Note to Bloggers: Don’t Quit Your Day Job
Are Phablets Finally Here?
How to Speak Visualization
Would You Prefer Prettier Pivot Tables?
INTERVIEW WITH BEN JOHNSON

Sorry for the delay in postings. Not only was I super-busy the past week, but I had some connectivity challenges (both at SIGIR and at the apartment where I was staying) and mostly restricted my online activity to occasional tweets during talks. I meant to catch up on my blogging yesterday, but instead spent the day wine tasting in Long Island. But enough apologizing, I’m refreshed and ready to blog up a storm!

The second day of SIGIR (Tuesday) started straight off with research talks. I went to the web retrieval session, which consisted of two talks about anchor text and one about privacy-preserving link analysis.

“Building Enriched Document Representations using Aggregated Anchor Text“, by Don Metzler and colleagues at Yahoo Labs. They address the challenge of anchor text sparsity (the distribution of in-links for web pages follows a power law) by enriching document representation through aggregation of anchor text along the web graph. Their technique is intuitive, and the authors demonstrate statistically significant improvements in retrieval effectiveness. Unfortunately, their results are not repeatable, since used a proprietary test collection to obtain them.

The second talk of the session, “Using Anchor Texts with Their Hyperlink Structure for Web Search,” was by a group of authors from Microsoft Research Asia. They address the opposite problem of the previous paper: how to handle too much, rather than too little, anchor text. Specifically, they model dependence among multiple anchor texts associated with the same target document. Like the Yahoo folks, they demonstrate statistically significant results on a proprietary test collection.

The third talk, “Link Analysis for Private Weighted Graphs” (ACM DL subscribers only) by Jun Sakuma (University of Tsukuba) and Shigenobu Kobayashi (Tokyo Institute of Technology), was a bit of an outlier, if one can call a paper in a three-paper session an outlier. The authors offer privacy-preserving expansions of PageRank and HITS, the best-known link analysis methods associated with relevance and authority in web search. I’ve noticed an increasing number of papers like these that mix cryptography with information retrieval or database concerns. One of my frustrations in reading such papers is that I always suspect that people are re-inventing wheels because so few people are able to keep up with research in multiple disciplines.

Then I had the coffee break to solve my own research problem: how to fill the 11:30 slot in the Wednesday Industry Track, since a speaker called in sick that morning. When I walked by the Bing table, I saw Jan Pedersen (Chief Scientist for Core Search at Microsoft), and I begged him to help me out. I must have been a persuasive supplicant, because he procured me Nick Craswell, an applied researcher who works on Bing. Out of gratitude for this 11th-hour favor, I wore a Bing t-shirt all day yesterday as I went wine-tasting. Bing drinking, not binge drinking!

Anyway, that urgent problem resolved, I went back to enjoying the conference. For the second morning session, I went to the vertical search session.

As it turns out, that session kicked off the with SIGIR Best Paper winner: “Sources of Evidence for Vertical Selection” by Jaime Arguello (CMU), Fernando Diaz (Yahoo), Jamie Callan (CMU), and Jean-François Crespo (Yahoo). The authors do a lot of things I like: they apply query clarity as a performance predictor, and they bootstrap on an external collection (specifically Wikipedia). The test collection they use for evaluation is proprietary, but that seems to be the price (at least today) of doing this kind of work.

The second talk of the session was bit a subset of the previous paper’s authors: “Adaptation of Offline Vertical Selection Predictions in the Presence of User Feedback” by Fernando Diaz and Jaime Arguello. The authors creatively used simulation to evaluate their approach. They did a nice job, but I have to admit I’m skeptical of results about feedback that aren’t based on user studies.

Unfortunately, I missed the third talk of the session because I had to play organizer. But I must have earned some good karma, because I got to enjoy a delightful lunch with Marti Hearst and David Grossman.

Stay tuned for more posts about the interactive search session, the keynote by Albert-László Barabási, the banquet at the JFK Presidential Library and Museum, and of course the Industry Track.

Link to original post

TAGGED:sigir
Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

Hidden AI, a risk?
Hidden AI, Real Risk: A Governance Roadmap For Mid-Market Organizations
Artificial Intelligence Exclusive Infographic
unusual trading activity
Signal Or Noise? A Decision Tree For Evaluating Unusual Trading Activity
Analytics Exclusive Infographic
Ai agents
AI Agent Trends Shaping Data-Driven Businesses
Artificial Intelligence Exclusive Infographic
Why Businesses Are Using Data to Rethink Office Operations
Why Businesses Are Using Data to Rethink Office Operations
Big Data Exclusive

Stay Connected

1.2KFollowersLike
33.7KFollowersFollow
222FollowersPin

You Might also Like

SIGIR: Meet the Who’s Who of Search and Information Retrieval

5 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

ai in ecommerce
Artificial Intelligence for eCommerce: A Closer Look
Artificial Intelligence
ai is improving the safety of cars
From Bolts to Bots: How AI Is Fortifying the Automotive Industry
Artificial Intelligence

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?