By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData Collective
  • Analytics
    AnalyticsShow More
    predictive analytics in dropshipping
    Predictive Analytics Helps New Dropshipping Businesses Thrive
    12 Min Read
    data-driven approach in healthcare
    The Importance of Data-Driven Approaches to Improving Healthcare in Rural Areas
    6 Min Read
    analytics for tax compliance
    Analytics Changes the Calculus of Business Tax Compliance
    8 Min Read
    big data analytics in gaming
    The Role of Big Data Analytics in Gaming
    10 Min Read
    analyst,women,looking,at,kpi,data,on,computer,screen
    Promising Benefits of Predictive Analytics in Asset Management
    11 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-23 SmartData Collective. All Rights Reserved.
Reading: SIGIR 2009: Day 2, Morning Sessions (Anchor Text, Vertical Search)
Share
Notification Show More
Latest News
ai digital marketing tools
Top Five AI-Driven Digital Marketing Tools in 2023
Artificial Intelligence
ai-generated content
Is AI-Generated Content a Net Positive for Businesses?
Artificial Intelligence
predictive analytics in dropshipping
Predictive Analytics Helps New Dropshipping Businesses Thrive
Predictive Analytics
cloud data security in 2023
Top Tools for Your Cloud Data Security Stack in 2023
Cloud Computing
become a data scientist
Boosting Your Chances for Landing a Job as a Data Scientist
Jobs
Aa
SmartData Collective
Aa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Uncategorized > SIGIR 2009: Day 2, Morning Sessions (Anchor Text, Vertical Search)
Uncategorized

SIGIR 2009: Day 2, Morning Sessions (Anchor Text, Vertical Search)

Daniel Tunkelang
Last updated: 2009/07/25 at 5:34 PM
Daniel Tunkelang
7 Min Read
SHARE

Sorry for the delay in postings. Not only was I super-busy the past week, but I had some connectivity challenges (both at SIGIR and at the apartment where I was staying) and mostly restricted my online activity to occasional tweets during talks. I meant to catch up on my blogging yesterday, but instead spent the day wine tasting in Long Island. But enough apologizing, I’m refreshed and ready to blog up a storm!

The second day of SIGIR (Tuesday) started straight off with research talks. I went to the web retrieval session, which consisted of two talks about anchor text and one about privacy-preserving link analysis.

“Building Enriched Document Representations using Aggregated Anchor Text“, by Don Metzler and colleagues at Yahoo Labs. They address the challenge of anchor text sparsity (the distribution of in-links for web pages follows a power law) by enriching document representation through aggregation of anchor text along the web graph. Their technique is intuitive, and the authors demonstrate statistically significant improvements in retrieval effectiveness. Unfortunately, their results are not repeatable, since used a proprietary test collection to obtain them.

The second …

More Read

SIGIR: Meet the Who’s Who of Search and Information Retrieval

Sorry for the delay in postings. Not only was I super-busy the past week, but I had some connectivity challenges (both at SIGIR and at the apartment where I was staying) and mostly restricted my online activity to occasional tweets during talks. I meant to catch up on my blogging yesterday, but instead spent the day wine tasting in Long Island. But enough apologizing, I’m refreshed and ready to blog up a storm!

The second day of SIGIR (Tuesday) started straight off with research talks. I went to the web retrieval session, which consisted of two talks about anchor text and one about privacy-preserving link analysis.

“Building Enriched Document Representations using Aggregated Anchor Text“, by Don Metzler and colleagues at Yahoo Labs. They address the challenge of anchor text sparsity (the distribution of in-links for web pages follows a power law) by enriching document representation through aggregation of anchor text along the web graph. Their technique is intuitive, and the authors demonstrate statistically significant improvements in retrieval effectiveness. Unfortunately, their results are not repeatable, since used a proprietary test collection to obtain them.

The second talk of the session, “Using Anchor Texts with Their Hyperlink Structure for Web Search,” was by a group of authors from Microsoft Research Asia. They address the opposite problem of the previous paper: how to handle too much, rather than too little, anchor text. Specifically, they model dependence among multiple anchor texts associated with the same target document. Like the Yahoo folks, they demonstrate statistically significant results on a proprietary test collection.

The third talk, “Link Analysis for Private Weighted Graphs” (ACM DL subscribers only) by Jun Sakuma (University of Tsukuba) and Shigenobu Kobayashi (Tokyo Institute of Technology), was a bit of an outlier, if one can call a paper in a three-paper session an outlier. The authors offer privacy-preserving expansions of PageRank and HITS, the best-known link analysis methods associated with relevance and authority in web search. I’ve noticed an increasing number of papers like these that mix cryptography with information retrieval or database concerns. One of my frustrations in reading such papers is that I always suspect that people are re-inventing wheels because so few people are able to keep up with research in multiple disciplines.

Then I had the coffee break to solve my own research problem: how to fill the 11:30 slot in the Wednesday Industry Track, since a speaker called in sick that morning. When I walked by the Bing table, I saw Jan Pedersen (Chief Scientist for Core Search at Microsoft), and I begged him to help me out. I must have been a persuasive supplicant, because he procured me Nick Craswell, an applied researcher who works on Bing. Out of gratitude for this 11th-hour favor, I wore a Bing t-shirt all day yesterday as I went wine-tasting. Bing drinking, not binge drinking!

Anyway, that urgent problem resolved, I went back to enjoying the conference. For the second morning session, I went to the vertical search session.

As it turns out, that session kicked off the with SIGIR Best Paper winner: “Sources of Evidence for Vertical Selection” by Jaime Arguello (CMU), Fernando Diaz (Yahoo), Jamie Callan (CMU), and Jean-François Crespo (Yahoo). The authors do a lot of things I like: they apply query clarity as a performance predictor, and they bootstrap on an external collection (specifically Wikipedia). The test collection they use for evaluation is proprietary, but that seems to be the price (at least today) of doing this kind of work.

The second talk of the session was bit a subset of the previous paper’s authors: “Adaptation of Offline Vertical Selection Predictions in the Presence of User Feedback” by Fernando Diaz and Jaime Arguello. The authors creatively used simulation to evaluate their approach. They did a nice job, but I have to admit I’m skeptical of results about feedback that aren’t based on user studies.

Unfortunately, I missed the third talk of the session because I had to play organizer. But I must have earned some good karma, because I got to enjoy a delightful lunch with Marti Hearst and David Grossman.

Stay tuned for more posts about the interactive search session, the keynote by Albert-László Barabási, the banquet at the JFK Presidential Library and Museum, and of course the Industry Track.

Link to original post

TAGGED: sigir
Daniel Tunkelang July 25, 2009
Share this Article
Facebook Twitter Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

ai digital marketing tools
Top Five AI-Driven Digital Marketing Tools in 2023
Artificial Intelligence
ai-generated content
Is AI-Generated Content a Net Positive for Businesses?
Artificial Intelligence
predictive analytics in dropshipping
Predictive Analytics Helps New Dropshipping Businesses Thrive
Predictive Analytics
cloud data security in 2023
Top Tools for Your Cloud Data Security Stack in 2023
Cloud Computing

Stay Connected

1.2k Followers Like
33.7k Followers Follow
222 Followers Pin

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

[mc4wp_form id=”1616″]

You Might also Like

SIGIR: Meet the Who’s Who of Search and Information Retrieval

5 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

AI and chatbots
Chatbots and SEO: How Can Chatbots Improve Your SEO Ranking?
Artificial Intelligence Chatbots Exclusive
giveaway chatbots
How To Get An Award Winning Giveaway Bot
Big Data Chatbots Exclusive

Quick Link

  • About
  • Contact
  • Privacy
Follow US

© 2008-23 SmartData Collective. All Rights Reserved.

Removed from reading list

Undo
Go to mobile version
Welcome Back!

Sign in to your account

Lost your password?