Join our email list for announcements for both this series and the Yahoo!/UT data mining series.

See video links on the Schedule page for live stream or archived recording of each talk.

Thursday March 22, 2012, 3:30pm, UTA 1.208

Siddharth Suri

Cooperation in Static and Dynamic Networks


This talk describes the results of a series of web-based, behavioral experiments designed to understand people's ability to cooperate in static and dynamic networks. In the context of static networks, it was previously thought that cooperation should fare better in highly clustered networks such as cliques than in networks with low clustering such as random networks. To test this hypothesis, we conducted a series of experiments, in which 24 individuals played a local public goods game arranged on one of five network topologies that varied between disconnected cliques and a random regular graph. In contrast with previous theoretical work, we found that network topology had no significant effect on average contributions.

Since humans have a natural tendency to choose with whom to form new relationships and with whom to end established relationships, we also study cooperation in dynamic networks. Helping cooperators to mix assortatively is believed to reinforce the rewards accruing to mutual cooperation while simultaneously excluding defectors. Here we report on another series of human subjects experiments in which groups of 24 participants played a multi-player prisoner's dilemma game where, critically, they were also allowed to propose and delete links to players of their own choosing at some variable rate. Over a wide variety of parameter settings and initial conditions, we found that endogenous partner selection significantly increased the level of cooperation, the average payoffs to players, and the assortativity between cooperators.

Joint work with Jing Wang (NYU) and Duncan Watts (Yahoo! Research).


I joined the Human & Social Dynamics group at Yahoo! Research led by Duncan Watts in August 2008. Prior to that I was a postdoctoral associate working with Jon Kleinberg in the computer science department at Cornell University. I earned my Ph.D. in computer and information science from the University of Pennsylvania in January 2007 under the supervision of Michael Kearns.

There are two main threads to my research. Both focus on the study of networks, albeit using different techniques. I design algorithms for analyzing the structure of large graphs using the MapReduce programming paradigm. I also conduct web-based behavioral experiments to understand how network topology impacts human behavior.

Friday, April 27, 2012, UTA 1.208

Victor Tsaran






Previous Speakers

Tuesday February 21, 2011, 3:15pm, UTA 1.208

Mounia Lalmas

User Engagement: A Scientific Challenge


In the online world, user engagement refers to the quality of the user experience that emphasizes the positive aspects of the interaction with technology and, in particular, the phenomena associated with wanting to use that technology longer and frequently. This definition is motivated by the observation that successful technologies are not just used, but they are engaged with. Engagement is measured in many ways, through self-report methods (e.g., questionnaires), observer methods (e.g., facial expression analysis, speech analysis, desktop actions, etc.), and neuro-physiological signal processing methods (e.g., respiratory and cardiovascular accelerations and decelerations, muscle spasms, etc.). However, little is known in validating and relating these metrics and so providing a firm basis for assessing the quality of the user experience. My research aims to address this problem by combining techniques from web analytics, information retrieval evaluation, and existing works on user engagement coming from the domains of information science, multimodal human computer interaction and cognitive psychology.

This talk comprises three parts: (1) I will define user engagement, list its many characteristics as identified in the research and analytic literature, and discuss through real examples the challenges associated with measuring user engagement. (2) I will describe recent data-driven approaches looking at user engagement through the development of new measures that allow for a better representation of how users engage with and across different web services. (3) I will describe how emerging research directions looking at affect and cognition are providing additional insights into measuring user engagement.

This work was done in collaboration with Ioannis Arapakis, Ricardo Baeza-Yates, Georges Dupret, Janette Lehmann, Lori McCay-Peet, Vidhya Navalpakkam and Elad Yom-Tov.


"I joined Yahoo! Research in January 2011, where I work on network-wide, affect- and attention-based models and measures of user engagement. Prior to come here, I was a Microsoft Research/RAEng Research Professor at the University of Glasgow, working on applying quantum theory to model information retrieval. Between 2002 to 2007, I co-led the Evaluation Initiative for XML Retrieval (INEX), a large-scale international project, responsible for defining the nature of XML retrieval, and how it should be evaluated."

Friday January 20, 2012, 1pm, UTA 1.208

Elizabeth Churchill

The Science and Design of Internet Experiences


As its name suggests, Human Computer Interaction is centrally concerned with understanding how people experience computational technologies, and with designing technologies with peoples' capabilities, characteristics, preferences, passions and proclivities in mind.

In this talk I will discuss the increasingly broad remit of human computer interaction (HCI) as a discipline. This expansion is driven in large part by the proliferation of everyday consumer devices, the applications that are being built for them and the Internet as a far-reaching platform for creation, distribution, recruitment, evaluation and experimentation.

I will talk about some of the projects being conducted by the Internet Experiences Group of Yahoo! Research, and consider the ways in which research, practice and development can and do speak to each other. I will lay out some of the challenges and opportunities we face as HCI researchers, practitioners and students. In the process I will reflect on what are, in my opinion, some familiar terms associated with HCI methods that are in need of a dusting off, among them: user-centered, end-user, interactive, iterative, qualitative, quantitative, scale, sample and population.


"I am a Principal Research Scientist at Yahoo! Research in Santa Clara, CA, where I manage the Internet Experiences Group. I am also the current Vice President of SigCHI, the ACM's Special Interest Group for Human Computer Interaction.

"My research focus is social media. At the highest level, I am interested in emerging digital media ethnoscapes - the fluid, shifting landscape of people and groups - that make up internet life. I take a human centered approach to design and innovation, and believe that lasting innovations derive from a deep understanding of how technologies are woven into everyday lives. Therefore my work centralizes the influence of social and cultural factors on people's adoption and adaptation of technologies. Over the last decade, I have conducted studies to address how social technologies and social media are created, consumed, adopted and adapted in different regions: in Japan, the US and the UK - and also in virtual worlds. I have designed applications for personal (mobile, desktop) and public space settings.

"My work at Yahoo! Research continues threads that have been previously developed: mediated collaboration, mobile connectivity, transmedia technologies, digital archive and memory, and the development of emplaced media. My work and ideas elaborated in more detail in my written publications, and embodied in technology prototypes and products."

Thursday December 1, 2011, 3pm, UTA 1.208

Chris Dyer

Using MapReduce to Learn Language and Translation at Scale


RAM size, disk capacity, network bandwidth, and processing power of commodity computers have been growing steadily, but at very different relative rates. In particular, disk capacity (and with it, data volumes) have grown much faster than either processing power or the bandwidth of the channels that connect a processor to its storage. As a result, efficiently processing large data volumes with non-specialized hardware requires optimized access patterns. Programming models like MapReduce (Dean & Ghemawat, 2004) provide a framework that helps engineers develop algorithms with access characteristics that enable them efficiently run on large clusters of commodity hardware.

In the first part of this talk, I will discuss how to formulate several basic problems in natural language processing (NLP) and statistical machine translation (SMT), including computing co-occurence statistics, EM algorithms, and gradient-based optimization techniques, in the MapReduce programming model.

In the second part of the talk I discuss distributed learning algorithms that were developed with the constraints and opportunities of the MapReduce programming paradigm in mind. As a case study, I present some new discriminative approaches to learning machine translation that are both well-suited to a coarsely synchronized distributed architecture and have excellent theoretical and empirical guarantees.

This is joint work with Patrick Simianer, Stefan Riezler, and Phil Blunsom.


Chris Dyer is a postdoctoral researcher in Noah Smith's lab in the Language Technologies Institute at Carnegie Mellon University. He completed his PhD on statistical machine translation with Philip Resnik at the University of Maryland in 2010. Together with Jimmy Lin, he is author of Data-Intensive Text Processing with MapReduce, published by Morgan & Claypool in 2010. Current research interests include machine translation, probabilistic machine learning, learning from noisy and incomplete data, and "big data" problems in NLP.

Friday, October 28, 2011, 1pm, UTA 1.208

Bob Moore

A Name is Worth a Thousand Pictures: Referential Practice in Search-Engine Interactions


Today's Internet search engines are highly effective in returning relevant web pages to users in mere seconds. Yet when interacting with search engines, users nonetheless experience troubles and frustrations, which are still poorly understood. One particular kind of trouble stems from users' level of prior knowledge about entities of interest, particularly regarding their names. This lab study examines how referential practice is organized in the context of search-engine interactions. It finds that, as in human conversation, users employ naming in their queries to refer to entities if then can. However, when they do not know the name, or a name fails, they attempt a two-stage search: first they search for the name, using generic descriptions combined with an image-matching strategy, and second, if the name is found, they formulate the final query using that name. A novel method -- computer interaction analysis -- is used to reveal formal features of users' referential practices from recordings of screen video with eye tracking and a novel transcription scheme.


Bob Moore is currently a Senior Research Scientist In the Internet Experiences Group at Yahoo! Labs, where he is examining human-computer interaction and designing online games. In the past he has worked as a researcher at the Xerox Palo Alto Research Center (PARC) and also as a game designer at The Multiverse Network. Bob's past research includes studies of face-to-face social interaction in print shops, work practices in automobile assembly plants, telephone-mediated interaction in survey call centers, and avatar-mediated interaction in commercial 3D virtual worlds. Bob holds Ph.D., M.S. and B.A. degrees in sociology with concentrations in ethnomethodology, conversation analysis, ethnography and science and technology studies.

Friday September 30, 2011, 1pm, UTA 1.208

Judd Antin

Motivation in the Age of Social Media


Why do people create, interact, and collaborate online? What are the deep motivations that drive so many to invest significant time and energy on Facebook, Flickr, StackExchange, Wikipedia, Twitter, YouTube, and countless other sites? As social media has become the internet's driving force, questions about motivation and incentives have come to the forefront. Motivation, however, is hard to talk about and harder to measure. In this talk I discuss some of the problems with current models of motivation that are enshrined in trends like "gamification." I also discuss some key problems with measuring motivation through surveys, and present a recent study on social desirability effects in reports of motivation on Amazon's Mechanical Turk. Finally, I present thoughts on the future of measuring motivation and developing effective mechanisms for motivating online participation.


Judd Antin is a social psychologist and research scientist in the Internet Experiences group at Yahoo! Research. Judd.s areas of expertise include incentives and motivation for online collaboration, "gamification" and game mechanics, online communities, collective action and social dilemmas, as well as trust, reliability, and credibility. His research interests center on user-generated content, social media, the wisdom of crowds, distributed work, and all other forms of online collaboration. Working with laboratory and field experiments, surveys, and qualitative methods, Judd strives for a holistic understanding of participation and collaboration and translating that understanding into innovation for Yahoo! products.

Friday March 11, 2011, 1pm, UTA 1.208

Bo Pang

Anatomy of the long tail: the whys and hows of satisfying niche interests


In many ecosystems on the Web, the vast majority of the items are of interest to a relatively small number of people. Nonetheless, these tail items in aggregate account for a sizable portion of the overall consumption. In this talk, we first present a user-centric perspective on the heavy tail phenomena. In particular, looking at extensive data on user preferences for movies, music, Web search, and Web browsing, we find overwhelming evidence that the vast majority of users are a little bit eccentric, consuming niche products at least some of the time. Our results suggest that the benefit of satisfying tail interests extends beyond direct revenue to second-order gains associated with increased user satisfaction. On the other hand, satisfying tail needs can be difficult due to the lack of historical user behavioral data. We examine ways to overcome this hurdle via a case study of one such task: link suggestion for tail sites.


Bo Pang is a research scientist at Yahoo! Research. She obtained her PhD in Computer Science from Cornell University in 2006. Her primary research interests are natural language processing, information retrieval, and web mining. Her past work include sentiment analysis, paraphrasing, querylog analysis, bridging structured and unstructured data, and computational advertising.


Thursday February 17, 2011

Victor Tsaran

Accessibility 2.0: the power of content aggregation in web applications for creating accessible interfaces


One of the highlights of Web 2.0 is the ability to aggregate and mash the content from one or more data sources and present it elsewhere on the web. While many kinds of data transformations are possible in this way, it is the presentation aspect of content aggregation that attracts many designers and developers to the idea. For example, an interface that was not designed with assistive technology users in mind can be transformed into a fully accessible one by extracting the original content from the data source and building the new interface on top of it.

In this talk I will discuss the accessibility strategy at Yahoo! and demonstrate several production site examples of how we used content from other Internet sites, such as Facebook, Twitter, BBC, Discovery Chanel and many others to build fully accessible applications for the Yahoo! home page.


Victor Tsaran, currently the Senior Accessibility Program Manager at Yahoo!, has fifteen years of experience in the field of accessibility. He helps to drive various accessibility-related initiatives at Yahoo!, such as the Yahoo! Accessibility Labs, educates designers and developers about access technology and how to code/test for it, and evangelizes accessibility and best web development practices inside and outside of the company.

In 1995 Victor co-founded one of the first computer centers for the blind in Ukraine. He later taught computers to visually impaired students in South Asia, Eastern Europe and the Middle East and worked with several international organizations to advance computer literacy and understanding of accessibility among educators in developing countries. Victor participates in various accessibility-related open source projects to enable access to the web and music production software by blind computer users around the world.

See Tsaran's October 2009 interview.
Video: Yahoo makes Web surfing easier for the disabled


Monday January 31, 2011

Vanessa Murdock

Geographic Context on the Web


We lead a double life in parallel social systems. In our everyday experience, we have friends, family, events, and social connections that happen in the real world, and enrich and give meaning to our lives. We also have friends, family, events and social connections that exist primarily online, which also enrich and give meaning to our lives. For most of us, our online life and our offline life have points of intersection such as events that we arrange online, but that take place offline, or places that we visit and then photograph, discuss, and share with our online social community. In this talk we focus on location as an important link between our online and offline lives.

As GPS-enabled devices become ubiquitous, online social platforms increasingly leverage location as an important aspect of our online social context. Web 2.0 platforms such as Foursquare, Flickr, Facebook and Twitter connect people to each other and to their surroundings. Because of the availability of location-based services, and advances in personal computing devices such as smart phones and tablet computers, people increasingly expect services such as search and advertising to be location-savvy as well. With our portable devices increasingly intelligent, we expect services to understand our geographic context without our having to explicitly indicate our location, or even to enable GPS on our handheld devices.

In this talk we present a set of open problems in understanding personal geography, and our current research in discerning a user's geographic context.


Vanessa Murdock is currently a Research Scientist with Yahoo! Research in Barcelona, where she leads the Geographic Context and Experience group. The Geo group is participating in the GLOCAL European project (2009 - 2012) funded under FP7, focusing on understanding geographic aspects of search, such as query intent, and the geographic scope of multimedia data. The Geo group works closely with the Yahoo! Geo Technologies group, based in London, to provide state-of-the-art geo services for a variety of Yahoo! products, including Yahoo! Local. While at Yahoo! Research, she has also worked on multimedia retrieval, and sponsored search. Dr. Murdock is the author of the book ``Exploring Sentence Retrieval'' (VDM Verlag, 2008) and received her Ph.D. in Computer Science from the University of Massachusetts, Amherst, in 2006.


Friday November 12, 2010

D. Ayman Shamma

Staying together: Understanding People and Media in Synchronous Connected Systems


The things we do together spawn conversations; gathering with our friends and families to watch programs, concerts, and events, we share the experience through backchannel conversations, social asides and mutual displays of agreement and disagreement. How do these sharing of experiences in turn shape how we understand the actual event? This talk presents real-world applications designed to facilitate synchronous conversations while sharing media. First, I will examine how people use status updates, such as on Twitter, while they watch live events on TV. By accounting for temporal and conversational features, one can use tweets to segment a long political debate into logical questions. I will also describe new methods for retrieving conversationally salient, not document salient, terms. Second, I will present Zync, a system for synchronized video sharing over instant messaging; in effect this is conversational video on demand. From observing how a YouTube video is shared within a conversation, we develop methods for media segmentation and summarization. Finally, I will show how using implicit conversational data can outperform explicit annotations in automated classification tasks for online videos. Throughout the talk, I will discuss how these examples extend online infrastructures to build highly connected experiences.


D. Ayman Shamma is a research scientist in the Internet Experiences group at Yahoo! Research. He researches synchronous environments and connected experiences both online and in-the-world. Focusing on creative expression and sharing frameworks, he designs and prototypes systems for multimedia-mediated communication, as well as develops targeted methods and metrics for understanding how people communicate online in small environments and at web scale. Ayman is the creator and lead investigator on the Yahoo! Zync project. Using models of creativity and sharing from his research, Ayman creates media art installations that have been reviewed by The New York Times, International Herald Tribune, and Chicago Magazine and exhibited internationally, including Second City Chicago, the Berkeley Art Museum, SIGGRAPH ETECH, Chicago Improv Festival, and Wired NextFest/NextMusic.

Ayman holds a B.S./M.S. from the Institute for Human and Machine Cognition at The University of West Florida and a Ph.D. in Computer Science from the Intelligent Information Laboratory at Northwestern University. Before Yahoo!, he was an instructor at the Medill School of Journalism; he has also taught courses in computer science and studio art departments. Prior to earning his Ph.D., he was a visiting research scientist for the Center for Mars Exploration at NASA Ames Research Center.


Friday October 15, 2010

Ricardo Baeza-Yates

Next Generation Search


We provide our personal vision of what could be the next generation of Web search engines, based on a single premise: people do not really want to search, they want to get tasks done. Hence, the key to a better experience will come from the combination of the deeper analysis of content with the detailed inferenc e of user intent. To achieve this the main ideas are: (1) in place of the indexing that search engines traditionally perform, we have a content analysis phase that spots entities such as people, places, and dates in documents; (2) at query time we assign an intent to the user based on the query and its context; and then (3) we retrieve entities matching the intent and assemble a results page not of documents, but of matching entities and their attributes. In this talk w e outline the main research challenges that derive from it.


Ricardo Baeza-Yates leads the Yahoo! Research labs at Barcelona, Spain and Santiago, Chile, and also superv ising the lab in Haifa, Israel. Until 2005 he was the director of the Center for Web Research at the Department of Computer Science of the Engineering School of the University of Chile; and ICREA Professor and founder of the Web Research Group at the Dept. of Information and Communication Technologies of Univ. Pomp eu Fabra in Barcelona, Spain. He maintains ties with both mentioned universities.

He is co-author of Modern Information Retrieval (Addison-Wesley, 2010, 2ed) among other books and publications. Member of the ACM (Fellow), AMS, IEEE (Seni or), SIAM and SCCC, as well as the Chilean Academy of Sciences. Awards from American Organization States, Institute of Engineers of Chile, University of Water loo and COMPAQ.

His research interests includes algorithms and data structures, information retrieval, web mining, text and multimedia databases, software and database vis ualization, and user interfaces.