Andreas Weigend, Social Data Revolution | MS&E 237, Stanford University, Spring 2011 | Course Wiki

Class_15: The End of Serendipity with Tom Glocer

Date: May 17, 2011
Audio: weigend_stanford2011.15_2011.05.17.mp3
Initial authors: [Phyo Aung Si,], [Addy Satija,], [Misrab Faizullah-Khan,]


  • Thursday 5/19: HW4A Due. In class, will talk more about ideas for gamification, incentive design for your QR t-shirt, and learn more about Social Engineering. This time will be focused on how to create a "cause" and give people incentive to scan your t-shirt (Virality).
  • Tuesday 5/24: DF7 Due. Jay Nath, Director of Innovation City of San Francisco, will be coming to class. One question to think about "What would you do if you were given the data of all people living in San Francisco?". One answer from Jay: "Somebody will get killed for sure" and media will report it to get attention. (Think along the line of how data transparency, e.g. city data, can create social and economic values.)

Key Points

  • Discussion on the changing dynamics of social data and norms
  • Using social data to improve products and user experiences (to help users make better decisions)
  • Finding balance between data sharing and its return values

Guest Speakers

Michael Sha
  • CEO of Wikinvest, a firm that is making financial data social.
  • Wikinvest provides a platform to users to give their brokerage usernames and passwords similar to the means by which provides a platform for bank account usernames and passwords.
  • Wikinvest tracks brokerage transaction history, fees, interests and dividends from the brokerage websites from its users. This information is used to make recommendations to users with regards to investment decisions. Some of these are related to the investment decisions that the user has already made. For example, a recommendation for a low fee index fund than the current index fund that the user has invested in.
  • This is a tough business: investments are very subjective decisions. Often, people would rather pay the small fee premium for a fund for a more-reputed fund-manager like Vanguard, than a cheaper, less-trusted fund-manager.
    • In this case, user click data can be used to rank the recommendations as well. That is one way social data can be used to make better financial decisions.

One challenge is that every time people share their personal or non-personal data on webs (social networking sites, blogs, opinions, Q&A, ...), they usually think about how it can affect them back, positively or negatively. To simply put, before sharing any data, users usually think about its return values (e.g., positive/negative feedbacks, potential likes/retweets, app usability, ...), just like financial investers think about Return on Investment before investing. So, one question to think about is "how do we get the people to share their data?"
For example, upon seeing this page, every facebook users would ask themselves: "If I give them permission, will it spam my wall? spam my e-mail? Will it abuse my personal data? Is it worth it?"

As for Wikinvest, understandably, there has to be an incentive structure created for people to divulge this data. There are in a niche market and are dealing with incredibly secure data.
To incentivise users to provide data, Wikinvest uses two strategies:
  1. Wikinvest uses its partners to expose users to content that shows them what Wikinvest does. Deep Integration with services like SigFig on CNN Money is a win-win incentive structure with Wikinvest getting a lot of eyeballs and CNN Money gets very high engagement from its users
  2. Wikinvest tells people that a lot of people are using its services. Most people choose to use the services after examining the tradeoff between divulging data and the convenience of the services that they will have access to if they do. Only a few people drop out after using Wikinvest's services. Their average portfolio size is $400,000.

However, one should not extrapolate too much from the users of Wikinvest. Those who are keen about investing money, are constantly looking for an edge....constantly looking for new sources of information that the competition might not have. Hence, they may be willing to go a step further than the general population, to gain an advantage. Extrapolating their behavior to the general population may lead to false conclusions.

Tom Glocer

Wikipedia page:
Personal page:

Remember the gist of his perspective? Really? Well, here's a reminder:

Why am I here?

What Tom sought to gain was insight through discussion. His goal: A rare opportunity at primary data collection!

The Heart of Thomson Reuters

Tom mentioned two items at the core of the company:
  1. It's about signal-to-noise ratio;
  2. It's also about news.

Signal-to-noise ratio implies dealing with ever larger, ever dirtier data sets to extract what is subjectively deemed useful. In other words, avoiding this:

The news may seem evident, but the true meaning of this is changing ever-rapidly! Consider the standard definition:


1. Newly received or noteworthy information, esp. about recent or important events.
2. A broadcast or published report of news.

Does this encapsulate what comes to mind when you think of "news"? I'll let you decide:

youtube_videos.gifTwitter-Logo.png OR 200750391_5fc456c474.jpg

Vibrant discussion!

Examples of Thomson Reuters activities include:
  • traditional news brokerage
  • dealing with investment practices
  • minimizing health care fraud
  • legal, compliance and intellectual property solutions

Almost exclusively dealing with professionals in some cases.

Two big changes mentioned:
  1. The concept of metadata acting behind the scenes;
  2. The line between operator and operand is steadily being blurred.

Is it really all that new?!

Maybe not! Reuters was founded in 1851 by a German refugee, based on the high-tech concept of the...


carrier pigeon. They were faster than horses for getting through forests.
It is interesting to know more about the history of Reuters and also Thomson Reuters. Here is the wikipage:
It's come a long way since then, but it's still about disseminating the most relevant news as efficiently as possible. Now the company has two divisions. One is market division responsible for sales & trading, investment & advisory. The other is professional division responsible for legal, healthcare and tax issues.

Old meets New

On one hand, Reuters still employs over 3,000 real people (yes, they even have feelings!) around the world for journalistic purposes.

But in addition to this, they also leverage algorithmic analysis and the social graph heavily. To summarize, all three elements being employed are:
  1. Human personalization;
  2. Algorithmic personalization;
  3. Social data.

Issues with the New Age

- There is the conflict between obfuscating data and search improvement
- Ironically, Thomson Reuters has more high-tech emphasis in its legal division!
- Sad thing, according to Tom, that journalists are losing eminence
- Tom is a "humanist who loves technology", as such
- Balance between information exploration and exploitation
- The loss of serendipitous news discovery with well-channeled sources of news (Facebook, Twitter, RSS feeds)
    • What counts as serendipitous news discovery? Do we have to flip through a paper newspaper? Or would getting a random piece of news from an application like Feedly be okay? Do we really feel that moving from a paper-based news world to a digitized news world makes us feel like there is no serendipity in news discovery? This TED video presents how the personalization of news sources creates an information bubble around us.

Journalism in the Age of Data:

Journalists are coping with the rising information flood by borrowing data visualization techniques from computer scientists, researchers and artists. Some newsrooms are already beginning to retool their staffs and systems to prepare for a future in which data becomes a medium. But how do we communicate with data, how can traditional narratives be fused with sophisticated, interactive information displays?

A video series on the topic: Here

What next?

Global Voices is an interconnected, crowdsourced blogging network that is attempting to leverage the easier channels of communication available today. It was started by Ethan Zuckerman of Harvard's Berkman Center.