- Videos showing Windows 8 is horribly painful for most people, looks likely to be another Windows Vista-like flop. Really worth watching the videos or trying it yourself (videos [1] [2], try it [1] [2])
- "38% of the ads are never in view to a user" and another 12% "of the ads are in view for less than 0.5 seconds" ([1])
- "Many more ads" are coming on Facebook, "a lot more advertising ... [on] Facebook's traditionally clean interface." Could this mean Facebook is having revenue trouble already? ([1] [2] [3])
- Not only is the iPhone over half of Apple's revenue, it is more than 70% of their profits. Apple really is a mobile phone manufacturer with a few other businesses attached. ([1])
- Coming soon, a "voice-activated assistant that remembers everything you say ... systems that are more conversational, that have the ability to ask more sophisticated followup questions and adapt to the individual ... [with] short-term and long-term memory." ([1])
- "Microsoft tries to find pockets of unrealized revenue and then figures out what to make. Apple is just the opposite: It thinks of great products, then sells them." ([1])
- "The best way to get the most out of engineers is to surround them with other great engineers." ([1])
- "It’s positively de-motivating to work for a company where your job is just to shut up and take orders. In tech startup land, we all understand instinctively that we have to hire super smart people, but we forget that we then have to organize the workforce so that those people can use their brains." ([1])
- Programmers want to learn new skills and technology while working in a team of people they respect, and over 90% of programmers said they are willing to take a lower paying job to get that. ([1])
- Netflix's streaming catalog continues to deteriorate, is now down to only 853 good movies, of which only 155 were released within the last five years ([1] [2] [3])
- "This class is about setting you on the path to developing good taste as a programmer" (free, from Udacity, taught by Googler and AI guru Peter Norvig, starts Apr 16) ([1])
- Could this be the business model for Udacity? Offer free classes online, then send companies candidates pre-screened for machine learning programming ability? ([1] [2])
- If my blog is any indication, the only RSS feeder still being used is Google Reader. Are all others dead now? ([1])
- Huge and wide open opportunity in personalized advertising for online news. Amazes me Yahoo and Amazon haven't gone after this, and that Google hasn't done a better job going after it. ([1])
- Paper with fascinating statistics on Groupon and other daily deal sites. Most dramatic, it costs restaurants half a star in their Yelp rating if they offer Groupon deals. ([1])
Sunday, March 18, 2012
Quick links
What has caught my attention lately:
Friday, March 09, 2012
Ad targeting at Yahoo
A remarkably detailed paper, "Web-Scale User Modeling for Targeting" (PDF), will be presented at WWW 2012 that gives many insights into how Yahoo does personalized advertising.
In summary, the researchers describe a system used in production at Yahoo that does daily builds of large user profiles. Each profile contains tens of thousands of features that summarize the interests of each user from the web pages they have viewed, searches they made, and ads they have viewed, clicked on, and converted (bought something) on. They explain how important it is to use conversions, not just ad clicks, to train the system. They measure the importance of using recent history (what you did in the last couple days), of using fine-grained data (detailed categories and even some specific pages and queries), of using large profiles, and of including data about ad views (which is a huge and low quality data source since there are multiple ad views per page view), and find all those significantly help performance.
Some excerpts from the paper:
First, they found recent history is very effective, yet only update the profiles daily. Wouldn't their results on the value of recent behavior (which others found too) suggest that there would be benefit from hourly or, even better, real-time updates of the profiles (perhaps with a second memory-based, unreliable, and partial coverage system supplementing the data in the more complete and more accurate older profiles)? That would allow the system to adapt immediately when someone, for example, starts looking at information for a vacation to Hawaii and show relevant offers immediately instead of only being able to do it the next day when it is usually too late. Unfortunately, I suspect we're not going to see really big gains in relevance and usefulness of ads without real-time updates to profiles of fine-grained interests; results that show that data only 24 hours old is better than data a week old may only be a tease of the gains to be seen with data only seconds old.
Second, they find that features based on individual search queries and pages viewed ("raw features") usually have no value, but occasionally have enough value that it is important to include some. Wouldn't that suggest that the categorization scheme for pages viewed, searches made, and ads need to be more fine-grained (e.g. not just the category "pants", but the category "men's boot cut jeans")? Or, better, perhaps more fine-grained while also correctly cross correlated (interest in "men's boot cut jeans" not only shows in the data a weak interest in all pants, but also maybe has been shown to indicate a fairly strong interest in "men's flannel shirts")?
If you are interested in this paper, you might also want to look at another recent paper out of Yahoo Research, "Learning to Target: What Works for Behavioral Advertising" (ACM), which is referenced multiple times by this paper and describes the features used in the user profile in a bit more detail, as well as the results of some other experiments.
Please see also my 2007 post, "What to advertise when there is no commercial intent?"
In summary, the researchers describe a system used in production at Yahoo that does daily builds of large user profiles. Each profile contains tens of thousands of features that summarize the interests of each user from the web pages they have viewed, searches they made, and ads they have viewed, clicked on, and converted (bought something) on. They explain how important it is to use conversions, not just ad clicks, to train the system. They measure the importance of using recent history (what you did in the last couple days), of using fine-grained data (detailed categories and even some specific pages and queries), of using large profiles, and of including data about ad views (which is a huge and low quality data source since there are multiple ad views per page view), and find all those significantly help performance.
Some excerpts from the paper:
We present the experiences from building a web-scale user modeling platform for optimizing display advertising targeting at Yahoo .... Our work ... [looks] into understanding the effect of different user activities on prediction, [gives] insights about the temporal aspect of user behavior (recency vs. long-term trends), and [explores] different variants (user representation and target label) through large offline and online experiments .... We deployed our platform to production and achieved a [large] boost in online metrics, such as eCPA, compared to the old system.Very interesting. A couple things I am left wondering:
Our objective is to refine the targeting constraints using the past behavior of the users ... [so] we can improve the number of conversions per ad impression without greatly increasing the number of impressions.
User profiles are aggregated logs from different systems/products (e.g. user logs of Yahoo News, Yahoo Finance, etc.) .... We consider several different events ... [including] pages visited .. the category of the page ... searches issued, clicks on search links, clicks on search advertising links ... [and] the category of the search query ... [and] views and clicks on ads ... [and] the ad category ... from an existing hierarchical ad categorizer.
Our results show [a] large performance loss incurred in favoring long-term history over short-term history. This is obvious as the recent history clearly communicates with a high probability the current interest of the user ... Although recent history is more important than older history, we still need to include older history to get the most complete idea about the user.
Results show ... many of our raw features are completely non-discriminative. However, a small percentage of these features are actually important ... [For example, just] ... dropping all raw ad views ... [or if] we drop all raw features and only keep categorical features ... [causes] dropping [of] the weighted AUC measure by 3.69% and 4.26%, respectively ... In production ... we apply a coarse feature selection through mutual information, then we apply a rigorous feature selection through l1 regularization.
First, they found recent history is very effective, yet only update the profiles daily. Wouldn't their results on the value of recent behavior (which others found too) suggest that there would be benefit from hourly or, even better, real-time updates of the profiles (perhaps with a second memory-based, unreliable, and partial coverage system supplementing the data in the more complete and more accurate older profiles)? That would allow the system to adapt immediately when someone, for example, starts looking at information for a vacation to Hawaii and show relevant offers immediately instead of only being able to do it the next day when it is usually too late. Unfortunately, I suspect we're not going to see really big gains in relevance and usefulness of ads without real-time updates to profiles of fine-grained interests; results that show that data only 24 hours old is better than data a week old may only be a tease of the gains to be seen with data only seconds old.
Second, they find that features based on individual search queries and pages viewed ("raw features") usually have no value, but occasionally have enough value that it is important to include some. Wouldn't that suggest that the categorization scheme for pages viewed, searches made, and ads need to be more fine-grained (e.g. not just the category "pants", but the category "men's boot cut jeans")? Or, better, perhaps more fine-grained while also correctly cross correlated (interest in "men's boot cut jeans" not only shows in the data a weak interest in all pants, but also maybe has been shown to indicate a fairly strong interest in "men's flannel shirts")?
If you are interested in this paper, you might also want to look at another recent paper out of Yahoo Research, "Learning to Target: What Works for Behavioral Advertising" (ACM), which is referenced multiple times by this paper and describes the features used in the user profile in a bit more detail, as well as the results of some other experiments.
Please see also my 2007 post, "What to advertise when there is no commercial intent?"
Subscribe to:
Posts (Atom)