Abstract:
Product images are used in conjunction with textual descriptions to improve classifications of product offerings. By combining cues from both text and image descriptions associated with products, implementations enhance both the precision and recall of product description classifications within the context of web-based commerce search. Several implementations are directed to improving those areas where text-only approaches are most unreliable. For example, several implementations use image signals to complement text classifiers and improve overall product classification in situations where brief textual product descriptions use vocabulary that overlaps with multiple diverse categories. Other implementations are directed to using text and images “training sets” to improve automated classifiers including text-only classifiers. Certain implementations are also directed to learning a number of three-way image classifiers focused only on “confusing categories” of the text signals to improve upon those specific areas where text-only classification is weakest.
Abstract:
Collecting and distributing information related to recent content publication activity of an instant messaging (IM) user provides other users in a network with timely, relevant information about people known to the user or within the same social network. A user participating in a social network can quickly and efficiently perceive new information related to other users (referred to as co-users) in a social network by reviewing the co-users' recent content publication activity. A user may be made able to do so without requiring the co-user to send a communication directly to the user regarding the new facts or new content, and also without requiring the user to actively browse or request information about the co-user.
Abstract:
Systems, methods, and computer storage media are provided for generating rich navigational study aids for electronic books. For a particular section of interest in a document, one or more related sections for providing additional context to the particular section are determined. The related sections are ranked based on a score indicating significance to the particular section. Based on a user's information processing preference, a set of ranked navigational links to each related section is presented to the user for additional context related to the particular section.
Abstract:
Collecting and distributing information related to recent content publication activity of an instant messaging (IM) user provides other users in a network with timely, relevant information about people known to the user or within the same social network. A user participating in a social network can quickly and efficiently perceive new information related to other users (referred to as co-users) in a social network by reviewing the co-users' recent content publication activity. A user may be made able to do so without requiring the co-user to send a communication directly to the user regarding the new facts or new content, and also without requiring the user to actively browse or request information about the co-user.
Abstract:
Methods and systems for automatically synthesizing product information from multiple data sources into an on-line catalog are disclosed, and in particular, for automatically synthesizing the product information based on attribute-value pairs. Information for a product may be obtained, via entity extraction, feed ingestion, and other mechanisms, from a plurality of structured and unstructured data sources having different taxonomies and schemas. Product information may additionally or alternatively be obtained or derived based on popularity data. The product information may be cleansed, segmented and normalized. The product information may be clustered so closest products, attribute names and attribute values are associated. A representative value for an attribute name may be determined, and the on-line catalog may be updated so that entries are comprehensive, meaningful and useful to a catalog user. Updates from at least 500 million different data sources may be scheduled to occur as frequently as several times daily.
Abstract:
Text messages over some period of time are collected. Topic identifiers, such as hashtags, are extracted from the text messages. The text messages associated with each topic identifier are processed to identify which topic identifiers are associated with group chats based on information associated with the text messages such as the times when the text messages were generated and whether the text messages identify user accounts. The topic identifiers that are determined to be associated with the group chats are incorporated into applications that allow users to search for group chats, and to view text messages from past group chats.
Abstract:
Collecting and distributing information related to recent content publication activity of an instant messaging (IM) user provides other users in a network with timely, relevant information about people known to the user or within the same social network. A user participating in a social network can quickly and efficiently perceive new information related to other users (referred to as co-users) in a social network by reviewing the co-users' recent content publication activity. A user may be made able to do so without requiring the co-user to send a communication directly to the user regarding the new facts or new content, and also without requiring the user to actively browse or request information about the co-user.
Abstract:
Methods and systems for automatically synthesizing product information from multiple data sources into an on-line catalog are disclosed, and in particular, for automatically synthesizing the product information based on attribute-value pairs. Information for a product may be obtained, via entity extraction, feed ingestion, and other mechanisms, from a plurality of structured and unstructured data sources having different taxonomies and schemas. Product information may additionally or alternatively be obtained or derived based on popularity data. The product information may be cleansed, segmented and normalized. The product information may be clustered so closest products, attribute names and attribute values are associated. A representative value for an attribute name may be determined, and the on-line catalog may be updated so that entries are comprehensive, meaningful and useful to a catalog user. Updates from at least 500 million different data sources may be scheduled to occur as frequently as several times daily.