Abstract:
A set of representations of item-page pairs of items and respective web pages that include the respective items is obtained, each representation including feature function values indicating weights associated with features of associated web pages, the features including page classification features. An annotated set of labeled training data that is annotated with salience annotation values of items for respective web pages that include the items is obtained. The salience annotation values are determined based on a soft function, by determining a first count of a total number of user queries associated with corresponding visits to the respective web pages, and determining a ratio of a second count to the first count, the second count determined as a cardinality of a subset of the corresponding visits that are associated with user queries that include the item, the subset included in the corresponding visits. Models are trained using the annotated set.
Abstract:
Automatically detected and identified tasks and calendar items from electronic communications may be populated into one or more tasks applications and calendaring applications. Text content retrieved from one or more electronic communications may be extracted and parsed for determining whether keywords or terms contained in the parsed text may lead to a classification of the text content or part of the text content as a task. Identified tasks may be automatically populated into a tasks application. Similarly, text content from such sources may be parsed for keywords and terms that may be identified as indicating calendar items, for example, meeting requests. Identified calendar items may be automatically populated into a calendar application as a calendar entry.
Abstract:
A text span forming either a single word or a series of two or more words that a user intended to select is predicted. A document and a location pointer that indicates a particular location in the document are received and input to different candidate text span generation methods. A ranked list of one or more scored candidate text spans is received from each of the different candidate text span generation methods. A machine-learned ensemble model is used to re-score each of the scored candidate text spans that is received from each of the different candidate text span generation methods. The ensemble model is trained using a machine learning method and features from a dataset of true intended user text span selections. A ranked list of re-scored candidate text spans is received from the ensemble model.
Abstract:
An analysis module, when triggered by a synchronization framework when a new data item is added to a project data store, runs a series of analysis feature extractors on the new content. An analysis may be conducted, and features of interest may be extracted from the data item. The analysis utilizes natural language processing, as well as other technologies, to provide an automatic or semi-automatic extraction of information. The extracted features of interest are saved as metadata within the project data store, and are associated with the data item from which it was extracted. The analysis module may be utilized to discover additional information that may be gleaned from content that is already in the project data store.
Abstract:
A summarization system and method. The summarization method includes utilizing a first body of information to obtain a second body of information, which is identified (by a hyperlink, an attachment identifier, a reference, etc.) in the first body of information. A summary of the obtained second body of information is then computed. The computed summary can be displayed to a user and/or stored for later use.
Abstract:
Project-related data may be aggregated from various data sources, given context, and may be stored in a data repository or organizational knowledge base that may be available to and accessed by others. Documents, emails, contact information, calendar data, social networking data, and any other content that is related to a project may be brought together within a single user interface, irrespective of its data type. A user may organize and understand content, discover relevant information, and act on it without regard to where the information resides or how it was created.
Abstract:
Automatically summarizing electronic communication conversation threads is provided. Electronic mails, text messages, tasks, questions and answers, meeting requests, calendar items, and the like are processed via a combination of natural language processing and heuristics. For a given conversation thread, for example, an electronic mail thread associated with a given task, a text summary of the thread is generated to highlight the most important text in the thread. The text summary is presented to a user in a visual user interface to allow the user to quickly understand the significance or relevance of the thread.
Abstract:
An analysis module, when triggered by a synchronization framework when a new data item is added to a project data store, runs a series of analysis feature extractors on the new content. An analysis may be conducted, and features of interest may be extracted from the data item. The analysis utilizes natural language processing, as well as other technologies, to provide an automatic or semi-automatic extraction of information. The extracted features of interest are saved as metadata within the project data store, and are associated with the data item from which it was extracted. The analysis module may be utilized to discover additional information that may be gleaned from content that is already in the project data store.
Abstract:
Project-related data may be aggregated from various data sources, given context, and may be stored in a data repository or organizational knowledge base that may be available to and accessed by others. Documents, emails, contact information, calendar data, social networking data, and any other content that is related to a project may be brought together within a single user interface, irrespective of its data type. A user may organize and understand content, discover relevant information, and act on it without regard to where the information resides or how it was created.
Abstract:
Automatically detected and identified tasks and calendar items from electronic communications may be populated into one or more tasks applications and calendaring applications. Text content retrieved from one or more electronic communications may be extracted and parsed for determining whether keywords or terms contained in the parsed text may lead to a classification of the text content or part of the text content as a task. Identified tasks may be automatically populated into a tasks application. Similarly, text content from such sources may be parsed for keywords and terms that may be identified as indicating calendar items, for example, meeting requests. Identified calendar items may be automatically populated into a calendar application as a calendar entry.