Sampling from distributed streams of data

    公开(公告)号:US08458326B2

    公开(公告)日:2013-06-04

    申请号:US12826785

    申请日:2010-06-30

    CPC classification number: H04L43/04 H04L43/022 H04L43/12

    Abstract: The present disclosure is directed to systems, methods, and computer-readable storage media for sampling from distributed data streams. Data elements are received at site servers configured to collect and report data to a coordinator device. The site servers assign a binary string to each of the data elements. Each bit of the binary strings can be independently set to a 0 or a 1 with a probability of one half. The binary string is used to sample from the received data elements, and the data elements and/or the sampled data elements can be transmitted to a coordinator device. The coordinator device can examine one or more bits of the binary string to draw samples of the received data elements in accordance with desired probabilities.

    Automatic gleaning of semantic information in social networks
    2.
    发明授权
    Automatic gleaning of semantic information in social networks 有权
    自动收集社交网络中的语义信息

    公开(公告)号:US08386534B2

    公开(公告)日:2013-02-26

    申请号:US12290449

    申请日:2008-10-30

    CPC classification number: G06F17/3053 G06F17/30867

    Abstract: Disclosed are method and apparatus for identifying members of a social network who have a high likelihood of providing a useful response to a query. A query engine examines the personal pages of a set of members and automatically gleans semantic information relevant to the query. From the automatically-gleaned semantic information, a score indicative of the likelihood that the member may provide a useful response is calculated.

    Abstract translation: 公开了用于识别对查询有很高可能性的社交网络的成员的方法和装置。 查询引擎检查一组成员的个人页面,并自动收集与查询相关的语义信息。 从自动收集的语义信息中,计算指示成员可以提供有用响应的可能性的得分。

    METHODS AND APPARATUS FOR REPRESENTING PROBABILISTIC DATA USING A PROBABILISTIC HISTOGRAM
    4.
    发明申请
    METHODS AND APPARATUS FOR REPRESENTING PROBABILISTIC DATA USING A PROBABILISTIC HISTOGRAM 失效
    使用概率组织表示概率数据的方法和装置

    公开(公告)号:US20110145223A1

    公开(公告)日:2011-06-16

    申请号:US12636544

    申请日:2009-12-11

    CPC classification number: G06F17/30536

    Abstract: Methods and apparatus for representing probabilistic data using a probabilistic histogram are disclosed. An example method comprises partitioning a plurality of ordered data items into a plurality of buckets, each of the data items capable of having a data value from a plurality of possible data values with a probability characterized by a respective individual probability distribution function (PDF), each bucket associated with a respective subset of the ordered data items bounded by a respective beginning data item and a respective ending data item, and determining a first representative PDF for a first bucket associated with a first subset of the ordered data items by partitioning the plurality of possible data values into a first plurality of representative data ranges and respective representative probabilities based on an error between the first representative PDF and a first plurality of individual PDFs characterizing the first subset of the ordered data items.

    Abstract translation: 公开了使用概率直方图表示概率数据的方法和装置。 一种示例性方法包括将多个有序数据项划分成多个桶,每个数据项能够具有来自多个可能数据值的数据值,其特征在于各自的概率分布函数(PDF), 每个桶与由相应的开始数据项和相应的结束数据项限定的有序数据项的相应子集相关联,并且通过分割多个数据项来确定与有序数据项的第一子集相关联的第一个桶的第一代表性PDF 基于第一代表性PDF和表征有序数据项的第一子集的第一多个单独PDF之间的误差,将可能的数据值转换成第一多个代表性数据范围和相应的代表概率。

    FORWARD DECAY TEMPORAL DATA ANALYSIS
    5.
    发明申请
    FORWARD DECAY TEMPORAL DATA ANALYSIS 有权
    前向衰减时间数据分析

    公开(公告)号:US20110066600A1

    公开(公告)日:2011-03-17

    申请号:US12560214

    申请日:2009-09-15

    CPC classification number: G06F17/30551 G06F17/30289 G06F17/30516

    Abstract: A disclosed method for implementing time decay in the analysis of streaming data objects is based on the age, referred to herein as the forward age, of a data object measured from a landmark time in the past to a time associated with the occurrence of the data object, e.g., an object's timestamp. A forward time decay function is parameterized on the forward age. Because a data object's forward age does not depend on the current time, a value of the forward time decay function is determined just once for each data object. A scaling factor or weight associated with a data object may be weighted according to its decay function value. Forward time decay functions are beneficial in determining decayed aggregates, including decayed counts, sums, and averages, decayed minimums and maximums, and for drawing decay-influenced samples.

    Abstract translation: 用于在流数据对象的分析中实现时间衰减的公开方法基于从过去的地标时间测量到与数据的出现相关联的时间的数据对象的年龄(这里称为远期时间) 对象,例如对象的时间戳。 前进时间衰减函数在前进时间参数化。 因为数据对象的转发时间不依赖于当前时间,因此对于每个数据对象仅确定一次正向时间衰减函数的值。 可以根据其衰减函数值对与数据对象相关联的缩放因子或权重进行加权。 前向时间衰减函数有助于确定衰变的聚集体,包括衰变计数,总和和平均值,衰减最小值和最大值,以及绘制衰变影响样本。

    System and Method for Encoding a Signal Using Compressed Sensor Measurements
    6.
    发明申请
    System and Method for Encoding a Signal Using Compressed Sensor Measurements 有权
    使用压缩传感器测量编码信号的系统和方法

    公开(公告)号:US20090153379A1

    公开(公告)日:2009-06-18

    申请号:US12268157

    申请日:2008-11-10

    CPC classification number: H03M7/30

    Abstract: Described is a system and method for receiving a signal for transmission and encoding the signal into a plurality of linear projections representing the signal. The encoding includes defining a transform matrix. The transform matrix being defined by processing the signal using a macroseparation matrix, processing the signal using a microseparation matrix and processing the signal using an estimation vector.

    Abstract translation: 描述了一种用于接收用于传输信号并将信号编码成表示该信号的多个线性投影的信号的系统和方法。 编码包括定义变换矩阵。 通过使用宏分离矩阵处理信号来定义变换矩阵,使用微分离矩阵处理信号并使用估计向量处理该信号。

    LINK-BASED CLASSIFICATION OF GRAPH NODES
    7.
    发明申请
    LINK-BASED CLASSIFICATION OF GRAPH NODES 审中-公开
    基于链接的图表分类

    公开(公告)号:US20090132561A1

    公开(公告)日:2009-05-21

    申请号:US11943681

    申请日:2007-11-21

    CPC classification number: G06F16/958 G06F16/9024

    Abstract: A method of labeling unlabeled nodes in a graph that represents objects that have an explicit structure between them. A computing device can use a labeling engine to labeled nodes in a graph that are labeled and can identify an unlabeled node in the graph that is structurally associated with the labeled nodes. The labeling engine can label the unlabeled node with the label of the labeled node based on the structural association between the unlabeled node and the labeled node.

    Abstract translation: 在图中标记未标记节点的方法,该节点表示在它们之间具有明确结构的对象。 计算设备可以使用标记引擎来标记图中的标记节点,并且可以标识图中与标记节点结构相关联的未标记节点。 标签引擎可以基于未标记节点和标记节点之间的结构关联来标记带有标记节点的标签的未标记节点。

    METHOD AND APPARATUS FOR PROVIDING REAL FRIENDS COUNT
    8.
    发明申请
    METHOD AND APPARATUS FOR PROVIDING REAL FRIENDS COUNT 有权
    提供真实朋友的方法和设备

    公开(公告)号:US20090083418A1

    公开(公告)日:2009-03-26

    申请号:US12233213

    申请日:2008-09-18

    CPC classification number: H04L67/306 G06Q10/10 G06Q30/02 H04L51/00

    Abstract: A method and apparatus for tracking communications in a network are disclosed. For example, the method receives a subscription from a customer for a service to track at least one variable associated with a plurality of communicants of the customer. The method identifies a plurality of members of a social network of the customer, and gathers communication data associated with the plurality of members for tracking the at least one variable. The method then displays at least one result derived from the communication data to the customer.

    Abstract translation: 公开了一种用于跟踪网络中的通信的方法和装置。 例如,该方法从客户接收服务的订阅以跟踪与客户的多个通信者相关联的至少一个变量。 该方法识别客户的社交网络的多个成员,并收集与多个成员相关联的通信数据,以跟踪至少一个变量。 该方法然后将至少一个从通信数据导出的结果显示给客户。

    Efficient publication of sparse data
    9.
    发明授权
    Efficient publication of sparse data 有权
    有效发布稀疏数据

    公开(公告)号:US09251216B2

    公开(公告)日:2016-02-02

    申请号:US13111154

    申请日:2011-05-19

    CPC classification number: G06F17/30522 G06F21/6254

    Abstract: The present disclosure is directed to systems, methods, and computer-readable storage media for publishing data. A data summary summarizing the data can be generated and published according to several publishing schemes. In some embodiments, non-zero entries are selected and modified and zero entries are sampled according to one or more distribution functions. The sampled and modified values are added to a data summary, or a sample of the sampled and modified values are added to the data summary. The data summary is published, released, used, or otherwise output. In other embodiments, priority values are assigned to each value associated with the data, and a number of entries with the highest values are selected and added to the data summary.

    Abstract translation: 本公开涉及用于发布数据的系统,方法和计算机可读存储介质。 总结数据的数据摘要可以根据多个发布方案生成和发布。 在一些实施例中,选择和修改非零条目,并根据一个或多个分配函数对零条目进行采样。 将采样和修改的值添加到数据摘要中,或将采样和修改的值的样本添加到数据摘要中。 数据摘要已发布,发布,使用或以其他方式输出。 在其他实施例中,将优先级值分配给与数据相关联的每个值,并且选择具有最高值的多个条目并将其添加到数据摘要。

    Generating minimality-attack-resistant data
    10.
    发明授权
    Generating minimality-attack-resistant data 有权
    生成最低限度的抗攻击数据

    公开(公告)号:US08631500B2

    公开(公告)日:2014-01-14

    申请号:US12825466

    申请日:2010-06-29

    CPC classification number: G06F21/6254

    Abstract: The present disclosure is directed to systems, methods, and computer-readable storage media for generating data and data sets that are resistant to minimality attacks. Data sets having a number of tuples are received, and the tuples are ordered according to an aspect of the tuples. The tuples can be split into groups of tuples, and each of the groups may be analyzed to determine if the group complies with a privacy requirement. Groups that satisfy the privacy requirement may be output as new data sets that are resistant to minimality attacks.

    Abstract translation: 本公开涉及用于生成抵御最低限度攻击的数据和数据集的系统,方法和计算机可读存储介质。 接收到具有多个元组的数据集,并且元组根据元组的一个方面被排序。 元组可以分为组元组,并且可以分析每个组以确定组是否符合隐私要求。 满足隐私要求的组可以作为抵抗最低限度攻击的新数据集输出。

Patent Agency Ranking