-
公开(公告)号:US20230409545A1
公开(公告)日:2023-12-21
申请号:US17845683
申请日:2022-06-21
Applicant: VMware, Inc.
Inventor: Abhishek GUPTA , Christos KARAMANOLIS , Richard P. SPILLANE , Marin NOZHCHEV
CPC classification number: G06F16/219 , G06F16/2219
Abstract: A version control interface provides for time travel with metadata management under a common transaction domain as the data. Examples generate a time-series of master branch snapshots for data objects stored in a data lake, with the snapshot comprising a tree data structure such as a hash tree and associated with a time indication. Readers select a master branch snapshot from the time-series, based on selection criteria (e.g., time) and use references in the selected master branch snapshot to read data objects from the data lake. This provides readers with a view of the data as of a specified time.
-
公开(公告)号:US20210064581A1
公开(公告)日:2021-03-04
申请号:US16552976
申请日:2019-08-27
Applicant: VMware, Inc.
Inventor: Wenguang WANG , Junlong GAO , Marcos K. AGUILERA , Richard P. SPILLANE , Christos KARAMANOLIS , Maxime AUSTRUY
IPC: G06F16/174 , G06F16/13 , G06F16/172
Abstract: The present disclosure provides techniques for deduplicating files. The techniques include creating a cache or subset of a large data structure. The large data structure organizes information by random hash values. The random hash values result in a random organization of information within the data structure, with the information spanning a large number of storage blocks within a storage system. The cache, however, is within memory and is small relative to the data structure. The cache is created so as to contain information that is likely to be needed during deduplication of a file. Having needed information within memory rather than in storage results in faster read and write operations to that information, improving the performance of a computing system.
-
公开(公告)号:US20210064579A1
公开(公告)日:2021-03-04
申请号:US16552908
申请日:2019-08-27
Applicant: VMware, Inc.
Inventor: Wenguang WANG , Junlong GAO , Marcos K. AGUILERA , Richard P. SPILLANE , Christos KARAMANOLIS , Maxime AUSTRUY
IPC: G06F16/174 , G06F16/14
Abstract: Disclosed techniques include deduplication. Techniques include determining whether a file is unique, and depending on whether the file is unique, deduplicating only part of the file or the entire file. The techniques include processing the first chunk of a file to determine whether the hash of the chunk hash is already within a chunk hash table, and if not, then a percentage of chunks of the file is similarly processed. If any of the hashes of chunks are already in the chunk hash table, then at least some of file has been previously deduplicated, and file is not unique the storage system. If none of the processed chunks have a hash that is already in the chunk hash table, then the file is considered to be unique within chunk store and only a partial percentage of the file's chunks are deduplicated. Not all of a unique file's chunks are deduplicated.
-
公开(公告)号:US20210064522A1
公开(公告)日:2021-03-04
申请号:US16552954
申请日:2019-08-27
Applicant: VMware, Inc.
Inventor: Wenguang WANG , Junlong GAO , Marcos K. AGUILERA , Richard P. SPILLANE , Christos KARAMANOLIS , Maxime AUSTRUY
Abstract: The present disclosure provides techniques for deallocating previously allocated storage blocks. The techniques include obtaining a list of chunk IDs to analyze, choosing a chunk ID, and determining the storage blocks spanned by the chunk corresponding to the chosen chunk ID. The technique further includes determining whether any file references any storage blocks spanned by the chunk. The determining may be performed by comparing an internal reference count to a total reference count, where the internal reference count is the number of reference to the storage block by a chunk ID data structure. If no files reference any of the storage blocks spanned by the chunk, then all the storage blocks of the chunk can be deallocated.
-
公开(公告)号:US20150058384A1
公开(公告)日:2015-02-26
申请号:US14010293
申请日:2013-08-26
Applicant: VMware, Inc.
Inventor: Christos KARAMANOLIS , Soam VASANI
IPC: G06F17/30
CPC classification number: G06F17/30194 , G06F17/30233 , G06F17/30283
Abstract: Techniques are disclosed for providing a file system interface for an object store intended to support simultaneous access to objects stored in the object store by multiple clients. In accordance with one method, an abstraction of a root directory to a hierarchical namespace for the object store is exposed to clients. The object store is backed by a plurality of physical storage devices housed in or directly attached to the plurality of host computers and internally tracks its stored objects using a flat namespace that maps unique identifiers to the stored objects. The creation of top-level objects appearing as subdirectories of the root directory is enabled, wherein each top-level object represents a separate abstraction of a storage device having a separate namespace that can be organized in accordance with any designated file system.
Abstract translation: 公开了用于提供用于支持同时访问由多个客户端存储在对象存储中的对象的对象存储库的文件系统接口的技术。 根据一种方法,将根目录抽象到对象存储的分层命名空间将暴露给客户端。 对象存储由容纳在多个主机计算机中或直接附接到多个主机计算机的多个物理存储设备支持,并且使用将唯一标识符映射到所存储的对象的平面命名空间来内部跟踪其存储的对象。 启用显示为根目录子目录的顶级对象的创建,其中每个顶级对象表示具有可根据任何指定文件系统组织的单独命名空间的存储设备的单独抽象。
-
公开(公告)号:US20230385265A1
公开(公告)日:2023-11-30
申请号:US17827795
申请日:2022-05-30
Applicant: VMware, Inc.
Inventor: Christos KARAMANOLIS , Abhishek GUPTA , Richard P. SPILLANE , Marin NOZHCHEV
CPC classification number: G06F16/2365 , G06F16/2282 , G06F11/1435
Abstract: A version control interface provides for accessing a data lake with transactional semantics. Examples generate a plurality of tables for data objects stored in the data lake. The tables each comprise a set of name fields and map a space of columns or rows to a set of the data objects. Transactions read and write data objects and may span a plurality of tables with properties of atomicity, consistency, isolation, durability (ACID). Performing the transaction comprises: accumulating transaction-incomplete messages, indicating that the transaction is incomplete, until a transaction-complete message is received, indicating that the transaction is complete. Upon this occurring, a master branch is updated to reference the data objects according to the transaction-incomplete messages and the transaction-complete message. Tables may be grouped into data groups that provide atomicity boundaries so that different groups may be served by different master branches, thereby improving the speed of master branch updates.
-
公开(公告)号:US20210271524A1
公开(公告)日:2021-09-02
申请号:US17321299
申请日:2021-05-14
Applicant: VMware, Inc.
Inventor: Christos KARAMANOLIS , William EARL , Mansi SHAH , Nathan BURNETT
IPC: G06F9/50
Abstract: Embodiments presented herein techniques for balancing a multidimensional set of resources of different types within a distributed resources system. Each host computer providing the resources publishes a status on current resource usage by guest clients. Upon identifying a local imbalance, the host computer determines a source workload to migrate to or from the resources container to minimize the variance in resource usage. Additionally, when placing a new resource workload, the host computer selects a resources container that minimizes the variance to further balance resource usage.
-
公开(公告)号:US20210064580A1
公开(公告)日:2021-03-04
申请号:US16552965
申请日:2019-08-27
Applicant: VMware, Inc.
Inventor: Junlong GAO , Wenguang WANG , Marcos K. AGUILERA , Richard P. SPILLANE , Christos KARAMANOLIS , Maxime AUSTRUY
IPC: G06F16/174 , G06F16/14 , G06F16/901
Abstract: The disclosure provides techniques for deduplicating files. The techniques include, upon creating or modifying a file, placing a logical timestamp of the current logical time, within a queue associated with the directory of the file. The techniques further include placing the logical timestamp within a queue of each parent directory of the directory of the file. To determine a set of files for deduplication, the techniques disclosed herein identify files that have been modified within a logical time range. The set of files modified within a logical time is identified by traversing directories of a storage system, the directories being organized within a tree structure. If a directory's queue does not contain a timestamp that is within the logical time range, then all child directories can be skipped over for further processing, such that no files within the child directories end up being within the set of files for deduplication.
-
9.
公开(公告)号:US20200371721A1
公开(公告)日:2020-11-26
申请号:US16988242
申请日:2020-08-07
Applicant: VMware, Inc.
Inventor: Christos KARAMANOLIS , Mansi SHAH , Nathan BURNETT
IPC: G06F3/06
Abstract: Techniques are described for storing a virtual disk in an object store comprising a plurality of physical storage devices housed in a plurality of host computers. A profile is received for creation of the virtual disk wherein the profile specifies storage properties desired for an intended use of the virtual disk. A virtual disk blueprint is generated based on the profile such that that the virtual disk blueprint describes a storage organization for the virtual disk that addresses redundancy or performance requirements corresponding to the profile. A set of the physical storage devices that can store components of the virtual disk in a manner that satisfies the storage organization is then determined.
-
公开(公告)号:US20190065062A1
公开(公告)日:2019-02-28
申请号:US15853202
申请日:2017-12-22
Applicant: VMware, Inc.
Inventor: Eric KNAUFT , Mansi SHAH , Jin ZHANG , Christian DICKMANN , Pascal RENAULD , Radhika VULLIKANTI , Christos KARAMANOLIS
Abstract: In a storage cluster having nodes, blocks of a logical storage space of a storage object are allocated flexibly by a parent node to component nodes that are backed by physical storage. The method includes maintaining a first allocation map for the parent node, and second and third allocation maps for the first and second component nodes, respectively, executing a first write operation on the first component node and updating the second allocation map to indicate that the first block is a written block, and upon detecting that the first component node is offline, executing a second write operation that targets a second block of the logical storage space, which is allocated to the first component node, on the second component node and updating the third allocation map to indicate that the second block is a written block.
-
-
-
-
-
-
-
-
-