| CPC G06F 16/215 (2019.01) [G06F 16/2255 (2019.01)] | 14 Claims |

|
1. A computer-implemented method, executed on a computing device, comprising:
receiving a user data page for storing in a storage system;
generating an unaligned hash representation of the received user data page;
identifying the generated unaligned hash representation within a hash table, thus defining an identified hash representation;
identifying a user data page associated with the identified hash representation, thus defining an identified user data page;
comparing a hash offset reference of the identified user data page with a hash offset reference of the received user data page, wherein the hash offset reference is a deterministically defined anchor point in the user data page configured to indicate where to start performing a hash operation on the user data page, wherein the hash offset reference of the identified user data page and the hash offset reference of the received user data page are defined by generating a hash for each byte of each user data page and deterministically selecting a particular byte to be the hash offset reference for each user data page;
performing a deduplication operation on the received user data page based upon, at least in part, the comparison of the hash offset reference of the identified user data page with the hash offset reference of the received user data page, wherein performing the deduplication operation includes performing an unaligned deduplication operation when the hash offset reference of the identified user data page and the hash offset reference of the received user data page are nonequivalent, wherein the unaligned deduplication operation includes comparing portions of data before and after the hash offset reference in the identified user data page to determine if they are equivalent, wherein the unaligned deduplication operation considers any length or amount of data before or after the hash offset reference within the identified user data page; and
dynamically maintaining the hash table for performing the deduplication operation on the user data page stored within the storage system by invalidating hash representations from the hash table when corresponding user data pages are removed from the storage system.
|