US 12,468,686 B2
Annotating datasets without redundant copying
George Aleksandrovich, Hoffman Estates, IL (US); Allie K. Watfa, Urbana, IL (US); Robin Sahner, Urbana, IL (US); and Mike Pippin, Sunnyvale, CA (US)
Assigned to YAHOO ASSETS LLC, New York, NY (US)
Filed by YAHOO ASSETS LLC, New York, NY (US)
Filed on Apr. 21, 2023, as Appl. No. 18/304,795.
Application 18/304,795 is a continuation of application No. 16/727,096, filed on Dec. 26, 2019, granted, now 11,650,977.
Prior Publication US 2023/0252021 A1, Aug. 10, 2023
Int. Cl. G06F 16/23 (2019.01); G06F 7/08 (2006.01)
CPC G06F 16/2379 (2019.01) [G06F 7/08 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A method comprising:
reading, by a device, a dataset comprising data stored in an unordered manner in a first set of columns and a first set of rows in a database;
determining, by the device, an attribute for the data of the dataset;
partitioning, by the device, the data based on the determined attribute;
reordering, by the device, the partitioned data based on boundaries of each partition of the data, the reordering performed via a stripe-based alignment of the partitioned data that results in a reduction of a quantity of data files associated with the dataset;
modifying, by the device, the dataset by flattening the reordered data; and
storing, by the device, the modified dataset in the database.