deduplicating with related columns
Example
Example1
There is an Excel file Book1.xlsx, and part of the data is as follows:
Remove data with duplicate id and name. If there is a non-empty name with the same id, the data with empty name will also be deleted. The results are as follows:
Write SPL script:
A1 Read excel file
A2 Group by id, deduplicate by name in the group, after deduplication, if there are more than two data in the group, then filter out the data with non-empty name, otherwise do not filter and merge the results of each group
A3 Export results to result.xlsx
Example2
There is an Excel file Book1.xlsx, and part of the data is as follows:
Remove duplicates (the column order is irrelevant), the results are as follows:
Write SPL script:
A1 Read excel file
A2 Sort each row of data, group to remove duplicates, and take the first data in the group
A3 Export results to result.xlsx