I have a csv file of orders from a business I run the web shop for, around 30 thousand entries total. I would like to FIND all entries which have a duplicate in the base: same customer, same total due amount and placed on the same day (some customers make repeating orders but over longer timeframe).

I found a help article regarding removing duplicate values, but it doesn’t really apply to my situation, because I want to remove UNIQUE values.

How would I go about that?

  • nudny ekscentryk@szmer.infoOP
    link
    fedilink
    arrow-up
    1
    ·
    edit-2
    1 year ago

    what if I sorted the orders by names and then for each one check if the one above and the one below it have the same name, date and amount due using 3 columns of IFs, and then filter out those which meet all three of these criteria by multiplying the outputs of IFs in another column? that should work I think? the only problem is last step filtering may fuck up the existing IF functions