Not exactly what I had to do, but i'll mark this as the answer anyway because it lead me down the right track to getting it sorted. What I had to do was similar, and i'll write out the exact steps I took in case anybody wants to see it in the future:
- I exported the data from PDM to an excel sheet, then copied the data in the sheet so I had two rows of it
- At this point I sorted my data so the duplicates would be removed properly - for me that was sorting by the date category in ascending order (so that it removes the oldest date duplicates, and only retains the newest one)
- With one of the rows I removed duplicate copies using the Remove Duplicates function within Excel
- I compared the two using a simple function from this website and highlight those that weren't in the duplicate removed row
- All highlighted files had their file locations run through a simple VBA script I had written to delete them using FileSystemObject
The vba script I used was very simple, but did the job properly for what I needed:
Private Sub CommandButton1_Click() Dim FileLocation As String Dim fso As New Scripting.FileSystemObject For i = 2 To 9445 FileLocation = Range("V" & i) If Range("W" & i) = 0 Then ElseIf Len(Dir(FileLocation)) = 0 Then Else fso.DeleteFile FileLocation End If Next End Sub
In this instance I had a total of 9445 parts after removing duplicates, so with row 1 being the header, the script starts at 2 and works its way through. In the row next to the file locations, there was either a 0 or 1 which denoted whether it should be deleted. It checks for a 0, with 0 meaning the file should not be deleted, then checks if the file doesn't exist using Len(Dir(xx)). If both of these return false, it will delete the file with fso.DeleteFile. However if either of them return true, because there is no function following them it will simply jump to "End If".
In order for FileSystemObject to work, you need ensure that the Microsoft Scripting Runtime reference is enabled inside VBA, found under Tools > References.