Galahtech

Development => Desktop Application Development => Topic started by: wrack on June 11, 2014, 05:25:58 am



Title: Duplicate File Finder
Post by: wrack on June 11, 2014, 05:25:58 am
G'day :)
 
I have been thinking of doing this for a very long time. I wanted to write a small utility which I can use to find duplicate files from a drive. I am sure you can find one online too but for me this is a fun project.
 
You can grab the latest version from http://www.codelake.com/downloads/DuplicateFileFinder.zip
 
Created in Visual Studio 2012, .NET Framework 4.5.1 using C#
 
I will keep adding more features to it as I find time. Please let me know if you have anything specific in mind and I shall try to fit it in in the development schedule.
 
Cheers :)


Title: Re: Duplicate File Finder
Post by: Jason Reed on June 11, 2014, 05:53:19 pm
Cool! Are you actually comparing content or are you just doing a hash check?


Title: Re: Duplicate File Finder
Post by: wrack on June 11, 2014, 08:24:43 pm
Cool! Are you actually comparing content or are you just doing a hash check?
MD5 :D


Title: Re: Duplicate File Finder
Post by: simmo on June 11, 2014, 08:31:10 pm
I've been meaning to run something like this on my "backup" drive as I suspect there are a couple folders I've saved twice (and have a few bookmarked, just never tried). I ran your app and it says no matches, so maybe I don't have any :o

(also, the popup that said no matches was no where to be found, only on task bar and I could not do anything with either window except right click on taskbar and "close window.")


Title: Re: Duplicate File Finder
Post by: Jason Reed on June 11, 2014, 09:17:19 pm
I use Linux so I only need to open a terminal session and enter
Code:
find -not -empty -type f -printf "%s\n" | sort -rn | uniq -d | xargs -I{} -n1 find -type f -size {}c -print0 | xargs -0 md5sum | sort | uniq -w32 --all-repeated=separate


Title: Re: Duplicate File Finder
Post by: wrack on June 12, 2014, 12:04:51 am
I've been meaning to run something like this on my "backup" drive as I suspect there are a couple folders I've saved twice (and have a few bookmarked, just never tried). I ran your app and it says no matches, so maybe I don't have any :o

(also, the popup that said no matches was no where to be found, only on task bar and I could not do anything with either window except right click on taskbar and "close window.")

Did you run a simple search or advanced search?

Weird behaviour with the dialog. I can't replicate it here. How many files are we talking about? I ran advanced search on a 36.9GB folder with 436639 files and 61814 subfolder and it took 1 hour to find me all duplicate files. Working on an updated version. Will post it soon.


Title: Re: Duplicate File Finder
Post by: wrack on June 12, 2014, 12:05:25 am
I use Linux so I only need to open a terminal session and enter
Code:
find -not -empty -type f -printf "%s\n" | sort -rn | uniq -d | xargs -I{} -n1 find -type f -size {}c -print0 | xargs -0 md5sum | sort | uniq -w32 --all-repeated=separate
Show off :horse:


Title: Re: Duplicate File Finder
Post by: wrack on June 12, 2014, 01:58:28 am
Ok new version is up. Get it from http://www.codelake.com/downloads/DuplicateFileFinder.1.2014.0612.0138.zip


Title: Re: Duplicate File Finder
Post by: simmo on June 13, 2014, 02:16:20 pm
Did you run a simple search or advanced search?
Simple.

Weird behaviour with the dialog. I can't replicate it here.
It might have to do with it being minimized while running. I typically have a LOT of things open at once (20 at the current moment) so when I "switch tasks" I typically hit [win key] + m to minimize errything then select what I want from the task bar, so my view isn't cluttered with windows I'm not currently working with.

How many files are we talking about? I ran advanced search on a 36.9GB folder with 436639 files and 61814 subfolder and it took 1 hour to find me all duplicate files.
About 65-7gb, your app says 743,610 files.

Working on an updated version. Will post it soon.
Trying it out now, will let you know :yes:


[edit1]
Simple search found no dups (which may be correct), same behavior with the minimized (or off screen?) dialog box.

Running adv search now.
[/edit1]

[edit2]
Adv search found over 65k duplicates :lol: same behavior with the minimized (or off screen?) dialog box.

[/edit2]



Title: Re: Duplicate File Finder
Post by: wrack on June 13, 2014, 10:28:30 pm
Ok. I am not able to fix the dialog issue but all that dialog says is how many files were found and you just have to click ok. If you have minimised the application and if the dialog box was opened then try to right click on the dialog form from the taskbar and click close.

Working on to fixing the issue. May have to come up with something better.


Title: Re: Duplicate File Finder
Post by: simmo on June 14, 2014, 01:56:53 am
If you have minimised the application and if the dialog box was opened then try to right click on the dialog form from the taskbar and click close.
I could not do anything with either window except right click on taskbar and "close window."
:jamie:

One thing you might consider is excluding "recycler" (or whatever that hidden recycle bin is called) a lot of my matches were in there.


Title: Re: Duplicate File Finder
Post by: wrack on June 14, 2014, 04:55:40 am
If you have minimised the application and if the dialog box was opened then try to right click on the dialog form from the taskbar and click close.
I could not do anything with either window except right click on taskbar and "close window."
:jamie:

One thing you might consider is excluding "recycler" (or whatever that hidden recycle bin is called) a lot of my matches were in there.
Yeah I found that too. I will modify the app to do just that.


Title: Re: Duplicate File Finder
Post by: wrack on June 17, 2014, 12:28:05 am
New version (1.2014.0617.0015) is up. Get it at http://www.codelake.com/downloads/DuplicateFileFinder.zip


* Make the minimum size of the folder tree smaller.
* Removed the final message dialog and now showing information in the status bar.
* Exclude Recycle Bin and System Volume Information folders while scanning.


Title: Re: Duplicate File Finder
Post by: simmo on June 17, 2014, 06:25:20 pm
Thanks! I'll try to get around to running it in the next few days and let you know how it goes  :jamie:


Title: Re: Duplicate File Finder
Post by: wrack on June 17, 2014, 11:28:48 pm
No problems :)

Let me know if you find anything nasty!


Title: Re: Duplicate File Finder
Post by: wrack on June 19, 2014, 06:18:40 am
New version (1.2014.0619.0547) is up.

Get it at http://www.codelake.com/downloads/DuplicateFileFinder.zip

* Reduced memory footprint while scanning large number of files especially when there are some filters in place.
* Better performance while using filters.
* Ability to sort the result list.


Title: Re: Duplicate File Finder
Post by: wrack on June 29, 2014, 08:40:00 am
New version (1.2014.0629.0832) is up.

Get it at http://www.codelake.com/downloads/DuplicateFileFinder.zip

* Add ability to include file extension so now you can search for specific file types such as JPG, PNG etc.


Title: Re: Duplicate File Finder
Post by: wrack on July 03, 2014, 03:00:59 am
New version (1.2014.0703.0246) is up.

Get it http://www.codelake.com/downloads/DuplicateFileFinder.zip

* Add option to open file location in the right click menu
* Add option to choose the combination of properties when performing simple search
* Add option of intelligent hash matching when performing advanced search
* Add option to specify normal file search filter
* Add option to specify regular expression file search filter
* Add option to just find files based on the filter options (Not looking for duplicates)
* Add smart help button and other cosmetic changes


Title: Re: Duplicate File Finder
Post by: wrack on July 03, 2014, 04:46:30 am
New version (1.2014.0703.0436) is up.

Get it http://www.codelake.com/downloads/DuplicateFileFinder.zip

* Improved performance using multithreading.


Title: Re: Duplicate File Finder
Post by: wrack on July 04, 2014, 12:03:29 am
I have found the multithreading to be unreliable and producing inconsistent results. For the time being, I am reverting that code change.

New version (1.2014.0703.2350) is up.

Get it http://www.codelake.com/downloads/DuplicateFileFinder.zip

* Removed multithreading due to reliability issues.
* Change the min and max size options and made them independent of one another.


Title: Re: Duplicate File Finder
Post by: Jason Reed on July 04, 2014, 04:27:57 am
That reminds me of an application I wrote year and years ago, in VB6 no less. Where I attempted to handle search with multithreading.

My trick was that for each initial path the user entered there would be one thread. It worked for the most part. The hard part was getting all the information from each thread back into the main thread for display.


Title: Re: Duplicate File Finder
Post by: wrack on July 06, 2014, 10:16:41 pm
My problem was it was building wrong number of files to begin with. A normal approach found 6002 files for example but the multithreaded approach found 6001 and sometimes 6002 and sometimes 6000. I can't publish that. Have to find a better way.


Title: Re: Duplicate File Finder
Post by: wrack on July 21, 2014, 02:06:50 am
New version (1.2014.0721.0103) is up.

Get it http://www.codelake.com/downloads/DuplicateFileFinder.zip

* Fix issues with min and max size options
* Fix issues with Intelligent Hash Matching scanning
* Add ability to select from multiple drives (Currently limited to 3 different drives)
* Ui and general code cleanup