• 0

[C++] Test if two files are the same?


Question

I'm not looking to test if two files are duplicates, I know you can do that by doing a byte by byte comparison, or hashing both files.

What I'm looking for is a way to test if both files are literally the same file. For example:

c:\documents and settings\bla.txt

c:\docume~1\bla.txt

When comparing the strings, those might be seen as two different files, when really they're the same file. I could convert both strings to short file names, but I'm not sure if Windows has other ways of linking files to eachother.

In brief, I need a foolproof way to test if two files are the same file or different.

Link to comment
https://www.neowin.net/forum/topic/349791-c-test-if-two-files-are-the-same/
Share on other sites

21 answers to this question

Recommended Posts

  • 0

Well i can just give you puesudo code:

Read both files and store the text inside, into two seperate strings

--> If string 1 == string 2

Do what you want to happen

You will have to check yourself, or let someone else find out how to read the files, since i haven't learned about reading files yet.

  • 0

So your question is how are you supposed to see if two files are identical without comparing contents of them!?

Btw, love your avatar - reminds me of when I was creating demos in DOS and I made effect that looked exactly as your avatar... heh..

  • 0

so let me see if i'm getting the question.

u have 1 file but have 2 different strings for the paths:

as in

Path 1 is C:\Docs\abc.txt

Path 2 is D:\Prog\Desktop\abc.txt

put the two paths in 2 string arrays. parse the arrays backwards till u reach the '\', from that point on again go forward till the end of the array and store this in 2 new arrays.

now compare these 2 arrays to chk if they are the same.

  • 0
  df_dukkar said:
so let me see if i'm getting the question.

u have 1 file but have 2 different strings for the paths:

as  in

Path 1 is C:\Docs\abc.txt

Path 2 is D:\Prog\Desktop\abc.txt

put the two paths in 2 string arrays. parse the arrays backwards till u reach the '\', from that point on again go forward till the end of the array and store this in 2 new arrays.

now compare these 2 arrays to chk if they are the same.

586278450[/snapback]

He wants to see if the files are the same, i.e. they contain the same stuff, but for some reason he doesn't want to open the file or uses hashes like an MD5sum.

  • 0

Sorry, let me reexplain :pinch:

c:\documents and settings\file.txt

c:\docume~1\file.txt

Technically, both are the same file, but doing a string comparison would say differently.

I have a function which has two parameters, inputfile and outputfile. If the outputfile is different than the inputfile, it will truncate the outputfile. But if the inputfile and outputfile are the same file, then it'll overwrite the current file.

I already mentioned that I could convert both strings to a short filename, but I'm unsure if there are other things to consider. For example, %systemroot%.

So I need a "foolproof" way of testing if the two strings both literally represent the same file. If converting to a short filename would be adequate, could someone give me a function that can do this? I was unable to find anything on Google or MSDN :blush:

And sorry for not being clearer in my first post :(

Btw, love your avatar - reminds me of when I was creating demos in DOS and I made effect that looked exactly as your avatar... heh..

My avatar is a XOR effect, (X ^ Y) ;)

Edit: Just to be absolutely sure I don't confuse anyone again...

I'm NOT looking to compare the contents of files.

I'm only looking to see if two file paths are the same.

eg. c:\docume~1\file.txt IS c:\documents and settings\file.txt. Only difference is one is the short file name and the other is the long file name.

Edited by xinok
  • 0

I dont really get why you would like to do that since they are always the same. That means instead of writing

cd "c:\documents and settings"

you can always use

cd c:\docume~1\

But if you really want to do that then it's only matter of stripping strings.

Reason that it stands docume~1 is that DOS cannot handle filenames larger than 8 characters, so "documents and settings" became "docume~1".

My recomendation is that you have some kind of translation table that translates docume~1 to its full name or use full path as input and strip it down to 6 characters and add "~1" to it.

  • 0
  xinok said:
but I'm not sure if Windows has other ways of linking files to eachother.

586278291[/snapback]

unfortunately windows does not have symlinks like in unix - this is how i understand link of similar (the same) files.

maybe i did not understand your post correctly, but if you want it not as part of some school assignment, what's the problem using 'comp' command?

i have it in the sendTo menu..

  • 0

NTFS supports both hard and soft links like ext2; these features are largely underused though.

There is an api function called GetShortPathName that will deal with 8.3 vs long format. The best way to detect all links/short/long would be to check which directory entry the directories ultimately point to. I think this approach would work only on a per-file-system basis (e.g., you need diff code for FAT32 and NTFS).

  • 0

Jayzee: I know what short file names are.

Andareed: Thanks for the GetShortPathName function

robotnic: I found this: CreateHardLink. It doesn't seem to exist in VC++ 6 though. I was hoping to create a hard link file, see what I can find out about them.

Taken from here:

A hard link to a file is indistinguishable from the original name for the file; there's no particular link that is more the "real name" for the file than any other.

I guess there isn't much I can do about hard link files.

Off Topic

Something thats just sort of bugging me, whats with all the typedef's and #define's in the C++ headers?

typedef LPCSTR LPCTSTR; typedef CONST CHAR *LPCSTR, *PCSTR; #define CONST const, typedef char CHAR; etc.

I really don't see the point. It just makes C++ harder to learn trying to memorize all these "types", and overall makes code harder to read if you don't know what a certain typedef or define is.

  • 0

I've actually written 2 apps for creating soft and hard links. If anyone wants, I can post them.

If you use CreateHardLink, you'll probably need to install the platform sdk and change the vc++ includes/libs directories. Interestingly, there is a new CreateSymbolicLink on msdn that only works with longhorn.

@xinok: these namings are generally acronyms. LPCTSTR means Long Pointer Const null-Terminating STRing. Another one is TCHAR, which is CHAR on ANSI and WCHAR on UNICODE. CHAR is char because win32 uses all caps for structures and primitives.

  • 0
  Andareed said:
I've actually written 2 apps for creating soft and hard links. If anyone wants, I can post them.

586280364[/snapback]

I'd appreciate it if I can get those apps :yes: And thanks ahead of time.
  • 0

Alright, I was able to solve this little riddle. First, creating a hard link file can be done from a command in windows:

fsutil hardlink create c:\output.txt c:\input.txt

Now testing for duplicates (I already tried this, it also locks hard link files):

We have the inputfile and outputfile

First, open the inputfile, but deny access to all other processes (lock the file)

Now try opening the outputfile. If it succeeds, the files are different. If it fails, continue...

Unlock the inputfile, and try opening the outputfile again. If it succeeds this time, then the files are the same. If it fails again, something else is wrong so return an error. :)

  • 0

Couldn't find softlink app but I based it on the code from here: http://www.codeproject.com/w2k/junctionpoints.asp

The hardlink app just uses CreateHardLink. There is also a sysinternals tool with source called junction: http://www.sysinternals.com/Utilities/Junction.html

  • 0
  df_dukkar said:
could u tell me how to lock down a file ??

586282181[/snapback]

I used the OpenFile function.

#include <windows.h>

char* file = "file.dat";

OFSTRUCT fileinfo;

long handle = OpenFile(file, &fileinfo, OF_SHARE_EXCLUSIVE);

  • 0
  Quote
PIDL's are only used in the shell.

That doesn't mean that you can still use them, if theu turn out to be usefull for your purpose.

You're right about hard and soft links tho. Maybe it's possible to use one of those Nt* API's to determine the file-block they 'link' too.

This topic is now closed to further replies.
  • Recently Browsing   0 members

    • No registered users viewing this page.
  • Posts

    • We went from a Latitude 7450 to Dell Pro 14 Plus PB14250
    • EA Sports UFC 5 is free-to-play on Xbox Series X|S this weekend by Pulasthi Ariyasinghe Every Thursday Microsoft brings in a fresh slate of games for Xbox gamers to try over the weekend as a part of its Free Play Days promotion. The latest refresh that arrived today only revealed a single title for the weekend, and that's EA Sports UFC 5. This is a special occasion as well. Unlike the standard events, this one does not require any tier of Game Pass to jump in. All Xbox Series X|S owners can play the game over the weekend, even if they do not have a Core subscription. As always, any progress made during the weekend also carries over automatically if you decide to purchase the game afterward. Coming in from EA Sports, UFC 5 is the latest entry in the long-running mixed martial arts fighting games. This installment landed in 2023 for the latest generation consoles. Built on DICE's Frostbite Engine for the first time, the title event received an M rating by the ESRB for its realistic injury system, as EA wanted to make a more authentic experience akin to the real-world sport. The damage system, dubbed the Real Impact System, involves depicting blood and sweat on fighters as well as facial injuries. Having injuries like bruised eyes or broken noses can even affect the performance of a fighter due to impaired vision or breathing difficulty. A doctor stoppage may even happen if the injuries become too severe. UFC 5 offers a single-player career mode to become the top-ranked fighter, online modes for competitive action, and special events tied to real-world fights. Here are the store links for the game: UFC 5 - $27.99 (Xbox Series X|S) UFC 5 Deluxe Edition - $31.99 (Xbox Series X|S) UFC 5 Ultimate Edition - $89.99 (Xbox Series X|S) This Free Play Days promotion will end on Sunday, June 29, at 11:59 pm PT. Following this, expect another round of games to enter the program next Thursday, July 3.
    • Windows 11 KB5060826 optional update brings better Setup, new data migration tool, more by Sayan Sen Alongside KB5060829 for Windows 11 24H2, Microsoft has also released its C-release non-security preview update for Windows 11 23H2 and 22H2 today. The new update is rolling out under KB5060826 (builds 22621.5549 and 22631.5549). KB5060826 brings an upgraded Setup which will now let admins choose whether to deploy critical updates during the OOBE (the initial out-of-box experience setup on Windows). Alongside that, it also adds the new PC migration experience on Windows Backup, as well as the default app changes related to the European Economic Area (EEA) region. The full changelog is given below: First up we have the changes and improvements for Gradual Rollout: [App defaults] New! We are rolling out some small changes in the European Economic Area (EEA) region for default browsers through the Set default button in Settings > Apps > Default apps: Additional file and link types will be set for the new default browser, if it registers them. The new default browser will be pinned to the Taskbar and Start menu unless you choose not to pin it by clearing the checkboxes. There is now a separate one-click button for browsers to change your .pdf default, if the browser registers for the .pdf file type. [PC Migration] New! The PC-to-PC migration experience in Windows is starting to roll out. You’ll begin to see the landing and the pairing page in the Windows Backup app, giving you a first look at what’s coming. In the full experience, you’ll be able to transfer files and settings from an old PC to a new one during setup. Support for this feature during PC setup will arrive in a future update. The rollout is being introduced in phases to support a smooth experience. [Windows Share] ​​​​​​​New! When you share links or web content using the Windows share window, you will see a visual preview for that content. Up next, we have Normal Rollout features: [Windows Setup] New! Admins can configure whether a new device gets critical updates during the out-of-box experience (OOBE). The update has a single known issue related to Noto fonts. You can view the support article here on Microsoft's official website. The update can be obtained by clicking on "Check for optional updates" inside Windows Update. You can also download it manually from Microsoft Update Catalog website.
    • American? Yer whole 120v limitation is madness.  You can still get low powered pcs , nice, that'll do the job 
    • People want a resizable taskbar, not tiny icons on a fat one. Microsoft, you've done it again.
  • Recent Achievements

    • One Month Later
      jfam earned a badge
      One Month Later
    • First Post
      TheRingmaster earned a badge
      First Post
    • Conversation Starter
      Kavin25 earned a badge
      Conversation Starter
    • One Month Later
      Leonard grant earned a badge
      One Month Later
    • Week One Done
      pcdoctorsnet earned a badge
      Week One Done
  • Popular Contributors

    1. 1
      +primortal
      571
    2. 2
      ATLien_0
      186
    3. 3
      +FloatingFatMan
      177
    4. 4
      Michael Scrip
      144
    5. 5
      Xenon
      116
  • Tell a friend

    Love Neowin? Tell a friend!