• 0

[C++] Test if two files are the same?


Question

I'm not looking to test if two files are duplicates, I know you can do that by doing a byte by byte comparison, or hashing both files.

What I'm looking for is a way to test if both files are literally the same file. For example:

c:\documents and settings\bla.txt

c:\docume~1\bla.txt

When comparing the strings, those might be seen as two different files, when really they're the same file. I could convert both strings to short file names, but I'm not sure if Windows has other ways of linking files to eachother.

In brief, I need a foolproof way to test if two files are the same file or different.

Link to comment
https://www.neowin.net/forum/topic/349791-c-test-if-two-files-are-the-same/
Share on other sites

21 answers to this question

Recommended Posts

  • 0

Well i can just give you puesudo code:

Read both files and store the text inside, into two seperate strings

--> If string 1 == string 2

Do what you want to happen

You will have to check yourself, or let someone else find out how to read the files, since i haven't learned about reading files yet.

  • 0

So your question is how are you supposed to see if two files are identical without comparing contents of them!?

Btw, love your avatar - reminds me of when I was creating demos in DOS and I made effect that looked exactly as your avatar... heh..

  • 0

so let me see if i'm getting the question.

u have 1 file but have 2 different strings for the paths:

as in

Path 1 is C:\Docs\abc.txt

Path 2 is D:\Prog\Desktop\abc.txt

put the two paths in 2 string arrays. parse the arrays backwards till u reach the '\', from that point on again go forward till the end of the array and store this in 2 new arrays.

now compare these 2 arrays to chk if they are the same.

  • 0
  df_dukkar said:
so let me see if i'm getting the question.

u have 1 file but have 2 different strings for the paths:

as  in

Path 1 is C:\Docs\abc.txt

Path 2 is D:\Prog\Desktop\abc.txt

put the two paths in 2 string arrays. parse the arrays backwards till u reach the '\', from that point on again go forward till the end of the array and store this in 2 new arrays.

now compare these 2 arrays to chk if they are the same.

586278450[/snapback]

He wants to see if the files are the same, i.e. they contain the same stuff, but for some reason he doesn't want to open the file or uses hashes like an MD5sum.

  • 0

Sorry, let me reexplain :pinch:

c:\documents and settings\file.txt

c:\docume~1\file.txt

Technically, both are the same file, but doing a string comparison would say differently.

I have a function which has two parameters, inputfile and outputfile. If the outputfile is different than the inputfile, it will truncate the outputfile. But if the inputfile and outputfile are the same file, then it'll overwrite the current file.

I already mentioned that I could convert both strings to a short filename, but I'm unsure if there are other things to consider. For example, %systemroot%.

So I need a "foolproof" way of testing if the two strings both literally represent the same file. If converting to a short filename would be adequate, could someone give me a function that can do this? I was unable to find anything on Google or MSDN :blush:

And sorry for not being clearer in my first post :(

Btw, love your avatar - reminds me of when I was creating demos in DOS and I made effect that looked exactly as your avatar... heh..

My avatar is a XOR effect, (X ^ Y) ;)

Edit: Just to be absolutely sure I don't confuse anyone again...

I'm NOT looking to compare the contents of files.

I'm only looking to see if two file paths are the same.

eg. c:\docume~1\file.txt IS c:\documents and settings\file.txt. Only difference is one is the short file name and the other is the long file name.

Edited by xinok
  • 0

I dont really get why you would like to do that since they are always the same. That means instead of writing

cd "c:\documents and settings"

you can always use

cd c:\docume~1\

But if you really want to do that then it's only matter of stripping strings.

Reason that it stands docume~1 is that DOS cannot handle filenames larger than 8 characters, so "documents and settings" became "docume~1".

My recomendation is that you have some kind of translation table that translates docume~1 to its full name or use full path as input and strip it down to 6 characters and add "~1" to it.

  • 0
  xinok said:
but I'm not sure if Windows has other ways of linking files to eachother.

586278291[/snapback]

unfortunately windows does not have symlinks like in unix - this is how i understand link of similar (the same) files.

maybe i did not understand your post correctly, but if you want it not as part of some school assignment, what's the problem using 'comp' command?

i have it in the sendTo menu..

  • 0

NTFS supports both hard and soft links like ext2; these features are largely underused though.

There is an api function called GetShortPathName that will deal with 8.3 vs long format. The best way to detect all links/short/long would be to check which directory entry the directories ultimately point to. I think this approach would work only on a per-file-system basis (e.g., you need diff code for FAT32 and NTFS).

  • 0

Jayzee: I know what short file names are.

Andareed: Thanks for the GetShortPathName function

robotnic: I found this: CreateHardLink. It doesn't seem to exist in VC++ 6 though. I was hoping to create a hard link file, see what I can find out about them.

Taken from here:

A hard link to a file is indistinguishable from the original name for the file; there's no particular link that is more the "real name" for the file than any other.

I guess there isn't much I can do about hard link files.

Off Topic

Something thats just sort of bugging me, whats with all the typedef's and #define's in the C++ headers?

typedef LPCSTR LPCTSTR; typedef CONST CHAR *LPCSTR, *PCSTR; #define CONST const, typedef char CHAR; etc.

I really don't see the point. It just makes C++ harder to learn trying to memorize all these "types", and overall makes code harder to read if you don't know what a certain typedef or define is.

  • 0

I've actually written 2 apps for creating soft and hard links. If anyone wants, I can post them.

If you use CreateHardLink, you'll probably need to install the platform sdk and change the vc++ includes/libs directories. Interestingly, there is a new CreateSymbolicLink on msdn that only works with longhorn.

@xinok: these namings are generally acronyms. LPCTSTR means Long Pointer Const null-Terminating STRing. Another one is TCHAR, which is CHAR on ANSI and WCHAR on UNICODE. CHAR is char because win32 uses all caps for structures and primitives.

  • 0
  Andareed said:
I've actually written 2 apps for creating soft and hard links. If anyone wants, I can post them.

586280364[/snapback]

I'd appreciate it if I can get those apps :yes: And thanks ahead of time.
  • 0

Alright, I was able to solve this little riddle. First, creating a hard link file can be done from a command in windows:

fsutil hardlink create c:\output.txt c:\input.txt

Now testing for duplicates (I already tried this, it also locks hard link files):

We have the inputfile and outputfile

First, open the inputfile, but deny access to all other processes (lock the file)

Now try opening the outputfile. If it succeeds, the files are different. If it fails, continue...

Unlock the inputfile, and try opening the outputfile again. If it succeeds this time, then the files are the same. If it fails again, something else is wrong so return an error. :)

  • 0

Couldn't find softlink app but I based it on the code from here: http://www.codeproject.com/w2k/junctionpoints.asp

The hardlink app just uses CreateHardLink. There is also a sysinternals tool with source called junction: http://www.sysinternals.com/Utilities/Junction.html

  • 0
  df_dukkar said:
could u tell me how to lock down a file ??

586282181[/snapback]

I used the OpenFile function.

#include <windows.h>

char* file = "file.dat";

OFSTRUCT fileinfo;

long handle = OpenFile(file, &fileinfo, OF_SHARE_EXCLUSIVE);

  • 0
  Quote
PIDL's are only used in the shell.

That doesn't mean that you can still use them, if theu turn out to be usefull for your purpose.

You're right about hard and soft links tho. Maybe it's possible to use one of those Nt* API's to determine the file-block they 'link' too.

This topic is now closed to further replies.
  • Recently Browsing   0 members

    • No registered users viewing this page.
  • Posts

    • Until something doesn't just work in Windows. Especially driver wise where Linux usually just has driver support in the Kernel. I agree with you on gaming, it's still not there, but for multimedia, Linux has been a thing for years now. I've been using it on all my non-gaming systems and I'd never ever go back to Windows.
    • Linux is still in its infancy, nothing works automatically. I can't even do a normal installation on my laptop without having to tweak the video card otherwise I have no picture at all. With Windows everything works without problems, just install a game and play but without having to open all kinds of tricks
    • PicoPDF 7.09 Beta by Razvan Serea PicoPDF is an easy to use program for editing PDF files. With PicoPDF, you can easily edit text and images in an existing PDF file. Easily rearrange text and graphics, add notes or comments to PDFs, enter information in blank spaces in PDF forms, add a digital signature to a PDF and more. The free version of PicoPDF PDF Editor is available for home, non-commercial use. PicoPDF PDF Editor key features: Edit existing text or add new text Rearrange text and graphics Add notes and comments to documents Delete or move embedded images Add an image to your PDF file Type into blank spaces to fill out PDF forms Add a digital signature to a PDF Works offline - no internet connection required Edit locally, no need to upload your documents Type, draw or insert an image signature into your PDF Resize, replace or update images Fill in and sign PDF forms fast PDF filler tools make filling out PDF forms easy Convert scanned documents into editable PDF files with OCR Works on Windows 11, 10, Vista, 7, 8, 8.1 & 11, macOS 10.9 or above Note: PicoPDF installs start menu shortcuts for other NCH items (NCH Software Suite) unrelated to the program's functionality. Download: PicoPDF 7.09 Beta | 1.9 MB (Free for personal, non-commercial use) Links: PicoPDF Home Page | PicoPDF for macOS | Screenshot | Changelog Get alerted to all of our Software updates on Twitter at @NeowinSoftware
    • 2025 finally the Year of Linux? LibreOffice explains "real costs" of Windows 11 by Sayan Sen A big change is coming to Windows PCs as Microsoft will soon end support for systems and devices running on Windows 10. As such, the company, alongside its partners like AMD, Asus, and Dell, have begun urging users to embrace the "mandatory Windows 11 upgrade." The problem is that not every PC out there will be able to do so, at least not officially, as Microsoft had declared higher requirements for Windows 11 and thus many systems would be left out. Redmond's official stance for such situations is that users get a new computer by dumping their older system. There is another option users have: switching to Linux. Back in January, earlier this year, ESET recommended that users do that if they can not upgrade from Windows 10 to 11 or perhaps when they do not want to. Last month, KDE launched a new campaign dubbed "Endof10", which encourages users to make the jump. The project page explains several of the benefits of Linux over an unsupported Windows 10 system, like security and privacy, among others. And it also published another post earlier this month welcoming such "Windows 10 exiles". Now, The Document Foundation, maker of LibreOffice, has also joined in to support the Endof10 initiative. The foundation writes: "You don’t have to follow Microsoft’s upgrade path. There is a better option that puts control back in the hands of users, institutions, and public bodies: Linux and LibreOffice. Together, these two programmes offer a powerful, privacy-friendly and future-proof alternative to the Windows + Microsoft 365 ecosystem." It further adds the "real costs" of upgrading to Windows 11 as it writes: "The move to Windows 11 isn’t just about security updates. It increases dependence on Microsoft through aggressive cloud integration, forcing users to adopt Microsoft accounts and services. It also leads to higher costs due to subscription and licensing models, and reduces control over how your computer works and how your data is managed. Furthermore, new hardware requirements will render millions of perfectly good PCs obsolete. .... The end of Windows 10 does not mark the end of choice, but the beginning of a new era. If you are tired of mandatory updates, invasive changes, and being bound by the commercial choices of a single supplier, it is time for a change. Linux and LibreOffice are ready — 2025 is the right year to choose digital freedom!" To help users with the migration from Windows to Linux, The Document Foundation has laid out some key steps on how to proceed: Start by testing Linux and LibreOffice on a second partition of your PC (for individuals) or in less critical departments (for companies). Check the compatibility of your software configuration with Linux and LibreOffice; most office tasks can easily be transferred or adapted with minimal effort. Build documentation to learn how Linux and LibreOffice work and organise training if necessary. Find a consultant who can help with the migration process, such as someone certified by the Linux Professional Institute or The Document Foundation (for LibreOffice). The foundation stresses how "important" it is to "start immediately" with the transition. You can find the full details about the announcement here in the official blog post.
    • Why is it that some people think we have a tinfoil hat mentality, as you put it, just because we don't want AI on our devices? As for MS or any other company looking at everything, it seems to be the thing these days that companies want to know all about us. Can't even go shopping these days without being asked if you have some sort of card that tell them what you are buying and who you are.
  • Recent Achievements

    • One Month Later
      POR2GAL4EVER earned a badge
      One Month Later
    • One Year In
      Orpheus13 earned a badge
      One Year In
    • One Month Later
      Orpheus13 earned a badge
      One Month Later
    • Week One Done
      Orpheus13 earned a badge
      Week One Done
    • Week One Done
      serfegyed earned a badge
      Week One Done
  • Popular Contributors

    1. 1
      +primortal
      562
    2. 2
      ATLien_0
      256
    3. 3
      +Edouard
      163
    4. 4
      +FloatingFatMan
      156
    5. 5
      Michael Scrip
      109
  • Tell a friend

    Love Neowin? Tell a friend!