• 0

[C++] Test if two files are the same?


Question

I'm not looking to test if two files are duplicates, I know you can do that by doing a byte by byte comparison, or hashing both files.

What I'm looking for is a way to test if both files are literally the same file. For example:

c:\documents and settings\bla.txt

c:\docume~1\bla.txt

When comparing the strings, those might be seen as two different files, when really they're the same file. I could convert both strings to short file names, but I'm not sure if Windows has other ways of linking files to eachother.

In brief, I need a foolproof way to test if two files are the same file or different.

Link to comment
https://www.neowin.net/forum/topic/349791-c-test-if-two-files-are-the-same/
Share on other sites

21 answers to this question

Recommended Posts

  • 0

Well i can just give you puesudo code:

Read both files and store the text inside, into two seperate strings

--> If string 1 == string 2

Do what you want to happen

You will have to check yourself, or let someone else find out how to read the files, since i haven't learned about reading files yet.

  • 0

So your question is how are you supposed to see if two files are identical without comparing contents of them!?

Btw, love your avatar - reminds me of when I was creating demos in DOS and I made effect that looked exactly as your avatar... heh..

  • 0

so let me see if i'm getting the question.

u have 1 file but have 2 different strings for the paths:

as in

Path 1 is C:\Docs\abc.txt

Path 2 is D:\Prog\Desktop\abc.txt

put the two paths in 2 string arrays. parse the arrays backwards till u reach the '\', from that point on again go forward till the end of the array and store this in 2 new arrays.

now compare these 2 arrays to chk if they are the same.

  • 0
  df_dukkar said:
so let me see if i'm getting the question.

u have 1 file but have 2 different strings for the paths:

as  in

Path 1 is C:\Docs\abc.txt

Path 2 is D:\Prog\Desktop\abc.txt

put the two paths in 2 string arrays. parse the arrays backwards till u reach the '\', from that point on again go forward till the end of the array and store this in 2 new arrays.

now compare these 2 arrays to chk if they are the same.

586278450[/snapback]

He wants to see if the files are the same, i.e. they contain the same stuff, but for some reason he doesn't want to open the file or uses hashes like an MD5sum.

  • 0

Sorry, let me reexplain :pinch:

c:\documents and settings\file.txt

c:\docume~1\file.txt

Technically, both are the same file, but doing a string comparison would say differently.

I have a function which has two parameters, inputfile and outputfile. If the outputfile is different than the inputfile, it will truncate the outputfile. But if the inputfile and outputfile are the same file, then it'll overwrite the current file.

I already mentioned that I could convert both strings to a short filename, but I'm unsure if there are other things to consider. For example, %systemroot%.

So I need a "foolproof" way of testing if the two strings both literally represent the same file. If converting to a short filename would be adequate, could someone give me a function that can do this? I was unable to find anything on Google or MSDN :blush:

And sorry for not being clearer in my first post :(

Btw, love your avatar - reminds me of when I was creating demos in DOS and I made effect that looked exactly as your avatar... heh..

My avatar is a XOR effect, (X ^ Y) ;)

Edit: Just to be absolutely sure I don't confuse anyone again...

I'm NOT looking to compare the contents of files.

I'm only looking to see if two file paths are the same.

eg. c:\docume~1\file.txt IS c:\documents and settings\file.txt. Only difference is one is the short file name and the other is the long file name.

Edited by xinok
  • 0

I dont really get why you would like to do that since they are always the same. That means instead of writing

cd "c:\documents and settings"

you can always use

cd c:\docume~1\

But if you really want to do that then it's only matter of stripping strings.

Reason that it stands docume~1 is that DOS cannot handle filenames larger than 8 characters, so "documents and settings" became "docume~1".

My recomendation is that you have some kind of translation table that translates docume~1 to its full name or use full path as input and strip it down to 6 characters and add "~1" to it.

  • 0
  xinok said:
but I'm not sure if Windows has other ways of linking files to eachother.

586278291[/snapback]

unfortunately windows does not have symlinks like in unix - this is how i understand link of similar (the same) files.

maybe i did not understand your post correctly, but if you want it not as part of some school assignment, what's the problem using 'comp' command?

i have it in the sendTo menu..

  • 0

NTFS supports both hard and soft links like ext2; these features are largely underused though.

There is an api function called GetShortPathName that will deal with 8.3 vs long format. The best way to detect all links/short/long would be to check which directory entry the directories ultimately point to. I think this approach would work only on a per-file-system basis (e.g., you need diff code for FAT32 and NTFS).

  • 0

Jayzee: I know what short file names are.

Andareed: Thanks for the GetShortPathName function

robotnic: I found this: CreateHardLink. It doesn't seem to exist in VC++ 6 though. I was hoping to create a hard link file, see what I can find out about them.

Taken from here:

A hard link to a file is indistinguishable from the original name for the file; there's no particular link that is more the "real name" for the file than any other.

I guess there isn't much I can do about hard link files.

Off Topic

Something thats just sort of bugging me, whats with all the typedef's and #define's in the C++ headers?

typedef LPCSTR LPCTSTR; typedef CONST CHAR *LPCSTR, *PCSTR; #define CONST const, typedef char CHAR; etc.

I really don't see the point. It just makes C++ harder to learn trying to memorize all these "types", and overall makes code harder to read if you don't know what a certain typedef or define is.

  • 0

I've actually written 2 apps for creating soft and hard links. If anyone wants, I can post them.

If you use CreateHardLink, you'll probably need to install the platform sdk and change the vc++ includes/libs directories. Interestingly, there is a new CreateSymbolicLink on msdn that only works with longhorn.

@xinok: these namings are generally acronyms. LPCTSTR means Long Pointer Const null-Terminating STRing. Another one is TCHAR, which is CHAR on ANSI and WCHAR on UNICODE. CHAR is char because win32 uses all caps for structures and primitives.

  • 0
  Andareed said:
I've actually written 2 apps for creating soft and hard links. If anyone wants, I can post them.

586280364[/snapback]

I'd appreciate it if I can get those apps :yes: And thanks ahead of time.
  • 0

Alright, I was able to solve this little riddle. First, creating a hard link file can be done from a command in windows:

fsutil hardlink create c:\output.txt c:\input.txt

Now testing for duplicates (I already tried this, it also locks hard link files):

We have the inputfile and outputfile

First, open the inputfile, but deny access to all other processes (lock the file)

Now try opening the outputfile. If it succeeds, the files are different. If it fails, continue...

Unlock the inputfile, and try opening the outputfile again. If it succeeds this time, then the files are the same. If it fails again, something else is wrong so return an error. :)

  • 0

Couldn't find softlink app but I based it on the code from here: http://www.codeproject.com/w2k/junctionpoints.asp

The hardlink app just uses CreateHardLink. There is also a sysinternals tool with source called junction: http://www.sysinternals.com/Utilities/Junction.html

  • 0
  df_dukkar said:
could u tell me how to lock down a file ??

586282181[/snapback]

I used the OpenFile function.

#include <windows.h>

char* file = "file.dat";

OFSTRUCT fileinfo;

long handle = OpenFile(file, &fileinfo, OF_SHARE_EXCLUSIVE);

  • 0
  Quote
PIDL's are only used in the shell.

That doesn't mean that you can still use them, if theu turn out to be usefull for your purpose.

You're right about hard and soft links tho. Maybe it's possible to use one of those Nt* API's to determine the file-block they 'link' too.

This topic is now closed to further replies.
  • Recently Browsing   0 members

    • No registered users viewing this page.
  • Posts

    • Microsoft 365 Copilot Notebooks now integrated in OneNote on Windows by Paul Hill Microsoft has announced that Enterprise customers with Microsoft 365 Copilot, SharePoint, or OneDrive licenses can now use Microsoft 365 Copilot Notebooks integrated directly with OneNote on Windows. Copilot Notebooks are AI-powered and bring together different resources such as Copilot Chat, files, notes, and links into a single space to make you more productive. The Redmond giant wants to make it easier for customers to gather content, understand complex topics, and create “smarter content” with Copilot Notebooks. This integration is squarely aimed at Enterprise customers, not Personal or Family subscribers. How Copilot Notebooks enhance productivity in OneNote When you open OneNote on Windows, you should see Copilot Notebooks in the left-hand sidebar, from here you can view or edit existing notebooks or you can create one by going to Home > Create Copilot Notebook or New notebook. During the creation of your new notebook, you can give it a name and add references such as OneNote pages, .docx, .pptx, .xlsx, .pdf, or .loop files. This gives Copilot extra context to provide you with more refined answers. Once you have created a notebook and added your reference documents, you can use Copilot Notebooks to help you gather insights from your documents, draft summaries, and generate audio overviews. An important caveat to mention about these notebooks is that you can only add 20 files as references and only individual OneNote pages, as opposed to sections and notebooks, can be added. Microsoft could add support for these in the future, but you can’t add them yet. Another limitation right now is that some OneNote features aren’t functional within Copilot Notebooks, including tags, section groups, inking, templates, password protection, Immersive Reader, and offline support. Availability and what it means for enterprise users Microsoft 365 Copilot Notebooks in OneNote for Windows are available for Enterprise customers with an appropriate license (anyone with a Microsoft 365 Copilot, SharePoint, or OneDrive license) running OneNote Version 2504 (Build 18827.20128) or later. If you have any other feedback to give to Microsoft, you can give it via Help > Feedback. As an Insider preview, Microsoft will likely improve this before declaring it stable so let Microsoft know of any issues you have. Now that the feature is available as a preview, it’s the perfect time for IT admins and other decision-makers to evaluate the feature to see how it could benefit their wider organization.
    • Mixxx 2.5.2 by Razvan Serea Mixxx is powerful, free, and open-source DJ software designed for both beginners and professionals. It offers real-time beatmatching, auto DJ, effects, and MIDI controller support. With a clean interface and compatibility across Windows, macOS, and Linux, Mixxx is ideal for live performances, radio broadcasts, or practice sessions. Its active community and constant updates make it a reliable tool for any DJ. Mixxx integrates the tools DJs need to perform creative live mixes with digital music files. Whether you are a new DJ with just a laptop or an experienced turntablist, Mixxx can support your style and techniques of mixing. Mixxx key features: Realtime audio engine with low-latency performance MIDI and HID controller mapping with customizable scripting (JavaScript-based) Vinyl DVS support (absolute & relative timecode modes) OpenSL, ASIO, WASAPI, and JACK audio backend support Advanced BPM & musical key detection (KeyFinder integration) Quantized beat sync and phase locking Effect chain routing with LADSPA plugin support 4-deck mixing with independent EQ and gain control Support for wide file formats (MP3, FLAC, OGG, WAV, AIFF) Broadcasting via Icecast and Shoutcast with metadata support Library with Crate, Playlist, and Smart Playlist organization Multi-core CPU support for performance optimization Microphone and Auxiliary input routing with talkover ducking OSC and Web MIDI support Skinnable and themable Qt-based UI Cue points, hotcues, and looping with quantization Recording in lossless WAV or compressed formats Clock-synced looping and beatjump Mixxx 2.5.2 changelog: Library Fix playlist export when name contains a dot Fix loading the wrong track via drag and drop when using symlinks Fix: byte order in hotcue comments imported from rekordbox Tracks table: show ReplayGain with max. 2 decimals, full precision in tooltip Fix keyboard mappings with non-ASCII characters on Linux Computer feature: enable initial sorting during population Computer feature: avoid false-positve 'has children' for non-directory links Fix column header mapping when using external library Fixed Single track cover reload on reload metadata from file Controller Mappings Arturia KeyLab Mk1: initial mapping Denon MC7000: slicer mode TypeError Denon MC7000: crossfader curve using wrong parameter DJ TechTools MIDI Fighter Twister: support 4 decks Hercules DJControl Inpulse 500: the crossfader was not reaching 100% to the right end Icon Pro Audio iControls: initial mapping Numark Mixtrack Platinium FX: Fix 4 steps browsing issue Traktor Kontrol S3: Use GUI config for settings Traktor S2 MK3: Fixed LED issue Traktor S4 MK2: Use engine settings API for configuration Traktor S4 MK3: prevent sync lockup, add setting for tempo center snap Controller Backend Control picker: Allow to learn MIDI Aux/Mic enable controls Make [Main],headSplit CO persistent across restart Fix MIDI Controller button learning Fix learning with "No Mapping" selected Unit tests for engine.beginTimer engine-api.d.ts: brake()/spinback() documentation Target support Fix building with a CMake multi-config setup Fix building with gcc >= 14 with LTO and clang >= 19 (fpclassify) Fix: gcc -Warray-bounds= in fidlib by using a flexible member Added Linux Mint Codenames to debian_buildenv.sh Add hidden [Config],notify_max_dbg_time setting to reduce warnings in developer mode Detect arch and fail early if not supported when installing buildenv Misc Vinyl Control: Reduce sticker drift Fix infinite number of pop ups of the "No Vinyl|Mic|Aux|Passthrough input configured" dialog Reduce CPU usage with Trace log messages Fix adjust Gain after adopting it as ReplayGain only in requesting playe Skins: add loop anchor toggle to Deere, Shade, Tango Sound Hardware preferences: add manual link for Mic monitoring modes Work around an Ubuntu, Ibus or Qt issue regarding detecting the current keyboard layout. Fix BPM rounding for the 3/2 case Update cue & play indicators on paused decks when switching cue mode Download: Mixxx 2.5.2 | 113.0 MB (Open Source) Links: Mixxx Home page | Other OSes | Screenshot Get alerted to all of our Software updates on Twitter at @NeowinSoftware
    • KDE brings Wayland PiP to Plasma 6.5, adds finishing touches to 6.4 as release nears by David Uzondu The KDE team has released its This Week in Plasma update, showing the final polish being applied to Plasma 6.4 ahead of its June 17 release. Last week, the KDE team brought performance upgrades, and this week the team is continuing that with improvements like faster loading for System Monitor components in Plasma 6.4. Future work for Plasma 6.5 is already underway, and it includes a feature that many have probably been waiting for: proper Picture-in-Picture support on Wayland. This uses an experimental version of the Wayland PiP protocol, which means applications like Firefox that also implement it can finally display PiP windows correctly. It is a long-overdue addition that moves the Wayland session closer to feature parity with X11. The devs also merged KWin's Background Contrast effect into the Blur effect. Virtual desktops can now be re-ordered from the Pager widget, a feature previously missing. Invert and Zoom settings have been moved into the Accessibility page, which is a more sensible place for them than the Desktop Effects page was. The team also brought consistency to the Breeze application style, with animated effects for checkboxes and radio buttons now working in QtQuick-based apps. Other small cleanups include standardizing the section headers in the Disks & Devices, Networks, and Bluetooth widgets. For those who do a lot of screen recording, Spectacle now makes it much clearer how to stop a recording, both in its notifications and shortcut names. As for the immediate future, Plasma 6.4 and its first point release are getting accessibility and user interface tweaks. The team improved text contrast for labels used in secondary roles throughout Plasma, making things like brightness indicators much easier to read. The Kicker Application Menu in 6.4 can now scroll horizontally when a search returns a ton of results, so you can actually see all of them. The team also delivered some stability improvements in Plasma 6.4.0, most notably fixing a long-standing issue where adding widgets to oversized panels could freeze the entire shell. Discover also got a much-needed fix for a crash that occurred when suggesting replacements for unsupported Flatpak apps. On the usability side, dragging files into a Folder View widget no longer causes glitchy visuals, and Open and Save dialogs from Flatpak-based browsers now properly allow the preview pane to open. Printing from Flatpak GTK apps now respects correct sizing, and installing or removing apps no longer wipes out your search input in Kicker or Kickoff while you're using it. Other notable fixes include: Selection rectangles on the desktop now render properly when using custom fonts or sizes (Plasma 6.3.6) A crash in System Monitor charts used by apps and Plasma components has been resolved (Frameworks 6.15) Switching process views in System Monitor no longer causes crashes (Frameworks 6.16) Open and Save dialogs no longer close when hovering over specific files (Frameworks 6.16) A thumbnailer crash on X11 caused by certain widget styles has been fixed (KDE Gear 25.04.3) Frameworks 6.15 also speeds up System Monitor by delaying tree view arrow loading There are still 3 high-priority Plasma bugs holding out, and the list of quick-win "15-minute bugs" has grown to 23.
    • Hasleo Backup Suite Free 5.4.2.0 by Razvan Serea Hasleo Backup Suite Free is a free Windows backup and restore software, which embeds backup, restore and cloning features, it is designed for Windows operating system users and can be used on both Windows PCs and Servers. The backup and restore feature of Hasleo Backup Suite can help you back up and restore the Windows operating systems, disks, partitions and files (folders) to protect the security of your Windows operating system and personal data. The cloning feature of Hasleo Backup Suite can help you migrate Windows to another disk, or easily upgrade a disk to an SSD or a larger capacity disk. System Backup & Restore / Disk/Partition Backup & Restore Backup Windows operating system and boot-related partitions, including user settings, drivers and applications installed in these partitions, which ensures that you can quickly restore your Windows operating system once it crashes. Viruses, power failure, or other unknown reasons may cause data loss, so it is a good habit to regularly back up the drive that stores important files, you can at least recover lost files from the backup image files in the event of a disaster. System Clone / Disk Clone / Partition Clone Migrate the Windows operating system from one disk to another SSD or larger disk without reinstalling Windows, applications and drivers. Clone entire disk to another disk and ensure that the contents of the source disk and the destination disk are exactly the same. Clone a partition completely to the specified location on the current disk or another disk and ensure that the data will not be changed. File Backup & Restore Back up specified files(folders) instead of the entire drive to another location to protect your data, so you can quickly restore files(folders) from the backup image files when needed. Incremental/Differential/Full Backup Different backup modes are supported, you can flexibly choose data protection schemes, which can improve backup performance and save storage space while ensuring data security. Delta Restore Delta restore uses advanced delta detection technology to check the changed blocks on the destination drive and restore only the changed blocks, so it has a faster restore speed than the traditional full restore. Universal Restore This feature can help us restore the Windows operating system to computers with different hardware and ensure that Windows can work normally without any hardware compatibility issues. Hasleo Backup Suite 5.4.2.0 changelog: Added backup image delete feature Added storage path management feature Improved file backup feature Show application notifications in Windows Notification Center Various other bug fixes and feature improvements Download: Hasleo Backup Suite 5.4.2.0 | 34.4 MB (Freeware) Links: Hasleo Backup Suite Website | Hasleo Backup Suite Guide | Screenshot Get alerted to all of our Software updates on Twitter at @NeowinSoftware
  • Recent Achievements

    • One Month Later
      5i3zi1 earned a badge
      One Month Later
    • Week One Done
      5i3zi1 earned a badge
      Week One Done
    • Week One Done
      julien02 earned a badge
      Week One Done
    • One Year In
      Drewidian1 earned a badge
      One Year In
    • Explorer
      Case_f went up a rank
      Explorer
  • Popular Contributors

    1. 1
      +primortal
      543
    2. 2
      ATLien_0
      227
    3. 3
      +FloatingFatMan
      160
    4. 4
      Michael Scrip
      113
    5. 5
      +Edouard
      98
  • Tell a friend

    Love Neowin? Tell a friend!