• 0

Need help making an "offline" copy of a website I'm a member of


Question

Hi, everyone.

I'm a member of a private group, one that costs many thousands of dollars to join. The person that started the group and runs it, I'll call "Bob", isn't very technical at all. He hired someone to create a website for members. Each member has their own username and password. As far as I know, for at least the last few years Microsoft SharePoint Server has been used. I do not know any more specifics about what the website is running on.

"Bob" has always encouraged people to download things from the website here and there since content is always changing and in case anything ever happened to the server. Well, this website has many areas (called "Sites"), thousands of forum/blog type postings, and hundreds or thousands of links, files, and folders scattered all over the place. Yes, I could manually make local folders and download things, but that would be a nightmare and quite difficult.

So, all along over the last year or so, I've thought about just making an "offline" copy and figured I would use WinHTTrack, which I have known about for many years. I went to do it about a week ago, with the intention of backing up the site to a new Western Digital Caviar Black 2TB drive. I think the website is only maybe 100GB. Anyway, I am getting "Access Denied" and "Unauthorized" errors (if I look at the logs), which don't make any sense when I know I am using the correct username and password that I use to login. I am even copying and pasting from my password manager.

Here is the log file generated when telling WinHTTrack to copy http://www.WEBSITE.com :

HTTrack3.46+htsswf+htsjava launched on Wed, 17 Oct 2012 14:12:40 at http://USERNAME:PASS...www.WEBSITE.com +*.png +*.gif +*.jpg +*.css +*.js -

ad.doubleclick.net/* -mime:application/foobar

(winhttrack -WC2%Pns2u1%s%uN0%I0p7DaK0c2R3H0%kf2A25000%c1%f#f -F "Mozilla/4.5 (compatible; HTTrack 3.0x; Windows 98)" -%F "<!-- Mirrored from %s%s by HTTrack Website

Copier/3.x [XR&CO'2010], %s -->" -%l "en, en, *" http://USERNAME:PASS...www.WEBSITE.com -O1 I:\WEBSITE_BACKUP_FOLDER +*.png +*.gif +*.jpg

+*.css +*.js -ad.doubleclick.net/* -mime:application/foobar )

Information, Warnings and Errors reported for this mirror:

note: the hts-log.txt file, and hts-cache folder, may contain sensitive information,

such as username/password authentication for websites mirrored in this project

do not share these files/folders if you want these information to remain private

14:12:42 Error: "Access denied" (401) at link USERNAME:[email protected]/ (from primary/primary)

14:12:42 Info: No data seems to have been transfered during this session! : restoring previous one!

Here is the log file generated when telling WinHTTrack to copy http://www.WEBSITE.com/default.aspx :

HTTrack3.46+htsswf+htsjava launched on Wed, 17 Oct 2012 14:16:53 at http://USERNAME:PASS...om/default.aspx +*.png +*.gif +*.jpg +*.css +*.js

-ad.doubleclick.net/* -mime:application/foobar

(winhttrack -WC2%Pns2u1%s%uN0%I0p7DaK0c2R3H0%kf2A25000%c1%f#f -F "Mozilla/4.5 (compatible; HTTrack 3.0x; Windows 98)" -%F "<!-- Mirrored from %s%s by HTTrack Website

Copier/3.x [XR&CO'2010], %s -->" -%l "en, en, *" http://USERNAME:PASS...om/default.aspx -O1 I:\WEBSITE_BACKUP_FOLDER +*.png

+*.gif +*.jpg +*.css +*.js -ad.doubleclick.net/* -mime:application/foobar )

Information, Warnings and Errors reported for this mirror:

note: the hts-log.txt file, and hts-cache folder, may contain sensitive information,

such as username/password authentication for websites mirrored in this project

do not share these files/folders if you want these information to remain private

14:16:56 Error: "Unauthorized" (401) at link USERNAME:[email protected]/default.aspx (from primary/primary)

14:16:56 Info: No data seems to have been transfered during this session! : restoring previous one!

Attached you will find an a screen-shot of the error I get via a pop-up window (for WEBSITE.com or WEBSITE.com/default.aspx) as well as screen-shots showing the various tabs of the Preferences.

So, I was wondering if there is a problem in general with using WinHTTrack with SharePoint websites? Does anyone have any experience making offline copies of SharePoint websites? Any suggestions? Other software that someone has used for SharePoint websites?

Thanks in advance for any feedback and/or help...

-JayZJay

post-157395-0-82868200-1350509533.jpgpost-157395-0-21688800-1350509544.jpgpost-157395-0-71445100-1350509545.jpgpost-157395-0-52917400-1350509547.jpgpost-157395-0-78965000-1350509548.jpgpost-157395-0-17525600-1350509550.jpgpost-157395-0-44106700-1350509551.jpgpost-157395-0-70143500-1350509552.jpgpost-157395-0-07401200-1350509554.jpgpost-157395-0-25737800-1350509555.jpgpost-157395-0-60818300-1350509556.jpgpost-157395-0-82880800-1350509557.jpg

10 answers to this question

Recommended Posts

  • 0

Why not just do it at a server level backup. I never had any issues with the standard options in httrack in getting websites. Or just FTP into the server and DL the entire website from there. If it uses PHP I believe you still will need a local server to run the website.

  • 0

The server is not physically here, nor do I as only a member have any type of admin access to do a server level backup. As far as I know or can tell, there is not any type of ftp access available to anyone. If there is, I do not know the URL and "Bob", being a control freak, is not going to provide that information to any of us members.

  • 0

well from this

"14:12:42 Error: "Access denied" (401) at link USERNAME:[email protected]/ (from primary/primary)"

Looks to me that you can not use that form of url to auth to the site? If sharepoint its prob using ntml auth, and they prob have basic auth off so you can not do that sort of url to auth.

your never going to download anything if site requires auth to access. I would verify that you can use that url your using just in your browser address bar to access the site. If not then do a bit of google for how to use ntml auth with httrack, the old school was was to use a proxy with httrack that would do the ntml auth for you and then all the connections from httrack would use the local proxy you were running on the same machine. Have not play with httrack in years and years - maybe they support direct ntml auth now?

  • 0

"Bob" runs a website where he encourages people to download, yes sounds very legit...

Try not assuming. It is a research group. We study alternative health, banking, law, trusts, estates, and other topics. The only thing on the site is PDFs, Word docs, discussion on various topics, private presentations recorded from group meetings, webinars, etc. Those are the things on the site and there is a ton of it going back years. "Bob" encourages us to download and keep copies of stuff because 1) the site constantly changes, 2) there are times when you are in a law library or other place without Internet access and cannot access the site, and 3) "Bob" isn't very technical and he loses things, deletes stuff, etc -- so having our own backup copies is certainly a good idea just in case a problem arises.

  • 0

Try not assuming. It is a research group. We study alternative health, banking, law, trusts, estates, and other topics. The only thing on the site is PDFs, Word docs, discussion on various topics, private presentations recorded from group meetings, webinars, etc. Those are the things on the site and there is a ton of it going back years. "Bob" encourages us to download and keep copies of stuff because 1) the site constantly changes, 2) there are times when you are in a law library or other place without Internet access and cannot access the site, and 3) "Bob" isn't very technical and he loses things, deletes stuff, etc -- so having our own backup copies is certainly a good idea just in case a problem arises.

i dont know how to help, but this sounds really cool. Why can't we know who bob is, or this website?

  • 0

Try not assuming. It is a research group. We study alternative health, banking, law, trusts, estates, and other topics. The only thing on the site is PDFs, Word docs, discussion on various topics, private presentations recorded from group meetings, webinars, etc. Those are the things on the site and there is a ton of it going back years. "Bob" encourages us to download and keep copies of stuff because 1) the site constantly changes, 2) there are times when you are in a law library or other place without Internet access and cannot access the site, and 3) "Bob" isn't very technical and he loses things, deletes stuff, etc -- so having our own backup copies is certainly a good idea just in case a problem arises.

All i have gotten from this whole thread is bob is a complete and utter idiot that runs a website that people seem to pay alot of money to join. If you are paying him money to join this secret website and he has no clue on what he's doing then he should hire someone to take care of the tech side of things including backups to another hard drive/server (after all you say your paying alot of money why can he not have more than one server?). If there are so many files like you claim then the best thing would be to get limited ftp access to the downloads directory (can access and download that directory and thats it).

In all honesty you would be better talking to this bob and telling him to hire someone for back up reasons, as for the offline content maybe he does not want that because the information could end up being on the internet for free instead of people paying him to gain access. As you said things change and people leave things laying about which soon gets onto the internet,.

Anyway im out of this thread, i personally think what you are doing is wrong you are a member of a site and nothing more.

This topic is now closed to further replies.
  • Recently Browsing   0 members

    • No registered users viewing this page.
  • Posts

    • Sparkle 2.20.1 by Razvan Serea Sparkle is a free, open-source Windows optimization tool designed to make your PC faster, cleaner, and more private. With Sparkle, you can easily debloat Windows by removing unnecessary apps and services, disable Microsoft tracking to enhance privacy, and apply performance tweaks to boost speed. Its cleaner removes junk and temporary files, while every change is safe and fully reversible. Sparkle also features a modern, user-friendly interface with automatic updates, making system maintenance simple. Explore over 39 tweaks, from disabling telemetry and hibernation to optimizing network and game settings, all aimed at customizing and enhancing your Windows experience. Sparkle supports Windows 10 and 11. Sparkle 2.20.1 changelog: You can now change the Animation Direction from Up, Left, or Off. Added configurable animation direction (Up, Left, Off) for improved accessibility Added TTL caching to the system info backend Refactored tweak application flow to await NvidiaProfileInspector Improved IPC listener cleanup to correctly remove specific listeners Fixed online status not updating after successful network requests Updated system info tests to support backend caching Removed electron-toolkit utils dependency in favor of internal is.dev helper Fixed unwanted files and folders being included in application bundles Download: Sparkle 2.20.1 | Portable | ~100.0 MB (Open Source) Links: Sparkle Website | Github | Screenshot Get alerted to all of our Software updates on Twitter at @NeowinSoftware
    • Never used the G7 Pro, but I've never had a good experience with that style of d-pad and fighting games.
    • And I just bought a seat cushion for my mesh chair. The chair feels nice but the first time I sat in it with boxers, I realized I don't like the feel of mesh on my legs. 😂
    • "This Dell 27 inch 4K 120Hz IPS monitor is really cheap after a very long time" ... Lol.
    • This Dell 27 inch 4K 120Hz IPS monitor is really cheap after a very long time by Sayan Sen Recently we covered a really good deal on an AMD RX 9070 three-fan model that's available at slightly above its MSRP. If you are looking for a GPU for 1440p gaming that's around the performance of the Nvidia RTX 5070 you should most definitely check it out. Let's say that you are looking for a monitor to pair that up with too. The Samsung 49" G9 curved QD-OLED superultrawide is a good option that can provide an immersive experience. However despite being a very good deal currently (at $855), it may seem unaffordable to you, or you may simply not want to spend as much on a monitor. In that case Dell's S2725QS can be a very good option as it's on sale at the moment for its lowest price in over six months (purchase link under the specs table down below). The big highlight of the Dell S2725QS is its 27-inch IPS panel with a 3840 x 2160 (4K UHD) resolution, offering a high pixel density that can make text appear sharper while also providing plenty of screen space for productivity and media consumption. The display supports a refresh rate of up to 120Hz through both HDMI and DisplayPort, making it suitable not only for everyday desktop use but also for smoother gaming and scrolling. AMD FreeSync Premium support is included as well, helping reduce screen tearing during gaming sessions. The screen has fairly good brightness and color accuracy so you can use it for general work purpose, though photo/video editing is probably not going to be the best match for this. The technical specs of the Dell S2725QS are given in the table below: Specification Value Viewable Screen Size 27 in (68.58 cm) Screen Mode 4K UHD Maximum Resolution 3840 × 2160 Maximum Preset Resolution 3840 × 2160 @ 120 Hz Standard Refresh Rate 120 Hz Panel Technology In-plane Switching (IPS) Backlight Technology LED Edgelight System Pixel Density 163 PPI Response Time 8 ms GTG, 5 ms GTG, 4 ms GTG Horizontal Viewing Angle 178° Vertical Viewing Angle 178° Brightness 350 cd/m² (nits) Native Contrast Ratio 1500:1 Color Support 1.07 Billion Colors Color Gamut 99% sRGB (CIE 1931) Adaptive Sync AMD FreeSync Premium HDCP Support Yes Mount Type Panel Mount VESA Mount 100 × 100 mm Maximum Height Adjustment 13 cm Tilt -5° to 21° Swivel -30° to 30° Pivot ±90° Stand Adjustments Tilt, Swivel, Height, Pivot Glass Hardness 3H Horizontal Frequency 27–270 kHz (DisplayPort 1.4 / HDMI 2.1) Vertical Frequency 48–120 Hz (DisplayPort 1.4 / HDMI 2.1) Video Inputs 2 × HDMI 2.1 (HDCP 1.4 & 2.3), 1 × DisplayPort 1.4 (HDCP 1.4 & 2.3) Operating Temperature 0°C to 40°C Storage Temperature -20°C to 60°C Operating Humidity 10%–80% (Non-condensing) Storage Humidity 5%–95% (Non-condensing) Get it at the link below: Dell S2725QS 27-inch 4K 120Hz IPS monitor: $218.49 (Sold and Shipped by Amazon US) (Was: $280) Good to know This Amazon deal is U.S. specific, and not available in other regions unless specified. We only use first-party seller links (at the time of article publishing); ensure that you purchase from a first-party seller link only. Check out Today's Deals on Amazon | or our recent tech deals. Become a Prime member (for Students or SNAP) via Neowin Get Prime Access - Prime for half price (for qualifying Medicaid, EBT, SNAP) Subscribe to Prime Video, Audible Plus, Music Unlimited or Kindle Unlimited via Neowin As an Amazon Associate, we earn from qualifying purchases
  • Recent Achievements

    • Conversation Starter
      jessse3334 earned a badge
      Conversation Starter
    • Reacting Well
      JuvenileDelinquent earned a badge
      Reacting Well
    • One Month Later
      Excellence2025 earned a badge
      One Month Later
    • Week One Done
      Excellence2025 earned a badge
      Week One Done
    • Week One Done
      flexorcist earned a badge
      Week One Done
  • Popular Contributors

    1. 1
      +primortal
      499
    2. 2
      +Edouard
      243
    3. 3
      PsYcHoKiLLa
      153
    4. 4
      Steven P.
      84
    5. 5
      macoman
      64
  • Tell a friend

    Love Neowin? Tell a friend!