• 0

WebRequest and WebResponse has issues


Question

WebRequest and WebResponse has issues

I wrote a C# program that uses WebRequest and WebResponse to perform a simple web crawler. I discovered something about web sites. Web browsers such as IE and FireFox offer the capacity to view the HTML source code. But it seems that html code that is sent to the browser is one thing and what the browser interprets and displays is something else. For example, if you run a google search in IE and run the same google search in FireFox, the content that you can see when you view the source in IE will NOT have the hyperlinks and content from the search results, but you can see the html hyperlinks and content from the search results when you view the source in FireFox. So my question is this. How do you specialise the WebRequest and WebResponse to show the content after it is processed by the browser instead of before?

10 answers to this question

Recommended Posts

  • 0

It's not an 'issue', it's by design. It's not the responsibility of WebRequest/WebResponse to execute client-side scripts, no content loaded by XHR requests will be returned, just the original HTML document. I believe if you want to return the HTML after it's been modified by a client-side script you would need to use a full web browser control rather than a WebRequest.

  • 0

I think I will have to use the WebBrowser class instead.

How do I expose the LoadCompleted method in the WebBrowser class in WPF C#?

I am trying to write a C# program in wpf that retrieves the content of a web page.

The first thing I tried was to try the WebRequest and WebResponse classes. This did not provide the actual displayed content. WebResponse reveils the HTML code that is sent to the browser. But I discovered that, while the page is being loaded by the browser, javascript can change what content is finally displayed in the browser.

So I decided to use the WebBrowser class.

Immediately I found that there are two WebBrowser classes. Thee is the one that is documented for WinForms and there is another that is documented for WPF. I need to understand the one documented for WPF. What I think I neeed to know what to do is to retrieve code after the "LoadCompleted" method is caused. But I do not know how to this and I cannot find any example demonstrating how this is done.

  • 0

In whatever class you're hosting the control in (Page, Window, etc) you need to add a handler. You can either put it in the class's initialization routine as

myBrowser.LoadCompleted += WebBrowser_LoadCompleted

or put it in the XAML in the WebBrowser declaration.

<WebBrowser Name="myBrowser" LoadCompleted="WebBrowser_LoadCompleted"/>

  • 0

I am getting close to solving this and having a working bit of code. As things stand right now, the call back function for LoadCompleted is not called with the code is stepped through.

Why doesn't this call back function get called?

Basically here is the code surrounding the declaratoini callback method:

webbrowser1 = new WebBrowser();
webbrowser1.LoadCompleted +=webbrowser1_LoadCompleted;
webbrowser1.Navigate(new Uri([url="http://www.google.com"]http://www.google.com[/url]"));

Should there be something more or are they in the wrong order?

The method, webbrowser1_LoadCompleted, is never called. I have put breakpoints in the callback method and the running program never reaches this method:

		 void webbrowser1_LoadCompleted(object sender, NavigationEventArgs e)
		{
			.
			.
			.
		}

I must be missing a reference. I do not know what one I am missing. Can you offer a suggestion?

By the way, that block above is placed there by the editor on this forum, it is not how my code actually looks like

  • 0

I tested it with the following:


WebBrowser b = new WebBrowser();
b.Loaded += b_Loaded;
b.Navigated += b_Navigated;
b.LoadCompleted += b_LoadCompleted;
b.Navigate(http://microsoft.com);
[/CODE]

It seems to fire those three events in that order: Loaded, Navigated, then LoadCompleted. LoadCompleted doesn't fire until the entire page's content is completely downloaded.

Yeah, the code tags are dumb. There should be quotes around the Uri string.

  • 0

I tested it with the following:


WebBrowser b = new WebBrowser();
b.Loaded += b_Loaded;
b.Navigated += b_Navigated;
b.LoadCompleted += b_LoadCompleted;
b.Navigate(http://microsoft.com);
[/CODE]

It seems to fire those three events in that order: Loaded, Navigated, then LoadCompleted. LoadCompleted doesn't fire until the entire page's content is completely downloaded.

Yeah, the code tags are dumb. There should be quotes around the Uri string.

I am close. It works but, at the same time it does not completly come through. When I use a google search page as a test. all the methods you mention are called, but the HTMLDocument I extract when these methods are fired contain the HTML from the home page of google.

Also, if I put breakpoints on these methods, at watch the output to see if the WebBroswer class loads the google results page, it does not. The page is blank until the program is idle.

What do you think?

  • 0

If I am not mistaken it is because google uses ajax to load div content to show the results as you type and the load completed just will pull the html document that was originally loaded. If I use ajax requests and they change div content and then go to view source in any browser it will just show my original document.

This topic is now closed to further replies.
  • Recently Browsing   0 members

    • No registered users viewing this page.
  • Posts

    • Wow, spoken like a true blind hater, you don't even provide arguments. Please, go check my comment above to @seacaptain and you'll find out why what you say doesn't make sense in this context...
    • Get used to this, with AI tooling now uncovering new vulns and getting them exploitable far faster than has ever been possible before software is going to need to be updated far more frequently. Back in the day it may take reseachers weeks or months to do what AI can now do in hours. Once its a threat is discovered it's weaponsized far more quickly, meaning you simply can't be waiting 2, 3, 4 weeks to deploy a patch, it needs to be patched immediately. Going to be interesting handling this in the enterprise space where traditionally patching has been steady, but very staged (and rightly so up until now), that is going to have to change.
    • You don't need to "close all browser sessions constantly" or wait for updates to install. The updates download in the background while you use the browser, without interrupting you, they install automatically the next time you launch the app. And they install very fast (depending on your storage speeds, of course), you have to wait at most 2-3 extra seconds, if any. Seems like you haven't used Edge in a loooooooong time...
    • Segra 1.6.0 by Razvan Serea Segra is a free, open-source OBS-powered game recorder offering fast gameplay capture, instant clips, AI highlights, deep game integration, and seamless uploads—perfect for gamers, streamers, and content creators. Lightweight, fast, zero bloat. Segra key features: Automatic Game Recording: Begin capturing gameplay the moment your game launches, with zero manual setup. Instant Clipping: Save important moments instantly using a customizable hotkey—perfect for highlights, montages, or quick shares. Segra AI Highlights: Let Segra automatically detect kills, assists, deaths, and key events to generate polished highlight reels without manual editing. Gameplay Uploads: Upload recordings and clips directly to Segra.tv for fast sharing and cloud access. Deep Game Integration: Enjoy advanced game-data tracking across hundreds of supported titles, enabling smart highlight generation and stat-informed clipping. High-Performance Capture: Record up to 4K at 144 FPS using OBS-powered technology with minimal performance impact, supporting NVENC, AMD VCE, and custom quality controls. Segra Editor: Edit recordings easily with timeline controls, segment management, and event-based navigation to build the perfect clip. Customization Options: Adjust hotkeys, output formats, storage paths, codecs, capture quality, and performance settings for a tailored recording experience. Segra 1.6.0 changelog: Recording: Added HDR support. Grand Theft Auto: Added game integration for deaths (FiveM and RAGE MP supported). Highlights: Added customizable padding for highlights. Replay Buffer: Added a shockwave visual effect when a replay buffer clip is saved. Audio: Increased the maximum sound effects volume from 100% to 200%. Hotkeys: Fixed hotkeys not triggering while unrelated keys were held. Installer: Added code signing to verify publisher identity, branded the installer, and reduced OS security warnings. OBS: Updated the supported OBS version to 32.1.2. Download: Segra 1.6.0 | 74.4 MB (Open Source) View: Segra Homepage | Github | Screenshot Get alerted to all of our Software updates on Twitter at @NeowinSoftware
  • Recent Achievements

    • One Month Later
      Clizby earned a badge
      One Month Later
    • One Month Later
      Timaximus earned a badge
      One Month Later
    • Week One Done
      Timaximus earned a badge
      Week One Done
    • Rookie
      FBSPL went up a rank
      Rookie
    • First Post
      davidbazooked earned a badge
      First Post
  • Popular Contributors

    1. 1
      +primortal
      508
    2. 2
      PsYcHoKiLLa
      175
    3. 3
      +Edouard
      163
    4. 4
      Steven P.
      86
    5. 5
      ATLien_0
      79
  • Tell a friend

    Love Neowin? Tell a friend!