• 0

WebRequest and WebResponse has issues


Question

WebRequest and WebResponse has issues

I wrote a C# program that uses WebRequest and WebResponse to perform a simple web crawler. I discovered something about web sites. Web browsers such as IE and FireFox offer the capacity to view the HTML source code. But it seems that html code that is sent to the browser is one thing and what the browser interprets and displays is something else. For example, if you run a google search in IE and run the same google search in FireFox, the content that you can see when you view the source in IE will NOT have the hyperlinks and content from the search results, but you can see the html hyperlinks and content from the search results when you view the source in FireFox. So my question is this. How do you specialise the WebRequest and WebResponse to show the content after it is processed by the browser instead of before?

10 answers to this question

Recommended Posts

  • 0

It's not an 'issue', it's by design. It's not the responsibility of WebRequest/WebResponse to execute client-side scripts, no content loaded by XHR requests will be returned, just the original HTML document. I believe if you want to return the HTML after it's been modified by a client-side script you would need to use a full web browser control rather than a WebRequest.

  • 0

I think I will have to use the WebBrowser class instead.

How do I expose the LoadCompleted method in the WebBrowser class in WPF C#?

I am trying to write a C# program in wpf that retrieves the content of a web page.

The first thing I tried was to try the WebRequest and WebResponse classes. This did not provide the actual displayed content. WebResponse reveils the HTML code that is sent to the browser. But I discovered that, while the page is being loaded by the browser, javascript can change what content is finally displayed in the browser.

So I decided to use the WebBrowser class.

Immediately I found that there are two WebBrowser classes. Thee is the one that is documented for WinForms and there is another that is documented for WPF. I need to understand the one documented for WPF. What I think I neeed to know what to do is to retrieve code after the "LoadCompleted" method is caused. But I do not know how to this and I cannot find any example demonstrating how this is done.

  • 0

In whatever class you're hosting the control in (Page, Window, etc) you need to add a handler. You can either put it in the class's initialization routine as

myBrowser.LoadCompleted += WebBrowser_LoadCompleted

or put it in the XAML in the WebBrowser declaration.

<WebBrowser Name="myBrowser" LoadCompleted="WebBrowser_LoadCompleted"/>

  • 0

I am getting close to solving this and having a working bit of code. As things stand right now, the call back function for LoadCompleted is not called with the code is stepped through.

Why doesn't this call back function get called?

Basically here is the code surrounding the declaratoini callback method:

webbrowser1 = new WebBrowser();
webbrowser1.LoadCompleted +=webbrowser1_LoadCompleted;
webbrowser1.Navigate(new Uri([url="http://www.google.com"]http://www.google.com[/url]"));

Should there be something more or are they in the wrong order?

The method, webbrowser1_LoadCompleted, is never called. I have put breakpoints in the callback method and the running program never reaches this method:

		 void webbrowser1_LoadCompleted(object sender, NavigationEventArgs e)
		{
			.
			.
			.
		}

I must be missing a reference. I do not know what one I am missing. Can you offer a suggestion?

By the way, that block above is placed there by the editor on this forum, it is not how my code actually looks like

  • 0

I tested it with the following:


WebBrowser b = new WebBrowser();
b.Loaded += b_Loaded;
b.Navigated += b_Navigated;
b.LoadCompleted += b_LoadCompleted;
b.Navigate(http://microsoft.com);
[/CODE]

It seems to fire those three events in that order: Loaded, Navigated, then LoadCompleted. LoadCompleted doesn't fire until the entire page's content is completely downloaded.

Yeah, the code tags are dumb. There should be quotes around the Uri string.

  • 0

I tested it with the following:


WebBrowser b = new WebBrowser();
b.Loaded += b_Loaded;
b.Navigated += b_Navigated;
b.LoadCompleted += b_LoadCompleted;
b.Navigate(http://microsoft.com);
[/CODE]

It seems to fire those three events in that order: Loaded, Navigated, then LoadCompleted. LoadCompleted doesn't fire until the entire page's content is completely downloaded.

Yeah, the code tags are dumb. There should be quotes around the Uri string.

I am close. It works but, at the same time it does not completly come through. When I use a google search page as a test. all the methods you mention are called, but the HTMLDocument I extract when these methods are fired contain the HTML from the home page of google.

Also, if I put breakpoints on these methods, at watch the output to see if the WebBroswer class loads the google results page, it does not. The page is blank until the program is idle.

What do you think?

  • 0

If I am not mistaken it is because google uses ajax to load div content to show the results as you type and the load completed just will pull the html document that was originally loaded. If I use ajax requests and they change div content and then go to view source in any browser it will just show my original document.

This topic is now closed to further replies.
  • Recently Browsing   0 members

    • No registered users viewing this page.
  • Posts

    • They've told outlets who got review units that it isn't. Partially because they believe that contributes to closed ecosystems. GamersNexus also believes this is because Valve's fighting a monopolistic practices lawsuit in Europe right now. They've also never subsidized any of their past hardware efforts and well, they definitely aren't subsidizing the Steam Deck right now.
    • How do you know they aren't at least partially subsidizing it?
    • (macOS) Screen zoom was broken for me in beta 1 and it's now working properly in beta 2. In terms of performance and UI design/consistency, these betas are already much better than Tahoe.
    • Less powerful than a PS5 at twice the price! I wonder if they use that for marketing? Totally DoA.
    • Astra 0.6.1 Beta by Razvan Serea Astra is an audiophile music player designed for local music libraries, supporting MP3, FLAC, WAV, AAC, OGG, M4A, OPUS, WMA, AIFF, and more via FFmpeg. It offers gapless playback with pre-buffering, multichannel audio remapping, and Dolby Atmos decoding, ensuring albums play seamlessly while maintaining high-fidelity sound. Astra features real-time DSP visualizers powered by a native C++ engine, including an oscilloscope, spectrum analyzer, and vectorscope. A fully parametric 10-band EQ with live frequency response, built-in presets, and AutoEQ headphone calibration import lets you precisely shape your sound. Playback controls include shuffle, repeat, and drag-and-drop queue management, while the library automatically extracts metadata, album artwork, and supports global search, favorites, and recently played tracking. Additional features include output device selection, delay calibration, customizable themes, fullscreen and mini-player modes, Discord Rich Presence, optional Last.fm scrobbling, and an opt-in local API for integrations. Astra delivers a complete, high-quality desktop audio experience with no telemetry, accounts, or streaming. Astra 0.6.1 Beta changelog: Lyrics Initial XLRC support via @boof2015/xlrc 0.2.0 (#131) XLRC sidecar scanning, manual import, and renderer support Word timing, furigana, translations, voice labels, and translation-priority controls for XLRC Fullscreen lyrics overhaul with additional layout polish Manual lyrics editor with LRC, XLRC, and plain-text modes Drag-and-drop lyrics import plus sync offset controls Clickable synced lyrics for seeking, with popout and transport lyrics updates (#138) Fixed lyrics info sidebar scrolling (#138) Added a workaround for LRCLIB instability Metadata & Library Metadata editor rebuilt as a side panel Virtual DB metadata overrides and optional direct file tag writing Bulk metadata editing for title, artist, album, album artist, genre, year, track/disc numbers, and artwork Undo/redo support for virtual metadata edits Clear overrides action and default save-mode preference Artist page grid view added, with later design and sizing refinements Improved Jump to Playing with smart source, queue, album, artist, and library track targets Fixed smart source jump behavior Playlists Fixed VLC-style M3U import failures (#127) Added playlist export to M3U/M3U8 (#118) Improved imported playlist path resolution and missing-entry preservation Shuffle added to playlist pages (#121) Remove tracks directly from playlist views (#128) Fixed create-playlist-from-track modal closing when clicking inside it (#137) Multi-select quality-of-life fixes Right-click context menus no longer clear multiselections UI & Navigation Fixed UI scaling regressions in sidebar and home surfaces (#122, #123) Fixed transport bar regression (#126) Fixed horizontal scrolling on Home and Library rails Fixed artist grid sizing while searching Updated playlist action buttons and related layout polish Additional fullscreen lyrics visual adjustments Visualization Scopes and visualizers now respect UI scaling settings (#155) Added shared canvas sizing logic for correct DPR/backing-store behavior Canvas sizing tests added for visualizer scaling regressions Discord RPC Discord Rich Presence activity structure refactored Compact status can prioritize title or artist Profile info line can show file info or album Title and artist links can target YouTube Music, Last.fm, or be disabled Optional small Astra badge for cover-art presence Configurable “clear when paused” timing Added Discord activity tests Scrobbling Fixed custom Last.fm2 API profiles being accidentally blocked Expanded scrobbler profile protocol handling coverage Stability & Tests Added/expanded tests for XLRC parsing, lyrics presentation, metadata editor state, playlist import/export path handling, artist grid layout, horizontal scrolling, canvas sizing, and Discord RPC activity building Download: Astra 0.6.1 Beta | 138.0 MB (Open Source) View: Astra Home Page | Github | Screenshot Get alerted to all of our Software updates on Twitter at @NeowinSoftware
  • Recent Achievements

    • Week One Done
      Almohandis earned a badge
      Week One Done
    • Rookie
      dorf went up a rank
      Rookie
    • First Post
      mike_rumble earned a badge
      First Post
    • Dedicated
      tuben earned a badge
      Dedicated
    • Week One Done
      mnsgroup earned a badge
      Week One Done
  • Popular Contributors

    1. 1
      +primortal
      500
    2. 2
      +Edouard
      209
    3. 3
      PsYcHoKiLLa
      100
    4. 4
      Michael Scrip
      85
    5. 5
      neufuse
      69
  • Tell a friend

    Love Neowin? Tell a friend!