Ramses Posted October 5, 2004 Share Posted October 5, 2004 Hi all What is the best (fastest) way to check for changes in a specified url? I want to make a small program that will check at specified intervals and display a pop-up if the url is changes Link to comment Share on other sites More sharing options...
0 rayjin Posted October 5, 2004 Share Posted October 5, 2004 not sure... but how does a URL change? doesnt http://www.google.com stay as is for example? Link to comment Share on other sites More sharing options...
0 Ramses Posted October 5, 2004 Author Share Posted October 5, 2004 I don't mean the url, but the page that is located at the url Link to comment Share on other sites More sharing options...
0 Andareed Posted October 5, 2004 Share Posted October 5, 2004 Browsers check various content headers for expiration. You could use wininet. Here is some C; it should be easy to translate it into vb code. HINTERNET hInternet, hRequest; SYSTEMTIME systemTime; DWORD dwLen; hInternet = InternetOpen(NULL, INTERNET_OPEN_TYPE_DIRECT, NULL, NULL, 0); hRequest = InternetOpenUrl(hInternet, "http://www.google.com/", "", INTERNET_FLAG_PRAGMA_NOCACHE, 0, NULL); dwLen = sizeof(SYSTEMTIME); dwIndex = 0; HttpQueryInfo(hRequest, HTTP_QUERY_DATE | HTTP_QUERY_FLAG_SYSTEMTIME, &systemTime, &dwLen, &dwIndex); Link to comment Share on other sites More sharing options...
0 djtaylor Posted October 5, 2004 Share Posted October 5, 2004 I assume you understand web requests and responses (the client sends a web request to the server, and the server replies with a web response), and that each consists of a header and (optionally) some content. After the first web request for the page, there should be a Last-Modified field in the header of the response. Google is a bad example because there are no Last-Modified fields for actual pages, only images. But you get the idea! First web request: GET /images/logo.gif HTTP/1.1 Accept: */* Referer: http://www.google.com Accept-Language: en-gb Accept-Encoding: gzip, deflate User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322) Host: www.google.com Response: HTTP/1.1 200 OK Content-Type: image/gif Last-Modified: Thu, 23 Sep 2004 17:42:04 GMT Expires: Sun, 17 Jan 2038 19:14:07 GMT Server: GWS/2.1 Content-Length: 8558 Date: Tue, 05 Oct 2004 12:05:47 GMT (then follows the 8558 bytes of the image) In the next web request for the same URI, include an If-Modified-Since header field, and use the date obtained from the server in the Last-Modified field. Subsequent requests: GET /images/logo.gif HTTP/1.1 Accept: */* Referer: http://www.google.com Accept-Language: en-gb Accept-Encoding: gzip, deflate If-Modified-Since: Thu, 23 Sep 2004 17:42:04 GMT User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322) Host: www.google.com If the document has not been modified since the date you passed to the server in the If-Modified-Since header field, the server will reply with status code 304: Repsonse: HTTP/1.1 304 Not Modified Content-Type: text/html Server: GWS/2.1 Content-Length: 0 Date: Tue, 05 Oct 2004 12:06:15 GMT Notice that the content length is 0 (i.e. the document is not sent). The browser then knows that the document has not been modified and can use the cached version. If the document HAS been modified, however, the server will reply with status code 200, as it did in the first response. If the server does not support this feature, it will simply always reply with status code 200 and there's no way to tell whether it's been modified (other than by comparing it with the cached version). Link to comment Share on other sites More sharing options...
0 Sn1p3t Posted October 5, 2004 Share Posted October 5, 2004 Another way (although much less efficient) is to a class comparable to the WebClient class in .NET and grab the contents of the page. Compute a hash and compare it against the previous hash. Link to comment Share on other sites More sharing options...
0 smurfiness Posted October 5, 2004 Share Posted October 5, 2004 The one problem with all of these approaches is that if the page has ANY dynamic content at all, it will always be "updated" since the last view. The forum summary on the Neowin front page is a good example. These are useless if the site has more than 10 visitors a day, but a LOT of sites do it anyway. Link to comment Share on other sites More sharing options...
Question
Ramses
Hi all
What is the best (fastest) way to check for changes in a specified url?
I want to make a small program that will check at specified intervals and display a pop-up if the url is changes
Link to comment
Share on other sites
6 answers to this question
Recommended Posts