Results 1 to 5 of 5
  1. #1
    3 Star Lounger
    Join Date
    Jan 2001
    Location
    Sydney, Australia, New South Wales, Australia
    Posts
    251
    Thanks
    0
    Thanked 4 Times in 4 Posts

    MSHTML, VBA and HttpResponse code (IE 6)

    I'm scraping a website using VBA (from Access) and MSHTML (via IE 6), similar to covered on thread starting with post <post#=292415>post 292415</post#>.
    My question is: how do I tell what response code I received? How do I tell whether I got a 200 (OK) or 404 (Not found)? The actual page returned for those codes varies depending on the web server, but I should be able to get the HttpResponse code. My basic code is:

    <pre>Dim objMSHTML As New MSHTML.HTMLDocument
    Dim objDocument As MSHTML.HTMLDocument

    'This function is only available with Internet Explorer 5 and later
    Set objDocument = objMSHTML.createDocumentFromUrl(sURL, vbNullString)

    'Tricky, to make the function wait for the document to complete, usually the
    'transfer is asynchronous. Note that this string might be different if you have
    'another language than English for Internet Explorer on the machine where the code is
    'executed.
    While objDocument.readyState <> "complete"
    DoEvents
    Wend

    'OK, now we've got the page

    If objDocument.Title = "404 Not Found" Then
    'This is not a robust solution
    'Need to get "objDocument.HttpResponseCode" or similar
    '...
    </pre>


  2. #2
    Super Moderator jscher2000's Avatar
    Join Date
    Feb 2001
    Location
    Silicon Valley, USA
    Posts
    23,112
    Thanks
    5
    Thanked 93 Times in 89 Posts

    Re: MSHTML, VBA and HttpResponse code (IE 6)

    Interesting and tough problem. It appears that you need to do quite a bit of low-level API work to get this information, using several functions before you can request the status line with HttpQueryInfo. Some resources for you:

    Win32 Internet HTTP Functions in Visual Basic MSDN, Sept. 1996
    FIX: Internet Transfer Control 5.0 Has Bug with "HEAD" Request MSKB #171271
    HTTP Status Codes (Platform SDK: Windows Internet) MSDN

    I look forward to seeing the solution. <img src=/S/grin.gif border=0 alt=grin width=15 height=15>

  3. #3
    3 Star Lounger
    Join Date
    Jan 2001
    Location
    Sydney, Australia, New South Wales, Australia
    Posts
    251
    Thanks
    0
    Thanked 4 Times in 4 Posts

    Re: MSHTML, VBA and HttpResponse code (IE 6)

    Oh that I was using Java or Perl!
    I'll dive into the problem with these leads - thank you very much.
    I'll post the solution when I get it, but it might not be today!
    Peter

  4. #4
    3 Star Lounger
    Join Date
    Jan 2001
    Location
    Sydney, Australia, New South Wales, Australia
    Posts
    251
    Thanks
    0
    Thanked 4 Times in 4 Posts

    Re: MSHTML, VBA and HttpResponse code (IE 6)

    The code on the first of your links works, although the MS site is missing the sample application.
    The second link applies to the control, that the code doesn't actually need.
    Unfortunately, the <!t>[InternetReadFile]<!/t> function retrieves the page as text, not as a MSHTML object. So I'd need to either rewrite my system to parse the document itself (too hard), or create a MSHTML object from the text (too likely to bomb with bad HTML , and I don't know whether this is possible ) or make multiple requests (too much traffic).
    So my problem is now refined to how do I get the Http Status Code for a MSHTML request.
    I've redefined part of my code so that the effect of a 404 is minimised, so I don't need the answer to this problem now.

    [philosophy]
    I'm still interested from a curiosity point of view. Why would MS hide (so effectively) this standard piece of information? On the web, headers are almost as important as content (from a program's point of view).
    [/philosophy]

    Thanks for your help.

  5. #5
    Super Moderator jscher2000's Avatar
    Join Date
    Feb 2001
    Location
    Silicon Valley, USA
    Posts
    23,112
    Thanks
    5
    Thanked 93 Times in 89 Posts

    Re: MSHTML, VBA and HttpResponse code (IE 6)

    I realize it's terribly cumbersome, but I thought you could use the API calls for the sole purpose of obtaining the header information, but continue to use the rest of your code "as is." As for why it isn't part of the MSHTML document object model, good question!!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •