Page 1 of 2 12 LastLast
Results 1 to 15 of 23
  1. #1
    Platinum Lounger
    Join Date
    Dec 2001
    Location
    Melbourne, Australia
    Posts
    4,594
    Thanks
    0
    Thanked 27 Times in 27 Posts

    Screen scraper cont... (IE6)

    I have included a zip file containing 2 source files (racingmain20051120 & racingmain20051120Geelong).

    racingmain20051120 is the first page to go to, I would like to be able to extract the meetings and their 1st EventID so I can save the meetings I want into an Access table.

    racingmain20051120Geelong is then entered, I would like to be able to extract the EventID for each Race in this file that relates to the Geelong races.

    Can somebody please help.

  2. #2
    Super Moderator jscher2000's Avatar
    Join Date
    Feb 2001
    Location
    Silicon Valley, USA
    Posts
    23,112
    Thanks
    5
    Thanked 93 Times in 89 Posts

    Re: Screen scraper cont... (IE6)

    Did you mean to attach some files? The answer depends on the details of your zip file contents.

  3. #3
    Platinum Lounger
    Join Date
    Dec 2001
    Location
    Melbourne, Australia
    Posts
    4,594
    Thanks
    0
    Thanked 27 Times in 27 Posts

    Re: Screen scraper cont... (IE6)

    Sorry about that, here it is.
    Attached Files Attached Files

  4. #4
    Super Moderator jscher2000's Avatar
    Join Date
    Feb 2001
    Location
    Silicon Valley, USA
    Posts
    23,112
    Thanks
    5
    Thanked 93 Times in 89 Posts

    Re: Screen scraper cont... (IE6)

    Your first file contains three empty tables that seem to be the places where this data should be. The ID attributes of the tables are:

    id="template_RacingMeetingList_tabHarness"
    id="template_RacingMeetingList_tabGreyhound"
    id="template_RacingMeetingListFuture_grdAustThor"

    I assume these tables are filled in with JavaScript, and that the data updates as of the time the script is run. Therefore this approach of archiving the raw HTML to a zip file probably is not very useful.

    The second file is more promising. It contains numerous different text that you could read and interpret. For example, there are a number of links in this format:
    <UL>GEELONG[/list](None of the other links in this particular file contain the string "GEELONG", but I think you should allow for the possibility that there might be more than one.)

    There also are other places where the eventid appears, including a drop down control (a <select> element) with individual entries (<option> elements) like this:
    <UL><option value="/site/racing/racingwinplace.aspx?eventid=168021">GEELONG</option>[/list]If you plan to read these as .txt files rather than loading them in IE and using the document object model, you can use string manipulation methods. For example, let's suppose you wanted to find the eventid in that last <option> element. You could:

    dim fso as new FileSystemObject ' required one of the Microsoft Scripting libraries under Tools>References
    strBigString = {use fso.open textstream to your file method}.Readall
    lngEventEnd = instr(1, strbigstring, """>GEELONG</option>") - 1
    if lngeventend > 0 then
    strEventID = mid(strbigstring, lngeventend - 5, 6)
    'other stuff
    else
    msgbox "Not found"
    end if

    I obviously haven't tested that (!!) but strEventID should end up being the 6 character eventid as a string.

  5. #5
    Super Moderator jscher2000's Avatar
    Join Date
    Feb 2001
    Location
    Silicon Valley, USA
    Posts
    23,112
    Thanks
    5
    Thanked 93 Times in 89 Posts

    Re: Screen scraper cont... (IE6)

    As you noted originally, there might be more than one GEELONG race, in which case that precise string might not be found. You might need to use some fuzzy searching, for example with the RegEx object, to find them if there are more than one. That's a big topic; you can pick up some initial tips and samples using a Lounge Search (not limited to IE.)

  6. #6
    Platinum Lounger
    Join Date
    Dec 2001
    Location
    Melbourne, Australia
    Posts
    4,594
    Thanks
    0
    Thanked 27 Times in 27 Posts

    Re: Screen scraper cont... (IE6)

    <P ID="edit" class=small>(Edited by patt on 22-Nov-05 07:28. Further explain problem.)</P>In the first file it shows which meetings are being run on that day, eg Geelong. I notice there is the following line that seems to show there is some kind of table that follows:
    <div class="userform"><table id="template_RacingMeetingList_tabHorse" width="100%" border="0" cellpadding="0" cellspacing="0"><tr>

    Following that line are the entries which show the EventID for the first race of each meeting:
    <td width="25%" style="height:21px;;">[img]/images/1/en/misc/flags/flag_AU.gif[/img]

  7. #7
    Platinum Lounger
    Join Date
    Dec 2001
    Location
    Melbourne, Australia
    Posts
    4,594
    Thanks
    0
    Thanked 27 Times in 27 Posts

    Re: Screen scraper cont... (IE6)

    Are you saying it cannot be done using the Document Object model?

  8. #8
    Super Moderator jscher2000's Avatar
    Join Date
    Feb 2001
    Location
    Silicon Valley, USA
    Posts
    23,112
    Thanks
    5
    Thanked 93 Times in 89 Posts

    Re: Screen scraper cont... (IE6)

    I'm sure it can be done using the DOM in the same manner as the earlier project. However, I've been too busy to work on this (co-workers out on vacation).

  9. #9
    Platinum Lounger
    Join Date
    Dec 2001
    Location
    Melbourne, Australia
    Posts
    4,594
    Thanks
    0
    Thanked 27 Times in 27 Posts

    Re: Screen scraper cont... (IE6)

    I can wait, I can understand you must be a very busy man based upon your bio. Actually I will have a go at the old code I have and see what I can do with it.

  10. #10
    Platinum Lounger
    Join Date
    Dec 2001
    Location
    Melbourne, Australia
    Posts
    4,594
    Thanks
    0
    Thanked 27 Times in 27 Posts

    Re: Screen scraper cont... (IE6)

    In the interim I have decided to extract the source of the web page and analyse that, it works fine.

  11. #11
    Platinum Lounger
    Join Date
    Dec 2001
    Location
    Melbourne, Australia
    Posts
    4,594
    Thanks
    0
    Thanked 27 Times in 27 Posts

    Re: Screen scraper cont... (IE6)

    I have a decided to save the source and process the txt file for the event numbers I need. I can read the base IE screen which holds the start event IDs I need.

    What I want to know is how do I save the source of the base IE screen I have opened?

  12. #12
    Super Moderator jscher2000's Avatar
    Join Date
    Feb 2001
    Location
    Silicon Valley, USA
    Posts
    23,112
    Thanks
    5
    Thanked 93 Times in 89 Posts

    Re: Screen scraper cont... (IE6)

    Use the HTMLDocument.documentElement.innerHTML and save it to disk using either a disk access method, FileSystemObject, or Word. Attached illustrates how to get the HTML code into the body of a Word document. It then uses SaveAs to plain text.

    Hmmm... I just noticed that the HTML comes out formatted a bit differently in the documentElement.innerHTML version versus the original (as saved from the browser). I suppose as long as you consistent use one or the other, it should be okay.
    Attached Files Attached Files

  13. #13
    Platinum Lounger
    Join Date
    Dec 2001
    Location
    Melbourne, Australia
    Posts
    4,594
    Thanks
    0
    Thanked 27 Times in 27 Posts

    Re: Screen scraper cont... (IE6)

    Please excuse my ignorance, but how do you use "the HTMLDocument.documentElement.innerHTML and save it to disk using either a disk access method"? Have you got a code example?

    The attached appears to me to be a word document with nothing in it.

  14. #14
    Plutonium Lounger
    Join Date
    Mar 2002
    Posts
    84,353
    Thanks
    0
    Thanked 29 Times in 29 Posts

    Re: Screen scraper cont... (IE6)

    Jefferson's attachment contains a module with sample VBA code.

  15. #15
    Platinum Lounger
    Join Date
    Dec 2001
    Location
    Melbourne, Australia
    Posts
    4,594
    Thanks
    0
    Thanked 27 Times in 27 Posts

    Re: Screen scraper cont... (IE6)

    Thanks Jefferson and Hans,
    I have never been into VBA in word before. Until just now I had no idea how to get to VBA in Word.

    Can this code or something similar be used from Access?

Page 1 of 2 12 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •