Results 1 to 7 of 7
  1. #1
    Uranium Lounger
    Join Date
    Dec 2000
    Location
    Salt Lake City, Utah, USA
    Posts
    9,508
    Thanks
    0
    Thanked 6 Times in 6 Posts

    Parsing PDF source text files into tables (All?)

    I
    -John ... I float in liquid gardens
    UTC -7ąDS

  2. #2
    Plutonium Lounger
    Join Date
    Mar 2002
    Posts
    84,353
    Thanks
    0
    Thanked 29 Times in 29 Posts

    Re: Parsing PDF source text files into tables (All?)

    To find the last alphabetic character, you can use this function (either in formulas, or in VBA code):

    Function LastAlphaPos(varValue As Variant) As Integer
    Dim i As Integer
    For i = Len(varValue) To 1 Step -1
    If Asc(Mid(varValue, i)) > 64 Then
    LastAlphaPos = i
    Exit For
    End If
    Next i
    End Function

    This is 'naive' - anything with ASCII code above 64 is 'alphabetic', everything else not 'alphabetic'. You can refine the test if you like.

  3. #3
    Plutonium Lounger
    Join Date
    Mar 2002
    Posts
    84,353
    Thanks
    0
    Thanked 29 Times in 29 Posts

    Re: Parsing PDF source text files into tables (All?)

    Always glad to [iut somebody! <img src=/S/grin.gif border=0 alt=grin width=15 height=15>

  4. #4
    Uranium Lounger
    Join Date
    Dec 2000
    Location
    Salt Lake City, Utah, USA
    Posts
    9,508
    Thanks
    0
    Thanked 6 Times in 6 Posts

    Re: Parsing PDF source text files into tables (All?)

    Thanks, that puts me over the hump! I can wrap a selection cells conversion loop around this expansion on your function, and then add the Text-to-Columns functions:

    Function DelimiterAfterLastAlpha(rngCell As Range) As String
    Dim i As Integer, intLen As Integer
    Dim strVal As String
    strVal = rngCell.Value
    intLen = Len(strVal)
    For i = intLen To 1 Step -1
    If Asc(Mid(strVal, i)) > 64 Then Exit For
    Next i
    If i > 1 Then
    DelimiterAfterLastAlpha = Left(strVal, i) & "
    -John ... I float in liquid gardens
    UTC -7ąDS

  5. #5
    Uranium Lounger
    Join Date
    Dec 2000
    Location
    Salt Lake City, Utah, USA
    Posts
    9,508
    Thanks
    0
    Thanked 6 Times in 6 Posts

    Re: Parsing PDF source text files into tables (All?)

    Stupid keyboard! <img src=/S/laugh.gif border=0 alt=laugh width=15 height=15>

    Attached is my work in progress, for anyone who should need something similar.
    -John ... I float in liquid gardens
    UTC -7ąDS

  6. #6
    3 Star Lounger
    Join Date
    May 2002
    Location
    Mpls, Minnesota, USA
    Posts
    271
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Parsing PDF source text files into tables (All?)

    I came across this thread looking for help with similar problem.
    The PDF I was working with was a financial statement and did not have text at the beginning of each entry.
    I decided to try printing the PDF and scanning the printed pages using OCR, which I saved in rtf format.
    When I opened the rtf document in Word it was in a very nice table and was easy to copy/paste into Excel.
    As OCR is not perfect, there were a few errors that were easy to spot and correct.
    I thought I would offer it as a possible method for the future.

    Chuck
    Chuck Reimer
    I'm from the Government and I'm here to help...

  7. #7
    5 Star Lounger jujuraf's Avatar
    Join Date
    Jun 2001
    Location
    San Jose, California, USA
    Posts
    1,061
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Parsing PDF source text files into tables (All?)

    I've successfully used a product called Able2Extract, http://www.investintech.com/pdftoexcel.html for just this very thing. It does a pretty good job of grabbing tabular data from a PDF file (and other files).

    Since this is something you do often, it might be worth it to get this tool.

    Deb

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •