Results 1 to 6 of 6
  1. #1
    Bronze Lounger
    Join Date
    Jan 2001
    Posts
    1,418
    Thanks
    1
    Thanked 0 Times in 0 Posts

    .pdf to Word (2000)

    Problem: Whenever I copy a .pdf document over to word, any of the options I select a the file type I want copy to inserts a hard return at the end of each line of text. What I would like is that it be copied in the same format as the original document, but without the hard returns at the end of each line (but they would be at the end of each paragraph.).

    Any ideas?

    Thanks in advanve,

    Jeff

  2. #2
    Super Moderator
    Join Date
    May 2002
    Location
    Canberra, Australian Capital Territory, Australia
    Posts
    5,055
    Thanks
    2
    Thanked 417 Times in 346 Posts

    Re: .pdf to Word (2000)

    Hi Jeff,

    I think you can do that with the full version of Adobe Acrobat, and with some other pdf conversion packages. Alternatively, you could scan & OCR the document - any decent OCR package can tell which lines are part of the same para (most of the time, anyway).

    Cheers
    Cheers,

    Paul Edstein
    [MS MVP - Word]

  3. #3
    Super Moderator jscher2000's Avatar
    Join Date
    Feb 2001
    Location
    Silicon Valley, USA
    Posts
    23,112
    Thanks
    5
    Thanked 93 Times in 89 Posts

    Re: .pdf to Word (2000)

    I don't use AutoFormat, but I think it has an option to "unwrap" text broken with hard returns after the fact. Alternatively, you could write a macro that unwraps it from the clipboard. I have something similar I wrote a while ago, for use with text from e-mail messages:

    <pre>Sub wrapSoft()
    'Requires a reference to Microsoft Forms 2.0 Object Library
    Dim doClip As DataObject, strTemp As String
    Dim strClipParas() As String, strClipwords() As String
    Set doClip = New DataObject 'instantiate DataObject
    doClip.GetFromClipboard 'retrieve clipboard
    On Error GoTo invalidFormat
    'break up individual paragraphs (looking for two vbCrLfs in a row)
    strClipParas = Split(doClip.GetText(1), vbCrLf & vbCrLf)
    On Error GoTo 0
    If UBound(strClipParas()) = 0 And strClipParas(0) = vbNullString Then
    MsgBox "Clipboard appears to be empty. Copy the text to parse and " & _
    "try again.", vbCritical
    GoTo unloadObj
    End If

    Dim k As Integer
    Dim i As Integer, newClip As String
    newClip = vbNullString
    For k = 0 To UBound(strClipParas())
    'remove those infernal > symbols, if possible, from beginnings of lines
    strTemp = WildReplace(strClipParas(k), "rn>*", vbCrLf)
    'clear all paras to spaces
    strTemp = Replace(strTemp, vbCrLf, " ")
    'collapse all white space to single spaces
    strTemp = WildReplace(strTemp, "[ tv]{1,}", " ")
    'try to resurrect double spaces after periods
    strTemp = WildReplace(strTemp, "([.?!] )([A-Z])", "$1 $2")
    'append to new clipboard string
    newClip = newClip & strTemp
    If k < UBound(strClipParas()) Then
    newClip = newClip & vbCrLf & vbCrLf
    End If
    Next

    doClip.SetText newClip, 1
    doClip.PutInClipboard
    GoTo unloadObj

    invalidFormat:
    MsgBox "Clipboard appears to contain a non-text object. Copy the text " & _
    "to parse and try again.", vbCritical
    unloadObj:
    Set doClip = Nothing 'eliminate unneeded object
    End Sub

    Function WildReplace(strExpression As String, strFind As String, _
    strReplace As String, Optional bolReplaceAll As Boolean = True, _
    Optional bolCaseSensitive As Boolean = False) As String
    'Requires VBScript 5 = IE 5.x or greater
    'perform minimal parameter checking
    If (strExpression = vbNullString) Or (strFind = vbNullString) Then
    WildReplace = strExpression
    Exit Function
    End If
    'instantiate RegExp (regular expressions) object
    Dim objRegExp As Object
    Set objRegExp = CreateObject("vbscript.regexp")
    objRegExp.IgnoreCase = Not bolCaseSensitive
    objRegExp.Global = bolReplaceAll
    objRegExp.Pattern = strFind
    WildReplace = objRegExp.Replace(strExpression, strReplace)
    End Function</pre>

    You probably can simplify this. Looking at it now, I'm not sure that whatever I was trying to clean up is going to be an issue for PDF pastes...

  4. #4
    Super Moderator jscher2000's Avatar
    Join Date
    Feb 2001
    Location
    Silicon Valley, USA
    Posts
    23,112
    Thanks
    5
    Thanked 93 Times in 89 Posts

    Re: .pdf to Word (2000)

    As long as we're at it ( <img src=/S/grin.gif border=0 alt=grin width=15 height=15> ), Microsoft's Office Marketplace is promoting a third party PDF-to-Word tool that imports PDFs into Word for only $49.95.

    SolidConverter PDF from VoyagerSoft, LLC

    Not sure if you would use this often enough to justify the price.

  5. #5
    Bronze Lounger
    Join Date
    Jan 2001
    Posts
    1,418
    Thanks
    1
    Thanked 0 Times in 0 Posts

    Re: .pdf to Word (2000)

    Where is this "un wrap" option?

  6. #6
    Super Moderator jscher2000's Avatar
    Join Date
    Feb 2001
    Location
    Silicon Valley, USA
    Posts
    23,112
    Thanks
    5
    Thanked 93 Times in 89 Posts

    Re: .pdf to Word (2000)

    I don't use AutoFormat, so I don't know where it is. Sorry. Maybe someone who uses can help.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •