Page 1 of 2 12 LastLast
Results 1 to 15 of 25
  1. #1
    New Lounger RedWagnum's Avatar
    Join Date
    Sep 2016
    Location
    San Antonio, TX
    Posts
    21
    Thanks
    3
    Thanked 0 Times in 0 Posts

    Question Parsing Text files

    Okay folks, I'm at it again. The short version: I need to read some text files, parse the data, temporarily store it to display for review and corrections, then dump it into a table.

    The Long version: I have several hundred text files that I have accumulated over the years. These files contain contact and other information. Since the files were created over a number of years the format may vary slightly from file to file. The structure is generally something like this:

    Code:
    John Doe
    
    AKA: Johnny, Doe-man
    
    ID: JODO83MXT
    
    Born [dd-mm-yy]: 03-11-1969
    Birthplace: Dallas, TX
    First Contact: 2010 [or Year Met:]
    Last Contact: 2015 [or Most Recently:]
    Where: Austin, TX [or Location:]
    Height: 5'9"
    Weight: 190
    Hair Color: Brown
    Build: Medium
    Activities: Cycling, Running
    Class: 87
    
    Categories: BA, SU
    The first three (non-blank) lines are always formatted as shown. After that the field names may differ from what is shown. I'm pretty sure the order is the same though. A few files may have blank lines between each field. Categories, if it exists, is always preceded with a blank line. There is other data after all this that I don't care about (presently)

    I have for a module to select a file with. I think I can handle the parsing bit using SELECT CASE to look for the "field names" and extract the relevant data using VBA's string functions. Not sure what to do after that. What I want is to display the data in some way so that I can correct/verify it before committing it to the main data table. Displaying it for editing in a form with with a "Cancel" and "Commit" button would be ideal I think.

    Lastly, should the code be created as a function, subroutine, macro, or module? I'm missing some concepts here, that's for sure!

    Looking for advice . . .

  2. #2
    WS Lounge VIP access-mdb's Avatar
    Join Date
    Dec 2009
    Location
    Oxfordshire, UK
    Posts
    1,720
    Thanks
    146
    Thanked 156 Times in 149 Posts
    I have a routine for parsing files with data from all my photographs (EXIF, ITTC etc). These are exported into a text file (with data from many photos). I use a function which looks for specific strings - just like you appear to do. The function puts the data into a temporary table (after clearing it of data) so I can eyeball it and correct any errors and supply missing information. Once I'm happy I use a query to move the data into the final table. The function and query are run from buttons on my home screen (aka switchboard). This may well be something like you would want to do. My only comment is to limit the number of records in the temporary table as this will help you in looking for problems (thousands of records is not a good idea in these cases!)
    Talk is cheap because supply exceeds demand

  3. #3
    New Lounger RedWagnum's Avatar
    Join Date
    Sep 2016
    Location
    San Antonio, TX
    Posts
    21
    Thanks
    3
    Thanked 0 Times in 0 Posts
    Okay, started some work on this but I'm having another "You're not doing it right" moment. Why are the Select Cases falling through?

    Code:
        f = FreeFile
        Open strFile For Input As #f
        Line Input #f, strLine
        rst.AddNew
        rst!tName = strLine
    '    rst.Update
        
        Do While Not EOF(f)
            Line Input #f, strLine
            If Left(strLine, 4) = "AKA:" Then Debug.Print Trim(Mid(strLine, 5, 250))
            
            Select Case strLine
                Case Left(strLine, 4) = "AKA:"
                    rst!tAKA = Trim(Mid(strLine, 5, 250))
                Case Left(strLine, 3) = "ID:"
                    rst!tID = Trim(Mid(strLine, 4, 50))
                Case Left(strLine, 5) = "Born:"
                    rst!tBorn = Trim(Mid(strLine, 6, 50))
                Case Left(strLine, 11) = "Birthplace:"
                    rst!tBirthplace = Trim(Mid(strLine, 12, 50))
                Case Left(strLine, 14) = "First Contact:"
                    rst!tFirstContact = Trim(Mid(strLine, 15, 50))
                Case Left(strLine, 9) = "Year Met:"
                    rst!tFirstContact = Trim(Mid(strLine, 10, 50))
                Case Left(strLine, 13) = "Last Contact:"
                    rst!tLastContact = Trim(Mid(strLine, 14, 50))
                Case Left(strLine, 14) = "Most Recently:"
                    rst!tLastContact = Trim(Mid(strLine, 15, 50))
                Case Left(strLine, 7) = "Where:"
                    rst!tWhere = Trim(Mid(strLine, 8, 50))
                Case Left(strLine, 9) = "Location:"
                    rst!tWhere = Trim(Mid(strLine, 10, 50))
                Case Left(strLine, 7) = "Height:"
                    rst!tHeight = Trim(Mid(strLine, 8, 50))
                Case Left(strLine, 7) = "Weight:"
                    rst!tWeight = Trim(Mid(strLine, 8, 50))
                Case Left(strLine, 11) = "Hair Color:"
                    rst!tHairColor = Trim(Mid(strLine, 12, 50))
                Case Left(strLine, 6) = "Build:"
                    rst!tBuild = Trim(Mid(strLine, 7, 50))
                Case Left(strLine, 7) = "Height:"
                    rst!tHeight = Trim(Mid(strLine, 8, 50))
                Case Left(strLine, 11) = "Activities:"
                    rst!tActivities = Trim(Mid(strLine, 12, 50))
                Case Left(strLine, 6) = "Class:"
                    rst!tClass = Trim(Mid(strLine, 7, 50))
                Case Left(strLine, 11) = "Categories:"
                    rst!tCategories = Trim(Mid(strLine, 12, 50))
            End Select
        Loop
    Assume the data as posted in the first post. The first Line Input reads the name. When we hit the Loop we read the next line in the file ("") which is a no match and we loop again. Next input is "AKA: Johnny, Doe-man" and the If...Then confirms this by printing "Johnny, Doe-man" in the immediate window. However the first Case which uses the same expression as the If...Then falls through, as do all the others when their appropriate line comes up. Wut?? And please ignore the "rst" lines. Still learning about how to use recordsets.

    Thanks!

  4. #4
    New Lounger RedWagnum's Avatar
    Join Date
    Sep 2016
    Location
    San Antonio, TX
    Posts
    21
    Thanks
    3
    Thanked 0 Times in 0 Posts
    After poking around testing, reading the help files, having some dinner, and testing some more I came up with this:
    Code:
    Option Compare Database
    Option Explicit
    
    Private Sub ImportTXT_Click()
        Dim strFile As String
        Dim f As Integer
        Dim dbs As DAO.Database
        Dim rst As DAO.Recordset
        Dim strLine As String
        Dim strValue As String
        Dim strTest As String
        Dim x As Integer
        
        On Error GoTo ErrHandler
    
        With Application.FileDialog(msoFileDialogOpen)
            .Filters.Clear
            .Filters.Add "Text files", "*.txt"
            If Not .Show = True Then
                Exit Sub
            End If
            strFile = .SelectedItems(1)
        End With
    
        Set dbs = CurrentDb
        Set rst = dbs.OpenRecordset("tblTmpData", dbOpenDynaset)
        
        '  We are going to be using the Edit method
        rst.Edit
    
        f = FreeFile
        Open strFile For Input As #f
        
        Line Input #f, strLine
        
        '  The first line is always the name
        rst!tName = strLine
        
        Do While Not EOF(f)
            Line Input #f, strLine
            
            '  Look for a colon
            x = InStr(1, strLine, ":", vbTextCompare)
            '  If x>1 (found) the chop up the string else we skip the Select
            If x > 1 Then
                strTest = Left(strLine, x)                  '  get everything up and including the colon
                strValue = Trim(Mid(strLine, x + 1, 250))   '  get everything after the colon, trim spaces
                ' Debug.Print "|" & strTest & "|" & strValue & "|"      ' for testing
    
                Select Case strTest
                    Case "AKA:"                             '  look for our "keywords"
                        rst!tAKA = strValue                 '     and assign their values to the temp fields
                    Case "ID:"
                        rst!tID = strValue
                    Case "Born [dd-mm-yy]:"
                        rst!tBorn = strValue
                    Case "Born [dd-mm-yyyy]:"
                        rst!tBorn = strValue
                    Case "Birthplace:"
                        rst!tBirthplace = strValue
                    Case "First Contact:"
                        rst!tFirstContact = strValue
                    Case "Year Met:"
                        rst!tFirstContact = strValue
                    Case "Last Contact:"
                        rst!tLastContact = strValue
                    Case "Most Recently:"
                        rst!tLastContact = strValue
                    Case "Where:"
                        rst!tWhere = strValue
                    Case "Location:"
                        rst!tWhere = strValue
                    Case "Height:"
                        rst!tHeight = strValue
                    Case "Weight:"
                        rst!tWeight = strValue
                    Case "Hair Color:"
                        rst!tHairColor = strValue
                    Case "Build:"
                        rst!tBuild = strValue
                    Case "Height:"
                        rst!tHeight = strValue
                    Case "Activities:"
                        rst!tActivities = strValue
                    Case "Class:"
                        rst!tClass = strValue
                    Case "Categories:"
                        rst!tCategories = strValue
                End Select
                
            End If                                      '  we jump to here if no colon was found
            
            '  Debug.Print strLine                      '  for testing
        Loop
        
        rst.Update                                      '  update the record
    
    ExitHandler:
      On Error Resume Next
      Close #f                                          '  clean up
      rst.Close
      Set rst = Nothing
      Set dbs = Nothing
      
      DoCmd.OpenForm "frmTmpData", , , "[tempID] = 26"
      
      
      Exit Sub
    
    ErrHandler:
      MsgBox Err.Description, vbExclamation
      Resume ExitHandler
    
    End Sub
    Everything seems to be working at the moment. The data is being read, parsed, and put into the proper fields in a temp table. I used code by HansV in this thread as a basis for the recordset handling. Now I need help with the next part: moving the data into the main table after I have verified it. I'll still need to do a little manipulation in code, like converting the Born data from a string to a date format. Someone want to help me with the move part?

    Thanks to all who help!

  5. #5
    New Lounger RedWagnum's Avatar
    Join Date
    Sep 2016
    Location
    San Antonio, TX
    Posts
    21
    Thanks
    3
    Thanked 0 Times in 0 Posts
    Mmmmm... Logic problem.

    While the code posted above seems to be working there are several issues.

    1. There should be 15 data fields, as shown in the data sample in the first post. However, not all files have the last field (Categories).
    2. There could be errors in the field names, eg: missing colon, misspelling, different variations of wording, etc. and a field could be missing (highly unlikely).
    3. There is a substantial (1kb to 5kb) amount of additional data after the last field (be that Class or Categories) that I want to collect and dump into a Memo field.
    4. There could be additional field name matches in this additional data.

    Because of issues 3 & 4, allowing the code to continue to loop through the Select function after the last field would introduce erroneous data. For issue 1 I thought about using a simple counter that is incremented as each field is found and if the total is 14 or 15 then bypass the Select function and continue data collection for the Memo. However if issue 2 exists this will not work. Also thought about using a flag that is set if "Class" or "Categories" but I can't see a way to make that work either. If Class is found, flag is set, Select is bypassed and Categories, if it exists, will never be found. In fact this method is just plain bad now that I think about it. Any field that I use for a flag could jack everything up if is out of order!

    So, what I'm looking to do is read the data until all 14 fields + the optional Categories have been found, then skip the Select loop to get the rest of the data for the Memo field, else stop if all 14 + 1 were not found,

    Does any of that make sense??? Spock I am not! Maybe I am over thinking this. I hate being anal about dummy-proofing and error checking. There comes a point where all the possibilities will consume all available resources.

    Live Long and Prosper!

  6. #6
    WS Lounge VIP access-mdb's Avatar
    Join Date
    Dec 2009
    Location
    Oxfordshire, UK
    Posts
    1,720
    Thanks
    146
    Thanked 156 Times in 149 Posts
    Quote Originally Posted by RedWagnum View Post
    After poking around testing, reading the help files, having some dinner, and testing some more I came up with this:

    Everything seems to be working at the moment. The data is being read, parsed, and put into the proper fields in a temp table. I used code by HansV in this thread as a basis for the recordset handling. Now I need help with the next part: moving the data into the main table after I have verified it. I'll still need to do a little manipulation in code, like converting the Born data from a string to a date format. Someone want to help me with the move part?

    Thanks to all who help!
    As far as the date field is concerned, better to convert it in your VBA rather than subsequently, then the dates will be in the correct format. My temp table is merely an empty copy of the final table so all I have to do is use query to copy the new data into the final table. Again, if any field needs formatting or changing in some way, do it in the VBA.

    I use a temp table so I can correct the raw data rather than trying to guess what typos or other errors might have to be handled (that way madness can ensue). If you find that a particular error keeps occurring then corrections for that error can be built into the code.

    BTW I always clear the temp table every time I run the function, here's some code to do this
    Code:
    iRecords = CurrentDb.TableDefs("tblCombinedTotals").RecordCount
    
    If iRecords > 0 Then
        DoCmd.SetWarnings False
        DoCmd.RunSQL "DELETE * FROM tblCombinedTotals"
        DoCmd.SetWarnings True
    End If
    Obviously you will need to amend the table name. If anyone has a more elegant way of doing this I will be all eras (or eyes)
    Talk is cheap because supply exceeds demand

  7. #7
    New Lounger RedWagnum's Avatar
    Join Date
    Sep 2016
    Location
    San Antonio, TX
    Posts
    21
    Thanks
    3
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by access-mdb View Post
    As far as the date field is concerned, better to convert it in your VBA rather than subsequently, then the dates will be in the correct format. My temp table is merely an empty copy of the final table so all I have to do is use query to copy the new data into the final table. Again, if any field needs formatting or changing in some way, do it in the VBA.

    I use a temp table so I can correct the raw data rather than trying to guess what typos or other errors might have to be handled (that way madness can ensue). If you find that a particular error keeps occurring then corrections for that error can be built into the code.
    I've completed all the code to read and parse the data and present it in a form for editing. So the conversion and transfer process is where I am heading next. The numeric conversions should be pretty easy. The date conversions - not so much... For whatever reason at the time I decided to use the dd-mm-yyyy date format more common in your neck of the woods. Unfortunately, the CDATE function only converts to the system code page date format. For dates with a day number less than 13 this is not a big deal (umm, I don't think-haven't tested) but for the days 13 and higher it's gonna be a pain, I think. Initial testing shows that 3-2-00 converts to 3-2-2000 but 31-12-99 converts to 12-31-1999. I may have this ass backwards though. I think I read that date and time are stored as a number with the whole number part representing the date and the fractional part the time (fraction of a day). If this is the case then the low number days will be the problem and the high number days should convert okay. Much more testing to do.

    Some of the fields in the temp table will need to be moved to a field in the primary table that is a lookup table. While my SQL-foo is not very strong I think this can be accomplished with a simple query. Everything else should be straight string to string copies, accounting for lengths.

    My biggest problem right now is I really don't know how to go about the actual transfer of data. I have a clue, but not much of one! I think I just need to open two recordsets and assign the tValues to the pValues

    Code:
    Set dbs = CurrentDb
    Set rstT = dbs.OpenRecordset("tblTmpData", dbOpenDynaset)
    Set rstP = dbs.OpenRecordset("tblPriData", dbOpenDynaset)
    
    rstP.AddNew
    
    rstP!Name = rstP!Name
    
    etc...
    Something like that. Much to learn, I have!

    BTW I always clear the temp table every time I run the function, here's some code to do this

    Obviously you will need to amend the table name. If anyone has a more elegant way of doing this I will be all eras (or eyes)
    Don't know about elegant but I like simple. That was so simple even I could understand it. Great code - works perfect. Thanks a lot! Just gotta be careful with it, put the wrong table name in there and poof ... out go the lights!

    And did I mention I solved my "Logic Problem" with the data fields? Just decided to count the number of non-blank lines read in and when the total gets to a certain point, bypass the loop and read the rest into the note field. If I don't get to the right count open a msgbox and abort. All seems to work great!

    Thanks again for helping me out Malcolm. Your assistance is really appreciated!

  8. #8
    WS Lounge VIP access-mdb's Avatar
    Join Date
    Dec 2009
    Location
    Oxfordshire, UK
    Posts
    1,720
    Thanks
    146
    Thanked 156 Times in 149 Posts
    Glad you're making progress. Here's my code which reads from a text file called CommonMetadata.txt (I do it by directory but yours could be different). It does some processing of data e.g. see line Case "GPSLatitudeRef", then writes it to the table. I then eyeball the table - any errors and omissions will mean the metadata in my photos is wrong, so I correct those and recreate CommonMetadata.txt. I find I rarely have to do the corrections more than once. When I'm happy I merely run an update query to copy the data from the temp table to the final table (I actually copy data from two tables, but that's just a quirk of how I have my photo metadata stored. Hope this helps. The code can be improved, but it doesn't take very long to run. I will acknowledge here that I had some help from Retired Geek and others to improve my (non working) original code. Their help was for Excel, I just converted it for Access.

    Code:
    Public Function Import_metadata2()
    
        Dim MyDB            As Database
        Dim rsOutput        As Recordset
        
        Dim zFileName       As String
        Dim zTags           As String
        Dim zLatitude       As String
        Dim zDevice         As String
        Dim zLongitude      As String
        Dim zMyDate         As String
        Dim zLatitudeR      As String
        Dim zLongitudeR     As String
        Dim zSrcDir         As String
        Dim zPictType       As String
        Dim zRawData        As String
        Dim zCaption        As String
        Dim zReference      As String
        Dim zCopyright      As String
        Dim zArtist         As String
        Dim zExposureTime   As String
        Dim zFNumber        As String
        Dim zISO            As String
        Dim zFocalLength    As String
        Dim zFocalLength35  As String
        Dim zFlash          As String
        Dim dteDateShot     As Variant
        Dim iRecords        As Integer
        Dim zWarning        As String
      
    '*** End Setup ***
    'Clear temp table
    iRecords = CurrentDb.TableDefs("tblTempPhotographs").RecordCount
    
    If iRecords > 0 Then
        DoCmd.SetWarnings False
        DoCmd.RunSQL "DELETE * FROM tblTempPhotographs"
        DoCmd.SetWarnings True
    End If
        zSrcDir = InputBox("Enter Directory", "zSrcDir") + "\"
        If zSrcDir = "\" Then
            MsgBox "Macro cancelled"
            GoTo thatsit
        End If
        zPictType = InputBox("Enter Type: 1=Digital/2=Slide/3=Negative/4=Photograph", "zPictType")
        If zPictType = 1 Then
            zPictType = "Digital"
        ElseIf zPictType = 2 Then
            zPictType = "Slide"
        ElseIf zPictType = 3 Then
            zPictType = "Negative"
        ElseIf zPictType = 4 Then
            zPictType = "Photograph"
        End If
    Set MyDB = CurrentDb
    Set rsOutput = MyDB.OpenRecordset("tblTempPhotographs", dbOpenTable)
    
       Open zSrcDir & "CommonMetadata.txt" For Input As #1
    
       Line Input #1, zRawData  '*** Discard First Line ***
       
        Do
        Line Input #1, zRawData
        
        Select Case Trim(Left(zRawData, InStr(zRawData, " ")))
            Case "FileName"
                zFileName = Mid(zRawData, 35)
                If zPictType <> "Digital" Then zReference = Left(zFileName, 5)
        
            Case "Keywords"
                zTags = Mid(zRawData, 35)
    
            Case "GPSLatitudeRef"
                zLatitudeR = Mid(zRawData, 35, 1)
                Line Input #1, zRawData
                zLatitude = Mid(zRawData, 35) + " " + zLatitudeR
        
            Case "GPSLongitudeRef"
                zLongitudeR = Mid(zRawData, 35, 1)
                Line Input #1, zRawData
                zLongitude = Mid(zRawData, 35) + " " + zLongitudeR
    
            Case "Model"
                zDevice = Mid(zRawData, 35)
    
            Case "DateTimeOriginal"
                zMyDate = Mid(zRawData, 35, 19)
                dteDateShot = dteParseDateTime(zMyDate)
    
            Case "Caption-Abstract"
                zCaption = Mid(zRawData, 35)
                
            Case "Copyright"
                zCopyright = Mid(zRawData, 35)
                
            Case "Artist"
                zArtist = Mid(zRawData, 35)
                    
            Case "ExposureTime"
                zExposureTime = Mid(zRawData, 35)
                
            Case "FNumber"
                zFNumber = Mid(zRawData, 35)
                
            Case "ISO"
                zISO = Mid(zRawData, 35)
                
            Case "FocalLengthIn35mmFormat"
                zFocalLength35 = Mid(zRawData, 35)
                
            Case "FocalLength"
                zFocalLength = Mid(zRawData, 35)
                
            Case "Flash"
                zFlash = Mid(zRawData, 35)
                
            Case "Warning"
                zWarning = Mid(zRawData, 35)
                
            Case "========"
    'Now write to table
            rsOutput.AddNew
            rsOutput("Reference") = zReference
            rsOutput("Tags") = zTags
            rsOutput("FileName") = zFileName
            rsOutput("Latitude") = zLatitude
            rsOutput("Longitude") = zLongitude
            rsOutput("Device") = zDevice
            rsOutput("Date shot") = dteDateShot
            rsOutput("Directory") = zSrcDir
            rsOutput("Type") = zPictType
            rsOutput("Caption") = zCaption
            rsOutput("Copyright") = zCopyright
            rsOutput("Photographer") = zArtist
            rsOutput("Speed") = zExposureTime
            rsOutput("Aperture") = zFNumber
            rsOutput("ISO") = zISO
            rsOutput("Lens") = zFocalLength
            rsOutput("FocalLength35mm") = zFocalLength35
            rsOutput("Flash") = zFlash
            rsOutput("Warning") = zWarning
            rsOutput.Update
    
    '*** Clean Up ***
            zFileName = ""
            zTags = ""
            zLatitude = ""
            zLatitudeR = ""
            zLongitude = ""
            zLongitudeR = ""
            zDevice = ""
            dteDateShot = missing
            zReference = ""
            zCaption = ""
            zCopyright = ""
            zArtist = ""
            zExposureTime = ""
            zFNumber = ""
            zISO = ""
            zFocalLength = ""
            zFocalLength35 = ""
            zFlash = ""
            zWarning = ""
                   
        End Select
        
        Loop Until (EOF(1))
       
       'Write final data
            rsOutput.AddNew
            rsOutput("Reference") = zReference
            rsOutput("Tags") = zTags
            rsOutput("FileName") = zFileName
            rsOutput("Latitude") = zLatitude
            rsOutput("Longitude") = zLongitude
            rsOutput("Device") = zDevice
            rsOutput("Date shot") = dteDateShot
            rsOutput("Directory") = zSrcDir
            rsOutput("Type") = zPictType
            rsOutput("Caption") = zCaption
            rsOutput("Copyright") = zCopyright
            rsOutput("Photographer") = zArtist
            rsOutput("Speed") = zExposureTime
            rsOutput("Aperture") = zFNumber
            rsOutput("ISO") = zISO
            rsOutput("Lens") = zFocalLength
            rsOutput("FocalLength35mm") = zFocalLength35
            rsOutput("Flash") = zFlash
            rsOutput("Warning") = zWarning
            rsOutput.Update
            rsOutput.Close
    
    thatsit:
        Close #1
    
       
    End Function
    Talk is cheap because supply exceeds demand

  9. #9
    New Lounger RedWagnum's Avatar
    Join Date
    Sep 2016
    Location
    San Antonio, TX
    Posts
    21
    Thanks
    3
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by access-mdb View Post
    Glad you're making progress. Here's my code which reads from a text file called CommonMetadata.txt (I do it by directory but yours could be different). It does some processing of data e.g. see line Case "GPSLatitudeRef", then writes it to the table. I then eyeball the table - any errors and omissions will mean the metadata in my photos is wrong, so I correct those and recreate CommonMetadata.txt. I find I rarely have to do the corrections more than once. When I'm happy I merely run an update query to copy the data from the temp table to the final table (I actually copy data from two tables, but that's just a quirk of how I have my photo metadata stored. Hope this helps. The code can be improved, but it doesn't take very long to run. I will acknowledge here that I had some help from Retired Geek and others to improve my (non working) original code. Their help was for Excel, I just converted it for Access.
    Argh... I've been sitting here for a half hour trying to figure out where you were modifying the data in the VBA code (other than the GPS stuff). Then I re-read your post. You are not editing the data in Access but rather in the metadata file itself. Something like:

    Code:
    1.  read metadata file
    2.  put data in temptable
    3.  look at table
    4.  if data wrong edit file, go back to 1 else
    5.  put data in primetable
    Am I right? That's kinda like what I am trying to do but I want to do it all in Access/VBA. Something like:

    Code:
    1.  main data input form (primetable) click import
    2.  read input file
    3.  parse data and put in temptable
    4.  open temp form (temptable)
    5.  review/edit data in temp form
    6.  if data severely wrong abort, edit file manually, go back to 1 else click commit
    7.  copy data from temptable to primetable
    8.  return to main data input form
    As I mentioned previously I have the import routine in a seemingly working order. The temptable is just a series of text fields to avoid any conversion issues on import. Conversions are being done in the commit routine. Conversion turned out to be surprisingly easy. The only thing is to make sure I catch the Height field and edit it from something like 5'9" to 69. The dates seem to carry over just fine, though some files will have a birthdate field something like ??-??-1989 that I'll either have to edit on the form or trap for it in the VBA routine.

    What I need to look at next is making sure the import button only works on new records. I could use it on existing records but why? The purpose of all this is to get the data files into the database. Once they are in there I can just update everythng in the database.

    After that is avoiding record duplication. Each file will have a unique (I hope) ID field. I'll need to use a query during the one of the routine to check for uniqueness. Hmmm... is there a "UNIQUE" SQL command?

    Thanks again for inspiring me and letting me bounce my ideas off you.

    Cheers!

  10. #10
    WS Lounge VIP access-mdb's Avatar
    Join Date
    Dec 2009
    Location
    Oxfordshire, UK
    Posts
    1,720
    Thanks
    146
    Thanked 156 Times in 149 Posts
    As I stated before, an example is
    Code:
           Case "GPSLatitudeRef"
                zLatitudeR = Mid(zRawData, 35, 1)
                Line Input #1, zRawData
                zLatitude = Mid(zRawData, 35) + " " + zLatitudeR
    This is actually concatenating two lines. But in your CASE you would have code to correct your data with known errors. How you correct your data is up to you, but it's possible that your code could balloon quite a lot with more and more correction lines, hence my comment about eyeballing the data so you can get a feel for common errors. These can be corrected in your code, e.g. your date with question marks. It's way past my bedtime so I'll have a further look tomorrow.
    Talk is cheap because supply exceeds demand

  11. #11
    WS Lounge VIP access-mdb's Avatar
    Join Date
    Dec 2009
    Location
    Oxfordshire, UK
    Posts
    1,720
    Thanks
    146
    Thanked 156 Times in 149 Posts
    OK, now having had a beauty sleep I can think again! We seem to be a bit at cross purposes here, so let me start again and I'll try to make it clearer this time (I can go off at half cock and lose people quite easily).

    I've looked again at what you're trying to do and realise that my example wasn't exactly what you're after. I have to correct the original data, whereas (as far as I can see) you don't. That's fine, the principle's the same.
    1. I would only import about 20-30 records at a time. It's easier to see any errors and omissions when it's just on one screen.
    2. I would look for obvious errors or omissions. the question is, is it easier to correct them by hand or by coding.
    3. I would always put the data into the temp table in the same format as the main table - you can then see any issues that need correcting then. So by way of example, dates with question marks. When you extract the date, look for question marks and if found put the data in a separate text field, any dates that are OK can be put in the date field. You can then decide what you're going to do with those dates.

    Copying the data to the final table. When (and only when) you are happy with the data you've just done (remember ~about 30 records), just use an update query to copy it.

    Unique records. The issue here is how do you define unique records? With mine my main table has an index as below

    index.PNG

    I do have non unique filenames, but adding the date makes them unique. You have to decide which fields make the record unique. You then create an index as my illustration - if you then try to copy that record to the table again, it will be rejected.

    Your table can also have a reference field defined as an autonumber, like this
    autonumber.PNG
    When you add a record, it automatically adds one to the number and is unique.

    Data cleansing is actually quite a difficult thing to do and I would suggest an iterative process, using your code to import a few records at a time, and seeing what problems come up. Then you can decide to add code to correct them (where they are numerous) or just manually correct them if there are only a couple. Only when you're happy with all the data copy the records to the main table. I have close to 17,000 records in my main table (and counting) so I have had to do it a block at a time!

    Phew, I hope this helps! Feel free to throw more questions at me!
    Talk is cheap because supply exceeds demand

  12. #12
    New Lounger RedWagnum's Avatar
    Join Date
    Sep 2016
    Location
    San Antonio, TX
    Posts
    21
    Thanks
    3
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by access-mdb View Post
    OK, now having had a beauty sleep I can think again! We seem to be a bit at cross purposes here, so let me start again and I'll try to make it clearer this time (I can go off at half cock and lose people quite easily).

    I've looked again at what you're trying to do and realise that my example wasn't exactly what you're after. I have to correct the original data, whereas (as far as I can see) you don't. That's fine, the principle's the same.
    Yes, that would be correct. I only want to get the data from the files into the database. Once that is done the original files will probably be deleted.
    1. I would only import about 20-30 records at a time. It's easier to see any errors and omissions when it's just on one screen.
    2. I would look for obvious errors or omissions. the question is, is it easier to correct them by hand or by coding.
    3. I would always put the data into the temp table in the same format as the main table - you can then see any issues that need correcting then. So by way of example, dates with question marks. When you extract the date, look for question marks and if found put the data in a separate text field, any dates that are OK can be put in the date field. You can then decide what you're going to do with those dates.
    1. I only need to do one record at a time. I have less than 500 files to process but this is just a part of a larger project. There will other data to process for each record.
    2. I have looked at about 10% of the files at random. Other than the data issue the only other problem I have is with the height and weight. The height can look like (excluding the commas): 5'6, 5' 6, 5'6", 5'6/168, 5'6" / 168cm, 5'6/ 1.68m, 168 cm, 1.68m / 5'6, etc. WAY too many variations to parse for programmatically! I've already made the decision to just read that field as is into my temp table and just edit it to inches in the form. Once in the primary table the primary edit form will have two buttons to convert from one to the other. The value is actually stored in two fields (HeightI and HeightM) and the buttons will use the data from one field to convert and fill the other. Same process with weight.
    3. I have set up the temp table with all text fields to avoid type mismatch errors. The layout (field structure) is almost the same [EDIT: now the same -forgot the HeightM and WeightM fields in the temp database.] I was doing the checking/correcting code for the dates in the Commit sub but have moved it to the Import sub. Since all fields on the temp table are text there is no need to put erroneous dates in a separate text field.

    Copying the data to the final table. When (and only when) you are happy with the data you've just done (remember ~about 30 records), just use an update query to copy it.
    Yeah - I have no idea how to do that! :newb: What I am doing now is this:

    Code:
    Private Sub CommitPrimary_Click()
        Dim strTblTmp As String
        Dim strTblPri As String
        Dim dbs As DAO.Database
        Dim rstT As DAO.Recordset
        Dim rstP As DAO.Recordset
    
        On Error GoTo ErrHandler
    
        strTblTmp = "tblTmpData"                            '  Temp table name
        strTblPri = "tblData"                               '  Primary table name
    
        Set dbs = CurrentDb
        Set rstT = dbs.OpenRecordset(strTblTmp, dbOpenTable)    '  Create recordset for temp table
        Set rstP = dbs.OpenRecordset(strTblPri, dbOpenTable)    '  Create recordset for primary table
    
       '  We are going to be using the AddNew method for the primary
        rstP.AddNew
    
        Me.Refresh                                          '  Needed to update any changes in the temp table
                                                            '   made in the temp edit form
    
        rstP!Name = rstT!tName                              '  Copy values from temp table to primary table
        rstP!AKA = rstT!tAKA
        rstP!ID = rstT!tID
        rstP!Born = rstT!tBorn
        rstP!Birthplace = rstT!tBirthplace
        rstP!FirstContact = rstT!tFirstContact
        rstP!LastContact = rstT!tLastContact
        rstP!Where = rstT!tWhere
        rstP!HeightI = rstT!tHeightI
        rstP!HeightM = rstT!tHeightM
        rstP!WeightI = rstT!tWeightI
        rstP!WeightM = rstT!tWeightM
        rstP!HairColor = rstT!tHairColor
        rstP!Build = rstT!tBuild
        rstP!Activities = rstT!tActivities
        rstP!Class = rstT!tClass
        rstP!Categories = rstT!tCategories
    
    Stop                '  Break for testing
        rstP.Update                                      '  update the record
    
    ExitHandler:
      On Error Resume Next
      rstT.Close
      rstP.Close
      Set rstT = Nothing
      Set rstP = Nothing
      Set dbs = Nothing
    
      DoCmd.Close acForm                                '  Close the temp edit form
    
      Exit Sub
    
    ErrHandler:
      MsgBox Err.Description, vbExclamation
      Resume ExitHandler
    
    End Sub
    It works and the data conversions seem to be automatic. The [Born] field converts from text in temp to Date in Primary. The height, weight, first and last contact all convert Text to Numeric. I didn't expect that but for some reason it works. This is undoubtedly not the best way to do this, or even the right way. But it gets me from point A to point B. Mind you all this is "pre-production" code - I am testing and learning at this point. Once I get this stuff working well and as fool proof as I can get it, I'll move on to another part.
    Unique records. The issue here is how do you define unique records? With mine my main table has an index as below [snip]
    I do have non unique filenames, but adding the date makes them unique. You have to decide which fields make the record unique. You then create an index as my illustration - if you then try to copy that record to the table again, it will be rejected.

    Your table can also have a reference field defined as an autonumber, like this [snip]

    When you add a record, it automatically adds one to the number and is unique.
    I define unique records as ummm unique. Both tables have an autonumber index field (tempID and primaryID) that is presently the only key field. Each field also contains an ID field (ID) that is unique to each person (or at least should be.) I can create an index for that field for uniqueness. I just don't know if the code above can account for that (I suspect not). So that may need to be complete re-written. Need to do some more testing... and learning!

    Data cleansing is actually quite a difficult thing to do and I would suggest an iterative process, using your code to import a few records at a time, and seeing what problems come up. Then you can decide to add code to correct them (where they are numerous) or just manually correct them if there are only a couple. Only when you're happy with all the data copy the records to the main table. I have close to 17,000 records in my main table (and counting) so I have had to do it a block at a time!
    Tell me about it! I think I have a pretty good handle on it at this point. I don't have 17,000 records to deal with though so one at a time will work for me. My only real issue was the date formatting and I think I have that covered. Everything else can easily be handled via manual edits in the form.

    Phew, I hope this helps! Feel free to throw more questions at me!
    Double Phew!! I hope you don't mind the length of my ramblings and that you have a better understanding of that I am trying to do. If it would help I can send you a copy via PM of this portion of my work with a few sample input files. And remember you said I could throw more questions at you!!

  13. #13
    WS Lounge VIP access-mdb's Avatar
    Join Date
    Dec 2009
    Location
    Oxfordshire, UK
    Posts
    1,720
    Thanks
    146
    Thanked 156 Times in 149 Posts
    OK, you seem to have most things covered. I agree that the height data is far too variable for it to be done via code.
    Unique records - If you know you don't have two people with the same name then you don' need to worry on that score. Make the name the primary unique index and it won't let you add another record with the same name.
    You've used code to copy the data from temp to main table, I would use an append query, I think that's simpler and easier to handle, but in the end it's what you're comfortable with.
    I haven't seen your method of writing to tables before but as the data I write to a table has normally just been read from a text file I would have to use a different method anyway. For copying from one table to another I use an append query as I've already said. One thing about MS, there's normally more than one way of doing anything and often more.

    I'm happy to look at your data and coding if you have any questions, but I think you're doing very well with your methods. Note that I don't think the mods are happy with PMing in this sort of case as others might be following the thread and trying to pick up pointers for themselves (could a mod comment on this please).
    Talk is cheap because supply exceeds demand

  14. #14
    New Lounger RedWagnum's Avatar
    Join Date
    Sep 2016
    Location
    San Antonio, TX
    Posts
    21
    Thanks
    3
    Thanked 0 Times in 0 Posts
    Well, here it is. I've cleaned it up and commented it best I can. Included 5 sample test files. I am completely open to comment and/or criticisms from access-mdb or anyone else who cares to take a look under my skirt.

    For the most part I think everything is working as well as I can get it for now. One thing I would like to address if possible is the warning message that pops up if I get a duplicate entry. I would like to test for dupes (on the [ID] or [tID] field) in code so I can provide a more graceful exit.

    As I've mentioned before I am a complete Access/VBA newbie. I've learned a lot over the past couple weeks but I have a long way to go (as my code can attest). I know there are better ways to solve some of the issues I've had. Just need someone to show me how!

    So access-mdb or anyone else want to take a look with a fresh pair of eyes, be my guest.


    Tim
    Attached Files Attached Files

  15. #15
    WS Lounge VIP access-mdb's Avatar
    Join Date
    Dec 2009
    Location
    Oxfordshire, UK
    Posts
    1,720
    Thanks
    146
    Thanked 156 Times in 149 Posts
    I've had a quick look and my first thought is - it's not how I would do it and my second, you've used coding I'm not familiar with; I will have to look at what it does first.
    Not sure when I'll get a better look, my wife's birthday today

    But rule number one - does it work?
    Rule number two - will you understand it in 6 month's time?
    Talk is cheap because supply exceeds demand

Page 1 of 2 12 LastLast

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •