Results 1 to 1 of 1
2006-12-05, 22:31 #1
- Join Date
- Jan 2001
- West Long Branch, New Jersey, USA
- Thanked 9 Times in 7 Posts
Lessons on Import Data (Excel 2003)
Just wanted to share a few lessons I learned regarding use of the Import External Data process. I described my setup in <post#=617254>post 617254</post#> - basically using templates as the basis of a new file into which the data from a "source" table in a "source" file is imported.
- my dates, since they had text associated with some of them (eg. I had date entries like a normal "12/10/06" without the quotes as well as "12/11/06 Security Team") had to be formatted as text in both the source file and the template. Otherwise, results seemed unpredictable as I mentioned in above post. Giving up the ability to do "date" math was not a big deal for me since I could have taken advantage of that in only a limited way. If really necessary, create a hidden column upon which you do the date math where the value is the date part of the "text date."
- a general rule that follows from the above seems to be to keep formatting the same in the template and the source file for a given column. I also had custom formats in the source file that were messed up on importation unless I had the same custom format for the corresponding column in the template.
- text formatting for very long character strings (>255) can create problems on importation. I got the #### indication even though text wrapping was on. This happened in both the source and template files. But see next point. I had to switch the formatting to General to even get the >255-char text strings to appear.
- Since you don't know how the rows are going to be arranged based on sorting, which can change based on source file entries, you can't pre-plan the heights of the rows in the template. Format all the rows to be used in the template as AutoFit. Otherwise you get a row that's 1 line high or whatever you set it to. It's bound to be wrong for some row.
- If you have to filter on rows to appear in your output, you may need an extra "dummy" column to specify whether that row should be imported. You need to select that "show record" column when defining the query for importing the data. But best leave that to the end of the fields/columns you want (the query definition process allows you to move a field down after you've selected which ones you want, or go through the fields once and go back to get the fields for "control" purposes like filtering). Putting this field at the end, along with the next point, allows you to hide these columns in the template at one shot rather than having them in several places.
- if you have to sort rows so that your output appears in a way different than the source, this can be defined in the query definition process also. Again you may need a dummy column for proper sorting. For example, we have a status column with entries "Red", "Yellow", "Green", which can change over time. In the output, we want all the "Red" rows to be grouped and appear first, then all "Yellow" rows, and finally all "Green" rows. I had to define a numeric "sort" field as a dummy column that was selected in the query definition process, moved to the end, and imported into a hidden column. (I also used the "status sort" column as the basis for a Conditional Format to actually color the rows according to their status.)
- There's the issue of the source file pathname being hard-coded into the query definition file. This is a pain. There's some VBA code one could get from "Professional Excel Development" by Bullen, Bovey, & Green (page 496) for starters. Put the source file on a share drive so it stays put. Alternatively, you can edit the query definition file if you save it (xxx.dqy is the file) using Notepad since the file is just text with the source file info, the SQL for the query, and other stuff.
- Instead of saving the query as a separate file, it can be saved as part of the file into which the data is being imported (which can be a separate sheet in the source file, hence an advantage over Advanced Filter w/o a VBA assist). Editing the query from the destination file seems to offer limited editing capabilities (you can pick a new source file). There's also a Query Manager add-in by JK Pieterse that allows one to change the source file easily.
- Although I thought some of the import options would have given me what I needed, I found that if I didn't supply enough empty rows for the records to be imported that crazy things started to happen. For example, my imported data starts on row 5 under some col headings provided by the template (I positioned the active cell as A5 in my template so not to forget to reposition the active cell in the new file). I have some general notes starting at around row 20. If I'm importing more than 15 rows, my col headings and notes got pushed to the right to make room for the imported data. Better to create lots of empty rows in the template and delete the unused ones.
- The above works pretty well for creating multiple "reports" (think Access) from flat DBs/tables. The import data/MS Query will not help you define a "report" based on a set of relational tables; it will tell you to do it yourself by dragging the linked field from one table to the other when you see the Query table view (think Query design view in Access). It's not that hard but the message can be overwhelming - what did it want me to do? All it's asking you to do is drag the linking field (like a key) from one table to the other so it knows which field in the 2 tables is the common key.
- When working with templates, it may be necessary to edit them like for some of the above. Double clicking the xlt in Explorer activates the default action, which is to open a new file based on the template. You don't know how many times I double-clicked, edited the file, and wondered where my edits went when I next opened a file based on that template. (Of course, it's not all that bad since I was asked if I wanted to save the file when I was finished and was surprised when I had to pick a file name for the new file based on the template created on the double click.)
- Of course, you may be asking why do this in Excel. Because...people start projects like these in Excel because they like it (I had a friend who used Excel for his word processing and web design). Access is much better. You could import an Excel range into Access, design your report, and export the report back to Excel (why, I don't know). I ran into a problem with this in that exported columns to Excel were in a different order and width than the Access report from which I exported.
- while you're at it, if you have several tables that you import to Access from Excel that are related, you can define your query in Access and copy/paste Access's SQL back into the query (.dqy) file, if you save the query separately per above, that you use for importing the data. You have to be a little careful here and may need to do some editing in the .dqy file. I'm not sure where you'd put the Access SQL if you saved the Excel query as part of the Excel file.
Anyway, that's some lessons learned the hard way.