Results 1 to 8 of 8

Thread: xml extraction

  1. #1
    2 Star Lounger
    Join Date
    Sep 2002
    Location
    East Coast(USA)
    Posts
    109
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Greetings,

    I'm new to this, and wanted to know how to go about doing the following tasks:

    I was given a file in XML format and asked to use T-SQL for the XML extraction and analysis. Submit the results in an Excel spreadsheet,
    import the data in the spreadsheets into a SQL Server relational database. Submit the CREATE scripts for all of the objects in the database; briefly explain how I used the data from the spreadsheets to populate the database. The database should be ready to use for accepting new records, as well.

    Thank you in advance.

    OCM

  2. #2
    Super Moderator
    Join Date
    Aug 2001
    Location
    Evergreen, CO, USA
    Posts
    6,623
    Thanks
    3
    Thanked 60 Times in 60 Posts
    To really help you with this, we need to know more about your capabilities with SQL Server, more about the end purpose of the database, and probably more about the file you've been given. If you are using Office 2007 or 2010, you can probably import the data into Excel directly and avoid doing anything with T-SQL. Othwise we need to know the version of SQL Server you have available.
    Wendell

  3. #3
    2 Star Lounger
    Join Date
    Sep 2002
    Location
    East Coast(USA)
    Posts
    109
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Thank you for your reply wendellB,

    Actually, my SQL Server capability is limited and I do not have access to SQL Server at this moment. Iím using Office 2007, and I am comfortable with Excel and Access. I can e-mail you the XML document if thatís helpful.

    Regards,
    OCM

  4. #4
    Super Moderator
    Join Date
    Aug 2001
    Location
    Evergreen, CO, USA
    Posts
    6,623
    Thanks
    3
    Thanked 60 Times in 60 Posts
    It's not necessary to email the XML document at this point. Does opening it in Excel 2007 give you any sense of the structure of the data? If that's so, then you should be able to import it into Access - Access can also read XML data, and if it is properly structured, you can actually create tables by doing an import. But if yhou aren't comfortable with XML, you will need to do some reading on how it works. You might start here, and here. Unfortunately the Access documentation isn't much help, but as long as the data is properly structured, the process is pretty pain-free. Another resource you might want to look at is the Microsoft Press book "Access 2007 Inside Out" by Viescas and Conrad - Chapter 23 is all about XML, and gives you some excellent background. Then I would look at using the Upsizing Wizard in Access 2007 to move the data to SQL Server, but you will need SQL Server at that point. Note that the 2008 Express is a free download from the Microsoft web site. Let us know how you progress, and we can try to answer any specific questions or issues you encounter.
    Wendell

  5. #5
    2 Star Lounger
    Join Date
    Sep 2002
    Location
    East Coast(USA)
    Posts
    109
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Thanks for the reply and the link. I was able to open the xml document in Excel. The next task is:
    1. analysis; finding average invoice total broken out by division, for the third quarter of this year , total number of currently accredited sites, average number of sites per application etc.
    2.to create deduped mailing list for current and complete addresses etc.
    3. Import the data in the spreadsheets into a SQL Server relational database. Submit the CREATE scripts for all of the objects in the database.

    How this is better accomplished, Excel, Access or as you mentioned, download the free 2008 express version and go from there.

    Regards,

    OCM

  6. #6
    Super Moderator
    Join Date
    Aug 2001
    Location
    Evergreen, CO, USA
    Posts
    6,623
    Thanks
    3
    Thanked 60 Times in 60 Posts
    The analysis part is generally easier to do in Excel, as you can experiment and restructure things without rebuilding data each time, but Access is probably capable of doing most of the analysis too, and if this is to be a process that gets used over and over, then it might be the best choice. But I would start with Excel.

    Access is a bit better at finding duplicates because of the ability to select distinct in a query and not get duplicates without editing the data. Is some sort of mail merge envisioned? If so is it to be used over and over again?

    I don't understand where SQL Server comes into the picture. Is it to be used to archive the data, and to keep quarterly data for future needs? And what is the bit about CREATE scripts? They are only used to define table structures, and aren't of any great value once you have the database created, unless you need to repeatedly create the database? And those scripts don't bring in the data - they simply create tables. It is probably possible (with a good deal of effort and study) to do the entire analysis in SQL Server, but it would require using the Business Intelliigence and Reporting Services see here. But I personally choose the path of least resistance, which in my case would probably be some combination of Excel and Access, perhaps linked to SQL Sever tables for data storage if this is to be an ongoing process.
    Wendell

  7. #7
    2 Star Lounger
    Join Date
    Sep 2002
    Location
    East Coast(USA)
    Posts
    109
    Thanks
    0
    Thanked 0 Times in 0 Posts
    I've successfully imported the XML data into Excel and Access (please see attached striped version) and start maneuvering around. My next task is the following:

    1) average invoice total amount for application fees only, broken out by division, for the third quarter of this year.
    2) Using the relational database, create a deduped mailing list for as many current and complete addresses as possible.
    3) Create a reader-friendly report break out the following statistics per division, based on the relational database:
    a. Total number of currently accredited sites
    b. Total number of applications in process
    c. Number of labs that have achieved accreditation in multiple divisions
    d. Total number of sites granted accreditation in each app area
    e. Average number of sites per application

    Do I need to create a query in access 2007, based on the query, design a report to accomplish the above?

    Please help.

    Thanks,
    OCM

  8. #8
    Super Moderator
    Join Date
    Aug 2001
    Location
    Evergreen, CO, USA
    Posts
    6,623
    Thanks
    3
    Thanked 60 Times in 60 Posts
    It seems your attachment didn't attach - can you try again?
    Wendell

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •