Results 1 to 8 of 8

Thread: Acquire complex XML using XMLpy node

  1. #1
    Contributor
    Join Date
    Feb 2012
    Location
    Amsterdam, Rotterdam, everywhere ;-)
    Posts
    38

    Default Acquire complex XML using XMLpy node

    Dear all,

    The Dutch Tax Authority requires Dutch companies to send extracts from their ERP-systems in a standardized format known as XAF (XML Audit File).
    I am building a standard Lavastorm project that can handle these XAF files, so I can perform a set of standard analyses on them.
    So now I'm experimenting with the XMLpy node to acquire this data.
    Being a noob on both XML and Python, I have come farther than I would have thought (read the XML tutorial :-), but now I'm a bit stuck on a more complex (multi-layered) part of the XML file.

    The essence of what I want is to be able to extract the data at the lowest (deepest) levels possible, which is 7 or 8 layers deep (!) and extract all this detailed information along with the information from ALL the parent levels it belongs to.
    This thing is, the data being returned is at a higher level than expected: I only get 11 records at the "Journal" level (the 3rd level) and only the FIRST piece of information from the level below that. Of course, I was expecting to get tens-of-thousands of records at the 7th or 8th level, being the individual bookings in the ERP system.

    I just *know* this can be done, but what am I doing wrong??

    A simplified piece of code I use (at like the 4th level or so) is pasted below:

    @elementHandler('/auditfile/transactions/journal')
    def journalHandler(element):
    data = {}
    data['journalID'] = None
    if hasattr(element, "journalID"):
    data['journalID'] = element.journalID
    data['description'] = None
    if hasattr(element, "description"):
    data['description'] = element.description
    data['type'] = None
    if hasattr(element, "type"):
    data['type'] = element.type
    data['transaction'] = None
    if hasattr(element, "transaction"):
    data['transaction_transactionID'] = None
    if hasattr(element.transaction, "transactionID"):
    data['transaction_transactionID'] = element.transaction.transactionID
    data['transaction_description'] = None
    if hasattr(element.transaction, "description"):
    data['transaction_description'] = element.transaction.description
    data['transaction_period'] = None
    if hasattr(element.transaction, "period"):
    data['transaction_period'] = element.transaction.period
    data['transaction_transactionDate'] = None
    if hasattr(element.transaction, "transactionDate"):
    data['transaction_transactionDate'] = element.transaction.transactionDate
    data['transaction_line'] = None
    if hasattr(element.transaction, "line"):
    data['transaction_line_recordID'] = None
    if hasattr(element.transaction.line, "recordID"):
    data['transaction_line_recordID'] = element.transaction.line.recordID

    outputRecord(data, 0)


    Thanks for any and all help!!!
    Best regards, Bart.

  2. #2
    Contributor
    Join Date
    Feb 2012
    Location
    Amsterdam, Rotterdam, everywhere ;-)
    Posts
    38

    Default

    Really, no one knows the answer to this??
    Pretty please?? ;-)

  3. #3
    Lavastorm Employee
    Join Date
    Aug 2009
    Location
    Cologne
    Posts
    513

    Default

    Hey,

    Are you able to provide an example with a configured node and with sample data which shows the problem you are experiencing?

    Tim.

  4. #4
    Contributor
    Join Date
    Feb 2012
    Location
    Amsterdam, Rotterdam, everywhere ;-)
    Posts
    38

    Default

    Dear Tim,

    Here's an experiment with an anonimous file.

    Would you be nice enough to take a look?

    What it SHOULD do is extract the info at the LOWEST level (journal), INCLUDING everything from every level above that.

    Many thanks in advance!!

    (BTW didn't know Lavastorm was also based in Köln).

    LavastormAuditfiles_Anonimous.brgAnonimous.zip

  5. #5
    Lavastorm Employee
    Join Date
    Aug 2009
    Location
    Cologne
    Posts
    513

    Default

    Hi,

    I'm not sure if this is possible with the existing XMLpy File node ... it might be, but I can't recall how to loop over subelements within the script in that node.

    However, we should be releasing a new node with our Lavastorm Analytic Library (LAL) in the near future which will enable exactly this sort of acquisition of complex XML data with very little configuration required.

    I will post back here when that is available.

    Regards,
    Tim.

  6. #6
    Contributor
    Join Date
    Feb 2012
    Location
    Amsterdam, Rotterdam, everywhere ;-)
    Posts
    38

    Default

    Hi Tim,

    That would be fantastic news!! I'm really looking forward to this new LAL!

    Best regards and many thanks for looking into this!

    Bart Roeleveld,
    Coney B.V.
    The Netherlands

  7. #7
    Lavastorm Employee
    Join Date
    Aug 2009
    Location
    Cologne
    Posts
    513

    Default

    Hi,

    As per the updated post in the "Updates to the LAL Library" thread (http://community.lavastorm.com/threa...he-LAL-library), LAL 2.16 has been released and is compatible with LAE 4.6.1.

    With it, comes the XML Data node which is perfectly suited for reading this sort of data.

    I've attached an example of how this can be configured to read the xaf format of XML data you posted previously.

    Obviously, this can also read multiple files, or from a single filename, but to make it easier to upload a working example, I've just put the data you previously posted into a Input Static node and the XML Data node will operate on that data by setting the "XmlData" choice to "Data Field", then the text component of the "XmlData" parameter to "Data" (the name of the input field containing the XML data).

    On the "Optional" tab, I also just set "RemoveCommonPrefixes" to true, which ensures that the output fields are named for example "numberEntries", and "journal.type" rather than "auditfile.transactions.numberEntries" and "auditfile.transactions.journal.type".




    Hope this helps,

    Tim.
    Attached Files Attached Files

  8. #8
    Contributor
    Join Date
    Feb 2012
    Location
    Amsterdam, Rotterdam, everywhere ;-)
    Posts
    38

    Default

    Tim is 'the man' and the new XML Data node totally RULES!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •