Page 1 of 2 12 LastLast
Results 1 to 10 of 17

Thread: Comma-Delimited input file including ranges: Expand ranges to individual numbers.

  1. #1

    Default Comma-Delimited input file including ranges: Expand ranges to individual numbers.

    I have comma-delimited acquisition (input) file with numbers with format like below examples:
    2103667XXX
    21055074XX
    210289732X


    The above mean that X represents all possible numbers within those ranges from 0 to 9 for single X, from 00 to 99 for XX, from 000 to 999 for XXX.
    Is there a way to Trim those and transform the input to contain ALL individual numbers below such ranges ???

    Thank you.

  2. #2
    Lavastorm Employee
    Join Date
    Apr 2014
    Location
    Boston, MA
    Posts
    279

    Default

    I think the attached is what you're asking for. I made that assumption that all ID's will be 10 digits in length. If not, we can add more logic, but otherwise, we're stripping out the X's and then matching the base (number without the X's) with actual ID's.
    Attached Files Attached Files
    Last edited by ryeh; 03-01-2016 at 05:47 PM.

  3. #3

    Default

    First of all, thank you for your quick reply!

    I have adjusted to my project, but, unfortunately, it appears this is running for very long time (or forever, since it is still running now.....).

    How should I better change to override emit or default (so that it might work) ?

    Thank you.

  4. #4
    Lavastorm Employee
    Join Date
    Apr 2014
    Location
    Boston, MA
    Posts
    279

    Default

    How many records do you have in each input? Since we are doing a cartesian join, we are essentially doing m-by-n (m and n are the number of records in each intput) matches. Two ways around this:
    1. Use a hash split. If you're not on the server version, then this won't be available.
    2. Break it into a multi-stage match. See Stony's example posted on 3/18: http://community.lavastorm.com/threads/3534

  5. #5

    Default

    I have 45899 records from single numbers (like your ID input) and 51218 records from ranges with the X, XX, or XXX at end of 10-digit number, representing the ranges, as explained before and as you correctly have at your example Group input.
    My answers to your questions:
    1. My version is not the Server version, it is the Windows version (i.e. installed and running at Windows PC, not at Server, but using Linux Server for Input/Output/tmp files etc.). So, I guess I do not have the hash split option, I cannot see it at Acquisition anyway.
    2. Which BRG exactly are you referring to, the test.brg? I have downloaded and I am reviewing now, but I am not confident that this is dealing with the task I need to do..

  6. #6

    Default

    After more careful review of your BRG, I can understand that at first input with the ID:string, you provide a "mapping" for the ranges (groups) at 2nd input.
    This is not very practical to do, since I have tens of thousands of range numbers (ending with XXXX, XXX, XX, or X), so I cannot create other input for all those cases at another input like your ID:string.
    I need an intelligent algorithm to replace all Xs with values from 0 to 9. For single X could be simple, but for XX, XXX and XXXX it is be more tricky, since, as said at first post, I will need all XX from 00 to 99, all XXX from 000 to 999 and all XXXX from 0000 to 9999 !

  7. #7
    Lavastorm Employee
    Join Date
    Nov 2012
    Location
    Warrington, UK
    Posts
    246

    Default

    Hi here is a graph that expands out number ranges into all the individual numbers in each range. Does this meet your needs?

    Best regards,
    Adrian

    Number_Range_Expansion-XXXX_Support.brg
    Last edited by awilliams1024; 03-03-2016 at 06:11 PM. Reason: Updated graph to support XXXX number ranges

  8. #8

    Default

    Hi, what about replacing XXX with wildcards like "_" and asterix "*" . However I noticed that with x-ref this does not work, matching eg 2103667___ or 2103667*** with 2103667123. It considers wildcards as part of the string. How can we make it the node understand the wildcards and matches the inputs correctly?

  9. #9
    Lavastorm Employee
    Join Date
    Nov 2012
    Location
    Warrington, UK
    Posts
    246

    Default

    Hi, Here is a version of the graph that contains two new nodes that support multiple wildcards. The first additional node permits *any* of the specified wildcard characters to be present in the records (in any combination, e.g. 'X_*'). The second additional node permits the use of any one of the specified wildcard characters. All records have to use the specified wildcard (if one is present).

    The match for the 'X' wildcard is also now case-insensitive and the logic validates that each record has exactly 10 characters in the 'number'.

    Regards,
    Adrian

    Number_Range_Expansion-Multiple_Wildcards.brg

  10. #10
    Lavastorm Employee
    Join Date
    Apr 2014
    Location
    Boston, MA
    Posts
    279

    Default

    So there's no 'cross referencing with wildcards' capability. The closest you could do is with strFind(I) or regexIsMatch(I) within a cross reference. But you would need to do a cartesian join (every record on one input is checked against every record the other input).

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •