Results 1 to 4 of 4

Thread: Do while loop

  1. #1

    Default Do while loop

    Hi all,

    I have an excel file list of "stop words"(words I need to remove from my text) - there are over 1000 words on the list.

    I also have 22,000 text files containing text I'm able to pull into an Input raw node, this gives me a table where one of my columns is called narrative and contains all the text from the text file.

    I now need to remove every stop word from this narrative field for each of my 22,000 rows.

    I know a Do While Node can help me, but I'm not sure what the inputs and outputs should be. I assume the right logic here is to loop through my stop word list and do a Narrative.Replace('StopWord',''), but I'm having trouble.

    Can anyone talk this through for me?

    Many thanks to anyone who can help - pulling my hair out.
    Last edited by karennap; 09-21-2017 at 04:59 PM.

  2. #2
    Lavastorm Employee stonysmith's Avatar
    Join Date
    Nov 2006
    Location
    Grapevine Tx
    Posts
    771

    Default

    The code below is one example of how to accomplish what you want.
    I used the DoWhile function within BrainScript rather than using the DoWhile node.

    Warning: this could be a bit slow. We might have to look at some other solution if it's too slow for your needs.

    Code:
    node:Edit_out_StopWords
    bretype:core::Lookup
    editor:Label=Edit out StopWords
    editor:sortkey=59c3f1844d353263
    input:@40fd2c746abc6dc7/=Narrative.40fe6c55598828e5
    input:@40fd2c74486e4494/=Build_StopWord_List.40fd2c744c862db0
    output:@40fd2c7445835585/=
    prop:InputKey=<<EOX
    1
    EOX
    prop:LookupKey=<<EOX
    1
    EOX
    prop:Script=<<EOX
    sw=StopWords.split(",")
    i=0
    n=Narrative
    while i<len(sw) {
    	n=replace(n,sw[i],"")
    	i=i+1
    	}
    emit n as Narrative
    EOX
    editor:XY=390,130
    end:Edit_out_StopWords
    
    node:Build_StopWord_List
    bretype:core::Agg
    editor:Label=Build StopWord List
    editor:sortkey=59c3f0f520a54b4c
    input:@40fd2c7427456e5b/=StopWords.40fe6c55598828e5
    output:@40fd2c744c862db0/=
    prop:GroupBy=<<EOX
    1
    EOX
    prop:Script=<<EOX
    s=groupString(StopWord,",")
    emit s as StopWords
    where lastInGroup
    
    EOX
    editor:XY=330,230
    end:Build_StopWord_List
    
    node:Narrative
    bretype:core::Static Data
    editor:Label=Narrative
    editor:sortkey=59c3f1f81cda6d9f
    output:@40fe6c55598828e5/=
    prop:StaticData=<<EOX
    Narrative
    Red
    Green
    Blue
    Yellow
    Cyan
    Magenta
    Orange
    Chartreuse
    Aquamarine
    Azure
    Violet
    Fuchsia
    Grue
    Bleen
    Octarine
    Garrow
    Gendale
    Hooloovoo
    Fire
    Ice
    EOX
    editor:XY=250,130
    end:Narrative
    
    node:StopWords
    bretype:core::Static Data
    editor:Label=StopWords
    editor:sortkey=59c3edfc083e6352
    output:@40fe6c55598828e5/=
    prop:StaticData=<<EOX
    StopWord
    Red
    Blue
    Green
    EOX
    editor:XY=250,230
    end:StopWords

  3. #3

    Default

    This is great, completely different approach to what I was thinking. Thanks so much! It is a little slow but it works so it will do.

  4. #4

    Default

    This is all i expected, Thanks to stonysmith for sharing the detail code.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •