Page 1 of 2 12 LastLast
Results 1 to 10 of 14

Thread: JDBC Node error on version 6.1.3

  1. #1

    Default JDBC Node error on version 6.1.3

    Hi everyone,

    We are currently upgrading from 4.5.3 up to 6.1.3 version of Lavastorm and i have come across a strange error that i could use some help with.

    While using a SQL query to retrieve data from a DB2 data base we have a line in the SQL that has to be cast (CAST (RPALPH AS CHAR(40) CCSID 37) AS RPALPH) and in version 4.5.3 everything works fine but when running this on the new vesrsion i get the below error.ERROR: Input length = 1

    ERROR: error on row 0. com.lavastorm.io.data.FieldWritingException: Error occurred writing the field (RPALPH), type (class java.lang.String) on record (53317). Field had the value (Elephants Can?t Jump ).

    Any ideas as to why the new version is treating the data this way would be greatly appreciated.

    Regards,

    Keith

  2. #2
    Lavastorm Employee gmullin's Avatar
    Join Date
    May 2014
    Location
    Chicago
    Posts
    162

    Default

    Can you try setting the parameter DefaultToUnicode to true? It should be on the Optional tab.

  3. #3

    Default

    Hello,

    Yes that works but i do not want the output in unicode as there are at least 40 other graphs using this output in a string format.
    How do i keep this output as a string not unicode?

    Regards,

    Keith

  4. #4
    Lavastorm Employee gmullin's Avatar
    Join Date
    May 2014
    Location
    Chicago
    Posts
    162

    Default

    The exact same set of data with same driver comes out as string in the older version? What database are you connecting to? This is just a guess, but there might be a parameter we need to add to the DbUrl to make the encoding UTF-8 and get everything as string.

    Another possible option you could get the data as unicode and convert it all back to string within Lavastorm. See attached.

    sql_unicode_to_string.brg

  5. #5

    Default

    Hello,

    The graph doesnt convert the unicodes into strings as i get the below error which looks linked

    ERROR: Input length = 1

    ERROR: [IOException] Error writing output.
    Error Code: ls.brain.node.changeMetadata.writeError

    Also we cannot default all of the data to a string as there are other fields with different datatypes which are also required as is other wise there is a massive amount of work to correct the outputs in our estate as this isnt happening on one JDBC node.
    It seems to happen when the data in specific fields contain various punctuation characters eg ?

    Regards,

    Keith
    The databse is DB2

  6. #6

    Default

    Hi Keith,

    Over the years, the LAE has become stricter about character sets and string encodings, primarily because not doing so could and would lead to subtle data corruption issues.

    For LAE 6.x the "string" data type has a definite character set within the system. This can be set for the system using the server property ls.brain.characterSet, the default value for this is windows-1252. The "unicode" type is obviously unicode, and moves between UTF-16 while in memory (Java Strings are UTF-16) and UTF-8 in interim data files. Metadata, i.e. the names of fields, at this time can only be of the "string" character set.

    The problem you are encountering is that the data your are extracting and want to store into a string has characters which are not part of the string character set (most likely windows-1252, assuming you haven't changed it).

    Now the question is: what to do about it?

    First, what is the SQL "cast (CAST (RPALPH AS CHAR(40) CCSID 37) AS RPALPH)" supposed to be doing? From looking at the DB2 docs, it seems that it is encoding it as European EBCDIC (CSSID 37) and then converting it back to the character set defined for RPALPH. (Warning: I'm not a DB2 expert, just reading the doc). Is the purpose of this to simply get the DB2 database to do the illegal character mapping, i.e. convert bad characters to ?'s? If so, then to make it work with the default LAE character set (windows-1252), you would need to use CCSID 1252.

    Hopefully this helps,

    Matt

  7. #7

    Default

    HI Matt,

    Thanks for the explanation, that’s a great help but unfortunately doesn’t fix my issue.
    The reason we are using this cast function with this CCSID value is that it is importing text fields from the JDE portion of DISE/CMP which sits on the DB2 Database.
    As these values are in a different character set we have to convert them into strings and they can contain the odd character which previously wasn’t a problem.
    I have already tried various CCSID`s with no change in the result but admittedly a different error.
    Coded Character Set Identifier (CCSID) 1252 is not valid
    However the character it seems to most have the trouble with is the Questions mark however this is included within the windows 1252 character set so I may need to find where this parameter is stored to see if it is set to this as the server is running on redhat linux not windows.
    Also i am having the same error on other JDBC nodes without any cast functions just select *
    Regards,

    Keith
    Last edited by khopkins; 02-07-2017 at 10:02 AM.

  8. #8
    Lavastorm Employee stonysmith's Avatar
    Join Date
    Nov 2006
    Location
    Grapevine Tx
    Posts
    782

    Default

    The character you are having problems with is not a question mark or chr(63). It's something else - probably a "back quote" that is being displayed as a question mark, but it's not actually a question mark.

    Run the output from the query thru a filter node and try this code:

    Code:
    emit *
    emit RRALPH=RRALPH.str().regexSubstitute("[\x80-\xff]","_")
    Unfortunately, this REMOVES the offending character.. whatever meaning it had will be lost, but that may be acceptable in your situation.

    If that isn't sufficient then there are other solutions we can try.

  9. #9
    Lavastorm Employee gmullin's Avatar
    Join Date
    May 2014
    Location
    Chicago
    Posts
    162

    Default

    I know you said you tried several CCSID values, did you try unicode? I ran the following on my db2 express database:

    select sid, CAST (sname AS CHAR(40) CCSID unicode) as sname from student

    In Lavastorm sid was an int and sname was a string. See here: student.jpg

  10. #10

    Default

    Hi Everyone,

    I`m afraid that neither of these will work as it isnt just cast statement that is effecting this and as every export we do from a text dataset in DISE/CMP is converted to a string so converting them all to unicode would be great if the convertions actually worked which it doesnt as they error for the same reason.
    Manually converting each field could be done but as there are over 4,000 string fields in our data exports it isnt really feasible.
    The situation as basically we have characters that exist in text fields but not the specified character sets which for us and i would suppose many companies would also have.
    Is it possible that i just need to change the defined list of character sets in the charsetRegistry.conf and if so which ones would i need to add on a linux system to eradicate this issue and go back to being able to export these as strings?

    If this isnt possible then i have a huge job ahead of me to force all of the graphs that utilise this data to use any of the string fields as unicode and probably have to convert them.

    Regards,

    Keith

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •