Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Characters' Encoding Problems #284

Closed
marbrex opened this issue Aug 31, 2022 · 6 comments
Closed

[BUG] Characters' Encoding Problems #284

marbrex opened this issue Aug 31, 2022 · 6 comments

Comments

@marbrex
Copy link

marbrex commented Aug 31, 2022

My system

  • odbc Package Version: 2.4.4
  • ODBC Driver: HFSQL v27 for Windows (both x32/x64 versions, supports Unicode)
  • Database Name: HFSQL Client/Server (HyperFileSQL by PCSOFT, connection string details: https://doc.windev.com/en-US/?3044179)
  • Database Version: v27
  • Database OS: I don't have this information, since it's on a distant machine (but probably Windows..)
  • Node.js Version: 16.17.0
  • NPM Version: 8.15.0
  • Node.js OS: Windows 10 Entreprise (Build 19044)

The bug
There is a problem with characters' encoding in the returned data. In fact, the text is in French and the letters with accents are not correctly displayed (é, è, ç, à, ù, ê, ô, etc..), and any of these non-ascii charcters results in this "�". I've searched a lot for the reasons and possible solutions of this problem, tried everything I could, but with no success... The ODBC connection's encoding seems to be ANSI (must be ASCII), however the driver should support Unicode and be able to use the wide charset API, as it is stated on the official page of HFSQL installation procedure: https://doc.windev.com/en-US/?3044179

image


What I tried to do

  • If I understood correctly, the encoding depends on the ODBC connections itself, and it can be either ANSI or Unicode. So, I tried to change the connection string, adding keywords such as CCSID=1208, Charset=UTF8 etc, to explicitly set the encoding to Unicode. But to my regret, these keywords are specific to the Database, and couldn't find any for HFSQL in particular...
    I tried these with different types/combinations of the connection string:

    •  `DRIVER={${driver}};Server Name=${host};Server Port=${port};Database=${database};UID=${UID};PWD=${PWD}`
    •  `DSN=${DSN};UID=${UID};PWD=${PWD}` // rest of the info is specified in the Driver Manager
    •  `DSN=${DSN}` // rest of the info is specified in the Driver Manager
  • I also tried to debug with Windows' ODBC Driver Manager's Tracing function enabled. The log file seems to be ok, except 1 ODBC function: *

    node  server    2768-5f80	ENTER SQLDriverConnectW 
    	HDBC                0x0000029B2E6CF631
    	HWND                0x0000000000000000
    	WCHAR *             0x00007FF4D9C562E0 [      -3] "******\ 0"
    	SWORD                       -3 
    	WCHAR *             0x00007FF4D9C562E0 
    	SWORD                       -3 
    	SWORD *             0x0000000000000000
    	UWORD                        0 <SQL_DRIVER_NOPROMPT>
    
    node  server    2768-5f80	EXIT  SQLDriverConnectW  with return code 1 (SQL_SUCCESS_WITH_INFO)
    	HDBC                0x0000029B2E6CF631
    	HWND                0x0000000000000000
    	WCHAR *             0x00007FF4D9C562E0 [      -3] "******\ 0"
    	SWORD                       -3 
    	WCHAR *             0x00007FF4D9C562E0 <Invalid buffer length!> [-3]
    	SWORD                       -3 
    	SWORD *             0x0000000000000000
    	UWORD                        0 <SQL_DRIVER_NOPROMPT>
    
    	DIAG [IM006] [Microsoft][Gestionnaire de pilotes ODBC] �chec SQLSetConnectAttr du pilote (0) 
    

    (*) The last message is in French, and says IM006[Microsoft][ODBC Driver Manager] Driver's SQLSetConnectAttr failed. And even here in the log, it displays instead of characters !

    I don't know though why the connection string (3rd parameter of the function) is "******\ 0" and why its length (4th parameter) is equal to -3.

    The Invalid buffer length! error should correspond to the following SQLDriverConnect SQLSTATE:

    SQLSTATE Description Explanation
    HY090 String or buffer length that is not valid The value specified for StringLength1 is less than 0, but not equal to SQL_NTS. The value specified for BufferLength is less than 0.

    But I don't understand this error..

  • I tried to define 'UNICODE' in defines of the binding.gyp file, but that did not help. P.S. I defined 'DEBUG' as well, but I do not get any node-odbc package's debug messages in the terminal when running my app with npm start (i'm using Express.js).


Expected behavior
Correctly displayed characters in the query results.


pls help T_T

@marbrex marbrex changed the title [BUG] Charcters' Encoding Problems [BUG] Characters' Encoding Problems Aug 31, 2022
@kadler
Copy link
Member

kadler commented Aug 31, 2022

I don't know though why the connection string (3rd parameter of the function) is "******\ 0" and why its length (4th parameter) is equal to -3.

-3 is SQL_NTS, ie a null-terminated string. It's likely traced with asterisks to prevent tracing out a password.

@markdirish I wonder if we need some sort of connection option which forces any character strings to be bound as SQL_C_WCHAR instead of SQL_C_CHAR, that way we can have a consistent encoding. Seems too many databases on Windows still using ANSI encoding for SQL_C_CHAR.

@kadler
Copy link
Member

kadler commented Aug 31, 2022

I see now that the invalid buffer length message is getting printed for the 5th parm, which is the output string. Indeed, SQL_NTS is an invalid value for an output buffer length. The application needs to pass in a buffer size and the driver should be returning the actual length of the data available to return (which may be bigger than the input buffer size), but neither should ever be SQL_NTS.

Of course, node-odbc always passes NULL/0 for the output buffer: https://github.com/markdirish/node-odbc/blob/55037241591beee343b3d9ce2e4f37ca8426255e/src/odbc.cpp#L416-L417

I'm guessing the discrepancy is being injected by the ODBC Driver Manager for some reason (though you'd think it would know better than passing in SQL_NTS...)

@marbrex
Copy link
Author

marbrex commented Aug 31, 2022

I'm guessing the discrepancy is being injected by the ODBC Driver Manager for some reason

Might this be related to the fact that in my ODBC.INI file there is only config for x32 driver ?

[ODBC 32 bit Data Sources]
DonneesSql=SQL Server (32 bit)
Winlassie=HFSQL (32 bit)
WinlassieSys=HFSQL (32 bit)
[DonneesSql]
Driver32=C:\WINDOWS\system32\SQLSRV32.dll
[Winlassie]
Driver32=C:\Program Files (x86)\Common Files\PC SOFT\27.0\ODBC\Win32x86\wd270hfo.dll
[WinlassieSys]
Driver32=C:\Program Files (x86)\Common Files\PC SOFT\27.0\ODBC\Win32x86\wd270hfo.dll

Even if I use the x64 version of the Windows ODBC Driver Manager, it creates only x32 entries in the .INI file for some reason....

EDIT: Winlassie is my DSN, and I tried to create both User and System DSNs in the Driver Manager, but both did not work..

@marbrex
Copy link
Author

marbrex commented Aug 31, 2022

May be it would be worth to mention:

When using DBeaver to connect to the DB via ODBC, all characters are displayed correctly. And I even can export data (as csv for example), but when choosing an encoding of the data, the UTF-8 option results in weird characters (using Excel to open the export file, example: Veuillez établir instead of Veuillez établir)... However, the WINDOWS-1252 encoding gets the job done...

@marbrex
Copy link
Author

marbrex commented Sep 1, 2022

@kadler The problem was that the char encoding in the Database was set to WINDOWS-1252 for some reason 😅 , so I just fixed that by adding some conversions directly in the SELECT of my SQL queries:

CAST(CONVERT(column, 'WINDOWS1252', 'UTF8') AS VARCHAR(1024))
  • The first function CONVERT, as its name implies, converts the char encdoing and returns an SQL_LONGVARBINARY.
  • Then the 2nd one CAST represents the binary sequence as a string.

Thanks for the help !

@marbrex marbrex closed this as completed Sep 1, 2022
@kadler
Copy link
Member

kadler commented Sep 1, 2022

@marbrex Glad you found a solution/workaround. I'm not sure all databases would have such conversion functions, however, so we'll probably still need to look in to a more universal solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants