Skip to content
This repository was archived by the owner on Jul 23, 2020. It is now read-only.

cEncoding null pointer #5

Open
kiliankoe opened this issue Aug 26, 2015 · 4 comments
Open

cEncoding null pointer #5

kiliankoe opened this issue Aug 26, 2015 · 4 comments
Labels

Comments

@kiliankoe
Copy link

Hey again :)

I've been running into some issues trying to correctly decode the content of a specific page. The response header says it's utf-8, the page itself says is latin-1. Decoding with utf-8 works when done by hand, but I kept having encoding issues when creating an HTML document with Ji.

let htmlDoc = Ji(htmlData: data, encoding: NSUTF8StringEncoding)

I tried a bunch of different NSStringEncodings, most with the same results, which seemed weird. Having a look at Ji's code I found cEncoding to be a null pointer every time so htmlReadMemory() probably defaults to something besides utf-8.

let cEncoding: UnsafePointer<CChar> = CFStringGetCStringPtr(cfEncodingAsString, 0)
if cEncoding == nil {
    print("cEncoding is a null pointer")
}

If I hardcode the encoding of the page as a string literal a few lines lower everything seems to be working.

htmlDoc = htmlReadMemory(cBuffer, cSize, nil, "utf-8", options)

I hope this helps debugging this issue, I don't have much of a clue^^

@honghaoz
Copy link
Owner

Hey @kiliankoe, thanks for your issue. This is kind of weird, if we provide encoding: NSUTF8StringEncoding, cEncoding shouldn't be nil.

If it's convenient, can you provide me the URL you are trying to parse? I tested some common websites and Ji works well.

@kiliankoe
Copy link
Author

Unfortunately the site I'm using itself is behind a login, but I've copied a part of it that I can replicate the issue with here.

Specifically "Säulensaal Ost" should be coming out as "Säulensaal Ost" and a € symbol should also be readable after the prices.

@honghaoz
Copy link
Owner

@kiliankoe Thanks! I get it, let me look into it. String encoding is pretty hard, I'll let you know if I find something.

@honghaoz
Copy link
Owner

Hey @kiliankoe do you have any ideas on this issue? Sorry for busy those days.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants