Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] CSV import via Data visualizer is not working with the attached file #42114

Closed
felix-lessoer opened this issue Jul 29, 2019 · 6 comments · Fixed by #44768
Closed

[ML] CSV import via Data visualizer is not working with the attached file #42114

felix-lessoer opened this issue Jul 29, 2019 · 6 comments · Fixed by #44768
Assignees

Comments

@felix-lessoer
Copy link

Kibana version:
7.2.0

Elasticsearch version:
7.2.0

Server OS version:
Elastic Cloud AWS

Browser version:
Latest Chrome

Browser OS version:
Latest Chrome

Original install method (e.g. download page, yum, from source, etc.):

Describe the bug:
Trying to upload and ingest an specific CSV file stops without any error (in Frontend)
Javascript Console shows following error:

Refused to execute inline script because it violates the following Content Security Policy directive: "script-src 'unsafe-eval' 'nonce-wcklBim9l4CnY7us'". Either the 'unsafe-inline' keyword, a hash ('sha256-SHHSeLc0bp6xt4BoVVyUy+3IbVqp3ujLaR+s+kSP5UI='), or a nonce ('nonce-...') is required to enable inline execution.

Steps to reproduce:

  1. Create cloud cluster
  2. Open Kibana and goto Data Visualizer
  3. Upload CSV
  4. Change settings to use header row for field names
  5. Try to import it

Expected behavior:
Import will be done successfully.

Screenshots (if relevant):

Errors in browser console (if relevant):
Refused to execute inline script because it violates the following Content Security Policy directive: "script-src 'unsafe-eval' 'nonce-wcklBim9l4CnY7us'". Either the 'unsafe-inline' keyword, a hash ('sha256-SHHSeLc0bp6xt4BoVVyUy+3IbVqp3ujLaR+s+kSP5UI='), or a nonce ('nonce-...') is required to enable inline execution.

This is the CSV file that made problems:
https://drive.google.com/file/d/1UUtaK9vmzyM4ZCHuEWWAU4Q2BFbDBbzw/view?usp=sharing
[Elastic only]

Provide logs and/or server output (if relevant):

Any additional context:

@nickpeihl nickpeihl added the :ml label Jul 29, 2019
@elasticmachine
Copy link
Contributor

Pinging @elastic/ml-ui

@sophiec20
Copy link
Contributor

Minor investigations as follows:

22,552 hits imported.
First line was not recognised as a header but worked fine after manual override.
No errors displayed on screen. No errors in logs.

Using 7.3.0 { "build" : { "hash" : "46d4559", "date" : "2019-07-18T12:15:03.400569Z" },.
Showcase server, not a secured elastic cloud instance.

@jgowdyelastic
Copy link
Member

From my investigations I believe this is a problem with the import endpoint hanging.
I've managed to get it to import by cutting the file down to around 5190 lines. There seems to be some inconsistency in the number of lines which will eventually be accepted.
It would be worth cutting down the CHUNK_SIZE to see if uploading in smaller chunks will solve this.
This will be difficult to test as this problem is only happening in the cloud.

The error shown in the console comes from the EuiCodeEditor component and is unrelated to this issue.

@jgowdyelastic
Copy link
Member

I've managed to get this file to import by editing the CHUNK_SIZE value using chrome's debugger. I dropped the value down to 1000 causing the data to be imported in smaller batches.

There does still appear to be a delay during each chunk, so whatever is causing this to hang is still there but appears to be less effected when dealing with less data.
All the data was imported with no errors.

Maybe CHUNK_SIZE should be configurable to avoid issues like this?

@droberts195
Copy link
Contributor

Maybe CHUNK_SIZE should be configurable to avoid issues like this?

I don't think the user should be responsible for configuring this. The uploader should do it dynamically.

The thing that's interesting/different about this particular CSV file is that the records are very long - each one contains multiline text fields where AirBNB owners have written large amounts about their properties. I guess this issue shows that Cloud imposes a limit on the size of messages that the browser running Kibana can send to the Kibana server. I will try to find out what this limit is.

At the moment the upload code is using Lodash's chunk function to divide up the full array into chunks that can be uploaded to ES in a single bulk request. Two options I can see that would avoid this problem without requiring the user to select a chunk size are:

  1. Keep appending documents to the bulk request until its size nears the limit. In other words, the chunk size varies from request to request depending on the exact size of the documents in the request.
  2. Keep track of the maximum document size while creating the documents to be ingested. Then set the chunk size to pass to Lodash to (size limit / maximum document size). This is not as efficient, but could be an alternative if option 1 is hard to implement for some reason.

@aptrishu
Copy link

In my case it was related to browser. It worked fine on chrome, but was stuck on firefox.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants