Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intermittent Database Error on upload #111

Closed
alanmcruickshank opened this issue Mar 28, 2022 · 3 comments · Fixed by #112 or #114
Closed

Intermittent Database Error on upload #111

alanmcruickshank opened this issue Mar 28, 2022 · 3 comments · Fixed by #112 or #114

Comments

@alanmcruickshank
Copy link
Contributor

Occasionally I get one of these, when running the v2 upload on the new release branch (basically on master):

04:38:00  Encountered an error while running operation: Database Error
  100112 (22000): Remote file 'http://.../stages/5ef57c80-83d2-4383-94ed-7c3738a0a04b/manifest.json.gz' was not found. There are several potential causes. The file might not exist. The required credentials may be missing or invalid. If you are running a copy command, please make sure files are not deleted when they are being loaded or files are not being loaded into two different tables concurrently with auto purge option.

This is a snowflake error, and I suspect it's to do with collisions in the stage. Either because the purge of the file is too aggressive (we're removing anything which matches pattern='.*.json.gz' or because we need to wait for snowflake to load the file properly before removing.

Given I think each dbt cloud run is running in an isolated environment, I think it's most likely that the issue is happening in the stage rather than locally. Suggested mitigating measures:

  • More specific remove command.
  • Get rid of remove and use the PURGE option on the stage to try and get snowflake to remove files itself automatically post-load.
  • Use a run specific stage (or location within the stage), probably using the command_invocation_id.

Gut feel on my part is that the last option alone might be the most elegant solution, because that totally isolates each run.

@NiallRees
Copy link
Contributor

The last option is also in line with how SaaS data loaders work.

@alanmcruickshank
Copy link
Contributor Author

Cool cool - will make PR

@alanmcruickshank
Copy link
Contributor Author

I'm still getting this error on the current pre-release. Issue is the at #112 solves the file upload but not the file read. The stage reference in the FROM clause also needs to be qualified. Will make a PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants