Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: fix handling of loading empty metadata file for queue #1042

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Mantisus
Copy link
Collaborator

@Mantisus Mantisus commented Mar 3, 2025

Description

  • Error handling if an empty metadata file was created
  • Reduced the probability of creating empty metadata files

Issues

@Mantisus Mantisus requested a review from vdusek March 3, 2025 13:45
@Mantisus Mantisus self-assigned this Mar 3, 2025
@Mantisus Mantisus requested a review from Pijukatel March 3, 2025 17:15
Copy link
Contributor

@Pijukatel Pijukatel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add some test that would capture the nature of this scenario.

@@ -51,7 +51,8 @@ async def persist_metadata_if_enabled(*, data: dict, entity_directory: str, writ

# Write the metadata to the file
file_path = os.path.join(entity_directory, METADATA_FILENAME)
f = await asyncio.to_thread(open, file_path, mode='wb')
mode = 'r+b' if os.path.exists(file_path) else 'wb'
Copy link
Contributor

@Pijukatel Pijukatel Mar 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the file exists we can still open it with "wb" mode.If we open it with "r+b" mode, then we are not overwriting the whole file with new content but just changing the file starting from the beginning of the file. I doubt that is what we want.

Imagine file with content b"abc"
and you want to change it

with open(path, 'r+b') as f:
    f.write(b"x") 

-> b"xbc"

with open(path, 'wb') as f:
    f.write(b"x") 

-> b"x"

Maybe I have missed the point here, but so far it does not seem right to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

error starting up crawlee - json.decoder.JSONDecodeError: Expecting value
2 participants