Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate to boto3. Fix #43 #164

Merged
merged 9 commits into from
Apr 2, 2018
Merged

Migrate to boto3. Fix #43 #164

merged 9 commits into from
Apr 2, 2018

Conversation

mpenkov
Copy link
Collaborator

@mpenkov mpenkov commented Dec 22, 2017

Creating a new pull request to get around travis' secure data issues.

Copy link
Contributor

@menshikh-iv menshikh-iv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

conn = boto.connect_s3()
conn.create_bucket("mybucket")
s3 = boto3.resource('s3')
s3.create_bucket(Bucket='mybucket')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, better to use "global" bucket /key (as in https://github.com/RaRe-Technologies/smart_open/blob/master/smart_open/tests/test_s3.py#L16) + same maybe_mock_s3?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -968,6 +975,9 @@ class S3OpenTest(unittest.TestCase):
@mock_s3
def test_r(self):
"""Reading a UTF string should work."""
#
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's your plan to do with this type of test (with FIXME)?

@mpenkov
Copy link
Collaborator Author

mpenkov commented Mar 16, 2018

@piskvorky @menshikh-iv Moving all tests across to boto3 requires moving some of our core functionality from boto to boto3. This affects two features:

  1. smart_open accepts boto keys. If we switch to boto3, we could either drop this feature, or silently substitute incoming boto objects with boto3 ones.
  2. s3_iter_bucket accepts a boto Bucket, and yields boto Keys. If we switch to boto3, we will have to do something about this feature, e.g. always accept and return boto3. We can also write two versions of the function, one for boto2 and one for boto3, but this is labor-intensive.

I can see three ways going forward. In my opinion, in order of decreasing subjective goodness:

  1. Stop using boto/boto3 in the interface, and use native datatypes only, i.e. strings, to identify resources. This decouples our API from boto and boto3 - which particular library we use to do the work becomes purely an implementation detail.
  2. Use boto3 only.
  3. Use boto3, maintain boto functionality as separate.
  4. Keep things as is: use boto in the API, use boto3 under the covers.

I think 1) is a bit extreme, because it will surprise (anger) some people. 3) is poor, because it makes us do additional work to support a deprecated library, but marginally better than 4). I think 2) is a good compromise, because it isn't difficult to switch between boto and boto3 in application code.

What do you guys think?

@menshikh-iv
Copy link
Contributor

Thanks for detailed description @mpenkov, I'm +1 for (2) because we want to migrate to boto3 a long time ago, see #43, I see no reasons to support boto2 anymore (more than that, this project doesn't develop actively now https://github.com/boto/boto (they send us to boto3)).

@piskvorky
Copy link
Owner

I agree with @menshikh-iv .

.travis.yml Outdated
@@ -30,6 +30,8 @@ matrix:

install:
- pip install .[test]
- pip uninstall --yes botocore boto3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better to edit setup.py as we discussed, I think this will work

_MULTIPROCESSING = False
try:
import multiprocessing.pool
_MULTIPROCESSING = True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why can multiprocessing be unavailable?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I trust the above comment. It may be unavailable in certain environments, but I've never been in one.

@@ -672,168 +615,6 @@ def write_callback(request):
assert responses.calls[3].request.url == "http://127.0.0.1:8440/file"


class S3IterBucketTest(unittest.TestCase):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests have no more sense, am I right?

Copy link
Collaborator Author

@mpenkov mpenkov Mar 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Most of them moved to the s3 module. The others have lost their meaning with the rewrite.

@inksink inksink mentioned this pull request Mar 30, 2018
@menshikh-iv
Copy link
Contributor

@mpenkov what's missed here (especially what's about cases when this is impossible to replace old boto)? Is current PR ready to merge?

@mpenkov
Copy link
Collaborator Author

mpenkov commented Apr 2, 2018

@menshikh-iv Yes, I think it's ready.

@menshikh-iv menshikh-iv changed the title use boto3 in tests wherever possible Migrate to boto3. Fix #43 Apr 2, 2018
@menshikh-iv
Copy link
Contributor

Great work @mpenkov 🔥

@menshikh-iv menshikh-iv merged commit fd227d5 into master Apr 2, 2018
@piskvorky piskvorky deleted the boto3 branch April 2, 2018 15:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants