Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid credentials embedded in URL #169

Closed
maxnikulin opened this issue Aug 28, 2020 · 16 comments
Closed

Avoid credentials embedded in URL #169

maxnikulin opened this issue Aug 28, 2020 · 16 comments
Labels
enhancement New feature or request

Comments

@maxnikulin
Copy link
Contributor

It took quite time for me to realize how to avoid requests with explicit credentials in URL:

GET https://user:[email protected]:8080/?a=config&f=json&ts=1598577972

Such requests tend to appear in various logs causing leak of passwords. Additionally I do not consider keeping password in the extension options as a best practice.

As the first step, I believe that a warning should be added before user and password fields in the options page:

It is not recommended to fill user name and password here. Set browser master password if you did not do it yet and allow browser to keep credentials if you do no like to type login and password on each browser startup.

I was confused by "repeatedly" in the help remarks:

If the backend server is configured to require HTTP authorization, the password can be provided here to prevent being asked for repeatedly.

I expected that "repeatedly" means on any action, saved page access, etc, so I find it is confusing. I think, it is better to say "on each startup" and to add "(insecure, not recommended)" warning. I was unlucky enough to clear password field during experiments but keeping user name. It is the worst case when it is possible to work with PyWebScrapBook web interface but any action in extension repeatedly resets credentials cached by the browser.

The following will require significant amount of work.

I guess, these requests are performed to push password into the browser authorization cache. I think it is more secure to keep password in the browser built-in wallet where it can be protected by master password. I hope, I would not be wrong if I will say that exchange of the password to a session cookie and adding authorization header should be better approach. Token could be even used in next browser session till explicit logout. Certainly I assume limited token lifetime (1, 2, or 3 months for example). Clear text password should not be stored in extension settings at all. Login dialog should be shown in demand instead (certainly it should not be "repeatedly").

For extension context launchWebAuthFlow could be the next goal https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/identity/launchWebAuthFlow

To summarize my expectations,

  • A notice should be added to the options page
  • Session cookie and authentication token for HTTP header should be implemented to make storing of password unnecessary.
  • Password should not be stored in extension options, instead a temporary token should be stored in local storage.
@danny0838
Copy link
Owner

danny0838 commented Aug 29, 2020

1. According to some doc, HTTP user and password are only visible by the client and the server as long as SSL is not compromised. So I don't think it's a big issue.

2. I agree with you that saving password in extension storage is not save. We'll remove the user and password options.

3. I don't get your idea about a token. Currently the implemented token is for defending CSRF, not for authorization.

4. Implementing a full cookie based authorization requires a full design of login/logout page for the extension and web interface, and it's not downward compatible.

I have pushed the devel branch of PyWsb to support a cookie based session, so that the login status can persist after browser closing, with HTTP authorization being the fallback, and it will keep working even if the user disallows cookie. The only downside is that there's currently no logout interface, and the user has to remove cookie from browser options himself to logout. You can give it a try and see if there's something to improve.

Though, as per-book authorization is requested, we may eventually need a full implementation in the future.

@danny0838 danny0838 added the enhancement New feature or request label Aug 29, 2020
@maxnikulin
Copy link
Contributor Author

Just to check that you got my point right. If I do not specify login and password in the addon settings, everything works acceptable from my point of view. I have not tested chrome behavior. Firefox shows login prompt once after startup. If I allow storing of passwords, login and password fields are pre-filled. At first it was not clear for me that this is working option at all. That is why I asked for warnings at the settings page.

  1. Passwords should not appear in clear text (unless it is low-level network traffic dumps). That is why hashed passwords are usually stored.

  2. I mean authentication token (e.g. sessionID) unrelated to CSFR protection token, that is unlike password, is a temporary secret. E.g. in OAuth2 scheme client receives even 2 tokens: one for regular request and another token to refresh the former one.

  3. Which variant of compatibility are you worrying of: old extension and new backend or new extension with old backend? Let me repeat the idea of a backend API method that returns feature map. If extension detects new backend then it opens login page.

I see substantial refactoring of the code. It is great when first proof-of-concept variant is being replaced by better organized methods.

It seems that flask sessions are not suitable to keep user ID. The only way to immediately expire sessions after password change is to generate new application secret key (It still could be acceptable for a single-user server if documented). Some kind of persistence is required on the server side with possibility to select sessions that belong to a particular user. Flask session could remember authentication session ID in the server storage (file system, SQL).

As to per-book permissions, my opinion, it is another story. Book could have owner and access control list associated with roles. Particular roles could be assigned to users. Notice that authentication options are another properties of the user, not directly related to book access. In general, user could have more than one authentication variant: local password, logins through external services (google, facebook, etc.).

Nitpick: isn't it necessary to ensure that secret key file is not world-readable? Is it assumed that .wsb directory have proper permissions?

Do you expect any problem in the case of app.config['SESSION_COOKIE_SAMESITE'] = 'Strict'?

@danny0838
Copy link
Owner

It seems that flask sessions are not suitable to keep user ID. The only way to immediately expire sessions after password change is to generate new application secret key (It still could be acceptable for a single-user server if documented). Some kind of persistence is required on the server side with possibility to select sessions that belong to a particular user. Flask session could remember authentication session ID in the server storage (file system, SQL).

I haven't thought about session invalidation. If we want only a mechanism to invalidate a session when the password changes, we can simply include the (hashed) password in the session and re-validate it for requests.

As to per-book permissions, my opinion, it is another story. Book could have owner and access control list associated with roles. Particular roles could be assigned to users. Notice that authentication options are another properties of the user, not directly related to book access. In general, user could have more than one authentication variant: local password, logins through external services (google, facebook, etc.).

Yes. This is increasingly complicated and we'll put it as lower priority.

Nitpick: isn't it necessary to ensure that secret key file is not world-readable? Is it assumed that .wsb directory have proper permissions?

I don't think it's meaningful to restrict only the secret key file from reading by another user on the server, who is probably already able to access the whole scrapbook.

Also, there's nothing secret in the secret key file. If someone wants to protect passwords, he should protect the config file, if he's not satisfied by salted hashes.

If one really doesn't want other users to access his scrapbook, he can, and probably should, set permission for the whole scrapbook directory, or configure umask, by himself.

Do you expect any problem in the case of app.config['SESSION_COOKIE_SAMESITE'] = 'Strict'?

Samesite cookie can't work with WSB because a browser extension is run under an origin which is different from the backend server.

@maxnikulin
Copy link
Contributor Author

I am impressed by the idea to keep part of password hash in the session data. I am going to add more detailed comment in a few days.

Samesite cookie can't work with WSB because a browser extension is run under an origin which is different from the backend server.

If I didn't confuse something, Strict surprisingly works for extensions. Notice also firefox warning

Cookie “wsbsession” will be soon rejected because it has the “sameSite” attribute set to “none” or an invalid value, without the “secure” attribute. To know more about the “sameSite“ attribute, read https://developer.mozilla.org/docs/Web/HTTP/Headers/Set-Cookie/SameSite

At first I did not figured out that updated credentials are ignored in the case of valid session. It is not a problem to implement logout interface on any external site.

<!DOCTYPE html>
<html>
        <head>
                <meta charset="utf-8">
                <title>Relogin</title>
        </head>
        <body>
                <h1>Login clickjacking</h1>
                <a href="https://bad:[email protected]:8080">Just a link</a>
        </body>
</html>

Maybe it could be wrapped to an iframe and setIntervale() to convert it into annoyance generator.

@danny0838
Copy link
Owner

Samesite cookie can't work with WSB because a browser extension is run under an origin which is different from the backend server.

If I didn't confuse something, Strict surprisingly works for extensions. Notice also firefox warning

Cookie “wsbsession” will be soon rejected because it has the “sameSite” attribute set to “none” or an invalid value, without the “secure” attribute. To know more about the “sameSite“ attribute, read https://developer.mozilla.org/docs/Web/HTTP/Headers/Set-Cookie/SameSite

I was wrong about SameSite cookie. According to some doc it seems that SameSite cookies are sent by the browser if the requesting extension has host permission of the requested origin. A little test shows that this works in both Firefox and Chromium; as a result, we'll set SameSite cookie to 'Lax' (not 'strict' since it's still possible for the user to link to his scrapbook via a link somewhere).

At first I did not figured out that updated credentials are ignored in the case of valid session. It is not a problem to implement logout interface on any external site.

<!DOCTYPE html>
<html>
        <head>
                <meta charset="utf-8">
                <title>Relogin</title>
        </head>
        <body>
                <h1>Login clickjacking</h1>
                <a href="https://bad:[email protected]:8080">Just a link</a>
        </body>
</html>

Maybe it could be wrapped to an iframe and setIntervale() to convert it into annoyance generator.

Oops, it's a leak. We'll prevent session clearance if the provided http auth is not an effective login.

I am impressed by the idea to keep part of password hash in the session data. I am going to add more detailed comment in a few days.

I added a test-session branch for the updated session implementation. Feel free to test it.

@maxnikulin
Copy link
Contributor Author

The pause is becoming too long, so I decided to show you current state. In a spare time, have a look into a proof of concept of more flexible authentication
https://github.com/maxnikulin/PyWebScrapBook/tree/auth-plugin-session
I have not rebased it to the latest test-session branch. It is an early draft. I have tried to cut authorization into interchangeble parts. You could find approach as overengineered.

I really want to be able to use e.g. sessions stored in a SQL database to audit how many active sessions I currently have, even if such functionality is not included into the upstream package.

I could not say that plugin interface is already established. A couple of things is bothering me currently: I have not tried to implement authentication through redirection to a dedicated page and I am unsure that it is compatible with flask extensions intended for authentication. Certainly in current implementation I could miss holes that makes flexible configuration fragile in respect to authorization bypassing. At the same time I could impose over-restrictive requirements as well.

we'll set SameSite cookie to 'Lax' (not 'strict' since it's still possible for the user to link to his scrapbook via a link somewhere).

Do you mean a link from some external site to a private scrapbook? I am still afraid that Lax opens a door for modifications initiated by external sources. I may be wrong but Lax is suitable for read-only access. I am not confident that anti-CSRF tokens provides enough degree of protection.

@danny0838
Copy link
Owner

The pause is becoming too long, so I decided to show you current state. In a spare time, have a look into a proof of concept of more flexible authentication
https://github.com/maxnikulin/PyWebScrapBook/tree/auth-plugin-session
I have not rebased it to the latest test-session branch. It is an early draft. I have tried to cut authorization into interchangeble parts. You could find approach as overengineered.

I really want to be able to use e.g. sessions stored in a SQL database to audit how many active sessions I currently have, even if such functionality is not included into the upstream package.

I could not say that plugin interface is already established. A couple of things is bothering me currently: I have not tried to implement authentication through redirection to a dedicated page and I am unsure that it is compatible with flask extensions intended for authentication. Certainly in current implementation I could miss holes that makes flexible configuration fragile in respect to authorization bypassing. At the same time I could impose over-restrictive requirements as well.

The current implementation seems too complicated and likely over-kill. Currently it's not yet justified that a server side session is required, and, if we really want a server side session, we'd prefer using an existing library such as SQLAlchemy rather then reinventing a wheel, unless the dependency introduce a difficulty on installation or a significant performance penalty. Besides, we are also consider reworking token handler using SQLite rather than plain files for the same reason.

we'll set SameSite cookie to 'Lax' (not 'strict' since it's still possible for the user to link to his scrapbook via a link somewhere).

Do you mean a link from some external site to a private scrapbook? I am still afraid that Lax opens a door for modifications initiated by external sources. I may be wrong but Lax is suitable for read-only access. I am not confident that anti-CSRF tokens provides enough degree of protection.

If strict mode were used, it would be extremely inconvenience when a re-authentication is asked for every time when the user visits his already-logged-in scrapbook through a URL he had referenced other where, such as a note-taking app or a private document. Even so, Lax mode still introduces an inconvenience such as being unable to embed an image or include a page in an iframe, although such usage seems much less common.

According to spec, Lax mode allows non-scriptible GETs only, while any modifying request requires POST method, and we are also going to restrict token request to be POST-only. It seems safe enough—and probably over-restrictive—as the intention of samesite cookie is to protect against CSRF, which is already covered by the token system of PyWSB. We are even considering implementing an auto-detection of server protocol and response SameSite None + Secure when served over HTTPS and SameSite Lax otherwise, if there is no known side effect. As of it, we are not going to be more restrictive unless a good justification is provided.

@maxnikulin
Copy link
Contributor Author

Currently it's not yet justified that a server side session is required, and, if we really want a server side session, we'd prefer using an existing library such as SQLAlchemy rather then reinventing a wheel

I proposed extension point to avoid arguing if server-side sessions are required. For the first try users could run backend with no authorization at all. Further they are free to use simple client-side session or implement login through any external service accordingly to their desire.

For a couple of simple tables it does not matter if implementation will be based on SQL Alchemy or will use plain SQL queries. Maybe something has changed, but SQLAlchemy anyway did not have facilities to update database schema. On the other hand, if there is a flask extension for authorization and authentication that could be easily integrated to scrapbook backend, it would be great (likely it will depend on SQLAlchemy, but it does not matter). Personally I am not familiar with flask ecosystem and have no preferences for or against particular extensions.

Personally, I am not comfortable with client side sessions due to it is impossible to easily invalidate them if something suspicious is detected. I suspect that it is safer to immediately invalidate pre-auth session as soon as user is authenticated and normal session is created.

@danny0838
Copy link
Owner

I proposed extension point to avoid arguing if server-side sessions are required. For the first try users could run backend with no authorization at all. Further they are free to use simple client-side session or implement login through any external service accordingly to their desire.

For a couple of simple tables it does not matter if implementation will be based on SQL Alchemy or will use plain SQL queries. Maybe something has changed, but SQLAlchemy anyway did not have facilities to update database schema. On the other hand, if there is a flask extension for authorization and authentication that could be easily integrated to scrapbook backend, it would be great (likely it will depend on SQLAlchemy, but it does not matter). Personally I am not familiar with flask ecosystem and have no preferences for or against particular extensions.

Personally, I am not comfortable with client side sessions due to it is impossible to easily invalidate them if something suspicious is detected. I suspect that it is safer to immediately invalidate pre-auth session as soon as user is authenticated and normal session is created.

Fair point for an "extension point". Unfortunately we have too many things to work on and simply don't have time and resource for such "nice-to-have" features.

If we really implement a server-side session, we'll be obliged to maintain it. For an effective session management we probably need to also store lots of information, such as access time, client agent, client domain and/or IP, geo-location, etc. It would be increasingly complicated to make them configurable, as well as to handle a potential scheme change for both the SQL and code in the future.

Even client-based session needs consideration. Cookie based session is somehow not quite compatible with http authorization, which would be sent by the browser repeatedly after an auth is first established.

The auth methanism PyWSB provided is just for a quick-and-simple way to serve a scrapbook on a PC or NAS securely over the net for personal use. For a modern browser that is not compromised, HTTP authorization via SSL should be safe enough.

For a techie who hosts an expert production remote server, it's more likely that he needs a more robust authentication mechanism for the whole www root of the domain(s) rather than just a PyWSB app, and configuration in the reverse proxy or the upstream HTTP server is probably the way to go. As a result, any fancy auth implementation for PyWSB are likely unuseful.

Additionally, Python is an easy-to-extend language. For someone who really needs an expert authorization mechanism, write a Flask middleware to wrap around the PyWSB server in the .wsb/serve.py or .wsb/app.py. For multiple servers, write a package that extends PyWSB. If the current code structure is too difficult to extend, it's free to issue a pull request to improve it.

Overally speaking, we would put this to low priority, if not a wontfix.

@danny0838
Copy link
Owner

As the first step, I believe that a warning should be added before user and password fields in the options page:

It is not recommended to fill user name and password here. Set browser master password if you did not do it yet and allow browser to keep credentials if you do no like to type login and password on each browser startup.

I was confused by "repeatedly" in the help remarks:

If the backend server is configured to require HTTP authorization, the password can be provided here to prevent being asked for repeatedly.

I expected that "repeatedly" means on any action, saved page access, etc, so I find it is confusing. I think, it is better to say "on each startup" and to add "(insecure, not recommended)" warning. I was unlucky enough to clear password field during experiments but keeping user name. It is the worst case when it is possible to work with PyWebScrapBook web interface but any action in extension repeatedly resets credentials cached by the browser.

We are going to add such notice in the next release:

NOTE: Credentials stored in a browser extension is unprotected and could be stolen by a skilled attacker. Using password-protected built-in browser password manager is generally more preferable.

I don't think "repeatedly" would mean "any action"; additionally "on each startup" is also inaccurate. There are many cases that can be treated as a different session by the browser, such as a new incognito/private window, a new container in Firefox, or a different user in Chrome, or even more cases handled by an extension. As defining how "repeatedly" actually mean is difficult, I'd leave it for the user to test and determine.

Although saving credentials in extension is less secure, it won't be too much a hazard if a strong and application-only user and password is used under full awareness. After all, if an attacker really is able to access the browser, there would be many things he can steal besides this credential.

Saving credentials in extension is sometimes more preferable. For example, a user may not want to enable browser password manager. And Firefox password manager with master password still prompts for the master password for each new "session". As such, we would add a notice rather than remove the feature completely.

@maxnikulin
Copy link
Contributor Author

I have not seriously considered external auth (auth_request in nginx https://docs.nginx.com/nginx/admin-guide/security-controls/configuring-subrequest-authentication/ or wsgi middleware) since I believe that currently authentication and authorization is too tightly coupled to the application. To take advantage of extensibility of python applications it is better to have extension points, otherwise it would be headache to rebase patches after each release.

For example, I consider format query argument as an internal implementation detail, however it is necessary to generate Unauthorized or Forbidden response in proper format. At the same time I believe that it is OK for the authentication application to respect Accept HTTP header.

Authorization is a more difficult question. I am unsure if access rules are stable and simple enough to make "external" auth aware of them. Rules like "PUT requires write access, read is enough for GET" are not suitable. Passing user ID to webscapbook backend may be a step forward. This case webscrapbook is responsible for authorization only and does not have to deal with authentication.

Are you against using of Accept instead of format? It is yet an idea, not a solid vision. To make things work smoothly it is necessary to teach extension to obtain login/logout URLs from backend (e.g. in config request) and to open the page on Forbidden response.

As to local-only solution, I have seen at least "I've running wsb on my web host" in an issue comment.

I consider cookie as a better solution for authentication since temporary secret is used. Passwords are "permanent" secrets to some degree since they are managed by humans. Currently it is suggested to use expensive enough hashing for stored data. E.g werkzeug.security.generate_password_hash uses pbkdf2. Example of recovering of password from simple hashes: https://blog.cynosureprime.com/2017/08/320-million-hashes-exposed.html But it could be too expensive to compute strong hash from basic authorization parameters sent by browser on each request.

On the extension side I am in doubts concerning web auth flow since it is not supported by firefox on android.

If session management is not hard-coded into the application, things such as geo-ip could be left to the particular users. If API is justified for external authorization, built-in client side session (with possibility to disable them) is enough from my point of view.

@danny0838
Copy link
Owner

I have not seriously considered external auth (auth_request in nginx https://docs.nginx.com/nginx/admin-guide/security-controls/configuring-subrequest-authentication/ or wsgi middleware) since I believe that currently authentication and authorization is too tightly coupled to the application. To take advantage of extensibility of python applications it is better to have extension points, otherwise it would be headache to rebase patches after each release.

Not really. If a user simply wants a gross access control over https://hisdomain.example.com/*, an authorization mechanism on the upper server is probably good enough, and even preferable, as it's likely to perform better.

For example, I consider format query argument as an internal implementation detail, however it is necessary to generate Unauthorized or Forbidden response in proper format. At the same time I believe that it is OK for the authentication application to respect Accept HTTP header.

Well, any standard-conforming browser will be able to handle 401 and 403 response according to the header information. Message in the response body is additional and not mandatory.

Authorization is a more difficult question. I am unsure if access rules are stable and simple enough to make "external" auth aware of them. Rules like "PUT requires write access, read is enough for GET" are not suitable. Passing user ID to webscapbook backend may be a step forward. This case webscrapbook is responsible for authorization only and does not have to deal with authentication.

Yes, if detailed access control such as what actions a user can perform, configuration on PyWSB would be needed. However, it's not necessarily what most users need.

Are you against using of Accept instead of format? It is yet an idea, not a solid vision. To make things work smoothly it is necessary to teach extension to obtain login/logout URLs from backend (e.g. in config request) and to open the page on Forbidden response.

Using Accept header in place of format query parameter is an interesting idea, but it requires large code rework, and we'd have to investigate whether it's cost-worthy.

The extension will lose control if an extension page is redirected to an external web site, and thus the extension has to build all interface on its own, which is a headache and breaks downward compatibility, and is the reason we are extremely conservative about it.

As to local-only solution, I have seen at least "I've running wsb on my web host" in an issue comment.

I consider cookie as a better solution for authentication since temporary secret is used. Passwords are "permanent" secrets to some degree since they are managed by humans. Currently it is suggested to use expensive enough hashing for stored data. E.g werkzeug.security.generate_password_hash uses pbkdf2. Example of recovering of password from simple hashes: https://blog.cynosureprime.com/2017/08/320-million-hashes-exposed.html But it could be too expensive to compute strong hash from basic authorization parameters sent by browser on each request.

Although basic auth with SSL is not perfect, the security it provides is generally acceptable. Form and session based auth may be better, but I don't see too much gain implementing them. Considering other important issues we have to deal with and the resources we have, we'd put them to lower priority.

On the extension side I am in doubts concerning web auth flow since it is not supported by firefox on android.

I hate Mozilla forcing users to switch to Fx for Android 79 even though it's incomplete and doesn't support most addons. For now you can try on Firefox for Android < 79 or Kiwi browser (which is Chromium-based). Though I don't think there would be any big issue concerning credential security for them.

If session management is not hard-coded into the application, things such as geo-ip could be left to the particular users. If API is justified for external authorization, built-in client side session (with possibility to disable them) is enough from my point of view.

Well, the most flexible way for a session management mechanism is simple enough—left it to the particular users to extend PyWSB.

@maxnikulin
Copy link
Contributor Author

The extension will lose control if an extension page is redirected to an external web site

I have impression that login web page could postMessage to the extension to unblock pending actions. Though I have never tried it, so there might be complication as content security policy requiring enumerating pages allowed to send messages in the extension manifest.

Well, any standard-conforming browser will be able to handle 401 and 403 response according to the header information.

It is API calls from the extension that expect JSON response, that I am afraid of. Attempt of usual access to a page should be redirected to the login page, API call should receive Forbidden instead of redirection.

With HTTP basic authorization I do not see a way to protect user against relogin or login reset through a specially crafted link.

@danny0838
Copy link
Owner

With HTTP basic authorization I do not see a way to protect user against relogin or login reset through a specially crafted link.

How? A link or a form cannot carry a different credential. AJAX is also not possible for an external site.

@maxnikulin
Copy link
Contributor Author

maxnikulin commented Sep 13, 2020

Please, have a look at a draft of support of Accept header https://github.com/maxnikulin/PyWebScrapBook/tree/accept-format and https://github.com/maxnikulin/webscrapbook/tree/accept-format

How? A link or a form cannot carry a different credential. AJAX is also not possible for an external site.

#169 (comment)

@danny0838
Copy link
Owner

danny0838 commented Sep 13, 2020

Please, have a look at a draft of support of Accept header https://github.com/maxnikulin/PyWebScrapBook/tree/accept-format and https://github.com/maxnikulin/webscrapbook/tree/accept-format

How? A link or a form cannot carry a different credential. AJAX is also not possible for an external site.

#169 (comment)

I think we have been talking about security in this thread.

We don't have unlimited resource and we are not going to implement any new auth mechanism unless it's proven true that SSL + basic auth has a critical leak, at least in near future.

The accept issue should be discussed in another thread. Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants