Skip to content

Commit

Permalink
Merge branch 'master' of github.com:metabase/metabase into non-root-p…
Browse files Browse the repository at this point in the history
…ath-v2
  • Loading branch information
tlrobinson committed Apr 27, 2017
2 parents 506006c + 2d1857d commit 4fe2354
Show file tree
Hide file tree
Showing 136 changed files with 1,373 additions and 1,180 deletions.
64 changes: 64 additions & 0 deletions docs/administration-guide/01-managing-databases.md
Original file line number Diff line number Diff line change
Expand Up @@ -88,6 +88,70 @@ You can also delete a database from the database list: hover over the row with t

**Caution: Deleting a database is irreversible! All saved questions and dashboard cards based on the database will be deleted as well!**

### SSH Tunneling In Metabase
---
Metabase has the ability to connect to some databases by first establishing a connection to a server in between Metabase and a data warehouse, then connect to the data warehouse using that connection as a bridge. This makes connecting to some data warehouses possible in situations that would otherwise prevent the use of Metabase.


#### When To Use This Feature
There are two basic cases for using an SSH tunnel rather than connecting directly:

* A direct connection is impossible
* A direct connection is forbidden due to a security policy

Sometimes when a data warehouse is inside an enterprise environment, direct connections are blocked by security devices such as firewalls and intrusion prevention systems. To work around this many enterprises offer a VPN, a bastion host, or both. VPNs are the more convenient and reliable option though bastion hosts are used frequently, especially with cloud providers such as Amazon Web Services where VPC (Virtual Private Clouds) don't allow direct connections. Bastion hosts offer the option to first connect to a computer on the edge of the protected network, then from that computer establish a second connection to the data warehouse on the internal network and essentially patch these two connestions together. Using the SSH tunneling feature, Metabase is able to automate this process in many cases. If a VPN is available that should be used in preference to SSH tunneling.

#### How To Use This Feature

When connecting though a bastion host:

* Answer yes to the "Use an SSH-tunnel for database connections" parameter
* Enter the hostname for the data warehouse as it is seen from inside the network in the `Host` parameter.
* Enter the data warehouse port as seen from inside the network into the `Port` parameter.
* Enter the extenal name of the bastion host as seen from the outside of the network (or wherever you are) into the `SSH tunnel host` parameter.
* Enter the ssh port as seen from outside the network into the `SSH tunnel port` parameter. This is usually 22, regardless of which data warehouse you are connecting to.
* Enter the username and password you use to login to the bastion host into the `SSH tunnel username` and `SSH tunnel password` parameters.

If you are unable to connect test your ssh credentials by connecting to the SSH server/Bastion Host using ssh directly:

ssh <SSH tunnel username>@<SSH tunnel host> -p <SSH tunnel port>


Another common case where direct connections are not possible is when connecting to a data warehouse that is only accessible locally and does not allow remote connections. In this case you will be opening an SSH connection to the data warehouse, then from there connecting back to the same computer.

* Answer yes to the "Use an SSH-tunnel for database connections" parameter
* Enter `localhost` in the `Host` parameter. This is the name the server
* Enter the same value in the `Port` parameter that you would use if you where sitting directly at the data warehouse host system.
* Enter the extenal name of the data warehouse, as seen from the outside of the network (or wherever you are) into the `SSH tunnel host` parameter.
* Enter the ssh port as seen from outside the network into the `SSH tunnel port` parameter. This is usually 22, regardless of which data warehouse you are connecting to.
* Enter the username and password you use to login to the bastion host into the `SSH tunnel username` and `SSH tunnel password` parameters.

If you have problems connecting verify the ssh host port and password by connecing manually using ssh or PuTTY on older windows systems.

#### Disadvantages to Indirect Connections

While using an ssh tunnel makes it possible to use a data warehouse that is otherwise not accessible it is almost always preferable to use a direct connection when possible:

There are several inherent limitations to connecting through a tunnel:

* If the enclosing SSH connection is closed because you put your computer to sleep or change networks, all established connections will be closed as well. This can cause delays resuming connections after suspending your laptop
* It's almost always slower. The connection has to go through an additional computer.
* Opening new connections takes longer. SSH connections are slower to establish then direct connections.
* Multiple operations over the same SSH tunnel can block each other. This can increase latency in some cases.
* The number of connections through a bastion host is often limited by organizational policy.
* Some organizations have IT security policies forbidding using SSH tunnels to bypass security perimeters.

#### What if The Built in SSH Tunnels Don't Fit My Needs?

This feature exists as a convenient wrapper around SSH and automates the common cases of connecting through a tunnel. It also makes connecting possible from systems that don't have or allow shell access. Metabase uses a built in SSH client that does not depend on the installed system's ssh client. This allows connecting from systems where it's not possible to run SSH manually, it also means that Metabase cannot take advantage of authentication services provided by the system such as Windows Domain Authentication or Kerberos Authentication.

If you need to connect using a method not enabled by Metabase, you can often accomplish this by running ssh directly:

ssh -Nf -L input-port:internal-server-name:port-on-server [email protected]

This allows you to use the full array of features included in ssh. If you find yourself doing this often, please let us know so we can see about making your process more convenient through Metabase.


---

## Next: enabling features that send email
Expand Down
31 changes: 29 additions & 2 deletions frontend/src/metabase/components/DatabaseDetailsForm.jsx
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@ const CREDENTIALS_URL_PREFIXES = {
googleanalytics: 'https://console.developers.google.com/apis/credentials/oauthclient?project=',
};

const isTunnelField = (field) => /^tunnel-/.test(field.name);

/**
* This is a form for capturing database details for a given `engine` supplied via props.
* The intention is to encapsulate the entire <form> with standard MB form styling and allow a callback
Expand Down Expand Up @@ -61,7 +63,10 @@ export default class DatabaseDetailsForm extends Component {

// go over individual fields
for (let field of engines[engine]['details-fields']) {
if (field.required && isEmpty(details[field.name])) {
// tunnel fields aren't required if tunnel isn't enabled
if (!details["tunnel-enabled"] && isTunnelField(field)) {
continue;
} else if (field.required && isEmpty(details[field.name])) {
valid = false;
break;
}
Expand Down Expand Up @@ -146,7 +151,29 @@ export default class DatabaseDetailsForm extends Component {
let { engine } = this.props;
window.ENGINE = engine;

if (field.name === "is_full_sync") {
if (field.name === "tunnel-enabled") {
let on = (this.state.details["tunnel-enabled"] == undefined) ? false : this.state.details["tunnel-enabled"];
return (
<FormField key={field.name} fieldName={field.name}>
<div className="flex align-center Form-offset">
<div className="Grid-cell--top">
<Toggle value={on} onChange={(val) => this.onChange("tunnel-enabled", val)}/>
</div>
<div className="px2">
<h3>Use an SSH-tunnel for database connections</h3>
<div style={{maxWidth: "40rem"}} className="pt1">
Some database installations can only be accessed by connecting through an SSH bastion host.
This option also provides an extra layer of security when a VPN is not available.
Enabling this is usually slower than a dirrect connection.
</div>
</div>
</div>
</FormField>
)
} else if (isTunnelField(field) && !this.state.details["tunnel-enabled"]) {
// don't show tunnel fields if tunnel isn't enabled
return null;
} else if (field.name === "is_full_sync") {
let on = (this.state.details.is_full_sync == undefined) ? true : this.state.details.is_full_sync;
return (
<FormField key={field.name} fieldName={field.name}>
Expand Down
1 change: 1 addition & 0 deletions project.clj
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@
"v3-rev139-1.22.0"]
[com.google.apis/google-api-services-bigquery ; Google BigQuery Java Client Library
"v2-rev342-1.22.0"]
[com.jcraft/jsch "0.1.54"] ; SSH client for tunnels
[com.h2database/h2 "1.4.194"] ; embedded SQL database
[com.mattbertolini/liquibase-slf4j "2.0.0"] ; Java Migrations lib
[com.mchange/c3p0 "0.9.5.2"] ; connection pooling library
Expand Down
6 changes: 3 additions & 3 deletions reset_password/metabase/reset_password/core.clj
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
(ns metabase.reset-password.core
(:gen-class)
(:require [toucan.db :as db]
[metabase.db :as mdb]
[metabase.models.user :as user]))
(:require [metabase.db :as mdb]
[metabase.models.user :as user]
[toucan.db :as db]))

(defn- set-reset-token!
"Set and return a new `reset_token` for the user with EMAIL-ADDRESS."
Expand Down
16 changes: 9 additions & 7 deletions sample_dataset/metabase/sample_dataset/generate.clj
Original file line number Diff line number Diff line change
@@ -1,15 +1,17 @@
(ns metabase.sample-dataset.generate
"Logic for generating the sample dataset.
Run this with `lein generate-sample-dataset`."
(:require (clojure.java [io :as io]
[jdbc :as jdbc])
(:require [clojure.java
[io :as io]
[jdbc :as jdbc]]
[clojure.math.numeric-tower :as math]
[clojure.string :as s]
(faker [address :as address]
[company :as company]
[lorem :as lorem]
[internet :as internet]
[name :as name])
[faker
[address :as address]
[company :as company]
[internet :as internet]
[lorem :as lorem]
[name :as name]]
[incanter.distributions :as dist]
[metabase.db.spec :as dbspec]
[metabase.util :as u])
Expand Down
4 changes: 3 additions & 1 deletion src/metabase/driver.clj
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,9 @@

(def ^:const connection-error-messages
"Generic error messages that drivers should return in their implementation of `humanize-connection-error-message`."
{:cannot-connect-check-host-and-port "Hmm, we couldn't connect to the database. Make sure your host and port settings are correct."
{:cannot-connect-check-host-and-port "Hmm, we couldn't connect to the database. Make sure your host and port settings are correct"
:ssh-tunnel-auth-fail "We couldn't connect to the ssh tunnel host. Check the username, password"
:ssh-tunnel-connection-fail "We couldn't connect to the ssh tunnel host. Check the hostname and port"
:database-name-incorrect "Looks like the database name is incorrect."
:invalid-hostname "It looks like your host is invalid. Please double-check it and try again."
:password-incorrect "Looks like your password is incorrect."
Expand Down
71 changes: 37 additions & 34 deletions src/metabase/driver/druid.clj
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,8 @@
[metabase.models
[field :as field]
[table :as table]]
[metabase.sync-database.analyze :as analyze]))
[metabase.sync-database.analyze :as analyze]
[metabase.util.ssh :as ssh]))

;;; ### Request helper fns

Expand All @@ -30,8 +31,8 @@
(do-request http/get \"http://my-json-api.net\")"
[request-fn url & {:as options}]
{:pre [(fn? request-fn) (string? url)]}
(let [options (cond-> (merge {:content-type "application/json"} options)
(:body options) (update :body json/generate-string))
(let [options (cond-> (merge {:content-type "application/json"} options)
(:body options) (update :body json/generate-string))
{:keys [status body]} (request-fn url options)]
(when (not= status 200)
(throw (Exception. (format "Error [%d]: %s" status body))))
Expand All @@ -53,16 +54,17 @@

(defn- do-query [details query]
{:pre [(map? query)]}
(try (vec (POST (details->url details "/druid/v2"), :body query))
(catch Throwable e
;; try to extract the error
(let [message (or (u/ignore-exceptions
(:error (json/parse-string (:body (:object (ex-data e))) keyword)))
(.getMessage e))]
(ssh/with-ssh-tunnel [details-with-tunnel details]
(try (vec (POST (details->url details-with-tunnel "/druid/v2"), :body query))
(catch Throwable e
;; try to extract the error
(let [message (or (u/ignore-exceptions
(:error (json/parse-string (:body (:object (ex-data e))) keyword)))
(.getMessage e))]

(log/error (u/format-color 'red "Error running query:\n%s" message))
;; Re-throw a new exception with `message` set to the extracted message
(throw (Exception. message e))))))
(log/error (u/format-color 'red "Error running query:\n%s" message))
;; Re-throw a new exception with `message` set to the extracted message
(throw (Exception. message e)))))))


;;; ### Sync
Expand All @@ -76,24 +78,24 @@
:type/Text)})

(defn- describe-table [database table]
(let [details (:details database)
{:keys [dimensions metrics]} (GET (details->url details "/druid/v2/datasources/" (:name table) "?interval=1900-01-01/2100-01-01"))]
{:schema nil
:name (:name table)
:fields (set (concat
;; every Druid table is an event stream w/ a timestamp field
[{:name "timestamp"
:base-type :type/DateTime
:pk? true}]
(map (partial describe-table-field :dimension) dimensions)
(map (partial describe-table-field :metric) metrics)))}))
(ssh/with-ssh-tunnel [details-with-tunnel (:details database)]
(let [{:keys [dimensions metrics]} (GET (details->url details-with-tunnel "/druid/v2/datasources/" (:name table) "?interval=1900-01-01/2100-01-01"))]
{:schema nil
:name (:name table)
:fields (set (concat
;; every Druid table is an event stream w/ a timestamp field
[{:name "timestamp"
:base-type :type/DateTime
:pk? true}]
(map (partial describe-table-field :dimension) dimensions)
(map (partial describe-table-field :metric) metrics)))})))

(defn- describe-database [database]
{:pre [(map? (:details database))]}
(let [details (:details database)
druid-datasources (GET (details->url details "/druid/v2/datasources"))]
{:tables (set (for [table-name druid-datasources]
{:schema nil, :name table-name}))}))
(ssh/with-ssh-tunnel [details-with-tunnel (:details database)]
(let [druid-datasources (GET (details->url details-with-tunnel "/druid/v2/datasources"))]
{:tables (set (for [table-name druid-datasources]
{:schema nil, :name table-name}))})))


;;; ### field-values-lazy-seq
Expand Down Expand Up @@ -163,13 +165,14 @@
:analyze-table analyze-table
:describe-database (u/drop-first-arg describe-database)
:describe-table (u/drop-first-arg describe-table)
:details-fields (constantly [{:name "host"
:display-name "Host"
:default "http://localhost"}
{:name "port"
:display-name "Broker node port"
:type :integer
:default 8082}])
:details-fields (constantly (ssh/with-tunnel-config
[{:name "host"
:display-name "Host"
:default "http://localhost"}
{:name "port"
:display-name "Broker node port"
:type :integer
:default 8082}]))
:execute-query (fn [_ query] (qp/execute-query do-query query))
:features (constantly #{:basic-aggregations :set-timezone :expression-aggregations})
:field-values-lazy-seq (u/drop-first-arg field-values-lazy-seq)
Expand Down
24 changes: 15 additions & 9 deletions src/metabase/driver/generic_sql.clj
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,9 @@
[field :as field]
[table :as table]]
[metabase.sync-database.analyze :as analyze]
[metabase.util.honeysql-extensions :as hx])
[metabase.util
[honeysql-extensions :as hx]
[ssh :as ssh]])
(:import [clojure.lang Keyword PersistentVector]
com.mchange.v2.c3p0.ComboPooledDataSource
[java.sql DatabaseMetaData ResultSet]
Expand Down Expand Up @@ -141,13 +143,15 @@
"Create a new C3P0 `ComboPooledDataSource` for connecting to the given DATABASE."
[{:keys [id engine details]}]
(log/debug (u/format-color 'magenta "Creating new connection pool for database %d ..." id))
(let [spec (connection-details->spec (driver/engine->driver engine) details)]
(db/connection-pool (assoc spec
:minimum-pool-size 1
;; prevent broken connections closed by dbs by testing them every 3 mins
:idle-connection-test-period (* 3 60)
;; prevent overly large pools by condensing them when connections are idle for 15m+
:excess-timeout (* 15 60)))))
(let [details-with-tunnel (ssh/include-ssh-tunnel details) ;; If the tunnel is disabled this returned unchanged
spec (connection-details->spec (driver/engine->driver engine) details-with-tunnel)]
(assoc (db/connection-pool (assoc spec
:minimum-pool-size 1
;; prevent broken connections closed by dbs by testing them every 3 mins
:idle-connection-test-period (* 3 60)
;; prevent overly large pools by condensing them when connections are idle for 15m+
:excess-timeout (* 15 60)))
:ssh-tunnel (:tunnel-connection details-with-tunnel))))

(defn- notify-database-updated
"We are being informed that a DATABASE has been updated, so lets shut down the connection pool (if it exists) under
Expand All @@ -158,7 +162,9 @@
;; remove the cached reference to the pool so we don't try to use it anymore
(swap! database-id->connection-pool dissoc id)
;; now actively shut down the pool so that any open connections are closed
(.close ^ComboPooledDataSource (:datasource pool))))
(.close ^ComboPooledDataSource (:datasource pool))
(when-let [ssh-tunnel (:ssh-tunnel pool)]
(.disconnect ^com.jcraft.jsch.Session ssh-tunnel))))

(defn db->pooled-connection-spec
"Return a JDBC connection spec that includes a cp30 `ComboPooledDataSource`.
Expand Down
Loading

0 comments on commit 4fe2354

Please sign in to comment.