Skip to content

Commit

Permalink
Merge branch 'jq/remodel'
Browse files Browse the repository at this point in the history
  • Loading branch information
quinnj committed May 3, 2016
2 parents a36ef63 + 307d11e commit 7ae7ebf
Show file tree
Hide file tree
Showing 22 changed files with 7,187 additions and 2,754 deletions.
16 changes: 13 additions & 3 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -1,10 +1,13 @@
language: julia

os:
- osx
- linux
julia:
- 0.3
- nightly
- 0.4

notifications:
email: false
email: false

services:
- mysql
Expand All @@ -14,3 +17,10 @@ before_install:
- sudo apt-get install libmyodbc libsqliteodbc unixodbc unixodbc-dev
- sudo odbcinst -i -d -f /usr/share/libmyodbc/odbcinst.ini
- sudo odbcinst -i -s -l -f ./test/odbc.ini

script:
- if [[ -a .git/shallow ]]; then git fetch --unshallow; fi
- julia -e 'Pkg.clone(pwd()); Pkg.build("ODBC"); Pkg.test("ODBC"; coverage=true)';
after_success:
- julia -e 'cd(Pkg.dir("ODBC")); Pkg.add("Coverage"); using Coverage; Coveralls.submit(Coveralls.process_folder())';
# - julia -e 'cd(Pkg.dir("ODBC")); Pkg.add("Coverage"); using Coverage; Codecov.submit(Codecov.process_folder())';
175 changes: 30 additions & 145 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,163 +1,48 @@
ODBC.jl
=======
[![Build Status](https://travis-ci.org/JuliaDB/ODBC.jl.png)](https://travis-ci.org/JuliaDB/ODBC.jl)
[![ODBC](http://pkg.julialang.org/badges/ODBC_0.4.svg)](http://pkg.julialang.org/?pkg=ODBC&ver=0.4)
[![ODBC](http://pkg.julialang.org/badges/ODBC_0.5.svg)](http://pkg.julialang.org/?pkg=ODBC&ver=0.5)

A low-level ODBC interface for the Julia programming language
Linux: [![Build Status](https://travis-ci.org/JuliaDB/ODBC.jl.svg?branch=master)](https://travis-ci.org/JuliaDB/ODBC.jl)

Windows: [![Build Status](https://ci.appveyor.com/api/projects/status/github/JuliaDB/ODBC.jl?branch=master&svg=true)](https://ci.appveyor.com/project/JuliaDB/odbc-jl/branch/master)

An ODBC interface for the Julia programming language

Installation through the Julia package manager:
```julia
julia> Pkg.init() # Creates julia package repository (only runs once for all packages)
julia> Pkg.add("ODBC") # Creates the ODBC repo folder and downloads the ODBC package + dependancy (if needed)
julia> using ODBC # Loads the ODBC module for use (needs to be run with each new Julia instance)
```
## Package Documentation
Exported functions, macros, types, and variables include:
#### Functions
* `ODBC.connect(dsn; usr="", pwd="")`

`ODBC.connect` requires the `dsn` string argument as the name of a pre-defined ODBC
datasource. Valid datasources (DSNs) must first be setup through the ODBC
administrator (or IODBC, unixODBC, etc.) prior to connecting in Julia. Note the use of `ODBC.` before `connect`, this is to prevent method ambiguity with the `Base.connect` family of methods.

The `usr` and `pwd` named arguments are optional as they may already
be defined in the datasource.

`ODBC.connect` returns a `Connection` type which contains basic information
about the connection and ODBC handle pointers.

`ODBC.connect` can be used by storing the `Connection` type in
a variable to be able to disconnect or facilitate handling multiple
connections like so:
```julia
co = ODBC.connect("mydatasource",usr="johndoe",pwd="12345")
```
But it's unneccesary to store the `Connection`, as an exported
`conn` variable holds the most recently created `Connection` type and other
ODBC functions (i.e. `query`) will use it by default in the absence of a specified
connection.

* `disconnect(connection::Connection=conn)`

`disconnect` closes a connection type, frees all handles and resets
the default connection `conn` as necessary. If invoked with no arguments
(i.e. `disconnect()`), the default connection `conn` is closed.

* `advancedconnect(conn_string::String)`

`advancedconnect` implements the native ODBC SQLDriverConnect function
which allows flexibility in connecting to a datasource through specifying
a 'connection string' (e.g. "DSN=userdsn;UID=johnjacob;PWD=jingle;") See
ODBC API documentation (http://goo.gl/uXTuk) for additional details.

If the connection string doesn't contain enough information for the driver
to connect, the user will be prompted with the additional information
needed.

Furthermore, on Windows, if `advancedconnect()` is called without arguments the ODBC
administrator will be brought up where the user can select the DSN to
which to connect, even allowing the user to create a datasource or add a
driver.

(An excellent resource for learning how to construct connection strings
for various DBMS/driver configurations is
http://www.connectionstrings.com/)

* `query(connection::Connection=conn, querystring; file=:DataFrame,delim='\t')`

If a connection type isn't specified as the first positional argument, the query will be executed against
the default connection (stored in the exported variable `conn` if you'd like to
inspect).

Once the query is executed, the resultset is stored in a
`DataFrame` by default (`file=:DataFrame`). Otherwise, the user may specify
a file name to which the resultset is to be written, along with the desired
file delimiter (default `delim='\t'`). Depending on DBMS capability, users may also
pass multiple query statements in a single query call and the resultsets
will be returned in an array of DataFrames, or the user may specify an array
of filename strings and `Char` delimiters into which the results will be written.

For the general user, a simple `query(querystring)` is enough to return a single
resultset in a DataFrame. Results are stored in the passed connection type's resultset field.
(i.e. `conn.resultset`). Results are stored by default to avoid immediate garbarge collection
and provide access for the user even if the resultset returned by `query()` isn't stored in a variable.

* `querymeta(conn::Connection=conn, querystring; file=:DataFrame,delim='\t')`

`querymeta` is really just the 1st half of the `query` function. A query string is sent to the DBMS, executed,
and metadata (i.e. rows, columns, types, column names, etc.) is returned to the user, avoiding actually returning
the dataset. The returned information is actually stored in the `Metadata` type, so the information may be
programmatically examined (try running `names(Metadata)` to see its fields). Running `querymeta` is useful
for inspecting the results of large queries while avoiding the overhead of returning the actual dataset into memory.
The function signature is identical to `query` for ease in switching between the two though `querymeta` ignores
the `file` and `delim` arguments.

* `listdrivers()`

Takes no arguments. Returns a list of installed ODBC drivers registered in the ODBC administator (IODBC, unixODBC, etc.).
* `listdsns()`
Basic Usage:
```julia
using ODBC

Takes no arguments. Returns a list of defined datasources (DSNs) registered in the ODBC administator
(IODBC, unixODBC, etc.). The datasource names can be used as the 1st argument in `ODBC.connect(dsn)`.
#### Macros
* `sql"..."`
# list installed ODBC drivers
ODBC.listdrivers()
# list pre-defined ODBC DSNs
ODBC.listdsns()

`sql"..."` is a Julia string literal implemented by the `@sql_str` macro. It is equivalent to calling
`query(querystring)` as you can see from the actual definition below:
```julia
macro sql_str(s)
query(s)
end
```
#### Types
* `Connection`
# connect to a DSN using a pre-defined DSN or custom connection string
dsn = ODBC.DSN("pre_defined_DSN","username","password")

Stores information about a DSN connection. Names include `dsn`, `number` (counts `Connection` types specific
to each DSN), `dbc_ptr` and `stmt_ptr` as internal connection and statement handle pointers, and `resultset` which
stores the last resultset returned from a `query` or `querymeta` call.
# Basic a basic query that returns results at a Data.Table by default
dbs = ODBC.query(dsn, "show databases")

* `Metadata`
Stores information about an executed query, returned by `querymeta`. Names include `querystring` (the query sent to
be executed), `cols` (# of columns in resultset), `rows` (# of rows in resultset), `colnames` (column names to be
returned in resultset), `coltypes` (SQL types of resultset columns), `colsizes` (size in bytes of resultset columns),
`coldigits` (max number of digits for numeric resultset columns; though not always implemented correctly by ODBC driver),
and `colnulls` (whether the resultset column is nullable).
#### Variables
* `conn`
Global, exported variable that initially holds a null `Connection` type until a connection is successfully made by
`ODBC.connect` or `advancedconnect`. Is used by `query` and `querymeta` as the default datasource `Connection` if none is
explicitly specified.
* `Connections`
Global, exported variable of type `Array{Connection,1}`, that holds `Connection` types. When multiple calls to `ODBC.connect`
or `advancedconnect` are made, `Connections` stores each `Connection` type to manage the number of DSN connections.
It is also referenced when the default connection `conn` is disconnected and reset to `Connections[end]` if other
connections exist, or a null `Connection` type otherwise.
# Execute a query without returning results
ODBC.execute!(dsn, "use mydb")

### Known Issues
* We've had limited ODBC testing between various platforms, so it may happen that `ODBC.jl` doesn't recognize your
ODBC shared library (also know as the Driver Manager, basically the middleman between `ODBC.jl` and the RDBMS).
The current approach is to check a variety of the most widely used ODBC libraries and produce an error if not found.
If this happens, you'll need to manually locate your ODBC shared library (searching for something along the lines of
`libodbc` or `libiodbc`, or installing it if you haven't yet) and then run the following:
```julia
const odbc_dm = "path/to/library/libodbc.so" (or .dylib on OSX)
```
# return query results as a CSV file
csv = CSV.Sink("mydb_tables.csv")
data = ODBC.query(dsn, "select table_name from information_schema.tables", csv);

*Note that the file is `odbc32` on Windows, but should never have a problem being found (ships by default).
That said, if you end up doing this, open an issue on GitHub to let me know what the name of your ODBC library is
and I can add is as one of the defaults to check for.
# return query results in an SQLite table
db = SQLite.DB()
source = ODBC.Source(dsn, "select table_name from information_schema.tables")
sqlite = SQLite.Sink(source, db)
Data.stream!(source, sqlite)
```

### TODO
* Create SQL typealiases and use in conjunction with Julia-C typealiases for ODBC_API (for more transparency and because we can)
* Metadata tools: This would involve specilized queries for examining DBMS schema, tables, views, columns, with
associated metadata and possibly statistics. I know the driver managers support SQLTables and SQLStatistics,
so it should be pretty simple to implement these.
* Create, Update Table functions (also auto-detect regular queries as these kinds of DDL queries): Pretty self-explanatory.
* Support more SQL data types: Date, Time, Intervals. Right now, all main bitstypes, character and binary formats
(short, long, float, double, char, etc.) are supported, but the date and time data types are read as strings.
Other implementations in C use structs to read them in and Julia is still fragile on struct support as far as I know.
As Julia struct compatibility improves it's an eventual (I think RODBC package still only reads dates as strings...)
* Asynchronous querying: This might be a longshot, but the later ODBC API supports async querying through polling,
so it would be cool to find a way to implement this. I'm not sure how useful it would be long term or exactly how it
would be implemented (Call asyncquery() and then later call querydone() to see if it's finished?), but because the
underlying api is capable this could be some cool functionality.
* How to deal with Unicode/ANSI function calling? (I think we're ok here, but not sure)
Use the automatic help mode for more information on package types/functions.
10 changes: 7 additions & 3 deletions REQUIRE
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
DataFrames v0.6.10
Dates
julia 0.3
julia 0.4
Compat
NullableArrays
DataStreams
CSV
SQLite
DecFP
Loading

0 comments on commit 7ae7ebf

Please sign in to comment.