Skip to content

Commit

Permalink
SBT Support (#21)
Browse files Browse the repository at this point in the history
* clean up and formatting

* formatting testcases

* upgraded to scala 2.11.8

* update readme

* Clean up .gitignore file, remove tf folder inside test folder

* Rename spark-tf-core to core, and update all references

* Remove core module, add License file and make pom changes

* Renaming namespace, update all files with new namespace

* Fix custom schema, correct pom

* update readme

* update readme

* add sbt build files

* Add conversion from mvn to sbt (#15)

* Add classifier to bring in correct shaded jar and class

* Add classifier to bring in correct shaded jar and class (#16)

* Add travis.yml file

* Refactor travis file

* Refactor travis file

* Update README.md

* Add Travis support to sbt branch (#17)

* Add classifier to bring in correct shaded jar and class

* Add travis.yml file

* Refactor travis file

* Refactor travis file

* Update README.md

* Cleanup

* Remove central1 dependency in sbt and sudo requirement from travis.yml (#18)

* Add classifier to bring in correct shaded jar and class

* Add travis.yml file

* Refactor travis file

* Refactor travis file

* Update README.md

* Cleanup

* SBT working, Cleaned up (#19)

* Add conversion from mvn to sbt

* Clean up for sbt

* Add exclude jars to build.sbt and update readme

* use filterNot

* Refactor to use filterNot (#20)

* Add classifier to bring in correct shaded jar and class

* Add travis.yml file

* Refactor travis file

* Refactor travis file

* Update README.md

* Cleanup

* use filterNot

* Add sbt-spark-package plugin support
  • Loading branch information
joyeshmishra authored and karthikvadla committed Feb 16, 2017
1 parent 72e9343 commit e7d123e
Show file tree
Hide file tree
Showing 6 changed files with 99 additions and 1 deletion.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,5 @@ target
tf-sandbox
spark-warehouse/
metastore_db/
project/project/
test-output.tfr
21 changes: 21 additions & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
language: scala

# Cache settings here are based on latest SBT documentation.
cache:
directories:
- $HOME/.ivy2/cache
- $HOME/.sbt/boot/

before_cache:
# Tricks to avoid unnecessary cache updates
- find $HOME/.ivy2 -name "ivydata-*.properties" -delete
- find $HOME/.sbt -name "*.lock" -delete

scala:
- 2.11.8

jdk:
- oraclejdk8

script:
- sbt ++$TRAVIS_SCALA_VERSION clean publish-local
19 changes: 18 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
[![Build Status](https://travis-ci.org/tapanalyticstoolkit/spark-tensorflow-connector.svg?branch=sbt)](https://travis-ci.org/tapanalyticstoolkit/spark-tensorflow-connector)

# spark-tensorflow-connector

This repo contains a library for loading and storing TensorFlow records with [Apache Spark](http://spark.apache.org/).
Expand All @@ -19,19 +21,34 @@ None.
2. [Apache Maven](https://maven.apache.org/)

## Building the library
Build the library using Maven as shown below.
You can build library using both Maven and SBT build tools

#### Maven
Build the library using Maven(3.3) as shown below

```sh
mvn clean install
```

#### SBT
Build the library using SBT(0.13.13) as show below
```sh
sbt clean assembly
```

## Using Spark Shell
Run this library in Spark using the `--jars` command line option in `spark-shell` or `spark-submit`. For example:

Maven Jars
```sh
$SPARK_HOME/bin/spark-shell --jars target/spark-tensorflow-connector-1.0-SNAPSHOT.jar,target/lib/tensorflow-hadoop-1.0-01232017-SNAPSHOT-shaded-protobuf.jar
```

SBT Jars
```sh
$SPARK_HOME/bin/spark-shell --jars target/scala-2.11/spark-tensorflow-connector-assembly-1.0-SNAPSHOT.jar
```

The following code snippet demonstrates usage.

```scala
Expand Down
52 changes: 52 additions & 0 deletions build.sbt
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
scalaVersion in Global := "2.11.8"

def ProjectName(name: String,path:String): Project = Project(name, file(path))

resolvers in Global ++= Seq("https://tap.jfrog.io/tap/public" at "https://tap.jfrog.io/tap/public" ,
"https://tap.jfrog.io/tap/public-snapshots" at "https://tap.jfrog.io/tap/public-snapshots" ,
"https://repo.maven.apache.org/maven2" at "https://repo.maven.apache.org/maven2" )

val `junit_junit` = "junit" % "junit" % "4.12"

val `org.apache.hadoop_hadoop-yarn-api` = "org.apache.hadoop" % "hadoop-yarn-api" % "2.7.3"

val `org.apache.spark_spark-core_2.11` = "org.apache.spark" % "spark-core_2.11" % "2.1.0"

val `org.apache.spark_spark-sql_2.11` = "org.apache.spark" % "spark-sql_2.11" % "2.1.0"

val `org.apache.spark_spark-mllib_2.11` = "org.apache.spark" % "spark-mllib_2.11" % "2.1.0"

val `org.scalatest_scalatest_2.11` = "org.scalatest" % "scalatest_2.11" % "2.2.6"

val `org.tensorflow_tensorflow-hadoop` = "org.tensorflow" % "tensorflow-hadoop" % "1.0-01232017-SNAPSHOT"


spName := "spark-tensorflow-connector"

sparkVersion := "2.1.0"

sparkComponents ++= Seq("sql", "mllib")

spIgnoreProvided := true

version := "1.0-SNAPSHOT"

name := "spark-tensorflow-connector"

organization := "org.trustedanalytics"

libraryDependencies in Global ++= Seq(`org.tensorflow_tensorflow-hadoop` classifier "shaded-protobuf",
`org.scalatest_scalatest_2.11` % "test" ,
`org.apache.spark_spark-sql_2.11` % "provided" ,
`org.apache.spark_spark-mllib_2.11` % "test" classifier "tests",
`org.apache.spark_spark-core_2.11` % "provided" ,
`org.apache.hadoop_hadoop-yarn-api` % "provided" ,
`junit_junit` % "test" )

assemblyExcludedJars in assembly := {
val cp = (fullClasspath in assembly).value
cp filterNot {x => List("spark-tensorflow-connector-1.0-SNAPSHOT.jar",
"tensorflow-hadoop-1.0-01232017-SNAPSHOT-shaded-protobuf.jar").contains(x.data.getName)}
}

licenses := Seq("Apache License 2.0" -> url("http://www.apache.org/licenses/LICENSE-2.0.html"))
1 change: 1 addition & 0 deletions project/build.properties
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
sbt.version=0.13.13
5 changes: 5 additions & 0 deletions project/plugins.sbt
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
resolvers += "bintray-spark-packages" at "https://dl.bintray.com/spark-packages/maven/"

addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.3")

addSbtPlugin("org.spark-packages" % "sbt-spark-package" % "0.2.5")

0 comments on commit e7d123e

Please sign in to comment.