-
Notifications
You must be signed in to change notification settings - Fork 40
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
This adds a default_python project based on the template of databricks/cli#686: ``` The 'default_python' project was generated by using the default-python template. ``` It also adds a LICENSE and removes the original stub example.
- Loading branch information
1 parent
412e180
commit d91afdb
Showing
22 changed files
with
505 additions
and
35 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,51 @@ | ||
DB license | ||
|
||
Copyright (2022) Databricks, Inc. | ||
|
||
Definitions. | ||
|
||
Agreement: The agreement between Databricks, Inc., and you governing the use of the Databricks Services, which shall | ||
be, with respect to Databricks, the Databricks Terms of Service located at www.databricks.com/termsofservice, and with | ||
respect to Databricks Community Edition, the Community Edition Terms of Service located at | ||
www.databricks.com/ce-termsofuse, in each case unless you have entered into a separate written agreement with | ||
Databricks governing the use of the applicable Databricks Services. | ||
|
||
Software: The source code and object code to which this license applies. | ||
|
||
Scope of Use. You may not use this Software except in connection with your use of the Databricks Services pursuant to | ||
the Agreement. Your use of the Software must comply at all times with any restrictions applicable to the Databricks | ||
Services, generally, and must be used in accordance with any applicable documentation. You may view, use, copy, | ||
modify, publish, and/or distribute the Software solely for the purposes of using the code within or connecting to the | ||
Databricks Services. If you do not agree to these terms, you may not view, use, copy, modify, publish, and/or | ||
distribute the Software. | ||
|
||
Redistribution. You may redistribute and sublicense the Software so long as all use is in compliance with these terms. | ||
In addition: | ||
|
||
You must give any other recipients a copy of this License; | ||
You must cause any modified files to carry prominent notices stating that you changed the files; | ||
You must retain, in the source code form of any derivative works that you distribute, all copyright, patent, | ||
trademark, and attribution notices from the source code form, excluding those notices that do not pertain to any part | ||
of the derivative works; and | ||
If the source code form includes a "NOTICE" text file as part of its distribution, then any derivative works that you | ||
distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those | ||
notices that do not pertain to any part of the derivative works. | ||
You may add your own copyright statement to your modifications and may provide additional license terms and conditions | ||
for use, reproduction, or distribution of your modifications, or for any such derivative works as a whole, provided | ||
your use, reproduction, and distribution of the Software otherwise complies with the conditions stated in this | ||
License. | ||
|
||
Termination. This license terminates automatically upon your breach of these terms or upon the termination of your | ||
Agreement. Additionally, Databricks may terminate this license at any time on notice. Upon termination, you must | ||
permanently delete the Software and all copies thereof. | ||
|
||
DISCLAIMER; LIMITATION OF LIABILITY. | ||
|
||
THE SOFTWARE IS PROVIDED “AS-IS” AND WITH ALL FAULTS. DATABRICKS, ON BEHALF OF ITSELF AND ITS LICENSORS, SPECIFICALLY | ||
DISCLAIMS ALL WARRANTIES RELATING TO THE SOURCE CODE, EXPRESS AND IMPLIED, INCLUDING, WITHOUT LIMITATION, IMPLIED | ||
WARRANTIES, CONDITIONS AND OTHER TERMS OF MERCHANTABILITY, SATISFACTORY QUALITY OR FITNESS FOR A PARTICULAR PURPOSE, | ||
AND NON-INFRINGEMENT. DATABRICKS AND ITS LICENSORS TOTAL AGGREGATE LIABILITY RELATING TO OR ARISING OUT OF YOUR USE OF | ||
OR DATABRICKS’ PROVISIONING OF THE SOURCE CODE SHALL BE LIMITED TO ONE THOUSAND ($1,000) DOLLARS. IN NO EVENT SHALL | ||
THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF | ||
CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN | ||
THE SOFTWARE. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
|
||
.databricks/ | ||
build/ | ||
dist/ | ||
__pycache__/ | ||
*.egg-info | ||
.venv/ | ||
scratch/** | ||
!scratch/README.md |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
# Typings for Pylance in Visual Studio Code | ||
# see https://github.com/microsoft/pyright/blob/main/docs/builtins.md | ||
from databricks.sdk.runtime import * |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
{ | ||
"recommendations": [ | ||
"databricks.databricks", | ||
"ms-python.vscode-pylance", | ||
"redhat.vscode-yaml" | ||
] | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
{ | ||
"python.analysis.stubPath": ".vscode", | ||
"databricks.python.envFile": "${workspaceFolder}/.env", | ||
"jupyter.interactiveWindow.cellMarker.codeRegex": "^# COMMAND ----------|^# Databricks notebook source|^(#\\s*%%|#\\s*\\<codecell\\>|#\\s*In\\[\\d*?\\]|#\\s*In\\[ \\])", | ||
"jupyter.interactiveWindow.cellMarker.default": "# COMMAND ----------", | ||
"python.testing.pytestArgs": [ | ||
"." | ||
], | ||
"python.testing.unittestEnabled": false, | ||
"python.testing.pytestEnabled": true, | ||
"files.exclude": { | ||
"**/*.egg-info": true | ||
}, | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
# default_python | ||
|
||
The 'default_python' project was generated by using the default-python template. | ||
|
||
## Getting started | ||
|
||
1. Install the Databricks CLI from https://docs.databricks.com/dev-tools/cli/databricks-cli.html | ||
|
||
2. Authenticate to your Databricks workspace: | ||
``` | ||
$ databricks configure | ||
``` | ||
3. To deploy a development copy of this project, type: | ||
``` | ||
$ databricks bundle deploy --target dev | ||
``` | ||
(Note that "dev" is the default target, so the `--target` parameter | ||
is optional here.) | ||
This deploys everything that's defined for this project. | ||
For example, the default template would deploy a job called | ||
`[dev yourname] default_python-job` to your workspace. | ||
You can find that job by opening your workpace and clicking on **Workflows**. | ||
4. Similarly, to deploy a production copy, type: | ||
``` | ||
$ databricks bundle deploy --target prod | ||
``` | ||
5. Optionally, install developer tools such as the Databricks extension for Visual Studio Code from | ||
https://docs.databricks.com/dev-tools/vscode-ext.html.Or read the "getting started" documentation for | ||
**Databricks Connect** for instructions on running the included Python code from a different IDE. | ||
6. For documentation on the Databricks asset bundles format used | ||
for this project, and for CI/CD configuration, see | ||
https://docs.databricks.com/dev-tools/bundles/index.html. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,43 @@ | ||
# This is a Databricks asset bundle definition for default_python. | ||
# See https://docs.databricks.com/dev-tools/bundles/index.html for documentation. | ||
bundle: | ||
name: default_python | ||
|
||
include: | ||
- resources/*.yml | ||
|
||
targets: | ||
# The 'dev' target, used development purposes. | ||
# Whenever a developer deploys using 'dev', they get their own copy. | ||
dev: | ||
# We use 'mode: development' to make everything deployed to this target gets a prefix | ||
# like '[dev my_user_name]'. Setting this mode also disables any schedules and | ||
# automatic triggers for jobs and enables the 'development' mode for Delta Live Tables pipelines. | ||
mode: development | ||
default: true | ||
workspace: | ||
host: https://myworkspace.databricks.com | ||
|
||
# Optionally, there could be a 'staging' target here. | ||
# (See Databricks docs on CI/CD at https://docs.databricks.com/dev-tools/bundles/index.html.) | ||
# | ||
# staging: | ||
# workspace: | ||
# host: https://myworkspace.databricks.com | ||
|
||
# The 'prod' target, used for production deployment. | ||
prod: | ||
# For production deployments, we only have a single copy, so we override the | ||
# workspace.root_path default of | ||
# /Users/${workspace.current_user.userName}/.bundle/${bundle.target}/${bundle.name} | ||
# to a path that is not specific to the current user. | ||
mode: production | ||
workspace: | ||
host: https://myworkspace.databricks.com | ||
root_path: /Shared/.bundle/prod/${bundle.name} | ||
run_as: | ||
# This runs as [email protected] in production. Alternatively, | ||
# a service principal could be used here using service_principal_name | ||
# (see Databricks documentation). | ||
user_name: [email protected] | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
# Fixtures | ||
|
||
This folder is reserved for fixtures, such as CSV files. | ||
|
||
Below is an example of how to load fixtures as a data frame: | ||
|
||
``` | ||
import pandas as pd | ||
import os | ||
|
||
def get_absolute_path(*relative_parts): | ||
if 'dbutils' in globals(): | ||
base_dir = os.path.dirname(dbutils.notebook.entry_point.getDbutils().notebook().getContext().notebookPath().get()) # type: ignore | ||
path = os.path.normpath(os.path.join(base_dir, *relative_parts)) | ||
return path if path.startswith("/Workspace") else os.path.join("/Workspace", path) | ||
else: | ||
return os.path.join(*relative_parts) | ||
|
||
csv_file = get_absolute_path("..", "fixtures", "mycsv.csv") | ||
df = pd.read_csv(csv_file) | ||
display(df) | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
[pytest] | ||
testpaths = tests | ||
pythonpath = src |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
# The main job for default_python | ||
resources: | ||
jobs: | ||
default_python_job: | ||
name: default_python_job | ||
|
||
schedule: | ||
quartz_cron_expression: '44 37 8 * * ?' | ||
timezone_id: Europe/Amsterdam | ||
|
||
email_notifications: | ||
on_failure: | ||
- [email protected] | ||
|
||
tasks: | ||
- task_key: notebook_task | ||
job_cluster_key: job_cluster | ||
notebook_task: | ||
notebook_path: ../src/notebook.ipynb | ||
|
||
- task_key: refresh_pipeline | ||
depends_on: | ||
- task_key: notebook_task | ||
pipeline_task: | ||
pipeline_id: ${resources.pipelines.default_python_pipeline.id} | ||
|
||
- task_key: main_task | ||
depends_on: | ||
- task_key: refresh_pipeline | ||
job_cluster_key: job_cluster | ||
python_wheel_task: | ||
package_name: default_python | ||
entry_point: main | ||
libraries: | ||
- whl: ../dist/*.whl | ||
|
||
job_clusters: | ||
- job_cluster_key: job_cluster | ||
new_cluster: | ||
spark_version: 13.3.x-scala2.12 | ||
# node_type_id is the cluster node type to use. | ||
# Typical node types on AWS include i3.xlarge; | ||
# Standard_D3_v2 on Azure; | ||
# n1-standard-4 on Google Cloud. | ||
node_type_id: i3.xlarge | ||
autoscale: | ||
min_workers: 1 | ||
max_workers: 4 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
# The main pipeline for default_python | ||
resources: | ||
pipelines: | ||
default_python_pipeline: | ||
name: "default_python_pipeline" | ||
target: "default_python_${bundle.environment}" | ||
libraries: | ||
- notebook: | ||
path: ../src/dlt_pipeline.ipynb | ||
|
||
configuration: | ||
"bundle.sourcePath": "/Workspace/${workspace.file_path}/src" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
# scratch | ||
|
||
This folder is reserved for personal, exploratory notebooks. | ||
By default these are not committed to Git, as 'scratch' is listed in .gitignore. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,50 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"application/vnd.databricks.v1+cell": { | ||
"cellMetadata": { | ||
"byteLimit": 2048000, | ||
"rowLimit": 10000 | ||
}, | ||
"inputWidgets": {}, | ||
"nuid": "6bca260b-13d1-448f-8082-30b60a85c9ae", | ||
"showTitle": false, | ||
"title": "" | ||
} | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"import sys\n", | ||
"sys.path.append('../src')\n", | ||
"from default_python import main\n", | ||
"\n", | ||
"main.get_taxis().show(10)" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"application/vnd.databricks.v1+notebook": { | ||
"dashboards": [], | ||
"language": "python", | ||
"notebookMetadata": { | ||
"pythonIndentUnit": 2 | ||
}, | ||
"notebookName": "ipynb-notebook", | ||
"widgets": {} | ||
}, | ||
"kernelspec": { | ||
"display_name": "Python 3", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"name": "python", | ||
"version": "3.11.4" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 0 | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
""" | ||
Setup script for default_python. | ||
This script packages and distributes the associated wheel file(s). | ||
Source code is in ./src/. Run 'python setup.py sdist bdist_wheel' to build. | ||
""" | ||
from setuptools import setup, find_packages | ||
|
||
import sys | ||
sys.path.append('./src') | ||
|
||
import default_python | ||
|
||
setup( | ||
name="default_python", | ||
version=default_python.__version__, | ||
url="https://databricks.com", | ||
author="<no value>", | ||
description="my test wheel", | ||
packages=find_packages(where='./src'), | ||
package_dir={'': 'src'}, | ||
entry_points={"entry_points": "main=default_python.main:main"}, | ||
install_requires=["setuptools"], | ||
) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
__version__ = "0.0.1" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
from pyspark.sql import SparkSession | ||
|
||
def get_taxis(): | ||
spark = SparkSession.builder.getOrCreate() | ||
return spark.read.table("samples.nyctaxi.trips") | ||
|
||
def main(): | ||
get_taxis().show(5) | ||
|
||
if __name__ == '__main__': | ||
main() |
Oops, something went wrong.