Skip to content

Commit

Permalink
chore(docker): try a better startup process with docker-compose
Browse files Browse the repository at this point in the history
  • Loading branch information
tchiotludo committed Feb 28, 2022
1 parent 7ef9173 commit 796bf20
Show file tree
Hide file tree
Showing 4 changed files with 244 additions and 37 deletions.
9 changes: 7 additions & 2 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,13 @@ ARG APT_PACKAGES=""
WORKDIR /app
COPY docker /

RUN if [ -n "${APT_PACKAGES}" ]; then apt-get update -y; apt-get install -y --no-install-recommends ${APT_PACKAGES}; apt-get clean && rm -rf /var/lib/apt/lists/* /var/tmp/*; fi && \
if [ -n "${KESTRA_PLUGINS}" ]; then /app/kestra plugins install ${KESTRA_PLUGINS}; fi
RUN mkdir -p /app/plugins && \
apt-get update -y && \
apt-get install -y --no-install-recommends curl wait-for-it ${APT_PACKAGES} && \
apt-get upgrade -y && \
apt-get clean && rm -rf /var/lib/apt/lists/* /var/tmp/*

RUN if [ -n "${KESTRA_PLUGINS}" ]; then /app/kestra plugins install ${KESTRA_PLUGINS}; fi

ENTRYPOINT ["docker-entrypoint.sh"]

Expand Down
228 changes: 215 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,24 +1,226 @@
<p align="center">
<a href="https://www.kestra.io">
<img width="460" src="https://kestra.io/logo.svg" alt="Kestra workflow orchestrator" />
</a>
</p>

# Kestra
![Last Version](https://img.shields.io/github/tag-pre/kestra-io/kestra.svg)
![License](https://img.shields.io/github/license/kestra-io/kestra)
![Docker Pull](https://img.shields.io/docker/pulls/kestra/kestra.svg)
![Github Downloads](https://img.shields.io/github/downloads/kestra-io/kestra/total)
![Github Star](https://img.shields.io/github/stars/kestra-io/kestra.svg)
[![codecov](https://codecov.io/gh/kestra-io/kestra/branch/develop/graph/badge.svg?token=It6L7BTaWK)](https://codecov.io/gh/kestra-io/kestra)
![Github Actions](https://github.com/kestra-io/kestra/workflows/Main/badge.svg?branch=master)
<h1 align="center" style="border-bottom: none">
Kestra, Infinitely scalable open source orchestration & scheduling platform. <br>
</h1>

<div align="center">
<a href="/kestra-io/kestra/blob/develop/LICENSE"><img src="https://img.shields.io/github/license/kestra-io/kestra?style=flat-square" alt="License" /></a>
<a href="https://github.com/kestra-io/kestra/pulse"><img src="https://img.shields.io/github/commit-activity/m/kestra-io/kestra?style=flat-square" alt="Commits-per-month"></a>
<a href="/kestra-io/kestra/stargazers"><img src="https://img.shields.io/github/stars/kestra-io/kestra.svg?style=flat-square" alt="Github star" /></a>
<a href="/kestra-io/kestra/releases"><img src="https://img.shields.io/github/tag-pre/kestra-io/kestra.svg?style=flat-square" alt="Last Version" /></a>
<a href="https://hub.docker.com/r/kestra/kestra"><img src="https://img.shields.io/docker/pulls/kestra/kestra.svg?style=flat-square" alt="Docker pull" /></a>
<a href="https://artifacthub.io/packages/helm/kestra/kestra"><img src="https://img.shields.io/badge/Artifact%20Hub-kestra-417598?style=flat-square&logo=artifacthub" alt="Artifact Hub" /></a>
<a href="https://kestra.io"><img src="https://img.shields.io/badge/Website-kestra.io-192A4E?style=flat-square" alt="Kestra infinitely scalable orchestration and scheduling platform"></a>
<a href="https://discord.gg/5RgZmkW"><img src="https://img.shields.io/discord/903344083391631471?label=Discord&style=flat-square" alt="Discord"></a>
<a href="/kestra-io/kestra/discussions"><img src="https://img.shields.io/github/discussions/kestra-io/kestra?style=flat-square" alt="Github discussions"></a>
<a href="https://twitter.com/kestra_io"><img src="https://img.shields.io/twitter/follow/kestra_io?style=flat-square" alt="Twitter" /></a>
<a href="https://app.codecov.io/gh/kestra-io/kestra"><img src="https://img.shields.io/codecov/c/github/kestra-io/kestra?style=flat-square&token=It6L7BTaWK" alt="Code Cov" /></a>
<a href="/kestra-io/kestra/actions"><img src="https://img.shields.io/github/workflow/status/kestra-io/kestra/Main/develop?style=flat-square" alt="Github Actions" /></a>
</div>

<br />

<p align="center">
<img width="460" src="https://kestra.io/logo.svg" alt="Kestra workflow orchestrator" />
<a href="https://kestra.io/"><b>Website</b></a> •
<a href="https://twitter.com/kestra_io"><b>Twitter</b></a> •
<a href="https://www.linkedin.com/company/kestra/"><b>Linked In</b></a> •
<a href="https://discord.gg/NMG39WKGth"><b>Discord</b></a> •
<a href="https://kestra.io/docs/"><b>Documentation</b></a>
</p>

> The modern, scalable orchestrator & scheduler open source platform.
<br />

<p align="center"><img src="https://kestra.io/video.gif" alt="modern data orchestration and scheduling platform " width="640px" /></p>


## What is Kestra ?
Kestra is an infinitely scalable orchestration and scheduling platform, creating, running, scheduling, and monitoring millions of complex pipelines.

- 🔀 **Any kind of workflow**: Workflows can start simple and progress to more complex systems with branching, parallel, dynamic tasks, flow dependencies
- 🎓‍ **Easy to learn**: Flows are in simple, descriptive language defined in YAML;u don't need to be a developer to create a new flow.
- 🔣 **Easy to extend**: Plugins are everywhere in Kestra, many are available from the Kestra core team, but you can create one easily.
- 🆙 **Any triggers**: Kestra is event-based at heart—you can trigger an execution from API, schedule, detection, events
- 💻 **A rich user interface**: The built-in web interface allows you to create, run, and monitor all your flows—no need to deploy your flows, just edit them.
-**Enjoy infinite scalability**: Kestra is built around top cloud native technologies—scale to millions of executions stress-free.

**Example flow:**

```yaml
id: my-first-flow
namespace: my.company.teams

inputs:
- type: FILE
name: uploaded

tasks:
- id: "archive"
type: "io.kestra.plugin.gcp.gcs.Upload"
description: "Archive the file on Google Cloud Storage bucket"
from: "{{ inputs.uploaded }}"
to: "gs://my_bucket/archives/{{ execution.id }}.csv"
- id: "csvReader"
type: "io.kestra.plugin.serdes.csv.CsvReader"
from: "{{ inputs.uploaded }}"
- id: fileTransform
type: io.kestra.plugin.scripts.nashorn.FileTransform
description: This task will anonymize the contactName with a custom nashorn script (javascript over jvm). This show that you able to handle custom transformation or remapping in the ETL way
from: "{{ outputs.csvReader.uri }}"
script: |
if (row['contactName']) {
row['contactName'] = "*".repeat(row['contactName'].length);
}
- id: avroWriter
type: io.kestra.plugin.serdes.avro.AvroWriter
description: This file will convert the file from Kestra internal storage to avro. Again, we handling ETL since the conversion is done by Kestra before loading the data in BigQuery. This allow you to have some control before loading and to reject wrong data as soon as possible.
from: "{{ outputs.fileTransform.uri }}"
schema: |
{
"type": "record",
"name": "Root",
"fields":
[
{ "name": "contactTitle", "type": ["null", "string"] },
{ "name": "postalCode", "type": ["null", "long"] },
{ "name": "entityId", "type": ["null", "long"] },
{ "name": "country", "type": ["null", "string"] },
{ "name": "region", "type": ["null", "string"] },
{ "name": "address", "type": ["null", "string"] },
{ "name": "fax", "type": ["null", "string"] },
{ "name": "email", "type": ["null", "string"] },
{ "name": "mobile", "type": ["null", "string"] },
{ "name": "companyName", "type": ["null", "string"] },
{ "name": "contactName", "type": ["null", "string"] },
{ "name": "phone", "type": ["null", "string"] },
{ "name": "city", "type": ["null", "string"] }
]
}
- id: load
type: io.kestra.plugin.gcp.bigquery.Load
description: Simply load the generated from avro task to BigQuery
avroOptions:
useAvroLogicalTypes: true
destinationTable: kestra-prd.demo.customer_copy
format: AVRO
from: "{{outputs.avroWriter.uri }}"
writeDisposition: WRITE_TRUNCATE*
- id: aggregate
type: io.kestra.plugin.gcp.bigquery.Query
description: Aggregate some data from loaded files
createDisposition: CREATE_IF_NEEDED
destinationTable: kestra-prd.demo.agg
sql: |
SELECT k.categoryName, p.productName, c.companyName, s.orderDate, SUM(d.quantity) AS quantity, SUM(d.unitPrice * d.quantity * r.exchange) as totalEur
FROM `kestra-prd.demo.salesOrder` AS s
INNER JOIN `kestra-prd.demo.orderDetail` AS d ON s.entityId = d.orderId
INNER JOIN `kestra-prd.demo.customer` AS c ON c.entityId = s.customerId
INNER JOIN `kestra-prd.demo.product` AS p ON p.entityId = d.productId
INNER JOIN `kestra-prd.demo.category` AS k ON k.entityId = p.categoryId
INNER JOIN `kestra-prd.demo.rates` AS r ON r.date = DATE(s.orderDate) AND r.currency = "USD"
GROUP BY 1, 2, 3, 4
timePartitioningField: orderDate
writeDisposition: WRITE_TRUNCATE
```
## Getting Started
To get a local copy up and running, please follow these simple steps.
### Prerequisites
Make sure you have already installed:
- [Docker](https://docs.docker.com/engine/install/)
- [Docker Compose](https://docs.docker.com/compose/install/)
### Launch Kestra
- Download the compose file [here](https://github.com/kestra-io/kestra/blob/develop/docker-compose.yml)
- Run `docker-compose up -d`
- Open `http://localhost:8080` on your browser
- Follow [this tutorial](https://kestra.io/docs/getting-started/) to create your first flow.
- Read the [documentation](https://kestra.io/docs/) to understand how to
- [Develop your flows](https://kestra.io/docs/developer-guide/)
- [Deploy Kestra](https://kestra.io/docs/administrator-guide/)
- Use our [terraform provider](https://kestra.io/docs/terraform/)
- Develop your [own plugins](https://kestra.io/docs/plugin-developer-guide/)





## Plugins
Kestra is built on [plugin systems](https://kestra.io/plugins/). You can find your plugin to interact with your provider; alternatively, you can follow [simple steps](https://kestra.io/docs/plugin-developer-guide/) to develop your own plugin. Here are the official plugins that are available:

<table>
<tr>
<td><a href="https://kestra.io/plugins/plugin-aws#s3">Amazon S3</a></td>
<td><a href="https://kestra.io/plugins/plugin-serdes#avro">Avro</a></td>
<td><a href="https://kestra.io/plugins/core/tasks/scripts/io.kestra.core.tasks.scripts.Bash">Bash</a></td>
</tr>
<tr>
<td><a href="https://kestra.io/plugins/plugin-gcp#bigquery">Big Query</a></td>
<td><a href="https://kestra.io/plugins/plugin-serdes#csv">CSV</a></td>
<td><a href="https://kestra.io/plugins/plugin-jdbc-clickhouse">ClickHouse</a></td>
</tr>
<tr>
<td><a href="https://kestra.io/plugins/plugin-elasticsearch">ElasticSearch</a></td>
<td><a href="https://kestra.io/plugins/plugin-notifications#mail">Email</a></td>
<td><a href="https://kestra.io/plugins/plugin-gcp#gcs">Google Cloud Storage</a></td>
</tr>
<tr>
<td><a href="https://kestra.io/plugins/plugin-googleworkspace#drive">Google Drive</a></td>
<td><a href="https://kestra.io/plugins/plugin-googleworkspace#sheets">Google Sheets</a></td>
<td><a href="https://kestra.io/plugins/plugin-scripts-groovy">Groovy</a></td>
</tr>
<tr>
<td><a href="https://kestra.io/plugins/plugin-fs#http">Http</a></td>
<td><a href="https://kestra.io/plugins/plugin-serdes#json">JSON</a></td>
<td><a href="https://kestra.io/plugins/plugin-scripts-jython">Jython</a></td>
</tr>
<tr>
<td><a href="https://kestra.io/plugins/plugin-kafka">Kafka</a></td>
<td><a href="https://kestra.io/plugins/plugin-kubernetes">Kubernetes</a></td>
<td><a href="https://kestra.io/plugins/plugin-jdbc-sqlserver">Microsoft SQL Server</a></td>
</tr>
<tr>
<td><a href="https://kestra.io/plugins/plugin-mongodb">MongoDb</a></td>
<td><a href="https://kestra.io/plugins/plugin-jdbc-mysql">MySQL</a></td>
<td><a href="https://kestra.io/plugins/plugin-scripts-nashorn">Nashorn</a></td>
</tr>
<tr>
<td><a href="https://kestra.io/plugins/core/tasks/scripts/io.kestra.core.tasks.scripts.Node">Node</a></td>
<td><a href="https://kestra.io/plugins/plugin-crypto#openpgp">Open PGP</a></td>
<td><a href="https://kestra.io/plugins/plugin-jdbc-oracle">Oracle</a></td>
</tr>
<tr>
<td><a href="https://kestra.io/plugins/plugin-jdbc-postgres">Postgres</a></td>
<td><a href="https://kestra.io/plugins/core/tasks/scripts/io.kestra.core.tasks.scripts.Python">Python</a></td>
<td><a href="https://kestra.io/plugins/plugin-jdbc-redshift">Redshift</a></td>
</tr>
<tr>
<td><a href="https://kestra.io/plugins/plugin-fs#sftp">SFTP</a></td>
<td><a href="https://kestra.io/plugins/plugin-singer">Singer</a></td>
<td><a href="https://kestra.io/plugins/plugin-notifications#slack">Slack</a></td>
</tr>
<tr>
<td><a href="https://kestra.io/plugins/plugin-jdbc-vectorwise">Vectorwise</a></td>
<td><a href="https://kestra.io/plugins/plugin-jdbc-vertica">Vertica</a></td>
<td><a href="https://kestra.io/plugins/plugin-serdes#xml">XML</a></td>
</tr>
</table>



This list is growing quickly as we are actively building more plugins, and we welcome contributions!

## Roadmap

![Kestra orchestrator](https://kestra.io/ui.gif)
See the [open issues](https://github.com/kestra-io/kestra/issues) for a list of proposed features (and known issues) or look at the [project board](https://github.com/orgs/kestra-io/projects/2).

## Documentation
* The official Kestra documentation can be found under: [kestra.io](https://kestra.io)

## License
Apache 2.0 © [Kestra Technologies](https://kestra.io)
41 changes: 20 additions & 21 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,6 @@ version: "3.6"
volumes:
zookeeper-data:
driver: local
zookeeper-log:
driver: local
kafka-data:
driver: local
elasticsearch-data:
Expand All @@ -14,30 +12,26 @@ volumes:

services:
zookeeper:
image: confluentinc/cp-zookeeper:7.0.1
image: 'bitnami/zookeeper:latest'
volumes:
- zookeeper-data:/var/lib/zookeeper/data
- zookeeper-log:/var/lib/zookeeper/log
- zookeeper-data:/bitnami/zookeeper
environment:
ALLOW_ANONYMOUS_LOGIN: "yes"
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_LOG4J_ROOT_LOGLEVEL: WARN
ZOOKEEPER_TOOLS_LOG4J_LOGLEVEL: WARN
ZOO_LOG_LEVEL: "WARN"

kafka:
image: confluentinc/cp-kafka:7.0.1
image: 'bitnami/kafka:latest'
volumes:
- kafka-data:/var/lib/kafka
- kafka-data:/bitnami
environment:
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092
KAFKA_CONFLUENT_SUPPORT_METRICS_ENABLE: 'false'
KAFKA_LOG4J_LOGGERS: "kafka=WARN,kafka.producer.async.DefaultEventHandler=WARN,kafka.controller=WARN,state.change.logger=WARN"
KAFKA_LOG4J_ROOT_LOGLEVEL: WARN
KAFKA_TOOLS_LOG4J_LOGLEVEL: WARN
links:
ALLOW_PLAINTEXT_LISTENER: "yes"
KAFKA_CFG_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_CFG_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_CFG_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
KAFKA_CFG_TRANSACTION_STATE_LOG_MIN_ISR: 1
KAFKA_CFG_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092
depends_on:
- zookeeper

elasticsearch:
Expand All @@ -59,7 +53,7 @@ services:

kestra:
image: kestra/kestra:develop-full
command: server standalone
entrypoint: /usr/bin/wait-for-it -t 60 kafka:9092 && /usr/bin/wait-for-it -t 60 elasticsearch:9200 && /app/kestra server standalone
volumes:
- kestra-data:/app/storage
- /var/run/docker.sock:/var/run/docker.sock
Expand All @@ -71,6 +65,10 @@ services:
client:
properties:
bootstrap.servers: kafka:9092
defaults:
stream:
properties:
state.dir: "/tmp/kestra/kafka-streams/"
elasticsearch:
client:
http-hosts: http://elasticsearch:9200
Expand All @@ -88,6 +86,7 @@ services:
url: http://localhost:8080/
ports:
- "8080:8080"
links:
depends_on:
- kafka
- zookeeper
- elasticsearch
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
import org.apache.kafka.clients.admin.ConfigEntry;
import org.apache.kafka.clients.admin.NewTopic;
import org.apache.kafka.common.config.ConfigResource;
import org.apache.kafka.common.errors.TimeoutException;
import org.apache.kafka.common.errors.TopicExistsException;
import io.kestra.core.metrics.MetricRegistry;
import io.kestra.runner.kafka.configs.ClientConfig;
Expand Down Expand Up @@ -127,7 +128,7 @@ public void createIfNotExist(TopicsConfig topicConfig) {
try {
this.of().createTopics(Collections.singletonList(newTopic)).all().get();
log.info("Topic '{}' created", newTopic.name());
} catch (ExecutionException | InterruptedException e) {
} catch (ExecutionException | InterruptedException | TimeoutException e) {
if (e.getCause() instanceof TopicExistsException) {
try {
adminClient
Expand Down

0 comments on commit 796bf20

Please sign in to comment.