Releases: thbar/kiba
v4.0.0
This is a maintenance release with code clean-up. Regular use should see no impact.
- Breaking: Ruby 2.4 (EOL since 2020-03-31) is not officially supported anymore.
- CI changes: moving from Travis CI (EOL) to GitHub actions. The Windows CI has been removed for now (see #97)
- Breaking: if your jobs use Kiba's "legacy runner" via
config :kiba, runner: Kiba::Runner
, be aware that this legacy runner has been removed in #96. The upgrade path is to remove this config line and let Kiba use the more modernKiba::StreamingRunner
, which is the default anyway since Kiba v3.0.0 (see #83 for context) and is normally fully backward-compatible. - Cleanup: in Kiba v3, the
kiba
shell command had been deprecated and replaced by a simple stub printing a warning to STDERR. It is now removed for good. - StandardRB has been added for formatting & linting the codebase.
v3.6.0
v3.5.0
v3.0.0
BREAKING - the kiba
CLI is deprecated
The kiba
CLI is deprecated in favor of the more modern Kiba.parse
programmatic API [#74, #81].
The programmatic API allows everything the "command" mode supported, plus much more, and actually encourage better coding practices. For instance:
- API mode allows to pass live variables (rather than just ENV configuration from command line or JSON configs from files)
- Doing so permits to wrap resources open/close around running a job
- API mode makes it easier to run testing on an ETL process (via minitest/rspec) directly in-process (which allows stubbing/webmock etc), rather than via a command call
- API mode enforces use of clean modules with explicit loading, rather than polluting the top-level namespace with global methods
- API mode allows to run jobs from Sidekiq or background job systems, from an HTTP call (if the job is fast), without necessarily waiting for a command line binary to run - this supports more dynamic interactions (e.g. a job is created in reaction to an external event received via HTTP or a websocket)
A temporary kiba-legacy-cli
gem is available (https://github.com/thbar/kiba-legacy-cli) to ease migration, but the recommendation is really to migrate over and use Kiba.parse
directly, as described in the current documentation.
Kiba now defaults to StreamingRunner
Introduced in v2.0.0 [#44] to ensure a transform could yield N rows for 1 input row, and improved in v2.5.0 [#57] to help implement "buffering transforms", the StreamingRunner
is now made the default to process the jobs [#83].
This change is expected to be backward compatible and will help with reusability & features of ETL components.
Ruby compatibility notice
- Kiba now officially supports MRI Ruby 2.4+ (although 2.3 will still work for now), JRuby 9.2+ or TruffleRuby.
- You may get warnings with Ruby 2.7 and errors with Ruby 2.8+. See [#85] for status on Ruby 3 keyword arguments support.
v2.5.0
Aggregating / buffering transforms
A Transform's close
can now yield rows (this requires the new StreamingRunner
, see v2.0.0 release notes).
This will let component implementers support new types of scenarios:
- Batch transforms (such as the upcoming Kiba Pro
ParallelTransform
, or batch SQL lookups) - Grouping of rows (including in-memory or db-backed sort, normalisation operations, map operations)
See #57 for more background & explanations.
Ruby compatibility notice
Kiba now requires MRI Ruby 2.3+, JRuby 9.1+ or TruffleRuby.
This is done to reduce the testing burden, to encourage users to avoid EOL'ed rubies, and to let me use more recent Ruby features when relevant.
Other tweaks
- Fix incorrect error message when calling
transform nil
(#73 - thanks @envygeeks for the report). - Fix code & documentation links on Rubygems (#71 - thanks @janko).
v2.0.0
New StreamingRunner engine
Kiba 2 introduces a new, opt-in engine called the StreamingRunner
, which allows to generate an arbitrary number of rows inside class transforms. This drastically improves the reusability & composability of Kiba components (see #44 for some background).
To use the StreamingRunner
, use the following code:
# activate the new Kiba internal config system
extend Kiba::DSLExtensions::Config
# opt-in for the new engine
config :kiba, runner: Kiba::StreamingRunner
# write transform class able to yield an arbitrary number of rows
class MyYieldingTransform
def process(row)
yield {key: 1}
yield {key: 2}
{key: 3}
end
end
The improved runner is compatible with Ruby 2.0+.
Compatibility with Kiba 1
Kiba 2 is expected to be compatible with existing Kiba 1 scripts, as long as you did not use internal API.
Internal changes include:
- An opt-in, Elixir's mix-inspired
config
system, currently only used to select the runner you want at job declaration time - A stronger isolation in the
Parser
, to reduces the chances that ETL scripts could conflict with Kiba internal classes