Skip to content

Releases: thbar/kiba

v4.0.0

24 Mar 12:35
9d33283
Compare
Choose a tag to compare

This is a maintenance release with code clean-up. Regular use should see no impact.

  • Breaking: Ruby 2.4 (EOL since 2020-03-31) is not officially supported anymore.
  • CI changes: moving from Travis CI (EOL) to GitHub actions. The Windows CI has been removed for now (see #97)
  • Breaking: if your jobs use Kiba's "legacy runner" via config :kiba, runner: Kiba::Runner, be aware that this legacy runner has been removed in #96. The upgrade path is to remove this config line and let Kiba use the more modern Kiba::StreamingRunner, which is the default anyway since Kiba v3.0.0 (see #83 for context) and is normally fully backward-compatible.
  • Cleanup: in Kiba v3, the kiba shell command had been deprecated and replaced by a simple stub printing a warning to STDERR. It is now removed for good.
  • StandardRB has been added for formatting & linting the codebase.

v3.6.0

07 Feb 21:22
Compare
Choose a tag to compare

Kiba.run(job) can now (instead of a job parameter) take a block to define the job. See #94 for more details.

v3.5.0

24 Sep 14:18
Compare
Choose a tag to compare

This release adds support for Ruby 2.7 and Ruby 3.

See #93 for detailed information and analysis.

v3.0.0

10 Feb 11:03
Compare
Choose a tag to compare

BREAKING - the kiba CLI is deprecated

The kiba CLI is deprecated in favor of the more modern Kiba.parse programmatic API [#74, #81].

The programmatic API allows everything the "command" mode supported, plus much more, and actually encourage better coding practices. For instance:

  • API mode allows to pass live variables (rather than just ENV configuration from command line or JSON configs from files)
  • Doing so permits to wrap resources open/close around running a job
  • API mode makes it easier to run testing on an ETL process (via minitest/rspec) directly in-process (which allows stubbing/webmock etc), rather than via a command call
  • API mode enforces use of clean modules with explicit loading, rather than polluting the top-level namespace with global methods
  • API mode allows to run jobs from Sidekiq or background job systems, from an HTTP call (if the job is fast), without necessarily waiting for a command line binary to run - this supports more dynamic interactions (e.g. a job is created in reaction to an external event received via HTTP or a websocket)

A temporary kiba-legacy-cli gem is available (https://github.com/thbar/kiba-legacy-cli) to ease migration, but the recommendation is really to migrate over and use Kiba.parse directly, as described in the current documentation.

Kiba now defaults to StreamingRunner

Introduced in v2.0.0 [#44] to ensure a transform could yield N rows for 1 input row, and improved in v2.5.0 [#57] to help implement "buffering transforms", the StreamingRunner is now made the default to process the jobs [#83].

This change is expected to be backward compatible and will help with reusability & features of ETL components.

Ruby compatibility notice

  • Kiba now officially supports MRI Ruby 2.4+ (although 2.3 will still work for now), JRuby 9.2+ or TruffleRuby.
  • You may get warnings with Ruby 2.7 and errors with Ruby 2.8+. See [#85] for status on Ruby 3 keyword arguments support.

v2.5.0

29 May 18:56
0e74193
Compare
Choose a tag to compare

Aggregating / buffering transforms

A Transform's close can now yield rows (this requires the new StreamingRunner, see v2.0.0 release notes).

This will let component implementers support new types of scenarios:

  • Batch transforms (such as the upcoming Kiba Pro ParallelTransform, or batch SQL lookups)
  • Grouping of rows (including in-memory or db-backed sort, normalisation operations, map operations)

See #57 for more background & explanations.

Ruby compatibility notice

Kiba now requires MRI Ruby 2.3+, JRuby 9.1+ or TruffleRuby.

This is done to reduce the testing burden, to encourage users to avoid EOL'ed rubies, and to let me use more recent Ruby features when relevant.

Other tweaks

  • Fix incorrect error message when calling transform nil (#73 - thanks @envygeeks for the report).
  • Fix code & documentation links on Rubygems (#71 - thanks @janko).

v2.0.0

05 Jan 18:57
Compare
Choose a tag to compare

New StreamingRunner engine

Kiba 2 introduces a new, opt-in engine called the StreamingRunner, which allows to generate an arbitrary number of rows inside class transforms. This drastically improves the reusability & composability of Kiba components (see #44 for some background).

To use the StreamingRunner, use the following code:

# activate the new Kiba internal config system
extend Kiba::DSLExtensions::Config
# opt-in for the new engine
config :kiba, runner: Kiba::StreamingRunner

# write transform class able to yield an arbitrary number of rows
class MyYieldingTransform
  def process(row)
    yield {key: 1}
    yield {key: 2}
    {key: 3}
  end
end

The improved runner is compatible with Ruby 2.0+.

⚠️ it is warmly recommended not to share data between the rows yielded this way, otherwise anything changing one row will also affect the others. Make sure to build completely independent rows (or use an immutable Hash structure).

Compatibility with Kiba 1

Kiba 2 is expected to be compatible with existing Kiba 1 scripts, as long as you did not use internal API.

Internal changes include:

  • An opt-in, Elixir's mix-inspired config system, currently only used to select the runner you want at job declaration time
  • A stronger isolation in the Parser, to reduces the chances that ETL scripts could conflict with Kiba internal classes

v1.0.0

01 Dec 17:50
Compare
Choose a tag to compare
  • close becomes optional in destinations.
  • Bumping to 1.0.0 since Kiba is in wide production use.