-
Notifications
You must be signed in to change notification settings - Fork 229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add streaming support #137
Conversation
Is schema required? can it work without schema like frame-offline? |
Yes. I believe it is. Kafka returns a bytes array of any type, so we have to rely on the schema to create a decoder. |
feathr_project/feathrcli/data/feathr_user_workspace/feathr_config.yaml
Outdated
Show resolved
Hide resolved
src/main/scala/com/linkedin/feathr/offline/source/dataloader/stream/KafkaDataLoader.scala
Outdated
Show resolved
Hide resolved
Another bug while testing: We should make this kafka thing optional if the config does not have a kafka section, it will fail:
|
src/test/scala/com/linkedin/feathr/offline/FeatureGenIntegTest.scala
Outdated
Show resolved
Hide resolved
src/test/scala/com/linkedin/feathr/offline/generation/TestPushToRedisOutputProcessor.scala
Outdated
Show resolved
Hide resolved
src/main/scala/org/apache/spark/customized/CustomGenericRowWithSchema.scala
Outdated
Show resolved
Hide resolved
src/main/scala/com/linkedin/feathr/offline/source/dataloader/stream/KafkaDataLoader.scala
Outdated
Show resolved
Hide resolved
src/main/scala/com/linkedin/feathr/offline/generation/StreamingFeatureGenerator.scala
Outdated
Show resolved
Hide resolved
src/main/scala/com/linkedin/feathr/offline/generation/StreamingFeatureGenerator.scala
Outdated
Show resolved
Hide resolved
src/main/scala/com/linkedin/feathr/offline/generation/StreamingFeatureGenerator.scala
Outdated
Show resolved
Hide resolved
src/main/scala/com/linkedin/feathr/offline/generation/StreamingFeatureGenerator.scala
Outdated
Show resolved
Hide resolved
src/main/scala/com/linkedin/feathr/offline/config/location/KafkaEndpoint.scala
Show resolved
Hide resolved
Discussed offline. Ideally we want users to define features without schema as well so it's aligned with offline and online. But seems Kafka doesn't provide such capability by itself yet. So instead, we will ask users to provide a schema. In the future, if there is interface to allow kafka read without schema, we can make the schema optional. |
Could you add a dev guide as well for future devs? |
How do we support other streaming sources, like Flink? Maybe we should do:
|
Discussed offline. The current syntax just work so no need to introduce anohter layer. |
feathr_project/feathrcli/data/feathr_user_workspace/feathr_config.yaml
Outdated
Show resolved
Hide resolved
dde07f5
to
0ab921f
Compare
f413137
to
acdb9cd
Compare
5edb8a7
to
350c244
Compare
Add streaming support for anchored features.
Example:
Define input data schema: