Hive sync: Let's discuss it. #49

simc · 2019-09-12T17:52:14Z

In the future I want to support syncing Hive with a remote database.

It would be helpful if you could share your needs & ideas.

One of the most obvious use cases is backing up data (for example settings or messages etc.) I think Firebase should be one of the first supported remotes.

ThinkDigitalSoftware · 2019-09-12T19:53:08Z

I like the idea. If it has functions that need to be set up on initialization like FCM does, it would be easy for a custom solution to be added. But it still will require a lot of input from the user because of handling updating Hive when the app reopens but the remote db has changed, etc

simc · 2019-09-13T07:02:25Z

But it still will require a lot of input from the user because of handling updating Hive when the app reopens but the remote db has changed, etc

Yes that's true. Much easier would be an implementation which just creates a backup of Hive.

The goal is to support full sync (including support for remote changes)

ThinkDigitalSoftware · 2019-09-13T08:04:25Z

It could possibly be simpler if you made adaptors for different remote db types? Sql, NoSQL, that way the similarities could be abstracted or simplified for the user?

On Fri, Sep 13, 2019 at 12:02 AM Simon Leier ***@***.***> wrote: But it still will require a lot of input from the user because of handling updating Hive when the app reopens but the remote db has changed, etc Yes that's true. Much easier would be an implementation which just creates a backup of Hive. The goal is to support full sync (including support for remote changes) — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#49>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AFPYO7OCSFDU55Y4W2FP4DDQJM3IFANCNFSM4IWHUJCQ> .

-- Think Digital 323-638-9448 760-678-8833 Facebook.com/ThinkDigitalRepair

simc · 2019-09-13T10:42:59Z

Yes, I'll start experimenting once queries and asset DBs are ready...

If anyone has time and wants to contribute, I'll be there to help.

leedstyh · 2019-09-13T11:32:34Z

Are s3, onedrive, google drive, dropbox on the plan?

simc · 2019-09-13T12:13:10Z

I hope to make it very easy to write a sync solution for every service. This will take time tho.

chemickypes · 2019-09-17T07:58:37Z

You could provide an interface so every user can implement what he wants.

simc · 2019-09-17T08:01:56Z

It would be great to know which information users actually need. Do you have something in mind?

chemickypes · 2019-09-17T08:54:40Z

I'm thinking as I write, so forgive some reasoning holes.

My ideas is having a simple interface with 2 or 4 functions like these:

//to read and write only one element
bool writeToServer(String key, dynamic value, String boxName, Type runtimeType);
T readFromServer(String key, dynamic value, String boxName, Type runtimeType);


// to read and write all values in one time
bool writeToServer(Map<String, dynamic> maps, String boxName, Type runtimeType);
Map<String, dynamic> readFromServer(String boxName);

boxName parameter (and the others) can be useful to the user to know what service call.

I suppose that the first two functions can be used with lazy box.

Now user have to implement this interface and iinjects the implementation within hive, so hive can call this object to sync remotely.

This is just a raw idea.

PS. Sorry for my pseudocode

simc · 2019-09-17T09:34:15Z

Looks good! Thanks.

leedstyh · 2019-09-18T01:15:39Z

@chemickypes The question of this way is that we have to encrypt and decrypt the data both on client and server.

The best way is reuse the binary format of Hive as I post in this issue.

And also, if we sync to s3, onedrive, google drive, dropbox, we have to do the encryption the decryption on client.

simc · 2019-09-18T12:07:56Z

@leedstyh
Since the data is stored unencrypted in memory, it will not be necessary to decrypt it before syncing.

I'm not sure about using the binary format. It would be necessary to run Hive on the server too since the binary format can only be used by Hive.

leedstyh · 2019-09-18T13:37:00Z

Nope, the server not necessary to run Hive. The server will not process the data, just store it. Think about syncing to google drive.

chemickypes · 2019-09-18T14:13:52Z

@leedstyh I think that we can split this problem into two smaller problems:

Use the remote server like an extension of locale hive
Use Hive like a delegate and its goal is get the data from the server when it needs, and the user will not distinguish where the data will come from because it has only one access point.

I don't know what @leisim wants to do with Hive.

joeblew99 · 2019-09-19T12:15:22Z

If the goal is offline editing then we are into the CRDT and vector clocks territory maybe.

Typically the domain model needs to think in terms of Mutations or Ops. This works when no allowing offline editing is allowed because the Server Time is the Global TIME.
When you want to support offline editing you need a way to merge changes.
CRDT, OPS, and Vector clocks is this area. The changes are happening in different time domains now.

Here is something to get the ball rolling maybe..
It is a flutter example that has basic support for offline editing.
https://github.com/memspace/zefyr
It uses Operations and logs them.
But is does not have vector clock support.
The data model is using the quill approach.
https://github.com/memspace/zefyr/blob/master/packages/notus/lib/src/heuristics.dart#L6

https://github.com/pulyaevskiy/quill-delta-dart

this is where the real OT ( Operational Transform) guts is.

Now rather than use vector clocks, sometimes you can use Context within the data.
I think this is what zefyr uses, but am not sure.

joeblew99 · 2019-09-23T22:14:31Z

I happened to stumble on this CRDT implemenation.

docs: https://cluster.ipfs.io/documentation/guides/consensus/
At the bottom its nice to see that the make the distinction between CRDT and RAFT properly.

It means that you can have data on the same types on many devices, and merge them independently.
You dont need to make OT's ( Operational Transforms ) which is very painful and limiting.

This is the Core lib.
https://github.com/ipfs/go-ds-crdt
That lib is used for IPFS Cluster to allow it to synchronise data.
https://github.com/ipfs/ipfs-cluster/blob/master/consensus/crdt/consensus.go

I really hope this is picked up with hive.

I think its an excellent basis for Hive Sync.

simc · 2019-09-24T05:36:30Z

Thanks for your valuable input. I'll definitely take a look at these projects and try to implement something similar with hive.

zenkog · 2019-12-03T16:00:30Z

Any plans to sync with Firestore? That would be awesome

ghost · 2019-12-26T17:17:52Z

+1000 to this! #49 (comment)

Manuelbaun · 2020-07-03T22:48:21Z

are there any updates on this? For a uni project, I was looking into a synchronization layer design using CRDTs. Here is my repo in dart https://github.com/Manuelbaun/sync_layer_crdt_playground.

I actually use some form of delta-crdts, sending only the mutation instead of the operation or the full state. It works nicely, but it adds a lot of overhead. For time tracking, I use Hybrid logical clocks. In my use cases ( just prototyping) my design worked fine but has a lot of work todo. For instance, I am not deleting anything and garbage collection will be needed at some point.

any of cause, there are a lot of design issues 😆 and I didn't made it into a library just yet

RastislavMirek · 2021-07-12T13:42:59Z

Is this still planned?

CodingArcher · 2021-12-17T04:44:23Z

Any further news on how the offline to online sync is coming along?
Has anyone been able to work something out?

themisir · 2021-12-17T17:39:58Z

Any further news on how the offline to online sync is coming along? Has anyone been able to work something out?

There's no plans to implement this feature anytime soon. Why?

It's best to keep things simple. It's nice to have something that does lots of things for you. But as system complexity increases it's hard to maintain them and also the software itself becomes "bloated" over time. Also having a Swiss army knife in a big projects usually causes issues because the bloated software usually has it's own workflow that doesn't play well with the existing software architecture. In general I'm currently preferring to keep current implementation stable rather than adding new features.
Online sync implementation depends on how the server side is implemented and usually differs from project to project. Maybe one project need a static bearer key authentication while other one might require authentication token based on logged in user, or other one might consume XML while some can only consume JSON. Well we might provide library for back-end schema / rules, but it might not play well with the existing back-end in place.
You can implement your own sync system based on whatever currently is available. You can either serialize data into json and send it to your server or send contents of database file. (I would prefer 1st option). It really ends up to your own design decision how you want to implement.

I'm open to feedback and suggestions on that.

simc added the enhancement New feature or request label Sep 12, 2019

simc pinned this issue Sep 12, 2019

simc added this to the Stretch Goals milestone Oct 11, 2019

simc added the hive label Oct 11, 2019

simc unpinned this issue Oct 19, 2019

simc removed this from the Stretch Goals milestone Feb 7, 2020

simc removed the hive label Feb 12, 2020

remboshelby mentioned this issue Feb 24, 2020

HiveError. Strange database crash on flutter. #240

Closed

themisir closed this as completed Aug 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hive sync: Let's discuss it. #49

Hive sync: Let's discuss it. #49

simc commented Sep 12, 2019 •

edited

Loading

ThinkDigitalSoftware commented Sep 12, 2019

simc commented Sep 13, 2019

ThinkDigitalSoftware commented Sep 13, 2019 via email

simc commented Sep 13, 2019

leedstyh commented Sep 13, 2019

simc commented Sep 13, 2019

chemickypes commented Sep 17, 2019

simc commented Sep 17, 2019 •

edited

Loading

chemickypes commented Sep 17, 2019 •

edited

Loading

simc commented Sep 17, 2019

leedstyh commented Sep 18, 2019 •

edited

Loading

simc commented Sep 18, 2019 •

edited

Loading

leedstyh commented Sep 18, 2019

chemickypes commented Sep 18, 2019

joeblew99 commented Sep 19, 2019

joeblew99 commented Sep 23, 2019

simc commented Sep 24, 2019 •

edited

Loading

zenkog commented Dec 3, 2019

ghost commented Dec 26, 2019

Manuelbaun commented Jul 3, 2020 •

edited

Loading

RastislavMirek commented Jul 12, 2021

CodingArcher commented Dec 17, 2021

themisir commented Dec 17, 2021

Hive sync: Let's discuss it. #49

Hive sync: Let's discuss it. #49

Comments

simc commented Sep 12, 2019 • edited Loading

ThinkDigitalSoftware commented Sep 12, 2019

simc commented Sep 13, 2019

ThinkDigitalSoftware commented Sep 13, 2019 via email

simc commented Sep 13, 2019

leedstyh commented Sep 13, 2019

simc commented Sep 13, 2019

chemickypes commented Sep 17, 2019

simc commented Sep 17, 2019 • edited Loading

chemickypes commented Sep 17, 2019 • edited Loading

simc commented Sep 17, 2019

leedstyh commented Sep 18, 2019 • edited Loading

simc commented Sep 18, 2019 • edited Loading

leedstyh commented Sep 18, 2019

chemickypes commented Sep 18, 2019

joeblew99 commented Sep 19, 2019

joeblew99 commented Sep 23, 2019

simc commented Sep 24, 2019 • edited Loading

zenkog commented Dec 3, 2019

ghost commented Dec 26, 2019

Manuelbaun commented Jul 3, 2020 • edited Loading

RastislavMirek commented Jul 12, 2021

CodingArcher commented Dec 17, 2021

themisir commented Dec 17, 2021

simc commented Sep 12, 2019 •

edited

Loading

simc commented Sep 17, 2019 •

edited

Loading

chemickypes commented Sep 17, 2019 •

edited

Loading

leedstyh commented Sep 18, 2019 •

edited

Loading

simc commented Sep 18, 2019 •

edited

Loading

simc commented Sep 24, 2019 •

edited

Loading

Manuelbaun commented Jul 3, 2020 •

edited

Loading