Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Idiomatic text buffers #97

Closed
rjkroege opened this issue Jul 15, 2018 · 12 comments
Closed

Idiomatic text buffers #97

rjkroege opened this issue Jul 15, 2018 · 12 comments
Assignees
Labels
Idiomatic Bugs tracing code cleanup in progress Work has started...

Comments

@rjkroege
Copy link
Owner

Edwood could stand to have a new and improved text storage system. We've talked about various notions. A piece table with deltas would seem to be the most reasonable choice.

@rjkroege rjkroege added the Idiomatic Bugs tracing code cleanup label Jul 15, 2018
@mibk
Copy link

mibk commented Jul 16, 2018

Not sure that would be a good fit for Edwood, but years ago I came across a vim-like editor called vis. Its text management system was based on the Project Oberon (the introduction blog post: http://www.brain-dump.org/blog/entry/144/vis_a_highly_efficient_vim_like_text_editor). I ported (and then slightly changed the API) the text management part of the editor into the package undo.

@rjkroege
Copy link
Owner Author

Thanks. This sounds promising. I'll take a look at your undo package.

@rjkroege
Copy link
Owner Author

Notes on using the undo package (or something similar)

  • I need to eliminate all the places where Edwood reaches into File. In particular, seq management and the list of observing Text instances need to be excised.
  • I can remove File's list of Texts if I simply get each Text to redraw if it needs to at the end of any mutation.
  • Text can deduce if it needs to draw by comparing a File's seq to the "lastdrawn" seq.
  • undo may need some extensions. In particular, I may want to ask the underlying text buffer implementation what changes have happened since the last seq so that they can be replayed into a Text when we redraw (if necessary.)

Concrete Steps

  • de-globalize seq
  • rationalize and document "dirty" tracking
  • remove the observer pattern
  • clarify the API for text buffers
  • insert undo

@rjkroege rjkroege added the in progress Work has started... label Nov 24, 2018
@rjkroege rjkroege self-assigned this Nov 24, 2018
@rjkroege
Copy link
Owner Author

Strings vs Runes

  • The Undo package operates on byte buffers.
  • Edwood's Buffer object is an array of rune.
  • Using Undo would therefore entail a choice: switch Edwood to using utf-8 buffers (or strings) or modify Undo to use rune.

I somewhat prefer switching Edwood to utf-8 buffers if only because for most workloads it will greatly the heap space used for various text buffers by nearly a factor of 4.

This however seems like a somewhat far-reaching alteration. Many access patterns to Buffer iterate (e.g. ReadC loops in text.go or addr.go.) But sam addressing can be to a specific character.

@fhs Do you have an opinion on this?

Sequencing

It's not obvious to me how to sequence this work. I have several incomplete CLs that attempt to make progress on this task per comment above and none of them seem to necessarily move forward. I think the primary first task it to make sure that all Edwood code outside of File stops depending on its internal details around dirty tracking.

@fhs
Copy link
Contributor

fhs commented Jan 25, 2019

Switching to UTF-8 buffers does seem nicer, but the problem is sam addressing. Maybe we can use golang.org/x/exp/utf8string to efficiently index strings by rune. Given how much work is involved to switch to utf8 buffer, I think for now, we can just pick a Buffer implementation that works on runes, write some benchmarks, and call it a day. After Edwood has more test coverage and easier to refactor, we can come back to this problem.

The Buffer implementation should probably be moved into an internal package. It should also return errors for invalid index instead of panic, which will make it easier to test.

rjkroege added a commit that referenced this issue Feb 16, 2019
Text maintains a cache of text insertions that should be Undo-able
as a group. Relocate this cache into File so that the File API surface
makes a small step closer to the undo.Buffer which encapsulates
undo with well-defined Commit points.

A step towards #97.
rjkroege added a commit that referenced this issue Feb 16, 2019
There is a frequent pattern in the code to reach into File's buffer
implementation to determine the number of runes contained in the
File. This impedes replacing File's internal details so switch
external users to the Size function on File objects.

Helps with #97.
rjkroege added a commit that referenced this issue Feb 16, 2019
Modify File.Commit to have API alignment with undo.Buffer's Commit.
Helps with #97.
rjkroege added a commit that referenced this issue Feb 16, 2019
Advance the semantic fit between undo.Buffer and File by renaming the
existing Insert and Delete methods to InsertAt and DeleteAt.

Helps #97.
rjkroege added a commit that referenced this issue Feb 16, 2019
Move the FileHash code into a separate file to simplify the File code. Cleanup
to help with #97.
rjkroege added a commit that referenced this issue Feb 16, 2019
Refactor the code in Text.Insert to let File.InsertAt (uniformly) invoke
an observer pattern on any insertion. Helps with #97.

NB: not complete. Rebase in follow-on cleanup CLs. In particular,
Undo doesn't redraw the box correctly in this change.
rjkroege added a commit that referenced this issue Feb 16, 2019
Refactor the code in DeleteAt to run observers. Helps #97. Fixes the
problem with deletion in Zerox Windows noted in the previous CL.
rjkroege added a commit that referenced this issue Feb 16, 2019
Helps with #97. Moving towards to keeping the observer pattern an
opaque detail.
rjkroege added a commit that referenced this issue Feb 16, 2019
Reduce the number of places that look into internal details of File by
adding a HasMultipleTexts accessor. Simplifies code for helping #97.
rjkroege added a commit that referenced this issue Feb 16, 2019
Altering File for #97 permits removing the use of curtext from allupdate.
rjkroege added a commit that referenced this issue Feb 16, 2019
Rewrite File.Load in terms of File.InsertAt to adapt it to the
undo.Buffer API surface. This also simplifies the code by Re-using the
new observer structure. Helps with #97.
rjkroege added a commit that referenced this issue Feb 16, 2019
File.Close does not map to undo.Buffer so clean up this and some other
unnecessary code. Helps with #97.
rjkroege added a commit that referenced this issue Feb 16, 2019
Remove an unnecessary use of File's internal details. Helps #97.
rjkroege added a commit that referenced this issue Feb 16, 2019
Window shouldn't reach into File internal details. Start reducing this.
Code cleanup tangential to #97.
rjkroege added a commit that referenced this issue Feb 17, 2019
* Relocate the typing cache

Text maintains a cache of text insertions that should be Undo-able
as a group. Relocate this cache into File so that the File API surface
makes a small step closer to the undo.Buffer which encapsulates
undo with well-defined Commit points.

A step towards #97.

* Modify Text to use file.ReadC directly.

* Additional code cleanup in Text

Post relocating of the typing cache, cleanup some additional code.

* Stop looking into File's internals for the buffer size.

There is a frequent pattern in the code to reach into File's buffer
implementation to determine the number of runes contained in the
File. This impedes replacing File's internal details so switch
external users to the Size function on File objects.

Helps with #97.

* Align File.Commit with undo.Buffer

Modify File.Commit to have API alignment with undo.Buffer's Commit.
Helps with #97.

* Rename Insert to InsertAt and Delete to DeleteAt

Advance the semantic fit between undo.Buffer and File by renaming the
existing Insert and Delete methods to InsertAt and DeleteAt.

Helps #97.

* Pull hash into separate file

Move the FileHash code into a separate file to simplify the File code. Cleanup
to help with #97.

* File.InsertAt runs its observers

Refactor the code in Text.Insert to let File.InsertAt (uniformly) invoke
an observer pattern on any insertion. Helps with #97.

NB: not complete. Rebase in follow-on cleanup CLs. In particular,
Undo doesn't redraw the box correctly in this change.

* File.DeleteAt runs all its observers

Refactor the code in DeleteAt to run observers. Helps #97. Fixes the
problem with deletion in Zerox Windows noted in the previous CL.

* Remove an unncessary use of File.text

Helps with #97. Moving towards to keeping the observer pattern an
opaque detail.

* Introduce File.HasMultipleTexts()

Reduce the number of places that look into internal details of File by
adding a HasMultipleTexts accessor. Simplifies code for helping #97.

* Cleanup curtext use

Altering File for #97 permits removing the use of curtext from allupdate.

* Trivial code cleanup

* Convert File.Load to use File.InsertAt

Rewrite File.Load in terms of File.InsertAt to adapt it to the
undo.Buffer API surface. This also simplifies the code by Re-using the
new observer structure. Helps with #97.

* Cleanup File.Close

File.Close does not map to undo.Buffer so clean up this and some other
unnecessary code. Helps with #97.

* Remove unnecessary File.text access

Remove an unnecessary use of File's internal details. Helps #97.

* Trivial code cleanup

Remove some unnecessary code and add a comment.

* Assorted code cleanups in Window

Window shouldn't reach into File internal details. Start reducing this.
Code cleanup tangential to #97.

* Minor code cleanups.

Further reduce the cases where Window reaches into File's deails.

* Fix a rebase edit error
@rjkroege
Copy link
Owner Author

Update on this work: I want to relocate File into a separate module so that its public interface becomes obvious. This will permit writing unit tests that exercise the public interface that would (ideally) not be fragile across replacing the File implementation.

rjkroege added a commit that referenced this issue Mar 4, 2019
Reduce unnecessary uses of File.text. Helps with #97. Simplify the
code.
rjkroege added a commit that referenced this issue Mar 4, 2019
Relocate Window.isscratch to File as it's state unique to the File
object. This change also further reduces use of File.text from outside
of File. Helps with #97.
rjkroege added a commit that referenced this issue Mar 4, 2019
Helps with #97: make File.Unmodded into File.Clean to better align
with undo.Buffer.Clean
rjkroege added a commit that referenced this issue Mar 4, 2019
Window.isdir is state identical for each Window.body.file so relocate
it into File. Use this to remove external accesses to File.mod. Helps
with #97.
rjkroege added a commit that referenced this issue Feb 9, 2022
SaveableAndDirty only needs to be true for buffers with Undo so
there's no need to invoke HasUncommitedChanges. Helps with #97 because
file.Buffer does not have a Commit concept.
rjkroege added a commit that referenced this issue Feb 9, 2022
SaveableAndDirty only needs to be true for buffers with Undo so
there's no need to invoke HasUncommitedChanges. Helps with #97 because
file.Buffer does not have a Commit concept.
rjkroege added a commit that referenced this issue Feb 9, 2022
Previous changes have added a newer alternative to file.File:
file.Buffer that provides a more efficient backing storage mechanism
for textual data with undo. This CL adds an adapter that implements
ObservableEditableBuffer in terms of file.Buffer. Noteworthy changes:

* supporting the oeb callback logic for Undo/Reo from file.Buffer
* track the bounds of Undo/Redo action
* track undoable file name changes in file.Buffer
* relocate oeb callbacks where possible to do so
* modify oeb tests to invoke both the file.File and file.Buffer
implementations

Provides the core of #97 -- a more idiomatic text buffer
implementation.
rjkroege added a commit that referenced this issue Feb 9, 2022
Previous changes have added a newer alternative to file.File:
file.Buffer that provides a more efficient backing storage mechanism
for textual data with undo. This CL adds an adapter that implements
ObservableEditableBuffer in terms of file.Buffer. Noteworthy changes:

* supporting the oeb callback logic for Undo/Reo from file.Buffer
* track the bounds of Undo/Redo action
* track undoable file name changes in file.Buffer
* relocate oeb callbacks where possible to do so
* modify oeb tests to invoke both the file.File and file.Buffer
implementations

Provides the core of #97 -- a more idiomatic text buffer
implementation.
rjkroege added a commit that referenced this issue Feb 10, 2022
Previous changes have added a newer alternative to file.File:
file.Buffer that provides a more efficient backing storage mechanism
for textual data with undo. This CL adds an adapter that implements
ObservableEditableBuffer in terms of file.Buffer. Noteworthy changes:

* supporting the oeb callback logic for Undo/Reo from file.Buffer
* track the bounds of Undo/Redo action
* track undoable file name changes in file.Buffer
* relocate oeb callbacks where possible to do so
* modify oeb tests to invoke both the file.File and file.Buffer
implementations

Provides the core of #97 -- a more idiomatic text buffer
implementation.
rjkroege added a commit that referenced this issue Feb 11, 2022
With this change, Edwood can successfully do interactive editing
operations when using the newtype buffers. (Run with --newtypebuffer).
Helps with #97.
rjkroege added a commit that referenced this issue Feb 14, 2022
With this change, Edwood can successfully do interactive editing
operations when using the newtype buffers. (Run with --newtypebuffer).
Helps with #97.
rjkroege added a commit that referenced this issue Feb 15, 2022
This CL completes Edwood's ability to use the file.Buffer
implementation and passes all tests with the new Buffer
implementation. Helps with #97. Fixes some subtle bugs detected by
unit tests.
rjkroege added a commit that referenced this issue Feb 15, 2022
This CL completes Edwood's ability to use the file.Buffer
implementation and passes all tests with the new Buffer
implementation. Helps with #97. Fixes some subtle bugs detected by
unit tests.
@rjkroege
Copy link
Owner Author

As of #430, Edwood is operating with the new-type utf8 byte buffers. Exciting! Thanks everybody who helped with the implementation. This unlocks many possible code improvements.

rjkroege added a commit that referenced this issue Feb 15, 2022
file.Buffer was in a file undo.go. Rename undo.go to buffer.go. This
is follow up work to help #97.
rjkroege added a commit that referenced this issue Feb 22, 2022
file.Buffer was in a file undo.go. Rename undo.go to buffer.go. This
is follow up work to help #97.
rjkroege added a commit that referenced this issue Feb 22, 2022
With the completion of file.Buffer, remove the previous implementation.
This cleanup is part of #97. Specific changes:

* turn the clipboard into a []byte
* start refactoring the warning system and store warnings as []byte
* convert all complex tests to the flexible window scaffold to make
testing best practices easy.
* actually remove the unused code file.File and file.RuneArray
rjkroege added a commit that referenced this issue Feb 22, 2022
With the completion of file.Buffer, remove the previous implementation.
This cleanup is part of #97. Specific changes:

* turn the clipboard into a []byte
* start refactoring the warning system and store warnings as []byte
* convert all complex tests to the flexible window scaffold to make
testing best practices easy.
* actually remove the unused code file.File and file.RuneArray
rjkroege added a commit that referenced this issue Feb 23, 2022
With the completion of file.Buffer, remove the previous implementation.
This cleanup is part of #97. Specific changes:

* turn the clipboard into a []byte
* start refactoring the warning system and store warnings as []byte
* convert all complex tests to the flexible window scaffold to make
testing best practices easy.
* actually remove the unused code file.File and file.RuneArray
rjkroege added a commit that referenced this issue Feb 24, 2022
With the completion of file.Buffer, remove the previous implementation.
This cleanup is part of #97. Specific changes:

* turn the clipboard into a []byte
* start refactoring the warning system and store warnings as []byte
* convert all complex tests to the flexible window scaffold to make
testing best practices easy.
* actually remove the unused code file.File and file.RuneArray
rjkroege added a commit that referenced this issue Feb 24, 2022
With the completion of file.Buffer, remove the previous implementation.
This cleanup is part of #97. Specific changes:

* turn the clipboard into a []byte
* start refactoring the warning system and store warnings as []byte
* convert all complex tests to the flexible window scaffold to make
testing best practices easy.
* actually remove the unused code file.File and file.RuneArray
rjkroege added a commit that referenced this issue Feb 24, 2022
With the completion of file.Buffer, remove the previous implementation.
This cleanup is part of #97. Specific changes:

* turn the clipboard into a []byte
* start refactoring the warning system and store warnings as []byte
* convert all complex tests to the flexible window scaffold to make
testing best practices easy.
* actually remove the unused code file.File and file.RuneArray
rjkroege added a commit that referenced this issue Feb 24, 2022
While implementing the newtype file.Buffer, I'd abstracted it
internally to exist both old and new types to exist simultaneously.
With the removal of old type buffers, remove this. Cleanup that's part
of #97.
rjkroege added a commit that referenced this issue Feb 24, 2022
While implementing the newtype file.Buffer, I'd abstracted it
internally to exist both old and new types to exist simultaneously.
With the removal of old type buffers, remove this. Cleanup that's part
of #97.
@rjkroege
Copy link
Owner Author

When should I consider this done? We now have new type buffers. But I've not finished taking advantage of them to improve Edwood operation. I'm going to declare this finished. Yay! 😀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Idiomatic Bugs tracing code cleanup in progress Work has started...
Projects
None yet
Development

No branches or pull requests

4 participants