Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow for rudimentary handling of non-text html parts. #2

Open
3 tasks
ferdinandyb opened this issue Jun 9, 2023 · 4 comments
Open
3 tasks

Allow for rudimentary handling of non-text html parts. #2

ferdinandyb opened this issue Jun 9, 2023 · 4 comments
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@ferdinandyb
Copy link
Owner

ferdinandyb commented Jun 9, 2023

Caeml currently assumes that seeing the text/plain part is what you want. It does not provide any information about the existence of other message parts, nor is it obvious from the output whether the message is empty or whether it did not contain a text/plain part (some people might only send text/html).

Some thoughts about improvement:

  • print a warning, if there's no text/plain part found
  • allow a commandline override to output a different part other than text/plain
  • somehow print an enumeration of all the message parts (maybe as a special "header"?)

Some notes to myself:

Maybe go somehow like this: iterate over all the parts. If we see text/plain, store it for output and after the entire iteration write it to stdout. If we haven't yet seen text/plain but we see text/html store that for output, but overwrite with text/plain if we see that. If user passes text/html as the desired output, do the same, but don't overwrite with text/plain (need to store a flag that the stored output is not fallback probably). We can do this for any other part (i.e. first always fall back to text/plain, if that is not available to text/html, but if we see the requested part store that for output and set the flag that we are not fallback anymore).

@ferdinandyb ferdinandyb self-assigned this Jun 9, 2023
@mpldr
Copy link

mpldr commented Jun 10, 2023

You could DIY a simple HTML parser which basically turns <b>, <i>, <em>, <a> and so on into ANSI escape codes and strips anything else. Plus maybe a wrapper. This way, you wouldn't even have to create a tokenizer, you can just use either regex or write what the RegEx does if it is too slow (which I doubt). I wouldn't personally bother with CSS parsing and the likes.

Would at least make the HTML experience more bearable :D

@agenbite
Copy link

I think printing the structure and allowing for dumping different parts could be a good solution.

Thanks for the work!

@ferdinandyb
Copy link
Owner Author

Hmm, I don't think I want to get into html parsing. On the other hand we could make a filter interface, similar to aerc. In that case it would make sense to fall back to first text/markdown and if that doesn't exist to text/html and if even that doesn't exist to any text/*. Or something like this. Shouldn't be very complicated.

@ferdinandyb ferdinandyb added the enhancement New feature or request label Jul 26, 2023
@ferdinandyb ferdinandyb added this to the release 1.0.0 milestone Jul 26, 2023
@ferdinandyb
Copy link
Owner Author

https://todo.sr.ht/~rjarry/aerc/224 Actually, what I proposed here might make caeml largely obsolete.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants