Allow for rudimentary handling of non-text html parts. #2

ferdinandyb · 2023-06-09T18:16:59Z

Caeml currently assumes that seeing the text/plain part is what you want. It does not provide any information about the existence of other message parts, nor is it obvious from the output whether the message is empty or whether it did not contain a text/plain part (some people might only send text/html).

Some thoughts about improvement:

print a warning, if there's no text/plain part found
allow a commandline override to output a different part other than text/plain
somehow print an enumeration of all the message parts (maybe as a special "header"?)

Some notes to myself:

Maybe go somehow like this: iterate over all the parts. If we see text/plain, store it for output and after the entire iteration write it to stdout. If we haven't yet seen text/plain but we see text/html store that for output, but overwrite with text/plain if we see that. If user passes text/html as the desired output, do the same, but don't overwrite with text/plain (need to store a flag that the stored output is not fallback probably). We can do this for any other part (i.e. first always fall back to text/plain, if that is not available to text/html, but if we see the requested part store that for output and set the flag that we are not fallback anymore).

The text was updated successfully, but these errors were encountered:

mpldr · 2023-06-10T06:23:36Z

You could DIY a simple HTML parser which basically turns <b>, <i>, <em>, <a> and so on into ANSI escape codes and strips anything else. Plus maybe a wrapper. This way, you wouldn't even have to create a tokenizer, you can just use either regex or write what the RegEx does if it is too slow (which I doubt). I wouldn't personally bother with CSS parsing and the likes.

Would at least make the HTML experience more bearable :D

agenbite · 2023-06-10T10:03:30Z

I think printing the structure and allowing for dumping different parts could be a good solution.

Thanks for the work!

ferdinandyb · 2023-06-10T10:25:11Z

Hmm, I don't think I want to get into html parsing. On the other hand we could make a filter interface, similar to aerc. In that case it would make sense to fall back to first text/markdown and if that doesn't exist to text/html and if even that doesn't exist to any text/*. Or something like this. Shouldn't be very complicated.

ferdinandyb · 2024-02-01T21:18:02Z

https://todo.sr.ht/~rjarry/aerc/224 Actually, what I proposed here might make caeml largely obsolete.

ferdinandyb self-assigned this Jun 9, 2023

ferdinandyb added the enhancement New feature or request label Jul 26, 2023

ferdinandyb added this to the release 1.0.0 milestone Jul 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow for rudimentary handling of non-text html parts. #2

Allow for rudimentary handling of non-text html parts. #2

ferdinandyb commented Jun 9, 2023 •

edited

Loading

mpldr commented Jun 10, 2023

agenbite commented Jun 10, 2023

ferdinandyb commented Jun 10, 2023

ferdinandyb commented Feb 1, 2024

Allow for rudimentary handling of non-text html parts. #2

Allow for rudimentary handling of non-text html parts. #2

Comments

ferdinandyb commented Jun 9, 2023 • edited Loading

mpldr commented Jun 10, 2023

agenbite commented Jun 10, 2023

ferdinandyb commented Jun 10, 2023

ferdinandyb commented Feb 1, 2024

ferdinandyb commented Jun 9, 2023 •

edited

Loading