Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't convert doc to html #5

Closed
zhongjin616 opened this issue Nov 11, 2021 · 8 comments
Closed

Can't convert doc to html #5

zhongjin616 opened this issue Nov 11, 2021 · 8 comments
Labels
enhancement New feature or request

Comments

@zhongjin616
Copy link

hi, I test unoconvert with libreoffice7.1.7

unoconvert 97html转换文档.doc 977.html

cause an error:

INFO:unoserver:Starting unoconverter.
INFO:unoserver:Opening 97html转换文档.doc
Traceback (most recent call last):
  File "/root/miniconda3/envs/docvert/bin/unoconvert", line 9, in <module>
    sys.exit(main())
  File "/root/miniconda3/envs/docvert/lib/python3.8/site-packages/unoserver/converter.py", line 246, in main
    result = converter.convert(
  File "/root/miniconda3/envs/docvert/lib/python3.8/site-packages/unoserver/converter.py", line 186, in convert
    raise RuntimeError(
RuntimeError: Could not find an export filter from com.sun.star.text.TextDocument to graphic_HTML

but i can do libreoffice --headless --convert-to html html转换文档.doc succeed.

@regebro
Copy link
Member

regebro commented Jan 12, 2022

Could you share the document? It doesn't seem to be a generalproblem.

@lublak
Copy link

lublak commented Feb 17, 2022

@regebro i have the same issue with a pptx file.
Just create a simple pptx and use unoserver unoconvert somepresi.pptx somehtml.html.
Currently i can't send this file because there are some personal datas (author, last edit user etc.)
Could not find an export filter from com.sun.star.presentation.PresentationDocument to generic_HTML

@lublak
Copy link

lublak commented Feb 17, 2022

I can force with: filtername = "impress_html_Export"
But than i only get a single html file. It would be nice to support a folder export with a complete html structur. Or with embed images

@regebro
Copy link
Member

regebro commented Mar 16, 2022

@lublak Libreoffice sees presentations as a graphical format, and html as document format, so there wouldn't be much to convert at all. It can only do useful conversations of presentations to PDF, IMO.

What is it you are attempting to do?

@lublak
Copy link

lublak commented Mar 21, 2022

@regebro

I try to export presentation as full web pages. (And that automatically in the background.)

It is possible to convert a presentation to a complete html page via the interface. So LibreOffice already has the possibility. Only how it looks like via the command line I do not know.

Export:
grafik
as html
grafik
Standard-HTML
grafik
grafik

@regebro
Copy link
Member

regebro commented Mar 21, 2022

Yeah, that includes defining export formats, etc, which I don't know how to do. It's possible it can be done if we implement support for filter flags, but I'm not sure even then.

If you can figure out how to do it with LibreOffice from the command line, I can look at implementing support for that.

@regebro regebro added the enhancement New feature or request label Mar 21, 2022
@felixble
Copy link

felixble commented Sep 27, 2022

Hi @regebro,

first of all thanks a lot for your great work with this package!

We are having the same issue when using unoconvert to convert odt to html.

Executing unoconvert --convert-to html test.odt test.html causes the following error:

INFO:unoserver:Starting unoconverter.
INFO:unoserver:Opening //b577c1152d56441fa928bd54d914ee07.odt
Traceback (most recent call last):
  File "/usr/local/bin/unoconvert", line 10, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.7/dist-packages/unoserver/converter.py", line 248, in main
    inpath=args.infile, outpath=args.outfile, convert_to=args.convert_to
  File "/usr/local/lib/python3.7/dist-packages/unoserver/converter.py", line 186, in convert
    f"Could not find an export filter from {import_type} to {export_type}"
RuntimeError: Could not find an export filter from com.sun.star.text.TextDocument to graphic_HTML

We can do it with LibreOffice from the command line with the following command: soffice --headless --convert-to html test.odt.

This produces the following output on the cli: convert /data/test.odt -> /data/test.html using filter : HTML (StarWriter).

It looks like there is an issue when figuring out the correct filter in

def find_filter(self, import_type, export_type):

EDIT:
Setting the variable filtername hardcoded to "HTML (StarWriter)" with the following line in

filtername = self.find_filter(import_type, export_type)
works in our tests:
filtername = "HTML (StarWriter)"

Could you please have a look at this? Can this filter be added?

Thanks!

@asmundstavdahl
Copy link

Can be solved with #59

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants