This is a script to convert a web page to a PDF file. It utilizes Puppeteer and WeasyPrint for the conversion process.
When converting a URL to PDF, we use Puppeteer and WeasyPrint. Since WeasyPrint does not support JavaScript execution , we first fetch the loaded HTML, CSS, and content using Puppeteer. We then create an HTML file from that fetched content, preserving the same structure. Finally, we convert this HTML file to a PDF file using WeasyPrint.
To run this script, we need to install WeasyPrint and Puppeteer also we need path of chrome.exe
git clone https://github.com/B2-krunalrana/python_pdf_conversion.git
pip install WeasyPrint
pip install pyppeteer
pyppeteer: https://pypi.org/project/pyppeteer/
When dealing with images, we need to convert them into data URLs and then include them in HTML files. This helps improve the layout and ensures that everything looks right.
Image to data url : https://ezgif.com/image-to-datauri
64-bit Windows :
C:\Program Files (x86)\Google\Chrome\Application\chrome.exe
32-bit Windows :
C:\Program Files\Google\Chrome\Application\chrome.exe
Xubuntu 20.04 :
/opt/google/chrome/chrome
Just a heads up, we're avoiding CSS frameworks like Bootstrap that heavily depend on JavaScript. Sometimes, these frameworks can cause compatibility issues with WeasyPrint. Therefore, we're committed to steering clear of any frameworks, like Bootstrap, that rely on JavaScript to apply CSS styles.