Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python]使用 Python 合并多个 PDF #2

Open
yangruihan opened this issue Aug 1, 2020 · 1 comment
Open

[Python]使用 Python 合并多个 PDF #2

yangruihan opened this issue Aug 1, 2020 · 1 comment
Labels

Comments

@yangruihan
Copy link
Owner

使用 Python 合并多个 PDF

帮朋友将多个pdf合并成一个,在线工具要么收费,要么不好用,于是自己动手用 python 写一个合并的脚本

人生苦短,我用 python


开发环境

  • macOS 10.14.6
  • python 3.7.7

安装 pypdf2

pip3 install pypdf2

编写脚本

脚本内容如下,非常好理解了

#!/usr/bin/env python3
#-*- coding:utf-8 -*-

import os
from PyPDF2 import PdfFileReader, PdfFileWriter

# 合并 pdf
# file_list 待合并文件列表
# output_path 输出路径
def merge_pdf(file_list, output_path):
    # 新建一个pdf输出对象
    output = PdfFileWriter()
    output_pages = 0

    # 遍历待合并文件列表,将每个pdf page加到输出对象上
    for pdf_file in file_list:
        print(f"start merge {pdf_file} ...")
        
        # 读取待合并文件
        ipdf = PdfFileReader(open(pdf_file, 'rb'))
        
        # 读取待合并文件页数
        page_cnt = ipdf.getNumPages()
        output_pages += page_cnt
        
        print(f"- read pdf page {page_cnt}")

        # 遍历待合并文件页,添加到输出对象上
        for p in range(page_cnt):
            output.addPage(ipdf.getPage(p))

    # 输出pdf
    output_stream = open(output_path, 'wb')
    output.write(output_stream)
    output_stream.close()

    print(f"finish merge, sum page {output_pages}")

def main():
    files = []

    # 遍历当前目录,将当前目录下后缀为.pdf的文件添加到待合并文件列表
    for root, dirs, fs in os.walk(".", topdown=False):
        for file in fs:
            if file.endswith(".pdf"):
                files.append(os.path.join(root, file))

    if len(files) > 0:
        files = sorted(files)
        print(f"ready to merge {len(files)} files:")
        for f in files:
            print(f"-- {f}")

        merge_pdf(files, "output.pdf")

if __name__ == "__main__":
    main()
@yangruihan
Copy link
Owner Author

pdfmerge.py.zip

@yangruihan yangruihan changed the title 使用 Python 合并多个 PDF [Python]使用 Python 合并多个 PDF Mar 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant