Skip to content

Commit

Permalink
Merge branch 'main' of https://github.com/allenai/pdelfin into main
Browse files Browse the repository at this point in the history
  • Loading branch information
jakep-allenai committed Sep 23, 2024
2 parents a3feca0 + d589b56 commit 79feb98
Show file tree
Hide file tree
Showing 4 changed files with 1 addition and 1 deletion.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,7 @@ def main():
parser.add_argument("--glob_path", type=str, help="Local or S3 path glob (e.g., *.pdf or s3://bucket/pdfs/*.pdf).")
parser.add_argument("--path_list", type=str, help="Path to a file containing paths to PDFs, one per line.")
parser.add_argument("--num_sample_docs", type=int, default=5000, help="Number of PDF documents to sample.")
parser.add_argument("--first_n_pages", type=int, default=5, help="Always sample the first N pages of each PDF.")
parser.add_argument("--first_n_pages", type=int, default=0, help="Always sample the first N pages of each PDF.")
parser.add_argument("--max_sample_pages", type=int, default=15, help="Max number of pages to sample per PDF.")
parser.add_argument("--output", type=str, default="openai_batch_data", help="Output destination")
args = parser.parse_args()
Expand Down
File renamed without changes.
File renamed without changes.

0 comments on commit 79feb98

Please sign in to comment.