Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed dependency installing on Mac #70

Open
abraccini77 opened this issue Mar 1, 2025 · 1 comment
Open

Failed dependency installing on Mac #70

abraccini77 opened this issue Mar 1, 2025 · 1 comment
Labels
bug Something isn't working

Comments

@abraccini77
Copy link

🐛 Describe the bug

I am trying to install olmOCR on mac and I get this error:

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
chromadb 0.5.23 requires tokenizers<=0.20.3,>=0.13.2, but you have tokenizers 0.21.0 which is incompatible.

I have tried to uninstall the version I had and install one within the requested version range. However, then I get the following error:

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
transformers 4.49.0 requires tokenizers<0.22,>=0.21, but you have tokenizers 0.20.3 which is incompatible.
cached-path 1.6.7 requires huggingface-hub<0.28.0,>=0.8.1, but you have huggingface-hub 0.29.1 which is incompatible.

It appears the requirements for tokenizers by olmOCR and by transformers are impossible to satisfy at the same time. Any way to bypass this issue?

Versions

Python 3.11.11
ace_tools==0.0
aiohappyeyeballs==2.4.6
aiohttp==3.11.13
aiosignal==1.3.2
alembic==1.14.1
altair==5.5.0
annotated-types==0.7.0
anyio==4.8.0
asgiref==3.8.1
attrs==25.1.0
babel==2.17.0
backoff==2.2.1
bcrypt==4.2.1
beaker-py==1.34.1
beautifulsoup4==4.12.3
bibtexparser==2.0.0b8
bleach==6.2.0
blinker==1.9.0
boto3==1.37.4
botocore==1.37.4
build==1.2.2.post1
cached_path==1.6.7
cachetools==5.5.2
certifi==2025.1.31
cffi==1.17.1
charset-normalizer==3.4.1
chroma-hnswlib==0.7.6
chromadb==0.5.23
click==8.1.8
clldutils==3.24.1
cohere==5.13.12
colorama==0.4.6
coloredlogs==15.0.1
colorlog==6.9.0
cryptography==44.0.1
csvw==3.5.1
dataclasses-json==0.6.7
Deprecated==1.2.18
distro==1.9.0
dlinfo==2.0.0
docker==7.1.0
docstring_parser==0.16
durationpy==0.9
embedchain==0.1.127
fastapi==0.115.8
fastavro==1.10.0
filelock==3.17.0
flatbuffers==25.2.10
frozendict==2.4.6
frozenlist==1.5.0
fsspec==2025.2.0
ftfy==6.3.1
gitdb==4.0.12
GitPython==3.1.44
google-api-core==2.24.1
google-auth==2.38.0
google-cloud-aiplatform==1.82.0
google-cloud-bigquery==3.29.0
google-cloud-core==2.4.2
google-cloud-resource-manager==1.14.1
google-cloud-storage==2.19.0
google-crc32c==1.6.0
google-resumable-media==2.7.2
googleapis-common-protos==1.68.0
gptcache==0.1.44
grpc-google-iam-v1==0.14.0
grpcio==1.71.0rc2
grpcio-status==1.71.0rc2
grpcio-tools==1.70.0
h11==0.14.0
h2==4.2.0
hpack==4.1.0
html5lib==1.1
httpcore==1.0.7
httptools==0.6.4
httpx==0.28.1
httpx-sse==0.4.0
huggingface-hub==0.27.1
humanfriendly==10.0
hyperframe==6.1.0
idna==3.10
importlib_metadata==8.5.0
importlib_resources==6.5.2
inflect==7.5.0
isodate==0.7.2
Jinja2==3.1.5
jiter==0.8.2
jmespath==1.0.1
joblib==1.4.2
jsonpatch==1.33
jsonpointer==3.0.0
jsonschema==4.23.0
jsonschema-specifications==2024.10.1
kanjize==1.6.0
kubernetes==32.0.1
langchain==0.3.19
langchain-cohere==0.3.5
langchain-community==0.3.18
langchain-core==0.3.40
langchain-experimental==0.3.4
langchain-openai==0.2.14
langchain-text-splitters==0.3.6
langsmith==0.1.147
language-tags==1.2.0
lingua-language-detector==2.0.2
lxml==5.3.0
Mako==1.3.9
Markdown==3.7
markdown-it-py==3.0.0
markdown2==2.5.3
MarkupSafe==3.0.2
marshmallow==3.26.1
mdurl==0.1.2
mem0ai==0.1.56
mmh3==5.1.0
monotonic==1.6
more-itertools==10.6.0
mpmath==1.3.0
multidict==6.1.0
multitasking==0.0.11
mypy-extensions==1.0.0
narwhals==1.28.0
networkx==3.4.2
numpy==2.1.3
oauthlib==3.2.2
ollama==0.4.7
-e git+https://github.com/allenai/olmocr.git@701abdb95525dbbfe75c2fc288df90bbea080043#egg=olmocr
onnxruntime==1.20.1
openai==1.65.2
opentelemetry-api==1.30.0
opentelemetry-exporter-otlp-proto-common==1.30.0
opentelemetry-exporter-otlp-proto-grpc==1.30.0
opentelemetry-instrumentation==0.51b0
opentelemetry-instrumentation-asgi==0.51b0
opentelemetry-instrumentation-fastapi==0.51b0
opentelemetry-proto==1.30.0
opentelemetry-sdk==1.30.0
opentelemetry-semantic-conventions==0.51b0
opentelemetry-util-http==0.51b0
orjson==3.10.15
overrides==7.7.0
packaging==24.2
pandas==2.2.3
peewee==3.17.8
phonemizer==3.3.0
pillow==11.1.0
platformdirs==4.3.6
portalocker==2.10.1
posthog==3.16.0
propcache==0.3.0
proto-plus==1.26.0
protobuf==5.29.3
pyarrow==19.0.1
pyasn1==0.6.1
pyasn1_modules==0.4.1
pycparser==2.22
pydantic==2.10.6
pydantic-settings==2.8.1
pydantic_core==2.27.2
pydeck==0.9.1
Pygments==2.19.1
pylatexenc==2.10
pyparsing==3.2.1
pypdf==5.3.0
PyPDF2==3.0.1
pypdfium2==4.30.1
PyPika==0.48.9
pyproject_hooks==1.2.0
pysbd==0.3.4
python-dateutil==2.9.0.post0
python-dotenv==1.0.1
python-pptx==1.0.2
pytz==2024.2
PyYAML==6.0.2
qdrant-client==1.13.2
rdflib==7.1.3
referencing==0.36.2
regex==2024.11.6
requests==2.32.3
requests-oauthlib==2.0.0
requests-toolbelt==1.0.0
rfc3986==1.5.0
rich==13.9.4
rpds-py==0.22.3
rsa==4.9
s3transfer==0.11.3
safetensors==0.5.2
schema==0.7.7
segments==2.2.1
setuptools==75.8.0
shapely==2.0.7
shellingham==1.5.4
six==1.16.0
smart-open==7.1.0
smmap==5.0.2
sniffio==1.3.1
soupsieve==2.6
SQLAlchemy==2.0.38
starlette==0.45.3
streamlit==1.42.2
streamlit-chat==0.1.1
SudachiDict-full==20250129
SudachiPy==0.6.10
sympy==1.13.1
tabulate==0.9.0
tenacity==9.0.0
tiktoken==0.7.0
tokenizers==0.21.0
toml==0.10.2
torch==2.6.0
torchaudio==2.6.0
torchvision==0.21.0
tornado==6.4.2
tqdm==4.67.1
transformers==4.49.0
typeguard==4.4.1
typer==0.15.1
types-requests==2.32.0.20241016
typing-inspect==0.9.0
typing_extensions==4.12.2
tzdata==2024.2
uritemplate==4.1.1
urllib3==2.3.0
uvicorn==0.34.0
uvloop==0.21.0
watchfiles==1.0.4
wcwidth==0.2.13
webencodings==0.5.1
websocket-client==1.8.0
websockets==15.0
wrapt==1.17.2
XlsxWriter==3.2.1
yarl==1.18.3
yfinance==0.2.50
yt-dlp==2025.1.26
zipp==3.21.0
zstandard==0.23.0

@abraccini77 abraccini77 added the bug Something isn't working label Mar 1, 2025
@flywiththetide
Copy link

🔍 Issue Summary

The installation fails due to conflicting dependencies between:

  • Chromadb requiring tokenizers <= 0.20.3
  • Transformers requiring tokenizers >= 0.21
  • Huggingface-hub having its own constraints (<0.28.0)

This creates a dependency loop that makes it impossible to satisfy all versions simultaneously.


✅ Refined Solutions

1️⃣ Use a Virtual Environment (Recommended)
A virtual environment helps isolate dependencies and prevents conflicts:

python -m venv olmocr_env
source olmocr_env/bin/activate  # On macOS/Linux

Then install OlmOCR inside the environment.


2️⃣ Manually Install Compatible Dependency Versions
Try installing dependencies in an order that minimizes conflicts:

pip install "transformers==4.49.0" "tokenizers==0.20.3" "huggingface-hub<0.28.0"
  • This ensures transformers and Chromadb use compatible tokenizers versions.
  • The huggingface-hub version is constrained to avoid further conflicts.

3️⃣ Force Compatibility with pip’s Dependency Resolver
If the issue persists, let pip automatically resolve conflicts:

pip install --upgrade --no-cache-dir olmocr transformers tokenizers huggingface-hub --use-deprecated=legacy-resolver
  • --no-cache-dir prevents cached dependencies from interfering.
  • --use-deprecated=legacy-resolver allows pip to resolve dependencies without strict conflicts.

4️⃣ Alternative: Install OlmOCR Without Chromadb
Since Chromadb is the main dependency causing the issue, try installing OlmOCR without it:

pip install olmocr --no-deps
pip install "transformers==4.49.0" "tokenizers==0.20.3"
  • This installs OlmOCR separately while managing dependencies manually.
  • If Chromadb is necessary, install a version that doesn't conflict with tokenizers:
pip install "chromadb<0.5.0"

🚀 Next Steps

Let me know if any of these solutions work for you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants