Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP 431 errors on very large diagrams for svg generation #12

Closed
ryepup opened this issue Dec 1, 2020 · 9 comments
Closed

HTTP 431 errors on very large diagrams for svg generation #12

ryepup opened this issue Dec 1, 2020 · 9 comments
Labels
wontfix This will not be worked on

Comments

@ryepup
Copy link

ryepup commented Dec 1, 2020

With very large diagrams I'm getting back "431 Request Header Fields Too Large". I have some code generating diagrams from various data sources, and am running into this pathological case.

Reproduction case

  • generate a very large diagram that's not very compressible, here's an example script:
#!/bin/bash

echo "graph LR" > poison.mermaid
echo "  target" >> poison.mermaid

for i in {1..500}
do
    id=$(cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 32 | head -n 1)
    echo "node$i[\"$id\"] --> target" >> poison.mermaid
done
  • copy the contents of that graph into the mermaid live editor
  • verify the big graph gets rendered
  • click "Link to SVG" (or copy the link and paste the URL into a locally running version of mermaid.ink)

Expected behavior: the same graph is rendered

Actual behavior: 431 Request Header Fields Too Large

Analysis

This 431 response is being returned from inside node's http module. Changes made in nodejs to resolve Denial of Service with large HTTP headers (CVE-2018-12121) introduced the --max-http-header-size setting, which defaults to 16KB. If the url is too big then the underlying http server rejects the HTTP request with a 431.

Workarounds

  • if running locally, add --max-http-header-size to the start script in package.json
    • e.g. "start": "node --max-http-header-size=102400000 src/index.js"
  • if running via docker, use NODE_OPTIONS to increase --max-http-header-size
    • e.g. docker run --rm -it -e 'NODE_OPTIONS="--max-http-header-size=102400000"' -p 3000:3000 jihchi/mermaid.ink
@ryepup
Copy link
Author

ryepup commented Dec 1, 2020

I'm not really sure if a code change is desirable here; the public mermaid.ink service probably wants to keep max-http-header-size low to prevent DOS attacks, maybe just something in the readme?

@jihchi
Copy link
Owner

jihchi commented Dec 2, 2020

(attached generated file in case someone wanted to give it a try: poison.mermaid.zip)

mermaid.ink

@jihchi
Copy link
Owner

jihchi commented Dec 2, 2020

Appreciate your thorough investigation!

Yes, I would like to keep max-http-header-size low to prevent DOS attacks.

I'll mention it in the readme!

@jihchi
Copy link
Owner

jihchi commented Dec 4, 2020

I'm thinking to provide another way to use marmaid.ink for large diagram, one of ways could support external URL like gist, for example:

https://gist.github.com/jihchi/e1422b9bcd0ec9f89e3afa1629939402

when you access https://mermaid.ink/{img,svg}/gist/e1422b9bcd0ec9f89e3afa1629939402, mermaid.ink will try to fetch the file called graph.mermaid from the gist and generate an image based on the content of graph.mermaid

@ryepup
Copy link
Author

ryepup commented Dec 4, 2020

That's an interesting idea! My automation is trapped in a corporate network, but that might be easier for a lot of people.

We could generalize that to fetching from anywhere (e.g. https://mermaid.ink/{img,svg}?src=https://...) but that might create a different attack vector.

For gists, it'd be nice to have as few constraints as possible for the consumers. What do you think about iterating through all the files in the gist, and rendering the first one that parses successfully? I figure the most common use case would be a gist with one file that is mermaid graph, and working without requiring a specific name would be nice for consumers.

Another thought: what if we also supported POST https://mermaid.ink/{img,svg} with the graph in the POST body? That should let us keep max-http-header-size as-is and give an option for the rare cases of big graphs. I'm unclear of the economics behind https://mermaid.ink; is it OK to encourage more compute work there, or try to guide people with extreme graphs to the run it in their own docker?

@jihchi
Copy link
Owner

jihchi commented Dec 5, 2020

We could generalize that to fetching from anywhere (e.g. https://mermaid.ink/{img,svg}?src=https://...) but that might create a different attack vector.

Yes, I prefer mermaid.ink itself as a public service only supports specific external services.

For gists, it'd be nice to have as few constraints as possible for the consumers. What do you think about iterating through all the files in the gist, and rendering the first one that parses successfully? I figure the most common use case would be a gist with one file that is mermaid graph, and working without requiring a specific name would be nice for consumers.

I'm a bit concerned for iterating through all the files in the gist, if something goes wrong, it might be problematic for the service. If the most common use case would be a gist with one file, how about we just take the first content of the gist and render that one?

Another thought: what if we also supported POST https://mermaid.ink/{img,svg} with the graph in the POST body? That should let us keep max-http-header-size as-is and give an option for the rare cases of big graphs. I'm unclear of the economics behind https://mermaid.ink; is it OK to encourage more compute work there, or try to guide people with extreme graphs to the run it in their own docker?

Supporting POST https://mermaid.ink/{img,svg} is a good idea. It is OK for mermaid.ink to do more compute work.

@jihchi
Copy link
Owner

jihchi commented Jan 19, 2022

This issue could be mitigated by #32

@PeppeL-G
Copy link

I was wondering how large diagrams mermaid.ink could handle (to see if I can use it on my website, or if I would run into this issue), so I did some tests. Using the test script in first post, I found that it works with 325 nodes, but not with 375 nodes. To get a better understanding of how large the code/how big the diagram is with 325 nodes, see:

I don't think this will ever be a problem for me. Thanks for a very useful service!

@stale
Copy link

stale bot commented Sep 7, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix This will not be worked on label Sep 7, 2022
@stale stale bot closed this as completed Sep 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

3 participants