Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve IP Logging for Analytics #384

Closed
jakkuh opened this issue Mar 21, 2019 · 3 comments · Fixed by #432
Closed

Improve IP Logging for Analytics #384

jakkuh opened this issue Mar 21, 2019 · 3 comments · Fixed by #432
Assignees
Milestone

Comments

@jakkuh
Copy link
Contributor

jakkuh commented Mar 21, 2019

It'd be awesome if some work could be put into IP fetching, we currently see about a 30% average rate of "unknown" IPs.

I'm not sure how this is even possible as every request has to come from an IP.

Any insight?

@jakkuh jakkuh changed the title Improve IP Grabbing Improve IP Logging for Analytics Mar 21, 2019
@jakkuh
Copy link
Contributor Author

jakkuh commented Mar 21, 2019

Here's some additional information regarding how Cloudflare handles HTTP Request Headers in regards to IP addresses.

https://support.cloudflare.com/hc/en-us/articles/200170986-How-does-Cloudflare-handle-HTTP-Request-headers-

There's also a header which supplies the country code of the originating visitor, so maybe if the IP is unavailable for some reason, we can still log the Country via that?

@acelaya
Copy link
Member

acelaya commented Mar 22, 2019

Accurately detecting the location of 100% of visits is virtually impossible.

Sometimes there are IP addresses that cannot be found in the GeoIp database or the fallback geolocation API, or they can only be partially located (we have the country, but not the city).

Also, there are visitors which do not send any IP address. It is there when real users visit the link from a browser, but if you share a link on Twitter (for example), you will automatically see several visits from their crawlers which cannot be located. This applies to any other kind of web bot.

This kind of visits are currently included with the rest. You can determine how many visits are not coming from real users based on the browser's chart. Those that did not provide a user agent that can be matched to a standard browser, are probably not real users.

Then shlink makes use of this library in order to determine the address https://github.com/akrabat/ip-address-middleware

It checks for the most common and standard headers where IP addresses usually live, but those headers can be configured, so if cloudflare is using some non-standard header, it should be easy to include. I will take a look at the docs.

@acelaya
Copy link
Member

acelaya commented Aug 1, 2019

I will include this in next release, inspecting requests for the next list of headers.

$headersToInspect = [
    'X-Real-Ip',
    'CF-Connecting-IP',
    'True-Client-IP',
    'Forwarded',
    'X-Forwarded-For',
    'X-Forwarded',
    'X-Cluster-Client-Ip',
    'Client-Ip',
];

That should cover most of the cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants