Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: initial update to the changelog for 0.5.0 #6977

Merged
merged 1 commit into from
Apr 7, 2020
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
327 changes: 327 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,332 @@
# go-ipfs changelog

## 0.5.0 RC1 2020-04-06

**WARNING** THIS IS A DRAFT! It highlights some of the new features, along with
a bunch of important errata. But it's _definitely_ missing a _ton_ of shiny new
features.
Comment on lines +6 to +7
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the word "missing" - it took me 3 reads to understand it is the doc that is missing features, and not the RC itself that got cut back.

Suggested change
a bunch of important errata. But it's _definitely_ missing a _ton_ of shiny new
features.
a bunch of important errata. But it _definitely_ does not come close
to fully describe the _ton_ of shiny new features.


### Highlights & Errata

This release includes many important changes users should be aware of.

#### New DHT

This release includes an almost completely rewritten DHT implementation with a
new protocol version. From a user's perspective, providing content, finding
content, and resolving IPNS records should simply get faster. However, this is a
_significant_ (albeit well tested) change and significant changes are always
risky, so heads up.

##### Old v. New

The current DHT suffers from three core issues addressed in this release:

1. Most peers in the DHT cannot be dialed (e.g., due to firewalls and NATs).
Much of a DHT query time is wasted trying to connect to peers that cannot be
reached.
2. The DHT query logic doesn't properly terminate when it hits the end of the
query and, instead, aggressively keeps on searching.
3. The routing tables are poorly maintained. This can cause a search that should
be logarithmic in the size of the network to be linear.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
be logarithmic in the size of the network to be linear.
scale logarithmicaly with the network size, to slow down linearly instead.


###### Reachable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: for an operator this section reads as "ugh... I might not like that". I would recommend reordering the reachability concern to be last: this way one reads about 2 no brainer good solutions first, and then encounters this section.


We have addressed the problem of undialable nodes by having nodes wait to join
the DHT as "server" nodes until they've confirmed that they are reachable from
the public internet. Additionally, we've introduced:

* A new libp2p protocol to push updates to our peers when we start/stop listen
on protocols.
* A libp2p event bus for processing updates like these.
* A new DHT protocol version. New DHT nodes will not admit old DHT nodes into
their routing tables. Old DHT nodes will still be able to issue queries
against the new DHT, they just won't be queried or referred by new DHT nodes.
This way, old, potentially unreachable nodes with bad routing tables won't
pollute the new DHT.

Unfortunately, there's a significant downside to this approach: VPNs, offline
LANs, etc. where _all_ nodes on the network have private IP addresses and never
communicate over the public internet. In this case, none of these nodes would be
"publicly reachable".

To address this last point, go-ipfs 0.5.0 will run _two_ DHTs: one for private
networks and one for the public internet. That is, every node will participate
in a LAN DHT and a public WAN DHT.

**RC1 NOTE:** Several of these features have not been enabled in RC1:

1. We haven't yet switched the protocol version and will be running the DHT in
"compatibility mode" with the old DHT. Once we flip the switch and enable the
new protocol version, we will need to ensure that at least 20% of the
publicly reachable DHT speaks the new protocol, all at once. The plan is to
introduce a large number of "booster" nodes while the network transitions.
2. We haven't yet introduced the split LAN/WAN DHTs. We're still testing this
approach and considering alternatives.
3. Because we haven't introduced the LAN/WAN DHT split, IPFS nodes running in
DHT server mode will continue to run in DHT server mode _without_ waiting to
confirm that they're reachable from the public internet. Otherwise, we'd
break IPFS nodes running DHTs in VPNs and disconnected LANs.

###### Query Logic

We've fixed the DHT query logic by correctly implementing Kademlia (with a few
tweaks). This should significantly speed up:

* Publishing IPNS & provider records. We previously continued searching for
closer and closer peers to the "target" until we timed out, then we put to the
closest peers we found.
* Resolving IPNS addresses. We previously continued IPNS record searches until
we ran out of peers to query, timed out, or found 16 records.

In both cases, we now continue till we find the closest peers then stop.

###### Routing Tables

Finally, we've addressed the poorly maintained routing tables by:

* Reducing the likelihood that the connection manager will kill connections to
peers in the routing table.
* Keeping peers in the routing table, even if we get disconnected from them.
* Actively and frequently querying the DHT to keep our routing table full.

##### Testing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should go to the top somewhere - not-so-technical operators want to know "we've done our homework" first and foremost


The DHT rewrite was made possible by our new testing framework,
[testground](https://github.com/ipfs/testground), which allows us to spin up
multi-thousand node tests with simulated real-world network conditions. With
testground and some custom analysis tools, we were able to gain confidence that
the new DHT implementation behaves correctly.

#### Refactored Bitswap

This release includes a _major_ [bitswap refactor][bitswap-refactor] running a
new, but backwards compatible, bitswap protocol. We expect these changes to
improve performance significantly.

With the refactored bitswap, we expect:

* Few to no duplicate blocks when fetching data from other nodes speaking the
_new_ protocol.
* Better parallelism when fetching from multiple peers.

Note, the new bitswap won't magically make downloading content any faster until
both seeds and leaches have updated. If you're one of the first to upgrade to
0.5.0 and try downloading from peers that haven't upgraded, you're unlikely to
see much of a performance improvement, if any.

[bitswap-refactor]: https://blog.ipfs.io/2020-02-14-improved-bitswap-for-container-distribution/

#### Provider Record Changes

When you add content to your IPFS node, you advertise this content to the
network by announcing it in the DHT. We call this "providing".

However, go-ipfs has multiple ways to address the same underlying bytes.
Specifically, we address content by content ID (CID) and the same underlying
bytes can be addressed using (a) two different versions of CIDs (CIDv1 and
CIDv2) and (b) with different "codecs" depending on how we're interpreting the
Comment on lines +127 to +128
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bytes can be addressed using (a) two different versions of CIDs (CIDv1 and
CIDv2) and (b) with different "codecs" depending on how we're interpreting the
bytes can be addressed using (a) two different versions of CIDs (CIDv0 and
CIDv1) and (b) with different "codecs" depending on how we're interpreting the

data.

Prior to go-ipfs 0.5.0, we used the content id (CID) in the DHT when sending out
provider records for content. Unfortunately, this meant that users trying to
find data announced using one CID wouldn't find nodes providing the content
under a different CID.

In go-ipfs 0.5.0, we're announcing data by _multihash_, not _CID_. This way,
regardless of the CID version used by the peer adding the content, the peer
trying to download the content should still be able to find it.

**Warning:** as part of the network, this could impact finding content added
with CIDv1. Because go-ipfs 0.5.0 will announce and search for content using the
bare multihash (equivalent to the v0 CID), go-ipfs 0.5.0 will be unable to find
CIDv1 content published by nodes prior to go-ipfs 0.5.0 and vice-versa. As CIDv1
is _not_ enabled by default so we believe this will have minimal impact.
However, users are _strongly_ encouraged to upgrade as soon as possible.

#### IPFS/Libp2p Address Format

If you've ever run a command like `ipfs swarm peers`, you've likely seen paths
that look like `/ip4/193.45.1.24/tcp/4001/ipfs/QmSomePeerID`. These paths are
_not_ file paths, they're multiaddrs; addresses of peers on the network.

Unfortunately, `/ipfs/Qm...` is _also_ the same path format we use for files.
This release, changes the multiaddr format from
<code>/ip4/193.45.1.24/tcp/4001/<b>ipfs</b>/QmSomePeerID</code> to
<code>/ip4/193.45.1.24/tcp/4001/<b>p2p</b>/QmSomePeerID</code> to make the
distinction clear.

What this means for users:

* Old-style multiaddrs will still be accepted as inputs to IPFS.
* If you were using a multiaddr library (go, js, etc.) to name _files_ because
`/ipfs/QmSomePeerID` looks like `/ipfs/QmSomeFile`, your tool may break if you
upgrade this library.
* If you're manually parsing multiaddrs and are searching for the string
`/ipfs/`..., you'll need to search for `/p2p/...`.


#### Minimum RSA Key Size

Previously, IPFS did not enforce a minimum RSA key size. In this release, we've
introduced a minimum 2048 bit RSA key size. IPFS generates 2048 bit RSA keys by
default so this shouldn't be an issue for anyone in practice. However, users who
explicitly chose a smaller key size will not be able to communicate with new
nodes.

Unfortunately, the some of the bootstrap peers _did_ intentionally generate 1024
bit RSA keys so they'd have vanity peer addresses (starting with QmSoL for
"solar net"). All IPFS nodes should _also_ have peers with >= 2048 bit RSA keys
in their bootstrap list, but we've introduced a migration to ensure this.

We implemented this change to follow security best practices and to remove a
potential foot-gun. However, in practice, the security impact of allowing
insecure RSA keys should have been next to none because IPFS doesn't trust other
peers on the network anyways.

#### Subdomain Gateway

The gateway will redirect from `http://localhost:5001/ipfs/CID/...` to
`http://CID.ipfs.localhost:5001/...` by default. This will:

* Ensure that every dapp gets its own browser origin.
* Make it easier to write websites that "just work" with IPFS because absolute
paths will now work.

Paths addressing the gateway by IP address (`http://127.0.0.1:5001/ipfs/CID`)
will not be altered as IP addresses can't have subdomains.

Note: cURL doesn't follow redirects by default. To avoid breaking cURL and other
clients that don't support redirects, go-ipfs will return the requested file
along with the redirect. Browsers will follow the redirect and abort the
download while cURL will ignore the redirect and finish the download.

#### TLS By Default

In this release, we're switching TLS to be the _default_ transport. This means
we'll try to encrypt the connection with TLS before re-trying with SECIO.

Contrary to the announcement in the go-ipfs 0.4.23 release notes, this release
does not remove SECIO support to maintain compatibility with js-ipfs.

#### SECIO Deprecation Notice

SECIO should be considered to be well on the way to deprecation and will be
completely disabled in either the next release (0.6.0, ~mid May) or the one
following that (0.7.0, ~end of June). Before SECIO is disabled, support will be
added for the NOISE transport for compatibility with other IPFS implementations.

#### QUIC Upgrade

If you've been using the experimental QUIC support, this release includes

**RC1 NOTE:** We've temporarily backed out of the new QUIC version because it
currently requires go 1.14 and go 1.14 has some scheduler bugs that go-ipfs can
reliably trigger.

#### Badger Datastore

In this release, we're calling the badger datastore (enabled at initialization
with `ipfs init --profile=badgerds`) as stable. However, we're not yet enabling
it by default.

The benefit of badger is that adding/fetching data to/from badger is
_significantly_ faster than adding/fetching data to/from the default datastore,
flatfs. In some tests, adding data to badger is 32x faster than flatfs (in this
release).

However,

1. Badger is complicated while flatfs pushes all the complexity down into the
filesystem itself. That means that flatfs is only likely to loose your data
if your underlying filesystem gets corrupted while there are more
opportunities for badger itself to get corrupted.
2. Badger can use a lot of memory. In this release, we've tuned badger to use
very little (~20MiB) of memory by default. However, it can still produce
large (1GiB) spikes in memory usage when garbage collecting.
3. Badger isn't very aggressive when it comes to garbage collection and we're
still investigating ways to get it to more aggressively clean up after
itself.

TL;DR: Use badger if performance is your main requirement, you rarely/never
delete anything, and you have some memory to spare.

#### Systemd Support

For Linux users, this release includes support for two systemd features: socket
activation and startup/shutdown notifications. This makes it possible to:

* Start IPFS on demand on first use.
* Wait for IPFS to finish starting before starting services that depend on it.

You can find the new systemd units in the go-ipfs repo under misc/systemd.

#### IPFS API Over Unix Domain Sockets

This release supports exposing the IPFS API over a unix domain socket in the
filesystem. You use this feature, run:

```bash
> ipfs config Addresses.API "/unix/path/to/socket/location"
```

#### Repo Migration

IPFS uses repo migrations to make structural changes to the "repo" (the config,
data storage, etc.) on upgrade.

This release includes two very simple repo migrations: a config migration to
ensure that the config contains working bootstrap nodes and a keystore migration
to base32 encode all key filenames.

In general, migrations should not require significant manual intervention.
However, you should be aware of migrations and plan for them.

* If you update go-ipfs with `ipfs update`, `ipfs update` will run the migration
for you.
* If you start the ipfs daemon with `ipfs daemon --migrate`, ipfs will migrate
your repo for you on start.

Otherwise, if you want more control over the repo migration process, you can
manually install and run the [repo migration
tool](http://dist.ipfs.io/#fs-repo-migrations).

#### Bootstrap Peer Changes

**AUTOMATIC MIGRATION REQUIRED**

The first migration will update the bootstrap peer list to:

1. Replace the old bootstrap nodes (ones with peer IDs starting with QmSoL),
with new bootstrap nodes (ones with addresses that start with
`/dnsaddr/bootstrap.libp2p.io`.
2. Rewrite the address format from `/ipfs/QmPeerID` to `/p2p/QmPeerID`.

We're migrating addresses for a few reasons:

1. We're using DNS to address the new bootstrap nodes so we can change the
underlying IP addresses as necessary.
2. The new bootstrap nodes use 2048 bit keys while the old bootstrap nodes use
1024 bit keys.
3. We're normalizing the address format to `/p2p/Qm...`.

Note: This migration won't _add_ the new bootstrap peers to your config if
you've explicitly removed the old bootstrap peers. It will also leave custom
entries in the list alone. In other words, if you've customized your bootstrap
list, this migration won't clobber your changes.

#### Keystore Changes

**AUTOMATIC MIGRATION REQUIRED**

Go-IPFS stores additional keys (i.e., all keys other than the "identity" key) in
the keystore. You can list these keys with `ipfs key`.

Currently, the keystore stores keys as regular files, named after the key
itself. Unfortunately, filename restrictions and case-insensitivity are platform
specific. To avoid platform specific issues, we're base32 encoding all key names
and renaming all keys on-disk.

## 0.4.23 2020-01-29

Given the large number of fixes merged since 0.4.22, we've decided to cut another patch release.
Expand Down