Skip to content

CosmWasm security best practices

ekez edited this page Sep 28, 2022 · 11 revisions

The commandments:

  1. Don't deserialize into Addr.
  2. Build systems that can be upgraded.
  3. Design for failure.

Important things to know:

  1. Your attackers have unlimited capital.
  2. IBC is permissionless.
  3. Math is dangerous.
  4. Gas.

Before we get into this, I'd like to provide a prelude about our threat model:

DAO DAO DAOs manage millions of dollars. Every time you commit a line of code to our smart contracts you should feel comfortable wagering that money on it working.

I do not know of an environment more hostile to software than blockchains. From a security perspective they are a hellscape of instant finality, immutable code, and pseudonymity. On top of that, our code is open-source, and all of our APIs are callable by anyone with a private key. Our only saving graces are determinism and atomicity, though even that works against us when contracts lock.

This is not to say that we can not write safe code. It is to say: we are in a uniquely challenging environment, we must approach our work with the utmost care and diligence.

Don't deserialize into Addr

CosmWasm's Addr type is not validated during deserialization. This means that messages with the Addr type have unexpected deserialization properties. For example, the following JSON payload will successfully deserialize into the corresponding enum:

{ "set_address": { "addr": "🏴‍☠️" } }
pub enum ExecuteMsg {
    SetAddress { addr: Addr }
}

To avoid unvalidated addresses in state, the String type should be used and then validated.

pub enum ExecuteMsg {
    SetAddress { addr: String }
}
// In the message handler.
let addr = deps.api.addr_validate(&addr).traspose()?;

Note the usage of map and transpose above. Where clarity is not impacted, we prefer standard library functions to custom logic on types. The Rust Standard Library is an extremely well used, security and performance conscious piece of software.

Build systems that can be upgraded

We are not clairvoyant. The systems we design should reflect this and be upgradable.

In order of most to least invasive, there are three ways a smart contract can be upgraded:

  1. A CosmWasm migration
  2. A config change
  3. A parent module swaps out a child module

When designing a system, choose the smallest hammer to meet your needs.

State minimization and module swap example

As an example of a module upgrade: say we're building a cw721-send rate limiter. Contract's that would like to use the rate limiter implement a proxy interface so they may receive the proxied NFTs, and can swap out and disable their proxy.

How do we allow users of this rate limiter to change the rate limit? We could add a contract-level admin, messages for nominating and confirming a new admin, and messages for updating the config.

To avoid this complexity, we may instead encourage consumers of our rate limited NFTs to create and use a new rate limiter. Now, to update the rate limit a consumer just removes the current rate limiter and installs a new one. This makes our rate limiting contract immutable without sacrificing functionality.

Design for failure

We are fallible. We should build systems that expect modules to fail and can detect and recover from those failures.

For example:

  1. Receivers of proposal module hooks are removed of they fail handling a message to stop the parent module from locking.
  2. Our proposal deposit system allows anyone to create proposals if an error occurs during deposit refund logic.
  3. A bridge may allow withdrawals of NFTs if it detects its counterparty on another chain has failed.

When writing tests, write tests that test functionality, then write more tests that test contract behavior under duress. Example.

Your attackers have unlimited capital

Juno may not have flash loans yet, but it will or I will die trying. Once this happens, we must design our systems as if attackers have unlimited capital for a single transaction.

If you'd like to protect against flash loans, make sure that there is no way that, in a single transaction, tokens can be purchased on an AMM and interact with your system.

For example, in DAO DAO there is a one block delay between when tokens are sent to the staking contract and when they are counted towards the staked token balance. This protects against flash loans as there is no way tokens can be purchased and used to vote in one transaction.

IBC is permissionless

For contract authors

In the parlance of networking, your IBC contracts have no firewalls and all their ports open. Anyone can attempt to connect to you at any time and you must either allow them to, or otherwise capitulate to an admin-managed firewall.

Should you choose the hero's journey and embrace permissionlessness, know that:

  1. Attackers can spin up chains with malicious-but-compliant state and connect to you. For example if you are a implementing a IBC token transfer protocol, an attacker could spin up a duplicate of another chain that you are connected to and send you messages to release assets belonging to the original chain. This is a completely valid action. In IBC, anyone can do whatever the hell they want. You must mint new debt vouchers for the exact duplicate tokens that can be redeemed on the exact duplicate chain.
  2. The other side of your connection may close the connection without your consent. You will discover this in the CloseConfirm message handler. You have one opportunity to react before everything sent over that channel is permanently locked.

With (1) in mind, here's an IBC packet with the fields that an attacker can set to arbitrary values labeled:

pub struct IbcPacket {
    /// Attacker controlled.
    pub data: Binary,
    /// Attacker controlled.
    pub src: IbcEndpoint,
    /// Trusted.
    pub dest: IbcEndpoint,
    /// Trusted.
    pub sequence: [u64,
    /// Attacker controlled.
    pub timeout: IbcTimeout,
}

Attacker controlled does not mean anarchy! It means that an attacker can spin up a chain such-that those fields are filled with arbitrary, but valid data.

For relayers

CosmWasm contracts can create infinite loops over IBC by calling back into themselves in their ACK handlers. I have written a proof-of-concept of this here. If you are relaying packets between two CosmWasm contracts, know that it is possible that those contracts will enter an infinite loop and cost you a large amount of money in fees.

Math is dangerous

If you are not a maniac and disable it, your CosmWasm contract will panic if an integer overflow or divide by zero occurs. This makes overflows an excellent way to lock a contract. Watch out for math.

To mitigate these risks, make use of CosmWasm's Uint256 and Uint512 types. Use methods like full_mul to do math that will not overflow, and then write "proofs" as to why you can reduce the value to a Uint128 at the end of the operation. For example, from our contracts:

        // It is impossible for this to overflow as total rewards can never exceed max value of
        // Uint128 as total tokens in existence cannot exceed Uint128
        let numerator = reward_config
            .reward_rate
            .full_mul(Uint128::from(
                last_time_reward_applicable - last_update_block,
            ))
            .checked_mul(scale_factor())?;
        let denominator = Uint256::from(total_staked);
        numerator.checked_div(denominator)?

Gas

If your contract gets itself into a state where it always runs out of gas, it is locked. There is no catching this as a submessage failure and reacting. Running out of gas is bad. That said, CosmWasm has some truly stupidly high gas limits. Here is a chart showing the highest computable fibonacci number on a couple blockchains compared to Cloudflare Workers:

image

Here is a transaction hash on the uni-5 Juno testnet that works fine despite recursively dispatching 720 submessages:

611DE7582D170D299B80B2492AB7711D60BA9EE9A6EFF259E4C7E545D795BBA9

The methodology for these can be found here and here.

Point is, CosmWasm has really high gas limits. That said, you still need to be careful.

  1. Avoid operations that are <= O(N) and can not be paginated. Formally speaking, say gas usage scales linearly with computational complexity, that is to say gas_used(complexity) = G * complexity. We then define L := gas_limit / G and say that a transaction succeeded if complexity(TX) <= L. For contracts, it is OK for there to exist N operations such that complexity(N) > L so long as the gas cost of a single arbitrary operation A satisfies complexity(A) <= L. Less formally, paginate your queries and watch out for O(N)+!
  2. You can store more in state in a single transaction than you can query in a single query. If you have state that you would like to be queried, you must cap it's size or otherwise risk complexity(A) > L. Here's an example of doing this from our contracts:
    let proposal_size = cosmwasm_std::to_vec(&proposal)?.len() as u64;
    if proposal_size > MAX_PROPOSAL_SIZE {
        return Err(ContractError::ProposalTooLarge {
            size: proposal_size,
            max: MAX_PROPOSAL_SIZE,
        });
    }