-
Notifications
You must be signed in to change notification settings - Fork 144
CosmWasm security best practices
The commandments:
Important things to know:
Before we get into this, I'd like to provide a prelude about our threat model:
DAO DAO DAOs manage millions of dollars. Every time you commit a line of code to our smart contracts you should feel comfortable wagering that money on it working.
I do not know of an environment more hostile to software than blockchains. From a security perspective they are a hellscape of instant finality, immutable code, and pseudonymity. On top of that, our code is open-source, and all of our APIs are callable by anyone with a private key. Our only saving graces are determinism and atomicity, though even this works against us when a contract locks.
This is not to say that we can not write safe code. It is to say: we are in a uniquely challenging environment, we must approach our work with the utmost care and diligence.
CosmWasm's Addr
type is not validated during deserialization. This means that messages with the Addr
type have unexpected deserialization properties. For example, the following JSON payload will successfully deserialize into the corresponding enum:
{ "set_address": { "addr": "🏴☠️" } }
pub enum ExecuteMsg {
SetAddress { addr: Option<Addr> }
}
To avoid unvalidated addresses in state, the String
type should be used and then validated.
pub enum ExecuteMsg {
SetAddress { addr: Option<String> }
}
let addr = msg
.addr
.map(|human| deps.api.addr_validate(&human))
.transpose()?
Note the usage of map
and transpose
above. Where clarity is not impacted, we prefer standard library functions to custom logic on types. The Rust Standard Library is an extremely well used, security and performance conscious piece of software.
We are not clairvoyant. The systems we design should reflect this and be upgradable.
In order of most to least invasive, there are three ways a smart contract can be upgraded:
- A CosmWasm migration
- A config change
- A parent module swaps out a child module
When designing a system, choose the smallest hammer to meet your needs.
As an example of a module upgrade: say we're building a cw721-send rate limiter. Contract's that would like to use the rate limiter implement a proxy interface so they may receive the proxied NFTs, and can swap out and disable their proxy.
How do we allow users of this rate limiter to change the rate limit? We could add a contract-level admin, messages for nominating and confirming a new admin, and messages for updating the config.
To avoid this complexity, we may instead encourage consumers of our rate limited NFTs to create and use a new rate limiter. Now, to update the rate limit a consumer just removes the current rate limiter and installs a new one. This makes our rate limiting contract immutable without sacrificing functionality.
We are fallible. We should build systems that expect modules to fail and can detect and recover from those failures.
For example:
- Receivers of proposal module hooks are removed if they fail handling a message to stop the parent module from locking.
- Our proposal deposit system allows anyone to create proposals if an error occurs during deposit refund logic.
- A bridge may allow withdrawals of NFTs if it detects its counterparty on another chain has failed.
When writing tests, write tests that test functionality, then write more tests that test contract behavior under duress. Example.
Juno may not have flash loans yet, but it will or I will die trying. Once this happens, we must design our systems as if attackers have unlimited capital for a single transaction.
If you'd like to protect against flash loans, make sure that there is no way that, in a single transaction, tokens can be purchased on an AMM and interact with your system.
For example, in DAO DAO there is a one block delay between when tokens are sent to the staking contract and when they are counted towards the staked token balance. This protects against flash loans as there is no way tokens can be purchased and used to vote in one transaction.
In the parlance of networking, your IBC contracts have no firewalls and all their ports open. Anyone can attempt to connect to you at any time and you must either allow them to, or otherwise capitulate to an admin-managed firewall.
Should you choose the hero's journey and embrace permissionlessness, know that:
- Attackers can spin up chains with malicious-but-compliant state and connect to you. For example if you are a implementing a IBC token transfer protocol, an attacker could spin up a duplicate of another chain that you are connected to and send you messages to release assets belonging to the original chain. This is a completely valid action. In IBC, anyone can do whatever the hell they want. You must mint new debt vouchers for the exact duplicate tokens that can be redeemed on the exact duplicate chain.
- The other side of your connection may close the connection without your consent. You will discover this in the
CloseConfirm
message handler. You have one opportunity to react before everything sent over that channel is permanently locked.
With (1) in mind, here's an IBC packet with the fields that an attacker can set to arbitrary values labeled:
pub struct IbcPacket {
/// Attacker controlled.
pub data: Binary,
/// Attacker controlled.
pub src: IbcEndpoint,
/// Trusted.
pub dest: IbcEndpoint,
/// Trusted.
pub sequence: [u64,
/// Attacker controlled.
pub timeout: IbcTimeout,
}
Attacker controlled does not mean anarchy! It means that an attacker can spin up a chain such-that those fields are filled with arbitrary, but valid data.
CosmWasm contracts can create infinite loops over IBC by calling back into themselves in their ACK handlers. I have written a proof-of-concept of this here. If you are relaying packets between two CosmWasm contracts, know that it is possible that those contracts will enter an infinite loop and cost you a large amount of money in fees.
If you are not a maniac and disable it, your CosmWasm contract will panic if an integer overflow or divide by zero occurs. This makes overflows an excellent way to lock a contract. Watch out for math.
To mitigate these risks, make use of CosmWasm's Uint256
and Uint512
types. Use methods like full_mul
to do math that will not overflow, and then write "proofs" as to why you can reduce the value to a Uint128
at the end of the operation. For example, here is a method from our contracts that does math:
/// Computes the number of tokens to return to an address when
/// claiming.
///
/// # Arguments
///
/// * `staked_total` - The number of tokens that have been staked.
/// * `balance` - The number of tokens the contract has (staked_total + rewards).
/// * `ask` - The number of tokens being claimed.
///
/// # Invariants
///
/// These must be checked by the caller. If checked, this function is
/// guarenteed not to panic.
///
/// 1. staked_total != 0.
/// 2. ask + balance <= 2^128
/// 3. ask <= staked_total
///
/// For information on the panic conditions for math, see:
/// <https://rust-lang.github.io/rfcs/0560-integer-overflow.html>
pub(crate) fn amount_to_claim(staked_total: Uint128, balance: Uint128, ask: Uint128) -> Uint128 {
// we know that:
//
// 1. cw20's max supply is 2^128
// 2. balance := staked_total + rewards
//
// for non-malicious inputs:
//
// 3. 1 => ask + balance <= 2^128
// 4. ask <= staked_total
// 5. staked_total != 0
// 6. 4 => ask / staked_total <= 1
// 7. 3 => balance <= 2^128
// 8. 6 + 7 => ask / staked_total * balance <= 2^128
//
// which, as addition and division are communative, proves that
// ask * balance / staked_total will fit into a 128 bit integer.
ask.full_mul(balance)
.div(Uint256::from(staked_total))
.try_into()
.unwrap()
}
If you're writing math this is the level of attention to detail required.
If your contract gets itself into a state where it always runs out of gas, it is locked. There is no catching this as a submessage failure and reacting. Running out of gas is bad. That said, CosmWasm has some truly stupidly high gas limits. Here is a chart showing the highest computable fibonacci number on a couple blockchains compared to Cloudflare Workers:
Here is a transaction hash on the uni-5
Juno testnet that works fine despite recursively dispatching 720 submessages:
611DE7582D170D299B80B2492AB7711D60BA9EE9A6EFF259E4C7E545D795BBA9
The methodology for these can be found here and here.
Point is, CosmWasm has really high gas limits. That said, you still need to be careful.
- Avoid operations that are
> O(1)
and can not be paginated. Formally speaking, say gas usage scales linearly with computational complexity, that is to saygas_used(complexity) = G * complexity
. We then defineL := gas_limit / G
and say that a transaction succeeded ifcomplexity(TX) <= L
. For contracts, it is OK for there to existN
operations such thatcomplexity(N) > L
so long as the gas cost of a single arbitrary operationA
satisfiescomplexity(A) <= L
. Less formally, paginate your queries and watch out for> O(1)
! - You can store more in state in a single transaction than you can query in a single query. If you have state that you would like to be queried, you must cap it's size or otherwise risk
complexity(A) > L
. Here's an example of doing this from our contracts:
let proposal_size = cosmwasm_std::to_vec(&proposal)?.len() as u64;
if proposal_size > MAX_PROPOSAL_SIZE {
return Err(ContractError::ProposalTooLarge {
size: proposal_size,
max: MAX_PROPOSAL_SIZE,
});
}