Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update utils.cpp - improve fromHexString #12

Closed
wants to merge 2 commits into from

Conversation

DanHouseman
Copy link
Contributor

fromHexString function has been optimized by directly converting characters to bytes using bitwise operations and manual parsing within the loop, eliminating the use of std::stoul and reducing the creation of temporary objects to enhance performance.

  1. fromHexString and fromHexStringToUint64 assume that the input after removing "0x" is a valid hexadecimal string without any non-hexadecimal characters.
  2. The original function creates a substring for each byte conversion (hex_str.substr(i, 2)), which involves both allocation and copying of two characters into a new string object for each byte of the input hex string. This is computationally costly, especially for long strings.
  3. This avoids that by directly parsing the substring withstd::stoul,thusly saving time and memory by minimizing reallocations.

Also... error handling.

By reserving the required amount of space in the vector upfront (bytes.reserve(cleaned_hex.length() / 2)), the function avoids the overhead associated with the vector's automatic resizing during the push_back operations.

Normally, if you don’t reserve space, the vector may need to resize and copy its contents multiple times as it grows, depending on the implementation. Each resizing involves allocating new memory and copying the existing elements to the new space, which is computationally expensive. This significantly improves performance primarily by reducing dynamic memory allocations and eliminating redundant operations

fromHexString function has been optimized by directly converting characters to bytes using bitwise operations and manual parsing within the loop, eliminating the use of std::stoul and reducing the creation of temporary objects to enhance performance.
Namespaced utils.
@darcys22
Copy link
Collaborator

So this is pretty unlucky, but literally same day you opened this a coworker optimised the fromHexString function in a pretty similar way.

struct HexToU8Result {
    bool    success;
    uint8_t u8;
};
        
static HexToU8Result hexToU8(char ch) {
    HexToU8Result result = {};
    result.success       = true;
    
    if (ch >= 'a' && ch <= 'f')
        result.u8 = static_cast<uint8_t>(ch - 'a' + 10);
    else if (ch >= 'A' && ch <= 'F')
        result.u8 = static_cast<uint8_t>(ch - 'A' + 10);
    else if (ch >= '0' && ch <= '9')
        result.u8 = static_cast<uint8_t>(ch - '0');
    else
        result.success = false;
     
    return result;
}
    
std::vector<unsigned char> utils::fromHexString(std::string_view hexStr) {
    hexStr = trimPrefix(hexStr, "0x");
    assert(hexStr.size() % 2 == 0);

    std::vector<unsigned char> result;
    for (size_t i = 0; i < hexStr.length(); i += 2) {
        std::string_view byteString = hexStr.substr(i, 2);
        HexToU8Result    hi         = hexToU8(byteString[0]);
        HexToU8Result    lo         = hexToU8(byteString[1]);
        unsigned char    byte       = static_cast<unsigned char>(hi.u8 << 4 | lo.u8 << 0);
        result.push_back(byte);
    }
    return result;
}

We should still reserve the bytes upfront like you mentioned and ill update that in another PR right now, but you were right on the money with the other optimisations.

@DanHouseman
Copy link
Contributor Author

@darcys22 no problem whatsoever! I’m glad to see the improvement - this is all part of what makes collaboration great! Happy to see parts of it merged! Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants