Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(fmt/cif): implement CIF 1.1 parser #438

Merged
merged 29 commits into from
Dec 23, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
b5d97eb
feat(meta): import `overload_cast` from pybind11
jnooree Dec 19, 2024
749d104
feat(utils): add unsafe version of safe_slice_rstrip
jnooree Dec 19, 2024
5dfc69c
feat(fmt/cif): implement CIF 1.1 lexer
jnooree Dec 19, 2024
0e7e707
test(fmt/cif): add cif lexer tests
jnooree Dec 19, 2024
bdc9d44
docs: add BioPython license notice
jnooree Dec 19, 2024
3c9ad0d
test(fmt/cif): add test for underscored value
jnooree Dec 19, 2024
c14af59
feat(fmt/cif): fetch lines lazily
jnooree Dec 19, 2024
d9241e1
feat(fmt/cif): support line continuation in text field
jnooree Dec 19, 2024
9272930
test(fmt/cif): test for line continuation
jnooree Dec 19, 2024
204f40d
feat(fmt/cif): discriminate unquoted and quoted values
jnooree Dec 20, 2024
80052c8
feat(fmt/cif): implement CIF 1.1 parser
jnooree Dec 20, 2024
a4688bd
test(fmt/cif): add simple parser testcases
jnooree Dec 20, 2024
4760951
feat(fmt/cif): allow querying total column count
jnooree Dec 20, 2024
de650f2
test(fmt/cif): add full cif parser tests
jnooree Dec 20, 2024
585fa1a
chore(fmt/cif): exclude debug helpers from coverage
jnooree Dec 20, 2024
9ae282a
refactor(fmt/cif): simplify error message generation
jnooree Dec 20, 2024
2702b4f
refactor(fmt/cif): use ascii_isspace instead std::isspace
jnooree Dec 20, 2024
32ded22
test(fmt/cif): add test for save frames
jnooree Dec 22, 2024
e81419f
fix(fmt/cif): fix use-after-free bug
jnooree Dec 23, 2024
9332678
fix(fmt/cif): avoid reusing line buffer to avoid use-after-free bug
jnooree Dec 23, 2024
1ab1a27
test(fmt/cif): add fuzzing test
jnooree Dec 23, 2024
35d6e81
build: enable fuzzing build
jnooree Dec 23, 2024
aff39a7
chore: ignore more false positives
jnooree Dec 23, 2024
54606cd
build: add missing Boost dependency
jnooree Dec 23, 2024
aab5f1a
fix(fmt/cif): cleanup headers
jnooree Dec 23, 2024
efae1cc
fix(fmt/cif): clear after move
jnooree Dec 23, 2024
0e3d54e
build: fix clang tools for non-fuzzing build
jnooree Dec 23, 2024
86c970d
ci: exclude fuzz from clang tools lint
jnooree Dec 23, 2024
63ec8c9
test(fmt/cif): test error case
jnooree Dec 23, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .github/ubsanignore.txt
Original file line number Diff line number Diff line change
@@ -1,2 +1,5 @@
implicit-signed-integer-truncation:ascii.cc
implicit-signed-integer-truncation:raw_hash_set.h
implicit-integer-sign-change:string_view
implicit-integer-sign-change:predefined_ops.h
implicit-integer-sign-change:cif.cpp
2 changes: 1 addition & 1 deletion .github/workflows/_run-clang-tools.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ jobs:
style: file
tidy-checks: ""
database: build
ignore: ".github|build|third-party|test"
ignore: ".github|build|third-party|test|fuzz"
env:
GITHUB_TOKEN: ${{ github.token }}

Expand Down
12 changes: 12 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ option(NURI_ENABLE_AVX2 "Use -mavx2 flag for optimization" OFF)
option(NURI_ENABLE_ARCH_NATIVE "Use -march=native flag for optimization" OFF)

option(NURI_ENABLE_SANITIZERS "Enable sanitizers for debug build" OFF)
option(NURI_BUILD_FUZZING "Enable fuzzing build" OFF)
option(NURI_PREBUILT_ABSL "Download prebuilt abseil binary" ON)

option(NURI_TEST_COVERAGE "Enable coverage build" OFF)
Expand Down Expand Up @@ -89,6 +90,16 @@ elseif(NURI_ENABLE_IPO)
set(CMAKE_INTERPROCEDURAL_OPTIMIZATION ON)
endif()

if(NURI_BUILD_FUZZING)
if(NOT CMAKE_CXX_COMPILER_ID MATCHES "Clang")
message(FATAL_ERROR "Fuzzing build is only supported with Clang")
endif()

message("Fuzzing build: enabling sanitizers")
set(NURI_ENABLE_SANITIZERS ON)
add_compile_options(-fsanitize=fuzzer)
endif()

set_sanitizer_envs()

if(NURI_ENABLE_SANITIZERS)
Expand Down Expand Up @@ -153,6 +164,7 @@ endif()

include(CTest)
add_subdirectory(test EXCLUDE_FROM_ALL)
add_subdirectory(fuzz EXCLUDE_FROM_ALL)

if(BUILD_TESTING AND NURI_POSTINSTALL_TEST)
message(NOTICE "Running tests after post-installation step")
Expand Down
47 changes: 47 additions & 0 deletions NOTICE.md
Original file line number Diff line number Diff line change
Expand Up @@ -265,6 +265,11 @@ Spectra is under the MPLv2 license.
## pybind11

- Project URL: <https://github.com/pybind/pybind11>
- Files in this repository subject to the license:
- [include/nuri/meta.h](include/nuri/meta.h). The following functions are
imported from pybind11's implementation:
- `overload_cast`
- `overload_cast_impl`
- Full license text:

```txt
Expand Down Expand Up @@ -570,3 +575,45 @@ Our implementation is based on the original TM-align software (version 20220412)
ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
DEALINGS IN THE SOFTWARE.
```

## Misc

### BioPython

Some testcases were imported from BioPython.

- Project URL: <https://github.com/biopython/biopython>
- Full license text:

```txt
BSD 3-Clause License
--------------------

Copyright (c) 1999-2024, The Biopython Contributors
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.

2. Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.

3. Neither the name of the copyright holder nor the names of its contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
```
9 changes: 9 additions & 0 deletions cmake-variants.json
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,15 @@
"settings": {
"NURI_ENABLE_SANITIZERS": true
}
},
"fuzzer": {
"buildType": "RelWithDebInfo",
"long": "Fuzzing build.",
"short": "Fuzz",
"settings": {
"NURI_ENABLE_SANITIZERS": true,
"NURI_BUILD_FUZZING": true
}
}
}
}
Expand Down
44 changes: 36 additions & 8 deletions cmake/NuriKitTest.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -18,26 +18,54 @@ if(NURI_ENABLE_SANITIZERS AND CMAKE_VERSION VERSION_GREATER_EQUAL 3.18)
set(NURI_GTEST_EXTRA_ARGS DISCOVERY_MODE PRE_TEST)
endif()

function(nuri_add_test file)
function(_nuri_generate_test_name root file)
get_filename_component(test_dir ${file} DIRECTORY)
file(RELATIVE_PATH test_dir "${PROJECT_SOURCE_DIR}/test" "${test_dir}")
file(RELATIVE_PATH test_dir "${root}" "${test_dir}")
string(REPLACE "/" "_" test_prefix ${test_dir})

get_filename_component(test_name ${file} NAME_WE)

set(target "nuri_${test_prefix}_${test_name}")
add_executable("${target}" "${file}")
target_link_libraries("${target}" PRIVATE
set(NURI_TEST_TARGET "nuri_${test_prefix}_${test_name}" PARENT_SCOPE)
endfunction()

function(nuri_add_test file)
_nuri_generate_test_name("${PROJECT_SOURCE_DIR}/test" "${file}")

add_executable("${NURI_TEST_TARGET}" "${file}")
target_link_libraries("${NURI_TEST_TARGET}" PRIVATE
GTest::gtest GTest::gmock GTest::gtest_main
absl::absl_log absl::absl_check)

if(TARGET nuri_lib)
target_link_libraries("${target}" PRIVATE nuri_lib)
target_link_libraries("${NURI_TEST_TARGET}" PRIVATE nuri_lib)
endif()

gtest_discover_tests("${target}"
gtest_discover_tests("${NURI_TEST_TARGET}"
WORKING_DIRECTORY "${PROJECT_SOURCE_DIR}"
${NURI_GTEST_EXTRA_ARGS})

add_dependencies(nuri_all_test "${target}")
add_dependencies(nuri_all_test "${NURI_TEST_TARGET}")
endfunction()

if(NOT TARGET nuri_all_fuzz)
add_custom_target(nuri_all_fuzz)
clear_coverage_data(nuri_all_fuzz)

if(NURI_BUILD_FUZZING AND BUILD_TESTING)
set_target_properties(nuri_all_fuzz PROPERTIES EXCLUDE_FROM_ALL OFF)
endif()
endif()

function(nuri_add_fuzz file)
_nuri_generate_test_name("${PROJECT_SOURCE_DIR}/fuzz" "${file}")

add_executable("${NURI_TEST_TARGET}" "${file}")
target_link_options("${NURI_TEST_TARGET}" PRIVATE -fsanitize=fuzzer)
target_link_libraries("${NURI_TEST_TARGET}" PRIVATE absl::log_initialize)

if(TARGET nuri_lib)
target_link_libraries("${NURI_TEST_TARGET}" PRIVATE nuri_lib)
endif()

add_dependencies(nuri_all_fuzz "${NURI_TEST_TARGET}")
endfunction()
4 changes: 2 additions & 2 deletions cmake/NuriKitUtils.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -268,11 +268,11 @@ function(handle_boost_dependency target)
target_system_include_directories(
"${target}"
Boost::spirit Boost::fusion Boost::mpl Boost::optional
Boost::iterator Boost::config
Boost::iterator Boost::config Boost::range
)
target_link_libraries(
"${target}"
PUBLIC Boost::iterator Boost::config
PUBLIC Boost::iterator Boost::config Boost::range
PRIVATE Boost::spirit Boost::fusion Boost::mpl Boost::optional
)
endfunction()
Expand Down
15 changes: 15 additions & 0 deletions fuzz/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
#
# Project NuriKit - Copyright 2024 SNU Compbio Lab.
# SPDX-License-Identifier: Apache-2.0
#

include(NuriKitTest)

add_compile_options(-Wno-error)

include_directories("${CMAKE_CURRENT_LIST_DIR}/include")
file(GLOB_RECURSE NURI_FUZZ_SRCS *.cpp)

foreach(nuri_test_src IN LISTS NURI_FUZZ_SRCS)
nuri_add_fuzz("${nuri_test_src}")
endforeach()
34 changes: 34 additions & 0 deletions fuzz/fmt/cif_fuzz.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
//
// Project NuriKit - Copyright 2024 SNU Compbio Lab.
// SPDX-License-Identifier: Apache-2.0
//

#include <sstream>

#include <absl/base/call_once.h>
#include <absl/base/log_severity.h>
#include <absl/log/globals.h>
#include <absl/log/initialize.h>

#include "fuzz_utils.h"
#include "nuri/fmt/cif.h"

NURI_FUZZ_MAIN(data, size) {
static absl::once_flag flag;
absl::call_once(flag, []() {
absl::InitializeLog();
absl::SetStderrThreshold(absl::LogSeverity::kFatal);
});

std::istringstream iss(
std::string { reinterpret_cast<const char *>(data), size });
nuri::CifParser parser(iss);

while (true) {
auto block = parser.next();
if (!block)
break;
}

return 0;
}
14 changes: 14 additions & 0 deletions fuzz/include/fuzz_utils.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
//
// Project NuriKit - Copyright 2024 SNU Compbio Lab.
// SPDX-License-Identifier: Apache-2.0
//

#ifndef NURI_FUZZ_FUZZ_UTILS_H_
#define NURI_FUZZ_FUZZ_UTILS_H_

#define NURI_FUZZ_MAIN(data, size) \
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size)

NURI_FUZZ_MAIN(/* data */, /* size */);

#endif /* NURI_FUZZ_FUZZ_UTILS_H_ */
Loading
Loading