Skip to content
@goodcleanfun

goodcleanfun

Fun little NLP building blocks for the public good, free from capitalist interests, clean as in small/focused and low climate impact

Popular repositories Loading

  1. tokenizer tokenizer Public

    Jinja

  2. vector_ops vector_ops Public

    Generic vector functions for numeric types in C using OpenMP if available

    C

  3. tokens tokens Public

    Arrays of tokens as string offsets and lengths as well as a tokenized string which stores the tokens as a single contiguous array of NUL-terminated strings using cstring_array

    C

  4. token_types token_types Public

    Global enum of token types and associated grouping functions.

    C

  5. utf8 utf8 Public

    utf8 strings to unicode codepoints using utf8proc in C

    C

  6. khash khash Public

    Header-only clib package for khash.h

Repositories

Showing 10 of 64 repositories
  • sartorial Public

    Pydantic model base classes and custom type handling, JSON schema generation, etc. covering a variety of common scenarios without much config

    goodcleanfun/sartorial’s past year of commit activity
    Python 0 MIT 0 0 2 Updated Jan 24, 2025
  • atypical Public

    Custom types for things like phone numbers, emails, etc. with normalization, Pydantic handling and JSON Schema serialization

    goodcleanfun/atypical’s past year of commit activity
    Python 0 MIT 0 0 2 Updated Jan 21, 2025
  • concurrent_array Public

    High-performance concurrent/thread-safe, generic, dynamic (push-only) array using read-write locks (write lock only held for resizing) and C11 atomics to ensure unique indices for each push.

    goodcleanfun/concurrent_array’s past year of commit activity
    C 0 MIT 0 0 0 Updated Jan 19, 2025
  • aligned Public

    Aligned memory management for vectorization

    goodcleanfun/aligned’s past year of commit activity
    C 0 MIT 0 0 0 Updated Jan 6, 2025
  • mapped_file Public

    Cross-platform memory-mapped files

    goodcleanfun/mapped_file’s past year of commit activity
    C 0 MIT 0 0 0 Updated Jan 5, 2025
  • cstring_array Public

    Stores an array of NUL-terminated C strings as one contiguous array with index pointers

    goodcleanfun/cstring_array’s past year of commit activity
    C 0 MIT 0 0 0 Updated Jan 5, 2025
  • threading Public

    A simple cross-platform threads.h implementation

    goodcleanfun/threading’s past year of commit activity
    C 0 0 0 0 Updated Jan 2, 2025
  • bit_utils Public

    Cross platform builtin bit operations (popcount, ctz, clz)

    goodcleanfun/bit_utils’s past year of commit activity
    C 0 MIT 0 0 0 Updated Dec 29, 2024
  • weight_balanced_tree Public

    Generic α-weight balanced binary search tree

    goodcleanfun/weight_balanced_tree’s past year of commit activity
    C 0 MIT 0 0 0 Updated Dec 29, 2024
  • splay_tree Public

    Splay tree, adaptive binary search tree with query time proportional to the entropy of access patterns

    goodcleanfun/splay_tree’s past year of commit activity
    C 0 MIT 0 0 0 Updated Dec 29, 2024

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…