Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Illegal instruction (core dumped) #394

Closed
gaozhenqiang opened this issue Jul 18, 2023 · 6 comments
Closed

Illegal instruction (core dumped) #394

gaozhenqiang opened this issue Jul 18, 2023 · 6 comments
Labels
bug Something isn't working question Further information is requested

Comments

@gaozhenqiang
Copy link

Hello!

I could not import diskannpy after installing in linux and python environment

The installed package is diskannpy-0.5.0rc2-cp310-cp310-manylinux_2_28_x86_64.whl

report an error:Illegal instruction (core dumped)

image

I don't know what the problem is. I hope I can get help

thanks in advance!

@gaozhenqiang gaozhenqiang added the question Further information is requested label Jul 18, 2023
@harsha-simhadri
Copy link
Contributor

harsha-simhadri commented Jul 20, 2023

What is your hardware architecture? Could you also please try 0.5.0rc4 with more portable compiler flags?

@daxpryce
Copy link
Contributor

The compiler flags used to generate the 0.5.0.rc2 release, -march=native and -mtune=native, basically make for a wheel that is only useful if your specific CPU architecture is the same as the github action runner that built it.

As @harsha-simhadri said, we just released 0.5.0.4c4 that should work on a far broader modern suite of processors, provided they support sse2, avx2, and fma - and from what I can tell, is basically any processor since around 2013 or 2014.

Hopefully installing this new version will work for you - please do report back with success or failure, as I only have a finite number of computers to test it on and they all are roughly from around the same timeframe.

@daxpryce daxpryce added the bug Something isn't working label Jul 21, 2023
@shreyasraoverse
Copy link

What is your hardware architecture? Could you also please try 0.5.0rc4 with more portable compiler flags?
@harsha-simhadri , @daxpryce
I was trying it out with 0.5.0rc4. @deepakdhull80 and I were combing through the code in DiskANNsrc/index.cpp. We think we found a bug there. Following is the code snippet (line 309 to 331). It seems _data_compacted was never defined. Just by adding _data_compacted = False and rebuilding the setup worked for us.

template <typename T, typename TagT, typename LabelT>
void Index<T, TagT, LabelT>::save(const char *filename, bool compact_before_save)
{
diskann::Timer timer;

std::unique_lock<std::shared_timed_mutex> ul(_update_lock);
std::unique_lock<std::shared_timed_mutex> cl(_consolidate_lock);
std::unique_lock<std::shared_timed_mutex> tl(_tag_lock);
std::unique_lock<std::shared_timed_mutex> dl(_delete_lock);

if (compact_before_save)
{
    compact_data();
    compact_frozen_point();
}
else
{
    if (!_data_compacted)
    {
        throw ANNException("Index save for non-compacted index is not yet implemented", -1, __FUNCSIG__, __FILE__,
                           __LINE__);
    }
}

@shreyasraoverse
Copy link

The code change that worked for us...

template <typename T, typename TagT, typename LabelT>
void Index<T, TagT, LabelT>::save(const char *filename, bool compact_before_save)
{
diskann::Timer timer;

std::unique_lock<std::shared_timed_mutex> ul(_update_lock);
std::unique_lock<std::shared_timed_mutex> cl(_consolidate_lock);
std::unique_lock<std::shared_timed_mutex> tl(_tag_lock);
std::unique_lock<std::shared_timed_mutex> dl(_delete_lock);
_data_compacted = false
if (compact_before_save)
{
    compact_data();
    compact_frozen_point();
}
else
{
    if (!_data_compacted)
    {
        throw ANNException("Index save for non-compacted index is not yet implemented", -1, __FUNCSIG__, __FILE__,
                           __LINE__);
    }
}

@gaozhenqiang
Copy link
Author

The compiler flags used to generate the 0.5.0.rc2 release, -march=native and -mtune=native, basically make for a wheel that is only useful if your specific CPU architecture is the same as the github action runner that built it.

As @harsha-simhadri said, we just released 0.5.0.4c4 that should work on a far broader modern suite of processors, provided they support sse2, avx2, and fma - and from what I can tell, is basically any processor since around 2013 or 2014.

Hopefully installing this new version will work for you - please do report back with success or failure, as I only have a finite number of computers to test it on and they all are roughly from around the same timeframe.

Hello, the new version you released runs on my machine. Thank you!

@daxpryce
Copy link
Contributor

What is your hardware architecture? Could you also please try 0.5.0rc4 with more portable compiler flags?
@harsha-simhadri , @daxpryce
I was trying it out with 0.5.0rc4. @deepakdhull80 and I were combing through the code in DiskANNsrc/index.cpp. We think we found a bug there. Following is the code snippet (line 309 to 331). It seems _data_compacted was never defined. Just by adding _data_compacted = False and rebuilding the setup worked for us.

template <typename T, typename TagT, typename LabelT> void Index<T, TagT, LabelT>::save(const char *filename, bool compact_before_save) { diskann::Timer timer;

std::unique_lock<std::shared_timed_mutex> ul(_update_lock);
std::unique_lock<std::shared_timed_mutex> cl(_consolidate_lock);
std::unique_lock<std::shared_timed_mutex> tl(_tag_lock);
std::unique_lock<std::shared_timed_mutex> dl(_delete_lock);

if (compact_before_save)
{
    compact_data();
    compact_frozen_point();
}
else
{
    if (!_data_compacted)
    {
        throw ANNException("Index save for non-compacted index is not yet implemented", -1, __FUNCSIG__, __FILE__,
                           __LINE__);
    }
}

@shreyasraoverse: this is a different bug than #394, which was coming up because we were building a processor family specific wheel that only worked if you happened to use the same one as the build machine, which we have no control over.

the issue you're describing is the one from #400, #402, and #404 - which I'll talk about in #404 when I reopen it, since it still isn't fixed. Thanks so much for your feedback, I'll try to quote post or move these responses to that thread and close out this ticket that @1933669775 raised

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants