-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
filtfilt! converts all data to NaN #82
Comments
As a follow up, I was able to filter without issuing using DSP.jl directly, e.g.:
So it appears to be an issue with SeisIO and not that library. |
Any update on this? (Not to be pushy, but it's kind of a serious bug that makes SeisIO more or less unusable). In experiments, I've noticed that it consistently happens for higher sample rate data, though I'm not sure why. |
I've looked into your report but I see no bug. This can be fixed by converting your data to Float64 precision.
Detail
At Float32 precision, your filter choice creates coefficient vectors Incidentally, in Jones et al. (2020), we implicitly demonstrate that our filtering works correctly in Fig. 5b and the corresponding text. So I really don't understand your decision to denounce a published work as "unusable". ResolutionI'm treating this as an implicit feature request for an automated way to prevent NaNs when calling |
Hi, thanks for your response. I just didn't know the data had to be Float64 before using While it would be great for this to work on Float32, my "feature request" would really just be to provide some guidance and perhaps a worked example of how to do this properly in the Documentation and the Tutorial. I looked through both rather carefully and had no clue.... (I also noticed a similar problem in the Jupyter Notebooks when I ran them). One might also think about throwing an error if In any case, I didn't mean at all to be disparaging about the software package. I am very excited to use it in my everyday workflow; I just need to be able to do simple things like filtering before making the conversion and having the confidence to make the switch. I asked around before posting this, and I am not the only user that has run into this issue and was puzzled by it. Sorry to offend, Daniel |
As I explained in my last comment, the user's choice of filter parameters is responsible for outputting NaNs. It's not a mistake for SeisIO to use Float32 data. Using Float32 data, rather than Float64, has exactly one effect: it narrows the range of viable filtering parameters. To demonstrate this, suppose I modify your above example as follows:
Here, X1 contains all 64-bit NaNs. So does S.x[1]. A better solution for your data is to downsample. You're band passing with an upper corner frequency of 10.0 Hz. If you check the PSD estimate of the filter frequency response function, you'll find almost no signal energy above 12 Hz. Suppose we resample to 40.0 Hz, then:
No NaNs. |
Got it, thanks. I agree that is the better solution in this case, though I initially came across the issue when removing the instrument response for high-sample rate data where downsampling beforehand would not have been an option. In any case, I do think it would be useful to clarify this nuance in the documentation - I am not the first person to make this mistake, nor will I be the last. It is not an issue in other codes commonly used to process seismic data like SAC, ObsPy, etc., so is not obvious to people making the switch. (And correct me if I am wrong, but I saw the same issue in the filtering example notebooks?). But it's your code, so it's really your call. Just giving my two cents. |
Hello, Thank you for the great discussion here. It is very helpful for me to implement filtering with SeisIO. To further check the issue, I have conducted the bandpass with forth order Butterworth, and found the results with NaNs when datatype is Float32 as discussed in this issue.
Output:
Thank you for the commit (Issue #82) on the documentation and the filtfilt!() function, which helps me understand better on the data type when applying the filter. Best, |
Please provide a minimum working example. I am sincerely sorry that you're finding SeisIO frustrating, but merely claiming that you've encountered an error in another place is not enough information to act on.
You can verify what I wrote in my last reply by checking the closed Issues: no one has ever reported NaNs in the tutorial notebooks before. In fact, the only tutorial-specific problem at all was Issue #76, due to a minor typo in the documentation.
Thank you for your feedback. I added two commits yesterday, one of which resolves the example in your initial comment. I plan to do a release with those commits today (22 March). |
Perfect, thanks for the commits. I think that will help everyone in the long run. The example I posted above was my attempt at a "minimum working example" because the place where I initially spotted the error was in the middle of a long complex workflow with data that is not publicly available. In terms of the tutorial, you may be on it already with your commit, but the place I am referring to is in the notebook called Part4-Processing, Cell #9: If we print I know this hasn't been brought up in the forums, as I did a search before opening a new issue. But I also reached out to two colleagues who have experience with SeisIO. Both had run across the NaN issue before, neither knew why it was happening, and one of the two had stopped using the code because of it. All three of us have a non-trivial amount of experience in observational seismology, though clearly don't have as much expertise in the underlying algorithms as you do. As you point out, it is not a "bug" but a misunderstanding of the effect that using Float32 data types can have in certain circumstances. So as long as new users are aware, I don't think this is a major issue. (As an aside, the obspy forum you mentioned appears unrelated to filtering / instrument response, though perhaps it is related in some deeper way that escapes me). Thanks again for all your hard work on the software package. I really look forward to using it more. Daniel |
With apologies, I'm not seeing NaNs there at all. It's possible that IRIS had no data for some channels, or other data issues, when you executed those test commands. In creating that part of the tutorial, we tried to choose test stations that we knew had excellent uptimes, but no station is 100%. It's also possible that my commits from March 20th fixed it. I can try to add some code to the tutorial that checks for station and data issues. |
Hi, I looked into this again. I think you are right, it is a data availability issue. I re-downloaded and re-installed. Now when running the tutorial, I get the following: Then I sat five minutes and ran again: So the NaNs go in and out depending on the time of the day. Likely unrelated to the filtering issue, but useful to know about. Daniel |
Perfect. I think it should be relatively straightforward to check for that in the tutorial. |
Patch-level release (v1.2.1) completed. Includes the |
Great, download and installed, and all tests passed of course. |
I am new to SeisIO, so apologies if I am doing something stupid.
I want to implement a simple workflow where I (1) download data and (2) do some simple preprocessing, including a bandpass filter. When I try to filter the data, however, the entire time series appears to be corrupted. For example:
Gives the following response:
Any idea what I am doing wrong? I am running julia version 1.5.2 if that helps.
The text was updated successfully, but these errors were encountered: