-
Notifications
You must be signed in to change notification settings - Fork 207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add rekordbox file format specs. #116
base: master
Are you sure you want to change the base?
Conversation
These have been proven to work in the context of my Beat Link Trigger project, enabling it to retrieve the database over NFS, parse it, and extract all the track metadata it needs even when it is impossible to connect to the database server running on the players because there are four of them in use.
Another good suggestion from @KOLANICH
This is as far as I am able to take these files at the moment. You are welcome to add them if you like, and it will not hurt my feelings if you decide they are not useful, because they are incredibly useful to me. 😄 Thanks in any case for your excellent tools! |
database/pdb.ksy
Outdated
@@ -0,0 +1,964 @@ | |||
meta: | |||
id: rekordbox_pdb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
file name should match its id
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is true, it should be stated in the style guide.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is, as far as I remember.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I searched the whole style guide before posting my comment, and could not find anything along those lines. I will rename the files when I have a few minutes though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All right, I have renamed the files and submitted a PR to the style guide to help future me avoid making the same mistake. 😄 Thanks. I still need to fix something in the rekordbox_anlz.ksy
file to avoid Java exceptions when the file contains a fourcc
that we have not previously encountered, as I am currently discussing in the gitter channel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any chance we can do devicesql_pdb
here, as we know that it is not exactly specific to rekordbox?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was another thread on this topic, @GreyCat. The filename started out not being specific to rekordbox, and @KOLANICH suggested the current rename. I was hesitant at first but ended up agreeing because, while there are certainly aspects of the format which are generally applicable to other DeviceSQL implementations, the specific column types which are used by this mapping (e.g. tracks, artists, playlists, etc.) are clearly used only by rekordbox.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I agree with rekordbox_pdb
, but please rename the file as database/rekordbox_pdb.ksy
then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while there are certainly aspects of the format which are generally applicable to other DeviceSQL implementations
Can they be moved into a separate ksy and used via import
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trying to use an enum causes unavoidable parse errors in Java and Python when new/unknown FourCC values are encountered. See kaitai-io/kaitai_struct#300
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
BTW, you can rewrite history and |
Only if the commit hasn’t been pushed anywhere, of course. I do that sometimes. |
@iamtunzor_twitter found media where there were different values for the artist row, which was causing total database parse failure. Now we should be robust as long as there are no actual structural changes.
database/rekordbox_pdb.ksy
Outdated
- type: u2 | ||
doc: | | ||
Some kind of magic word? Usually 0x60, 0x00 but have seen | ||
0x64, 0x00. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can it be format version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe, although the numbers don’t seem plausible to me. It could also be yet another redundant length value of some sort—since I don’t have access to the actual data where this problem occurred, I can’t explore that question. I don’t even know on what continent the nightclub is found. 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It turns out that in fact this is some sort of format version. A show producer in Croatia was running into parse problems and was able to get his DJ to share the database file with me. When the row has the 0x64, 0x00 header it means the name is more than 255 bytes long and is stored in yet another weird format! But I have been able to capture this variant in the latest commit.
Thanks to @iamtunzor_twitter in Croatia for getting his DJ to share a copy of the problematic database file with me!
database/rekordbox_pdb.ksy
Outdated
ofs_long_name: | ||
type: u2 | ||
if: subtype == 0x64 | ||
pos: _parent.row_base + 0x0a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about
ofs_real_name:
value: _parent.row_base + (subtype == 0x64 ? ofs_long_name : ofs_name)
name:
pos: ofs_real_name
type: device_sql_string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also please follow the convention pos [io] type desc if
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Around the time you were writing this, I was at my lunchtime swim, and in the pool is often when I review recent code and make improvements and optimizations, and I came up with the same idea, it will make dealing with the value much easier. Thanks! I will fix the order of the fields as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rats, when I try that I am getting a KSC error in the generate phase: mapping values are not allowed here in 'reader', line 410, column 66: ... (subtype == 0x64)? ofs_name_far : ofs_name_near)
(column 66 is right after the :
. Any ideas?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah sorry, that’s a FAQ, need to escape it as a string for the silly YAML format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ended up doing it a slightly different way, but it has the same net effect, the struct is easier to understand and much easier to use; the name is always accessed using the same instance regardless of how it is actually stored.
brunchboy@a2898a0
Now has a much clearer structure in the .ksy *and* provides a single, unified API for the struct user to access the name however it was stored.
name: | ||
pos: '_parent.row_base + (subtype == 0x64? ofs_name_far : ofs_name_near)' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ofs_name_far:
pos: _parent.row_base + 0x0a
pos: '_parent.row_base + (subtype == 0x64? (_parent.row_base + 0x0a) : ofs_name_near)'
was it intended?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is really the weird layout that they have set up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that your substitution is not quite correct. In the case where it is a “far” string the offset from _parent.row_base
is the two-byte value found at _parent.row_base + 0x0a
. So row-base gets added to the value found there, it is not added twice to the resulting offset.
I just made some small documentation changes based on things I noticed while finishing the detailed explanation article. These mappings have been used successfully to run shows by a number of people over the past few months, so if you would like to include them in your collection, they are probably ready. If you are not interested in them I can close the pull request as well; just let me know. |
Have you tried to replace these values and check the consequences?
I think that there is no reason to link pages into a list if they are organized into an array. So it may be that array is just an impl detail of memory allocation. I mean that you should check if moving the pages and their |
I’m sorry, I am again not sure what you mean. I can’t move pages around, I can only examine them as they are given to me. The code that creates them is in third-party proprietary software designed to run in low-power processors. The pages of a given table are linked together because there might be pages of other tables interspersed between them. In all files that any of my users have seen the index is equal to the array index of the page. There are many examples of far worse redundancy and inconsistency in the format. |
Thank you for the clarification, now I understand your spec better. |
These have been proven to work in the context of my Beat Link Trigger
project, enabling it to retrieve the database over NFS, parse it, and
extract all the track metadata it needs even when it is impossible to
connect to the database server running on the players because there
are four of them in use.