Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add rekordbox file format specs. #116

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

brunchboy
Copy link

These have been proven to work in the context of my Beat Link Trigger
project, enabling it to retrieve the database over NFS, parse it, and
extract all the track metadata it needs even when it is impossible to
connect to the database server running on the players because there
are four of them in use.

These have been proven to work in the context of my Beat Link Trigger
project, enabling it to retrieve the database over NFS, parse it, and
extract all the track metadata it needs even when it is impossible to
connect to the database server running on the players because there
are four of them in use.
database/pdb.ksy Outdated Show resolved Hide resolved
database/pdb.ksy Outdated Show resolved Hide resolved
database/pdb.ksy Outdated Show resolved Hide resolved
database/pdb.ksy Outdated Show resolved Hide resolved
database/pdb.ksy Outdated Show resolved Hide resolved
@brunchboy
Copy link
Author

This is as far as I am able to take these files at the moment. You are welcome to add them if you like, and it will not hurt my feelings if you decide they are not useful, because they are incredibly useful to me. 😄 Thanks in any case for your excellent tools!

database/pdb.ksy Outdated
@@ -0,0 +1,964 @@
meta:
id: rekordbox_pdb
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

file name should match its id

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is true, it should be stated in the style guide.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is, as far as I remember.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I searched the whole style guide before posting my comment, and could not find anything along those lines. I will rename the files when I have a few minutes though.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All right, I have renamed the files and submitted a PR to the style guide to help future me avoid making the same mistake. 😄 Thanks. I still need to fix something in the rekordbox_anlz.ksy file to avoid Java exceptions when the file contains a fourcc that we have not previously encountered, as I am currently discussing in the gitter channel.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any chance we can do devicesql_pdb here, as we know that it is not exactly specific to rekordbox?

Copy link
Author

@brunchboy brunchboy Nov 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was another thread on this topic, @GreyCat. The filename started out not being specific to rekordbox, and @KOLANICH suggested the current rename. I was hesitant at first but ended up agreeing because, while there are certainly aspects of the format which are generally applicable to other DeviceSQL implementations, the specific column types which are used by this mapping (e.g. tracks, artists, playlists, etc.) are clearly used only by rekordbox.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I agree with rekordbox_pdb, but please rename the file as database/rekordbox_pdb.ksy then?

Copy link
Contributor

@KOLANICH KOLANICH Nov 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while there are certainly aspects of the format which are generally applicable to other DeviceSQL implementations

Can they be moved into a separate ksy and used via import?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops! @GreyCat I had tried to rename the file to rekordbox.pdb and accidentally renamed dbf.pdb instead. That is fixed now. Phew!

@KOLANICH nobody knows; I don’t know if anyone has seen any other files created by DeviceSQL in recent years.

James Elliott added 2 commits November 21, 2018 10:58
Trying to use an enum causes unavoidable parse errors in Java and
Python when new/unknown FourCC values are encountered. See
kaitai-io/kaitai_struct#300
database/pdb.ksy Outdated Show resolved Hide resolved
Copy link
Contributor

@KOLANICH KOLANICH left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@KOLANICH
Copy link
Contributor

BTW, you can rewrite history and --amend changes to a single commit.

@brunchboy
Copy link
Author

Only if the commit hasn’t been pushed anywhere, of course. I do that sometimes. magit makes that sort of thing very easy in Emacs. 😄

@iamtunzor_twitter found media where there were different values for
the artist row, which was causing total database parse failure. Now we
should be robust as long as there are no actual structural changes.
- type: u2
doc: |
Some kind of magic word? Usually 0x60, 0x00 but have seen
0x64, 0x00.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can it be format version?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, although the numbers don’t seem plausible to me. It could also be yet another redundant length value of some sort—since I don’t have access to the actual data where this problem occurred, I can’t explore that question. I don’t even know on what continent the nightclub is found. 😄

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It turns out that in fact this is some sort of format version. A show producer in Croatia was running into parse problems and was able to get his DJ to share the database file with me. When the row has the 0x64, 0x00 header it means the name is more than 255 bytes long and is stored in yet another weird format! But I have been able to capture this variant in the latest commit.

Thanks to @iamtunzor_twitter in Croatia for getting his DJ to share a
copy of the problematic database file with me!
ofs_long_name:
type: u2
if: subtype == 0x64
pos: _parent.row_base + 0x0a
Copy link
Contributor

@KOLANICH KOLANICH Dec 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about

    ofs_real_name:
        value: _parent.row_base + (subtype == 0x64 ? ofs_long_name : ofs_name)
    name:
        pos: ofs_real_name
        type: device_sql_string

Copy link
Contributor

@KOLANICH KOLANICH Dec 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also please follow the convention pos [io] type desc if

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Around the time you were writing this, I was at my lunchtime swim, and in the pool is often when I review recent code and make improvements and optimizations, and I came up with the same idea, it will make dealing with the value much easier. Thanks! I will fix the order of the fields as well.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rats, when I try that I am getting a KSC error in the generate phase: mapping values are not allowed here in 'reader', line 410, column 66: ... (subtype == 0x64)? ofs_name_far : ofs_name_near) (column 66 is right after the :. Any ideas?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry, that’s a FAQ, need to escape it as a string for the silly YAML format.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up doing it a slightly different way, but it has the same net effect, the struct is easier to understand and much easier to use; the name is always accessed using the same instance regardless of how it is actually stored.
brunchboy@a2898a0

Now has a much clearer structure in the .ksy *and* provides a single,
unified API for the struct user to access the name however it was
stored.
name:
pos: '_parent.row_base + (subtype == 0x64? ofs_name_far : ofs_name_near)'
Copy link
Contributor

@KOLANICH KOLANICH Dec 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ofs_name_far:
        pos: _parent.row_base + 0x0a
pos: '_parent.row_base + (subtype == 0x64? (_parent.row_base + 0x0a) : ofs_name_near)'

was it intended?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is really the weird layout that they have set up.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that your substitution is not quite correct. In the case where it is a “far” string the offset from _parent.row_base is the two-byte value found at _parent.row_base + 0x0a. So row-base gets added to the value found there, it is not added twice to the resulting offset.

@brunchboy
Copy link
Author

I just made some small documentation changes based on things I noticed while finishing the detailed explanation article. These mappings have been used successfully to run shows by a number of people over the past few months, so if you would like to include them in your collection, they are probably ready. If you are not interested in them I can close the pull request as well; just let me know.

@KOLANICH
Copy link
Contributor

KOLANICH commented Mar 8, 2019

This value seems to be redundant, because it can be calculated by dividing the offset
of the start of the page by $len_page$, but perhaps it serves as a
sanity check.

Have you tried to replace these values and check the consequences?

_root.len_page * index

I think that there is no reason to link pages into a list if they are organized into an array. So it may be that array is just an impl detail of memory allocation. I mean that you should check if moving the pages and their page_refs don't spoil the format. If it is the case, the list may be just a strange key-value storage with index as a key and looking up a page by its index is crawling the list and checking the index of every page.

@brunchboy
Copy link
Author

I’m sorry, I am again not sure what you mean. I can’t move pages around, I can only examine them as they are given to me. The code that creates them is in third-party proprietary software designed to run in low-power processors. The pages of a given table are linked together because there might be pages of other tables interspersed between them.

In all files that any of my users have seen the index is equal to the array index of the page. There are many examples of far worse redundancy and inconsistency in the format.

@KOLANICH
Copy link
Contributor

KOLANICH commented Mar 8, 2019

Thank you for the clarification, now I understand your spec better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants