Do you wish:
- your codebase came with some helpful github-flavored markdown that provided a canonical, easily-linkable place for textual descriptions of database tables and columns to be stored?
- and required team members to update them as new ones were added?
- and made sure the schema followed certain conventions?
- or that your code had a dynamic definition of your database schema?
Then this is for you.
Table of Contents |
---|
Run this during your test suite (or just extractDbSchema
in your code if you want the db
object for your use):
const {extractDbSchema,run} = require('db-linter')
extractDbSchema({
//sql flavor
lang: 'postgres',//or 'mysql' (if using mariadb, say 'mysql')
//db creds
host: '127.0.0.1',
//port: 5432,//optional; if empty, assumes 3306 if mysql, 5432 if postgres
user: 'postgres',//note this user will need access to information_schema
password: '',
database:'test',
})
.then(db=>{
//or write your own convention checks once you have db
return run(db,{
path:'./readme.md',//which markdown to place/update the table
rules:'all',//or array of rule name strings from readme
//rule options
boolPrefixes:['is','allow'],
isObviousColumn:(columnName,tableName,db)=>{
//custom reasons a column does not need describing in your setup
//maybe columns that are everywhere, like created_at?
return false
}
})
})
.then( passedConventionCheck => //if this is false, all conventions were not followed
process.exit(passedConventionCheck ? 0 : 1)//or however you want to handle success / failure
)
Failed rules will be logged out for the dev to fix.
Below is the full list of built-in rules, but feel free to create your own and assess the json schema directly:
require_table_description_in_readme
- all tables need explanations for why they exist. Sometimes even describing tablex_y
as1 x can have many y's
will be appreciated going forward.require_column_description_in_readme
- all non-obvious columns need explanations for why they exist. ("Non-obvious" is customizable with theisObviousColumn()
in setup)require_lower_snake_case_table_name
- some instances, collations, & OSes are case insensitive, making this the only reliable naming style for tables and columnsrequire_lower_snake_case_column_name
- see above.disallow_bare_id
- columns namedid
have repeatedly been found to create footgun-level ambiguity downstream, and make sql more verbose & confusing by eliminating utility of theusing
keywordrequire_primary_key
- each row should always be individually fetchable from each table, otherwise the data structure & author needs may be at oddsrequire_unique_primary_keys
- identical primary keys would suggest they should be the same tablerequire_singular_table_name
- the table name should describe each row, not the table as a whole. A table holds multiple records, otherwise it would be called a pedestal; clarity is never added when a table is pluralized, it only makes remembering which part to pluralize harder when join tables inevitably have singular qualifiers. Also, consider using names for tables that are not SQL keywords or quoting them over pluralizing.require_all_foreign_keys
- every column titledx_id
(whenx
is another table) should have a foreign key to tablex
. In composite primary key scenarios, this may require denormalizing properties to retain the link.require_same_name_columns_share_type
- reduces confusion when talking & promotes more unique namesrequire_bool_prefix_on_only_bools
-is_
,allow_
should always refer to boolean columns. (Prefix list customizable with theboolPrefixes
in setup)
The db
object extractDbSchema()
creates has this structure:
const db = {
name:db_name,
tables:{
[table_name]:{
columns:{
[column_name]:{
type,
ordinal_position,
default,
is_nullable
},...
},
primary_key:[column_name,...],
foreign_keys:[
{
constraint_name,
table_name, //will be parent table_name
column_names,
foreign_table_name,
foreign_column_names,
},...
],
target_of_foreign_keys:[
{
constraint_name,
table_name,
column_names,
foreign_table_name,//will be parent table_name
foreign_column_names,
},...
]
}
},...
}
This is done in a few steps:
extractDbSchema(opts)
queriesinformation_schema
to provide a json schema representation of your mysql or postgres db (which you can also use in your code)run(db,opts)
:extractDescriptionsFromMarkdown(path)
extracts descriptions from the markdown file.makeMarkdownHtml(db,descriptions)
reconstructs and updates a git-flavored markdown readme of your db from the extracted json & descriptions, preserving descriptions across rebuilds, with each table and column deep-linkablecheckConventions(db,descriptions,opts)
checks whether the current state of the db follows the desired rules
-
Documentation - Being able to see an overview is desirable. Being able to point at something in conversation is helpful. Things not committed become folklore.
-
Total Freedom Is Not Always Desirable - Dev teams, especially those which suffer from high turnover, allow too much freedom in databases, which leads to local contradictions, which leads to ever-increasing mental overhead. Adding some reasonable rules can minimize the mental overhead necessary, and increase reliability.
Given the levels of restrictions and rigor placed on executed code, there are curiously few placed on everything else. Such freedom in a space can send the signal that equivalent rigor is not worthwhile here, when of course it still is.
- stored procedures, views, and enums are currently not considered, because they are not recommended.
- the natural language processing module
compromise
to detect plurals is not perfect. Sometimes you might have to give it hints atop your setup file to interpret certain words as nouns which could act as verbs, liketemplate
in this case:
let nlp=require('compromise')//will be available if `db-linter` is installed
nlp('',{
//word:'Noun'
template:'Noun',
//...
})
Looks best & links all work if viewed as github-flavored markdown.
Automatically rebuilt with updates, retaining descriptions devs provide. Note all links are deep-linkable for referencing in conversation.
A 3 col-max TOC is on top, for dbs with many tables.
Note you can place anything outside the <
!--DB-LINTER-->
markers surrounding the added markup.
But only descriptions inside, as everything else is regenerated between them.
history dimension
|
rick organism_dimension
|
portal_gun organism
|
Table | Relations
|
---|---|
dimension
- a parallel plane of existence accessible with a portal gun.
|
#dimension ↖ history ↖ organism_dimension ↖ rick
|
history
- an archived instance of travel via a specific portal gun, by a Rick, to a dimension, at a specific point in time
|
↗dimension ↗ portal_gun ↗ rick # history
|
organism
- a living creature Rick has encountered
|
#organism ↖ organism_dimension
|
organism_dimension
- 1 organism can be found in many dimensions
|
↗dimension ↗ organism # organism_dimension
|
portal_gun
- A portal gun made by a Rick, capable of opening portals to other dimensions.
|
↗rick # portal_gun ↖ history
|
rick
which Rick this is
|
↗dimension # rick ↖ history ↖ portal_gun
|
This spaced added so scrolling works as expected when deep-linking.