-
Notifications
You must be signed in to change notification settings - Fork 217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support named parameters #10
Comments
@joeandaverde, I think it is generally a good idea, but your implementation for the variable syntax reminds me of an issue we had recently with the $1,$2 variables when it comes to long variable names. When we encounter $1, $2 variables, we make sure they are not followed by a digit, i.e. they are not $11, $22, or something like that, because those are then different variables. But in your case there cannot be a truncating symbol to be identified, because you use regular letters for your variables. As a result, how could you possibly tell the difference between variables Now, if you were to suggest using something like this: Any thoughts on that? Any existing standard for that, perhaps, as a reference? |
There's a lot to be desired in the regular expression that's breaking up the SQL statement. Currently it's simply using '.split(/(:\w+)/)' which would work in the cases you mentioned. > 'SELECT * FROM users where id = :id and idiom = :idiom'.split(/(:\w+)/)
[ 'SELECT * FROM users where id = ',
':id',
' and idiom = ',
':idiom',
'' ]
> Where it would fail is if you wanted to use the full range of valid Ecmascript variable names. This is because Ecmascript allows for unicode variable names (see: https://mathiasbynens.be/notes/javascript-identifiers-es6). Another possible issue is that object keys can be any arbitrary string. This would also fail to match I'm currently using |
I'm also up to find a more resilient way of representing variable insertion. |
I'm going to look around for any kind of standard way for identifying a variable in that case, and it would help if you did too ;) Also, I'd like to avoid enumerating an object-parameter for its properties, because it may have lots more than just the parameters needed, and I believe that using the query as the source for variable names is the way to go, i.e. we search through the query using regex to make the list of all variables to be replaced, and only then use the object to pull its properties, and not the other way round. However, this does imply use of strict variable syntax, to avoid misidentifying them. |
Another issue with my solution is that it doesn't account for postgres type casts. However, this can probably be fixed with a negative look behind for select 1::int |
Yes, this does create an ambiguity, with Postgres using |
In regards to your previous message about enumerating object properties: This particular implementation pulls a mapping of $1 to variable name. It iterates those instead of the object parameter to extract values into an array. |
It would be ideal not to collide with ES6 template strings to avoid needing to escape: |
Thank you for that link. The best syntax I can think of for now is If you find a better one - please let me know ;) |
Just so, about your concern of creating a conflict: http://updates.html5rocks.com/2015/01/ES6-Template-Strings According to that, ES6 specification requires use of `` symbols to declare a template, as opposed to '' simple quotes, and since we are using regular strings to inject variables, there wouldn't be a conflict with ES6 Template Strings, even if we use the exact same syntax for variables. In fact, I see something more of a positive thing that we would have it compatible with ES6 that way ;) |
I'm going to take some time to think about the best approach to doing this. But in the meantime, I was playing with the idea of how to pull names of all variables-properties in a query for the syntax of /////////////////////////////////////////////////////////////
// Returns array of unique variable names in a text string;
//
// - variables are defined using syntax: ${varName};
// - a valid variable starts with a letter or underscore symbol,
// followed by any combination of letters, digits or underscores.
// - a variable can have any number of leading and trailing spaces;
function enumVars(txt) {
var v, names = [];
var reg = /\$\{\s*[a-zA-Z_][a-zA-Z0-9_]*\s*}/g;
while (v = reg.exec(txt)) {
var svn = v[0].replace(/[\$\{\s}]/g, ''); // stripped variable name;
if(names.indexOf(svn) === -1) {
names.push(svn);
}
}
return names;
}
// example:
console.log(enumVars("${var_1} ${ _var2_ } ${ var3_} ${ var_1 }"));
// outputs: [ 'var_1', '_var2_', 'var3_' ] The next step would be just to pull values of the corresponding properties from the object-parameter and do the replacements. Easy! :) |
I have created a new branch named-params, to start working in this direction, with the above method checked in, as a starting point. |
I originally started with regex exec. When I got to replacement's I found it was easier using the split method on string (using a capture group in order to not lose variable name). This made it easier to map Postgres vars to property names. |
I'd like to see that easier way. I know that with RegEx it would be just one line of code, using the same RegEx I used to search for variables. I'm not sure the same simplicity can be accomplished with simple string functions, considering all the complex cases of white spaces that RegEx takes care of automatically. Please show me your simpler alternative. |
Can't edit from phone or link to line. But you can see an example here in the function parseSql: https://github.com/joeandaverde/tinypg/blob/master/index.js |
It is difficult to understand the way it works there exactly, except for the fact that it deals with a different syntax other than Would you like to create a version of it that produces the same result as one I wrote above? |
I have finished the initial implementation for supporting named parameters in the new branch - check it out ;) For the time being I'm using my own RegEx approach, and I included one complex test there, see in file: indexSpec.js: it("must correctly format named parameters or throw an error", function () {
// - correctly handles leading and trailing spaces;
// - supports underscores and digits in names;
// - can join variables values next to each other;
// - converts all simple types correctly;
// - replaces undefined variables with null;
// - variables are case-sensitive;
expect(pgp.as.format("${ NamE_},${d_o_b },${ _active__},${ file_5A }${__Balance}", {
NamE_: "John O'Connor",
d_o_b: dateSample,
_active__: true,
file_5a: 'something', // not to be found, due to case difference;
__Balance: -123.45
})).toBe("'John O''Connor','" + dateSample.toUTCString() + "',TRUE,null-123.45");
}); If you think that you can replace the search-and-replace approach with a better one - give it a go, you have all the code, and even a good test to begin with. Other than that, before the branch is ready to be merged into master, I will be doing a number updates covering tests, documentation, and possibly implementation, if something better comes along. |
P.S. Please star this package, if you like :) It helps with the search result on npm ;) |
As there were no further comments, I finished all tests and documentation updates and released version 0.8.0 with the new feature. Details are in the release notes. |
I think we were talking about something completely different. What you implemented was formatted strings. My original request was to implement a feature to allow for parameterized SQL queries using named parameters. In order to accomplish this safely (not using string interpolation as this opens a gaping hole for SQL injection) there needs to be a way to map $1, $2 etc to array elements. It's worth mentioning again that the implementation as is condones unsanitized SQL queries. |
To my knowledge, there is no such thing for that which you want. Postgres doesn't support named parameters. See this: brianc/node-postgres#268 And SQL injection isn't a real problem on its own. One has to create a gap in URL-to variable mapping, or otherwise do a poor design overall. Therefore I'm not overly concerned about it. I extended the syntax to support named variables from objects, which works now, and this is about as good as it's gonna get :) I don't see how i can make it any better. The current code for 0.8.1 will be released soon, with one change to parameter formatting - it will throw an error when a variable doesn't exist, as opposed to formatting it as And I don't see how possibly we could have been talking about something else when even your references to your own library suggested exactly what was done here. |
I'm not suggesting named variables IN the final SQL query. db.query('SELECT * FROM users WHERE name = :name and age > :min_age', { name: 'joe', min_age: 30 }) would get transformed by your library to be db.query('SELECT * FROM users WHERE name = $1 and age > $2', ['joe', 30]) This type of query can be sanitized by the postgres database engine whereas one which allows for string substitution would not. This is bad: db.query("SELECT * FROM users WHERE name = 'joe' and age > 30") |
What are you talking about? Postgres doesn't take any variable parameters. Nothing can be sanitized by Postgres database engine. Only the final query can be passed from the client, nothing else. |
The parameters passed to Postgres can be injected into the query in a safe manor for example: as.format('SELECT * FROM users WHERE name = ${name} and age > ${min_age}', {name: '\'1\'; DROP TABLE users;--', min_age: 30 }) would become 'SELECT * FROM users WHERE name = \'1\'; DROP TABLE users;-- and age > 30' If instead those parameters are passed to the Postgres engine ($1, $2 => ['Joe', 30]), it can do the appropriate escaping of malicious characters. |
Here's an example of the Postgres log of similar query.
|
How do you pass |
It does. That's what you're seeing above. Directly from the Postgres logs. |
Please, point at the Postgres online documentation about passing parameters with a query, because I've never heard of such thing's existence. P.S. I'm done for the day, back to this tomorrow. |
http://www.postgresql.org/docs/9.2/static/plpgsql-statements.html scroll down half way (39.5.4. Executing Dynamic Commands) |
Those are local formatting commands, with specific use of |
It seems that PG now has support for Prepared Statements, which may be what you are looking for. By the way, if you like PG implementation of parameter formatting better, you can always use that instead. This library has always allowed that, see Initialization Options, parameter |
@joeandaverde Unrelated to this, but if you liked this library, I really appreciate if you can give me a feedback on this small addition: pg-monitor |
It'd be nice to specify named parameters in the query that references properties on an object passed in place of the array of values that would normally map to $1, $2, etc.
I created a small wrapper for the plain postgres module to do this for my internal project. I really like how your library looks and think this would be an excellent addition to essentially make my project obsolete.
https://github.com/joeandaverde/tinypg
The text was updated successfully, but these errors were encountered: