Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

buffer: add buf.mask for ws #1202

Closed
wants to merge 8 commits into from
Closed
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions doc/api/buffer.markdown
Original file line number Diff line number Diff line change
Expand Up @@ -810,6 +810,26 @@ buffer.
var b = new Buffer(50);
b.fill("h");

### buf.mask(maskValue, targetBuffer[, targetStart][, sourceStart][, sourceEnd])

* `maskValue` Number
* `targetBuffer` Buffer
* `targetStart` Number, optional
* `sourceStart` Number, optional
* `sourceEnd` Number, optional

Takes the contents of `buf`, starting at `sourceStart` and ending at
`sourceEnd`, and XOR's them with `maskValue`. The result is written to
`targetBuffer`, starting at `targetOffset`. `sourceStart` and `targetStart`
will default to `0` if not given; `sourceEnd` will default to `buf.length` if
not given. The start, end, and offset parameters function the same as the
corresponding parameters to
[buf.copy](#buffer_buf_copy_targetbuffer_targetstart_sourcestart_sourceend).

The target region of memory may overlap the source region of memory.

Returns the number of bytes masked and written into `targetBuffer`.

### buffer.values()

Creates iterator for buffer values (bytes). This function is called automatically
Expand Down
65 changes: 65 additions & 0 deletions src/node_buffer.cc
Original file line number Diff line number Diff line change
Expand Up @@ -310,6 +310,70 @@ void Base64Slice(const FunctionCallbackInfo<Value>& args) {
}


// bytesCopied = buffer.mask(mask, target, targetOffset, sourceStart, sourceEnd)
void Mask(const FunctionCallbackInfo<Value> &args) {
Environment* env = Environment::GetCurrent(args);

uint32_t mask32 = args[0]->Uint32Value();
char* mask = (char*)&mask32;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use reinterpret_cast, don't use C-style casts.

I'm not sure if you're aware of this but this would have been pointer aliasing with any other pointer type.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aliasing with char is okay right? @bnoordhuis

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, a pointer to char may alias a pointer to another type (but not the other way around.)


Local<Object> target = args[1]->ToObject(env->isolate());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before doing this should do CHECK(args[1]->IsObject()); and here do args[1].As<Object>(). Unless you don't expect args[1] to always be an object?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do. Should I add a similar check to buffer.copy?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Go ahead and add it to this PR, but leave it in a separate commit. Thanks :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add this to Copy in a separate PR – it causes test failures.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strange. well, don't worry about it then. don't want that change holding this up.


if (!HasInstance(target))
return env->ThrowTypeError("second arg should be a Buffer");

ARGS_THIS(args.This())
size_t target_length = target->GetIndexedPropertiesExternalArrayDataLength();
char* target_data = static_cast<char*>(
target->GetIndexedPropertiesExternalArrayData());
size_t target_start;
size_t source_start;
size_t source_end;

CHECK_NOT_OOB(ParseArrayIndex(args[2], 0, &target_start));
CHECK_NOT_OOB(ParseArrayIndex(args[3], 0, &source_start));
CHECK_NOT_OOB(ParseArrayIndex(args[4], obj_length, &source_end));

// Copy 0 bytes; we're done
if (target_start >= target_length || source_start >= source_end)
return args.GetReturnValue().Set(0);

if (source_start > obj_length)
return env->ThrowRangeError("out of range index");

if (source_end - source_start > target_length - target_start)
source_end = source_start + target_length - target_start;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ignore my previous comment. This calculation can in theory overflow (although not yet in practice due to size restrictions of buffers) assuming 32 bits size_t: source_end = (2**31+1) + 2**31 - 0; // 1

I'd probably add a CHECK_LE(source_start, source_end), just in case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I add CHECK_LE to buffer.copy as well, from which the above code was (more-or-less directly) lifted?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, same issue. Yes, a check would be good.

Medium term, we should look into removing the duplication in that file.


uint32_t to_copy = MIN(MIN(source_end - source_start,
target_length - target_start),
obj_length - source_start);

obj_data += source_start;
target_data += target_start;

uint32_t* target_data_32 = (uint32_t*)target_data;
uint32_t* obj_data_32 = (uint32_t*)obj_data;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pointer aliasing and strictly forbidden.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How should I fix this? Is the goal to only have one name in any frame point to a given pointer address? If so, would it be possible to add something like:

static inline size_t Mask32(uint32_t* obj_data, uint32_t* target_data, uint32_t mask, size_t len) {
  len = len / 4;
  for (size_t i = 0; i < len; ++i) {
    target_data[i] = obj_data[i] ^ mask;
  }
  return len * 4;
}

// then use it like so:
size_t written = Mask32(
  reinterpret_cast<uint32_t*>(obj_data),
  reinterpret_cast<uint32_t*>(target_data),
  mask.as_dword,
  to_copy
);
target_data += written;
obj_data += written;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iirc in cases like this you'd use a union.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I see two issues here: one is pointer aliasing, the other is endianness.

Endianness: doing 32 bits xor operations on bytes in a buffer will produce different results on BE and LE architectures.

Pointer aliasing: the safe way to go about it is as follows:

union {
  char bytes[sizeof(uint32_t)];
  uint32_t value;
} u;

for (size_t i = 0; i < to_copy_32; i += sizeof(u.bytes)) {
  char* const pointer = obj_data + i;
  memcpy(u.bytes, pointer, sizeof(u.bytes));
  u.value ^= mask32;
  memcpy(pointer, u.bytes, sizeof(u.bytes));
}

Not a paragon of beauty but there you have it.

size_t to_copy_32 = to_copy / 4;
size_t i;
for (i = 0; i < to_copy_32; ++i) {
target_data_32[i] = mask32 ^ obj_data_32[i];
}
target_data += to_copy_32 * 4;
obj_data += to_copy_32 * 4;

switch(to_copy % 4) {
case 3:
target_data[2] = obj_data[2] ^ mask[2];
case 2:
target_data[1] = obj_data[1] ^ mask[1];
case 1:
target_data[0] = obj_data[0] ^ mask[0];
case 0:;
}
return args.GetReturnValue().Set(to_copy);
}


// bytesCopied = buffer.copy(target[, targetStart][, sourceStart][, sourceEnd]);
void Copy(const FunctionCallbackInfo<Value> &args) {
Environment* env = Environment::GetCurrent(args);
Expand Down Expand Up @@ -741,6 +805,7 @@ void SetupBufferJS(const FunctionCallbackInfo<Value>& args) {
env->SetMethod(proto, "utf8Write", Utf8Write);

env->SetMethod(proto, "copy", Copy);
env->SetMethod(proto, "mask", Mask);

// for backwards compatibility
proto->ForceSet(env->offset_string(),
Expand Down
92 changes: 92 additions & 0 deletions test/parallel/test-buffer-mask.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
var common = require('../common');
var assert = require('assert');

var tests = [
testBasic,
testBounds,
testOverflow,
testNonAligned,
testReturnValue
]

tests.forEach(Function.prototype.call.bind(Function.prototype.call))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clever... but a hell of a lot longer than tests.forEach(function(f) { f(); }) :-)

Minor style nit: missing semicolon.


function referenceImplementation(source, maskNum, output, offset, start, end) {
var i = 0;
var mask = new Buffer(4);
var length = end - start;
mask.writeUInt32LE(maskNum, 0);
var toCopy = Math.min(length, output.length - offset);
var maskIdx = 0;
for (var i = 0; i < toCopy; ++i) {
output[i + offset] = source[i + start] ^ mask[maskIdx];
maskIdx = (maskIdx + 1) & 3;
}
}

function testBasic() {
var input = new Buffer(256);
var output = new Buffer(256);
var refoutput = new Buffer(256);
var mask = 0xF0F0F0F0;

for (var i = 0; i < input.length; ++i)
input[i] = i;

output[0] = refoutput[0] = 0;
referenceImplementation(input, mask, refoutput, 1, 0, 255);
input.mask(mask, output, 1, 0, 255);
for (var i = 0; i < input.length; ++i) {
assert.equal(output[i], refoutput[i]);
}
}

function testBounds() {
var input = new Buffer(16);
var output = new Buffer(32);
try {
input.mask(120120, output, 0, 0, 17);
assert.fail('expected error');
} catch(err) {
assert.ok(err);
}
}

function testOverflow() {
var input = new Buffer(16);
var output = new Buffer(15);
try {
input.mask(120120, output);
assert.fail('expected error');
} catch(err) {
assert.ok(err);
}
}

function testNonAligned() {
var input = new Buffer(16);
var output = new Buffer(16);
var refoutput = new Buffer(16);
var mask = 0xF0F0F0F0;

for (var i = 0; i < input.length; ++i)
input[i] = i;

for (var end = 3; end > 0; --end) {
referenceImplementation(input, mask, refoutput, 0, 0, end);
input.mask(mask, output, 0, 0, end);

for (var i = 0; i < end; ++i)
assert.equal(output[i], refoutput[i]);
}
}

function testReturnValue() {
var input = new Buffer(16);
var output = new Buffer(16);
assert.equal(input.mask(0, output), 16)
assert.equal(input.mask(0, output, 4), 12)
assert.equal(input.mask(0, output, 4, 6), 10)
assert.equal(input.mask(0, output, 4, 6, 4), 0)
assert.equal(input.mask(0, output, 4, 6, 8), 2)
}