-
Notifications
You must be signed in to change notification settings - Fork 210
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scalars and vectors #20
Scalars and vectors #20
Conversation
Looks good, but I'm not sure it's worth adding deprecated tags to all the old ways of doing it. Why not just remove them completely? There are going to be enough breaking changes in the new version, what's one more? That said, I think it might be worth keeping the |
Looks good; moving to using the Using the new API, how would one create an arbitrary constant matrix like this one? (old code)
It seems like this is hard, since there is only a primitive for Is there a very easy way using the |
@Craigacp it was to facilitate the migration of your code as we are heading to a final version but my goal was to completely remove them before release. Now, if everyone agrees to remove them right away, I'll be pleased to do so :) Good call to keep the old-fashioned fast track there though but marked as deprecated, I like that idea. |
@dhruvrajan The main reason we want to get rid of those methods using standard n-dimensional arrays like But you made me realize that the API is still not optimal yet and can yet be improved. Let's go through this in details. Something to take in consideration is that when we create a constant, the data must be copied to a tensor that is then copied once more as a constant internally. So in your example, the data is duplicated twice. The equivalent using the API, with some changes, could be this: tf.constant(NdArrays.ofInts(shape(3, 3))
.set(vectorOf(4, 3, 5), 0)
.set(vectorOf(2, 1, 7), 1)
.set(vectorOf(5, 3, 0), 2)
); We can improve this by allocating a tensor of the right capacity and writing directly its memory instead, so we save the allocation of some data on the heap. That would be (again with a few changes): tf.constant(TInt32.ofShape(3, 3)
.set(vectorOf(4, 3, 5), 0)
.set(vectorOf(2, 1, 7), 1)
.set(vectorOf(5, 3, 0), 2)
); (we note here that allocating a tensor (native) or an nd-array (heap) have a different signature in regards of the shape, something we might want to adjust as well). Still, each vector is allocated on the heap before ending up in the tensor, which should still be more performant than working with n-dimensional standard arrays. But it is possible to avoid it again, at the cost of extreme verbosity, like this: tf.constant(TInt32.ofShape(3, 3)
.set(4, 0, 0).set(3, 0, 1).set(5, 0, 2)
.set(2, 1, 0).set(1, 1, 1).set(7, 1, 2)
.set(5, 2, 0).set(3, 2, 1).set(0, 2, 2)
); which brings me back to think that the previous example (using |
It should still be possible to do something like this, right? tf.constant(TInt32.ofInts(shape(3, 3),
4, 3, 5,
2, 1, 7,
5, 3, 0)); |
@saudet if I recall correctly, while designing the NdArray API, it has been suggested (by @Craigacp I think) to use explicit indexation when setting values in an array instead of relying on the source formatting for making it clear. I was suggesting something like that at that time: tf.constant(TInt32.ofShape(3, 3)
.row(4, 3, 5)
.row(2, 1, 7)
.rowLast(5, 3, 0)
); @Craigacp , do you have time to jump in? wdyt? |
I agree; would be good to use explicit syntax, so that we can depend on auto-formatting to produce proper-looking array syntax. Since each
Also, though it's a bit tougher, allowing syntax-nesting for higher dimensional tensors would also be useful. It may be somewhat involved to verify the shape of the nested structure matches that of the
Renaming As a perhaps interesting tangent, in Scala it may also be possible to override some symbols like |
@karllessard, @saudet, I'm against the style where we list the values as a flat array and then shape them because if you reformat the code then you lose what little information there was about the structure and have to mentally reparse it each time you read it. Plus adding an extra value implies reformatting all the code, and if you don't reformat then things become misleading. The row wise builder is also prone to bugs as builders in my experience don't usually have an implicit mutable state pointer about where they write to (as they tend to set orthogonal parameters). My preference would be for something like Karl's earlier example
as that is robust to formatting changes, and is explicit about where the elements are written to. However I think it would be kinda ugly in higher dimensions. I'm not sure there is a perfect answer to this problem, all the ways I can think of without language support have some tradeoffs. |
ok I'm not really preaching in favor of switching to the row "builder" pattern but the way I was seeing it is that each class IntNdArray {
class Row {
private final long[] index;
public Row row(int... values) {
set(values, index);
return new Row(increment(index));
}
public IntNdArray rowLast(int... values) {
set(values, index);
return IntNdArray.this;
}
}
public Row row(int... values) {
long[] index = { 0 }; // length depends on rank of the array
return new Row(index).row(values);
}
} Now I'll go with the group if explicit indexation is preferred but I agree that we shouldn't rely only on source formatting to understand the code. |
@dhruvrajan , syntactically your static API is great, I like it, but in the end, you'll end up allocating arrays of arrays on the heap before passing it to the tensor for initialization, thus I don't think there is a real gain compared to traditional n-dimensional arrays. |
Well, I don't really care that much for the syntax, but say we already have an tf.constant(TInt32.ofInts(shape(3, 3), IntBuffer.wrap(new int[] {
4, 3, 5,
2, 1, 7,
5, 3, 0})); This is just for illustration, imagine that we have a larger |
You just need to wrap it with a tf.constant(TInt32.ofShape(30, 40, 50).write(DataBuffers.from(intBuffer))); |
Why do we need to copy it? |
It's not copied, it's wrapped... or to the tensor you mean? well, that was your case, right? You have an |
BTW, might be interesting to look at this, I wrote and ran quickly some benchmarks on the different methods and the "row" approach scores very high, since it writes sequentially all provided data and does not have to reposition itself on every call. But again, without proper formatting, it's easy to get lost. @Benchmark
@Measurement(batchSize = 1000)
public void initTensorByArrays() {
Tensors.create(new int[][][][]{
{
{
{0, 0, 0}, {0, 0, 1}, {0, 0, 2}
},
{
{0, 1, 0}, {0, 1, 1}, {0, 1, 2}
},
{
{0, 2, 0}, {0, 2, 1}, {0, 2, 2}
}
}, {
{
{1, 0, 0}, {1, 0, 1}, {1, 0, 2}
},
{
{1, 1, 0}, {1, 1, 1}, {1, 1, 2}
},
{
{1, 2, 0}, {1, 2, 1}, {1, 2, 2}
}
}, {
{
{2, 0, 0}, {2, 0, 1}, {2, 0, 2}
},
{
{2, 1, 0}, {2, 1, 1}, {2, 1, 2}
},
{
{2, 2, 0}, {2, 2, 1}, {2, 2, 2}
}
}
});
}
@Benchmark
@Measurement(batchSize = 1000)
public void initTensorByVectors() {
TInt32.ofShape(Shape.make(3, 3, 3, 3)).data()
.set(vectorOf(0, 0, 0), 0, 0, 0).set(vectorOf(0, 0, 1), 0, 0, 1).set(vectorOf(0, 0, 2), 0, 0, 2)
.set(vectorOf(0, 1, 0), 0, 1, 0).set(vectorOf(0, 1, 1), 0, 1, 1).set(vectorOf(0, 1, 2), 0, 1, 2)
.set(vectorOf(0, 2, 0), 0, 2, 0).set(vectorOf(0, 2, 1), 0, 2, 1).set(vectorOf(0, 2, 2), 0, 2, 2)
.set(vectorOf(1, 0, 0), 1, 0, 0).set(vectorOf(1, 0, 1), 1, 0, 1).set(vectorOf(1, 0, 2), 1, 0, 2)
.set(vectorOf(1, 1, 0), 1, 1, 0).set(vectorOf(1, 1, 1), 1, 1, 1).set(vectorOf(1, 1, 2), 1, 1, 2)
.set(vectorOf(1, 2, 0), 1, 2, 0).set(vectorOf(1, 2, 1), 1, 2, 1).set(vectorOf(1, 2, 2), 1, 2, 2)
.set(vectorOf(2, 0, 0), 2, 0, 0).set(vectorOf(2, 0, 1), 2, 0, 1).set(vectorOf(2, 0, 2), 2, 0, 2)
.set(vectorOf(2, 1, 0), 2, 1, 0).set(vectorOf(2, 1, 1), 2, 1, 1).set(vectorOf(2, 1, 2), 2, 1, 2)
.set(vectorOf(2, 2, 0), 2, 2, 0).set(vectorOf(2, 2, 1), 2, 2, 1).set(vectorOf(2, 2, 2), 2, 2, 2);
}
@Benchmark
@Measurement(batchSize = 1000)
public void initTensorByRows() throws Exception {
TInt32.ofShapeNew(Shape.make(3, 3, 3, 3))
.row(0, 0, 0).row(0, 0, 1).row(0, 0, 2)
.row(0, 1, 0).row(0, 1, 1).row(0, 1, 2)
.row(0, 2, 0).row(0, 2, 1).row(0, 2, 2)
.row(1, 0, 0).row(1, 0, 1).row(1, 0, 2)
.row(1, 1, 0).row(1, 1, 1).row(1, 1, 2)
.row(1, 2, 0).row(1, 2, 1).row(1, 2, 2)
.row(2, 0, 0).row(2, 0, 1).row(2, 0, 2)
.row(2, 1, 0).row(2, 1, 1).row(2, 1, 2)
.row(2, 2, 0).row(2, 2, 1).row(2, 2, 2);
} |
It could be an off-heap buffer. So it's still not possible to share Tensor memory? In that case, since the whole design is inefficient to start with, in my opinion there's little point in arguing about how inefficient it is to allocate multiple temporary arrays on the heap, isn't? |
In your example, you were passing a If you simply want to "wrap" native memory with the |
My point is about doing things the most efficient way. A dense Tensor is stored in contiguous memory, so it's also most efficient to interact with contiguous chunks of memory from Java, whether it's on the heap or not. And if it's not on the heap, then we don't even need to copy it.
We don't have to dig anywhere, we can allocate Tensor memory however we want. It's part of C API here: |
Very interesting, I don’t recall seeing that endpoint before, TF Java never used it, cool stuff! should be relatively easy to support with new tensor types. |
Providing the deallocator might be a little complicated, as it will need to call back into the JVM. |
I guess JavaCPP already support something like this? Also, we'll have to rely on JavaCPP |
Now what about something like this: @Benchmark
@Measurement(batchSize = 1000)
public void initTensorByIndexedRows() throws Exception {
TInt32.ofShapeNew(Shape.make(3, 3, 3, 3))
.initRows(0, 0)
.row(0, 0, 0).row(0, 0, 1).row(0, 0, 2)
.initRows(0, 1)
.row(0, 1, 0).row(0, 1, 1).row(0, 1, 2)
.initRows(0, 2)
.row(0, 2, 0).row(0, 2, 1).row(0, 2, 2)
.initRows(1, 0)
.row(1, 0, 0).row(1, 0, 1).row(1, 0, 2)
.initRows(1, 1)
.row(1, 1, 0).row(1, 1, 1).row(1, 1, 2)
.initRows(1, 2)
.row(1, 2, 0).row(1, 2, 1).row(1, 2, 2)
.initRows(2, 0)
.row(2, 0, 0).row(2, 0, 1).row(2, 0, 2)
.initRows(2, 1)
.row(2, 1, 0).row(2, 1, 1).row(2, 1, 2)
.initRows(2, 2)
.row(2, 2, 0).row(2, 2, 1).row(2, 2, 2);
} It is partially indexed, performs almost as good as the previous unindexed-row methods and has a more succinct syntax than the full-indexed vector approach |
Yes, JavaCPP supports all that. There's a "dummyDeallocator" used here by default: BTW, we don't need to put something like |
34b9d70
to
c9cad15
Compare
Ok, I did major changes to this PR, I would like if you can please take another look at it. Here's the changelog:
While splitting the @dhruvrajan , about the matrix initialization, I restored the possibility to initialize such constant with standard arrays, like this: tf.matrix(new int[][][] {
{
{ 1, 2 }, { 3, 4 }
}, {
{ 5, 6 }, { 7, 8 }
}
}); Just by avoiding reflective operations like the tf.constant(NdArrays.wrap(Shape.of(2, 2, 2), DataBuffers.of(
1, 2, 3, 4,
5, 6, 7, 8
))); Finally, all constant operators inherit from CC: @Craigacp |
@@ -0,0 +1,563 @@ | |||
/* Copyright 2017 The TensorFlow Authors. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The copyright on new files should probably be 2020.
* new constant will match those of the array. | ||
* @return a {@link TFloat64} constant matrix | ||
*/ | ||
public static Matrix<TFloat64> create(Scope scope, double[][][][] data) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is a 4-rank object really a matrix? Should these be moved to tensorConstant
or some other name? Also the javadoc should probably note that these are slower than other ways of creating a 4-rank object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is a 4-rank object really a matrix
tensor
can also be rank-0 or 1, while here it is only used to initialize rank-2+ tensors from standard nd-arrays. I thought matrix
meant that in some way but while it is pretty common for 2D and 3D, I personally don't know how awkward it sounds in the field for 4D+ data structures.
We can also merge back all those endpoints in a single constant
(or tensor
?) operator, with an exception for the vector
initializers taking advantages of the vararg parameter.
Also the javadoc should probably note that these are slower than other ways of creating a 4-rank object.
Actually, since I removed all reflective operations on them, their performances is comparable to the other methods. In graph mode, the usage of a constant
is mainly to instantiate a value from some constant values, which should be relatively small. I think what was really killing performances is when large tensors were created for feeding data to a graph using one of the Tensors.create
methods, like this one (that I plan to remove as well).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just thought of something, I've always found that tf.constant
is a bit verbose for something that is used so often.
What about tf.val
instead? It is concise and kind of implies that it is an immutable (constant) value. Then, the tf.vector
could be renamed maybe to tf.array
for vararg initialization only and other operators would be removed. @Craigacp?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trying it out, I think I really like it:
private static Tensor<TFloat32> preprocessImages(ByteNdArray rawImages) {
Ops tf = Ops.create();
// Flatten images in a single dimension and normalize their pixels as floats.
long imageSize = rawImages.get(0).shape().size();
return tf.math.div(
tf.reshape(
tf.dtypes.cast(tf.val(rawImages), TFloat32.DTYPE),
tf.array(-1L, imageSize)
),
tf.val(255.0f)
).asTensor();
}
Both val
and array
return a Constant
object (with the help of the new @Endpoint
annotation). I think I'll push those changes, the timing is right for improving the whole API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just thought of something, I've always found that tf.constant is a bit verbose for something that is used so often.
Agreed; I like the tf.val
and tf.array
with vararg initialization.
I think the necessity of a tf.matrix
is not very high; it'd be ok to just have tf.constant
or tf.tensor
methods to construct higher dimensional tensors like you suggest above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, API speaking, I've reverted most of the signatures to what they were, with the following exceptions:
tf.constant
has been renamed totf.val
tf.array
has been added to support varargShape.make
has been renamed toShape.of
- A few other minor things.
So there is no more tf.matrix
, tf.vector
, tf.scalar
, tf.tensor
....
if (shape[dimIdx] == 0) { | ||
shape[dimIdx] = arrayLength; | ||
} else if (shape[dimIdx] != arrayLength) { | ||
shape[dimIdx] = Shape.UNKNOWN_SIZE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the behaviour we want for ragged tensors? The ndarray machinery is mostly predicated on the ndarray being a hyperrectangle, is everything robust to getting a Shape.UNKNOWN_SIZE
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's in preparation for. Right now, only dense NdArray
s can be instantiated by TF Tools. Plus, providing a "ragged standard array" to initialize a dense one will fail gracefully, like it is tested here.
So I think it is safe to support properly unknown dim sizes for standard arrays.
c9cad15
to
c6edd32
Compare
c6edd32
to
8c73b39
Compare
Code updated with latest proposal (i.e. using |
This makes things very convenient! |
Ok, @Craigacp also confirmed with me that he's fine with those changes so I'm merging |
This PR is a suggestion to improve the creation of constants for invoking TensorFlow ops, either in graph or eager mode.
Right now, everything goes through the
constant
operator, which supports a huge list of possible constructor, most of them being standard arrays that increases in rank.In the current PR, I deprecate most of them as we should use the
NdArray
API to create multidimensional arrays for better performances in general. Further more, the constructor that takes a single primitive value in parameter is now handled by the newscalar
op and the one taking a single array goes to thevector
op. For example:The reason of create two new distinct ops
scalar
andvector
is that it allows us to use variadic parameters for creating constant of rank-1 with the need of always allocating a new array, i.e.So the new code is less verbose. And having distinct variable types like
Scalar
andVector
can be useful as an indicator of the rank of a tensor carried in a given constant.Also, a
vector
can be created from aShape
object, as often a shape needs to be passed as an operand to an op (note that could have been done inconstant
as well). For example:I found that the usage of those three different variants for creating constants, i.e
scalar
,vector
andconstant
, creates richer and cleaner code but I would like to hear your comments on it.CC: @Craigacp, @EronWright, @dhruvrajan