-
Notifications
You must be signed in to change notification settings - Fork 806
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Re-Implementation of reverse string via CFString #2479
Conversation
nit: reword to be imperative mood #Pending |
Getting some bench mark for old vs new #Resolved |
|
||
CFMutableStringRef result = CFStringCreateMutable(kCFAllocatorDefault, 0); | ||
|
||
while (index > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is going to suck when it comes to perf. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might as well use buffer indexing, given it's somewhat "safe"
In reply to: 110780304 [](ancestors = 110780304)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the use of unichars and individual character access, because it's safe for multi-byte encodings... but we should be cognizant of allocations!
CFIndex index = CFStringGetLength(string); | ||
|
||
if (index < 2) { | ||
return CFStringCreateCopy(kCFAllocatorDefault, string); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this could also be CFRetain #ByDesign
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the input is a mutable string, retain might not be a good idea/ #ByDesign
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! #ByDesign
return CFStringCreateCopy(kCFAllocatorDefault, string); | ||
} | ||
|
||
CFMutableStringRef result = CFStringCreateMutable(kCFAllocatorDefault, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you create the new string with a capacity hint from the original string, you can avoid the inevitable reallocations #Resolved
|
||
CFMutableStringRef result = CFStringCreateMutable(kCFAllocatorDefault, 0); | ||
|
||
while (index > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it may be cheaper to create a unichar buffer and pass that off to CFStringCreateWithCharacters -- you get the added benefit of not creating a mutable string
It should be somewhat "safe" to use utf-16 codepoints or unichars; it's definitely safer than utf-8 #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's what i was going with first, it might be more sensible given that only CT uses it.
Hmm i might as well move it in there, but this kind of makes it more util based. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I wonder how it works with composing diacritics. It might be reasonable to normalize the string first... #Resolved |
#import <CFInternal.h> | ||
#import "CFFoundationInternal.h" | ||
|
||
CFStringRef _CFStringCreateReversedString(CFStringRef string) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that there is exactly one place where this is used (and was used in the past), I don't like that we are doing this. Why not have the helper in CT and just use it there? #ByDesign
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what my initial plan was, but since this seems very common and could be utilized by others, it would make a good addition. Also we already have some tests that can be tested independently. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets not do this. the _reverseString was added because some other code was using it. Given that there is only one instance where _reverseString was used, there is no need to go overboard here.
In reply to: 110781327 [](ancestors = 110781327)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, what is wrong with good old fashioned strrev variants?
In reply to: 110782273 [](ancestors = 110782273,110781327)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_strrev + UTF-16 encoding should do the trick then.
In reply to: 110782324 [](ancestors = 110782324,110782273,110781327)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets not do this at all.
@rajsesh-msft still need reverse in CFString backed by strrev, that is contained in CoreText. |
Frameworks/CoreText/CTUtilities.mm
Outdated
@@ -15,7 +15,10 @@ | |||
//****************************************************************************** | |||
|
|||
#import <CoreText/CTUtilities.h> | |||
#import <Starboard/SmartTypes.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed #Resolved
Frameworks/CoreText/CTLine.mm
Outdated
@@ -188,8 +187,9 @@ CTLineRef CTLineCreateTruncatedLine(CTLineRef sourceLine, double width, CTLineTr | |||
CFDictionaryRef attribs = CTRunGetAttributes(currentRun); | |||
|
|||
if (truncationType == kCTLineTruncationStart) { | |||
NSString* reverse = [runString _reverseString]; | |||
NSAttributedString* string = [[NSAttributedString alloc] initWithString:reverse attributes:(NSDictionary*)attribs]; | |||
auto reverse = woc::MakeStrongCF<CFStringRef>(_CTCreateReversedString(static_cast<CFStringRef>(runString))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if there's just the one usage of it, can it be a file-scope function instead? #Resolved
Frameworks/CoreText/CTUtilities.mm
Outdated
} | ||
|
||
// Returns the number (in terms of UTF-16 code pairs) of Unicode characters in a string. | ||
CFIndex length = CFStringGetLength(string); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using CFStringGetBytes to be safer
#Resolved
@DHowett-MSFT @ms-jihua please take a look |
Frameworks/CoreText/CTLine.mm
Outdated
#import "CoreTextInternal.h" | ||
#import "CGContextInternal.h" | ||
#import "DWriteWrapper_CoreText.h" | ||
#import <CoreText/CTTypesetter.h> | ||
|
||
#import <algorithm> | ||
#import <numeric> | ||
#import <wchar.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: prefer <cwchar>
#Resolved
Frameworks/CoreText/CTLine.mm
Outdated
return CFStringCreateCopy(kCFAllocatorDefault, string); | ||
} | ||
|
||
std::vector<UniChar> characters((usedBufLen / (sizeof(UniChar))) + 1, '\0'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you use unique_iw instead of std::vector, you can use CFStringCreateWithBytesNoCopy
and detach it instead of copying the buffer. #Resolved
Frameworks/CoreText/CTLine.mm
Outdated
NSAttributedString* string = [[NSAttributedString alloc] initWithString:reverse attributes:(NSDictionary*)attribs]; | ||
auto reverse = woc::MakeStrongCF<CFStringRef>(_CTCreateReversedString(static_cast<CFStringRef>(runString))); | ||
NSAttributedString* string = | ||
[[NSAttributedString alloc] initWithString:static_cast<NSString*>(reverse.get()) attributes:(NSDictionary*)attribs]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
??? can't we use CFAttributedStringCreate? #ByDesign
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I already have changes for all NSAttributed, this is strictly for the reverse. #ByDesign
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I chose the wrong option.
|
||
CFIndex length = CFStringGetLength(string); | ||
CFIndex usedBufLen; | ||
CFStringGetBytes(string, CFRangeMake(0, length), kCFStringEncodingUTF16, 0, false, nullptr, length, &usedBufLen); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems kind of expensive for this purpose (unless CFStringGetBytes has a fastpath here?)
I wonder if CFStringGetMaximumSizeForEncoding() would be faster/would still work? #ByDesign
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it does take a faster way than if you provide a buffer.
Further improvements can be addressed by a bug to improve core text. The CR is mainly to remove Foundation dependency and what we have now is better than the earlier Foundation/copy based implementation.
#ByDesign
#import <vector> | ||
|
||
static CFStringRef __CTCreateReversedString(CFStringRef string) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how is this change being tested? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Frameworks/CoreText/CTLine.mm
Outdated
|
||
CFIndex bufLen = usedBufLen + 1 * sizeof(UniChar); | ||
woc::unique_iw<UniChar> characters(static_cast<UniChar*>(IwMalloc(bufLen))); | ||
memset(characters.get(), '\0', bufLen); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
c'mon, you only need to set the last character in the buffer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤣 that was coming 😄
Frameworks/CoreText/CTLine.mm
Outdated
bufLen, | ||
kCFStringEncodingUTF16, | ||
false, | ||
nullptr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you will need to provide an allocator to delete this string; look at the implementation of initWithBytesNoCopy:freeWhenDone:
for NSString, expecting freeWhenDone=YES
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already passing in the nullptr for deallocation.
Switched to use characters, which does it for you.
@DHowett-MSFT please take a look |
- Implemented via _wcsrev - Reduced string copies - Implemented using CoreFoundation - Removed usage of Foundation associated with reverse string. Fixes microsoft#2478
CFIndex usedBufLen; | ||
CFStringGetBytes(string, CFRangeMake(0, length), kCFStringEncodingUTF16, 0, false, nullptr, length, &usedBufLen); | ||
|
||
if (length < 2) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: move this above CFStringGetBytes call
@@ -1,6 +1,6 @@ | |||
//****************************************************************************** | |||
// | |||
// Copyright (c) 2015 Microsoft Corporation. All rights reserved. | |||
// Copyright (c) Microsoft. All rights reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These clean-ups, though nice, seem unrelated to this PR. Just revert these files?
fixes #2478