Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #280 (surrogate code points in Builder) #281

Merged
merged 2 commits into from
Mar 31, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions src/Data/Text/Internal/Builder.hs
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ import Data.Monoid (Monoid(..))
#if !MIN_VERSION_base(4,11,0) && MIN_VERSION_base(4,9,0)
import Data.Semigroup (Semigroup(..))
#endif
import Data.Text.Internal (Text(..))
import Data.Text.Internal (Text(..), safe)
import Data.Text.Internal.Lazy (smallChunkSize)
import Data.Text.Unsafe (inlineInterleaveST)
import Data.Text.Internal.Unsafe.Char (unsafeWrite)
Expand Down Expand Up @@ -138,7 +138,7 @@ empty = Builder (\ k buf -> k buf)
-- * @'toLazyText' ('singleton' c) = 'L.singleton' c@
--
singleton :: Char -> Builder
singleton c = writeAtMost 2 $ \ marr o -> unsafeWrite marr o c
singleton c = writeAtMost 2 $ \ marr o -> unsafeWrite marr o (safe c)
{-# INLINE singleton #-}

------------------------------------------------------------------------
Expand Down Expand Up @@ -190,7 +190,7 @@ fromString str = Builder $ \k (Buffer p0 o0 u0 l0) ->
ts <- inlineInterleaveST (loop marr' 0 0 chunkSize s)
return $ t : ts
| otherwise = do
n <- unsafeWrite marr (o+u) c
n <- unsafeWrite marr (o+u) (safe c)
loop marr o (u+n) (l-n) cs
in loop p0 o0 u0 l0 str
where
Expand Down
15 changes: 15 additions & 0 deletions tests/Tests/Regressions.hs
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ import qualified Data.Text.Encoding as TE
import qualified Data.Text.Internal as T
import qualified Data.Text.IO as T
import qualified Data.Text.Lazy as LT
import qualified Data.Text.Lazy.Builder as TB
import qualified Data.Text.Lazy.Encoding as LE
import qualified Data.Text.Unsafe as T
import qualified Test.Tasty as F
Expand Down Expand Up @@ -101,6 +102,18 @@ t227 =
(T.length $ T.filter isLetter $ T.take (-3) "Hello! How are you doing today?")
0

t280_fromString :: IO ()
t280_fromString =
assertEqual "TB.fromString performs replacement on invalid scalar values"
(TB.toLazyText (TB.fromString "\xD800"))
(LT.pack "\xFFFD")

t280_singleton :: IO ()
t280_singleton =
assertEqual "TB.singleton performs replacement on invalid scalar values"
(TB.toLazyText (TB.singleton '\xD800'))
(LT.pack "\xFFFD")

-- See GitHub issue #301
-- This tests whether the "TEXT take . drop -> unfused" rule is applied to the
-- slice function. When the slice function is fused, a new array will be
Expand Down Expand Up @@ -129,5 +142,7 @@ tests = F.testGroup "Regressions"
, F.testCase "t197" t197
, F.testCase "t221" t221
, F.testCase "t227" t227
, F.testCase "t280/fromString" t280_fromString
, F.testCase "t280/singleton" t280_singleton
, F.testCase "t301" t301
]