packages/text.git
7 years ago.gitignore 'dist' and 'cabal-dev'
Simon Meier [Fri, 27 Jan 2012 23:12:29 +0000 (00:12 +0100)] 
.gitignore 'dist' and 'cabal-dev'

7 years agoMerge pull request #9 from jaspervdj/master
Bryan O'Sullivan [Fri, 13 Jan 2012 22:29:18 +0000 (14:29 -0800)] 
Merge pull request #9 from jaspervdj/master

Simpler restreaming state

7 years agoMerge
Bryan O'Sullivan [Sat, 7 Jan 2012 22:01:19 +0000 (14:01 -0800)] 
Merge

7 years agoAdded tag 0.11.1.12 for changeset 204da16b5098
Bryan O'Sullivan [Fri, 23 Dec 2011 22:42:55 +0000 (14:42 -0800)] 
Added tag 0.11.1.12 for changeset 204da16b5098

7 years agoMany improvements, all small.
Bryan O'Sullivan [Fri, 23 Dec 2011 22:12:36 +0000 (14:12 -0800)] 
Many improvements, all small.

Add a span_ function, using unboxed tuples, to Text.Private.

Use span_ in a few places where it can help a little.

Relax the constraint on rational to Fractional.

Specialize over many more integral types.

7 years agoReduce pointer arithmetic for better speed.
Bryan O'Sullivan [Fri, 23 Dec 2011 20:28:11 +0000 (12:28 -0800)] 
Reduce pointer arithmetic for better speed.

7 years agoImprove ASCII encoding performance in a safer way.
Bryan O'Sullivan [Fri, 23 Dec 2011 20:24:17 +0000 (12:24 -0800)] 
Improve ASCII encoding performance in a safer way.

7 years agoBump version to 0.11.1.12
Bryan O'Sullivan [Fri, 23 Dec 2011 20:03:31 +0000 (12:03 -0800)] 
Bump version to 0.11.1.12

7 years agoMerge the performance- and correctness-affecting commits away
Bryan O'Sullivan [Fri, 23 Dec 2011 20:03:18 +0000 (12:03 -0800)] 
Merge the performance- and correctness-affecting commits away

7 years agoOops! Back out part of 59aad6977070 - it was wrong
Bryan O'Sullivan [Fri, 23 Dec 2011 20:02:07 +0000 (12:02 -0800)] 
Oops! Back out part of 59aad6977070 - it was wrong

My assertion that it was safe to skip the "do I have 1 byte available?" check
was incorrect.

7 years agoA valiant attempt at improving UTF-8 encoding performance.
Bryan O'Sullivan [Fri, 23 Dec 2011 19:57:13 +0000 (11:57 -0800)] 
A valiant attempt at improving UTF-8 encoding performance.

This didn't actually work - it slowed down aeson encoding by almost 2x!

7 years agoMake span slightly faster and simpler
Bryan O'Sullivan [Fri, 23 Dec 2011 17:53:36 +0000 (09:53 -0800)] 
Make span slightly faster and simpler

This drops a single allocation of a boxed integer.

7 years agoAdded tag 0.11.1.11 for changeset 8b981edd27be
Bryan O'Sullivan [Fri, 23 Dec 2011 08:15:14 +0000 (00:15 -0800)] 
Added tag 0.11.1.11 for changeset 8b981edd27be

7 years agoBump version to 0.11.1.11
Bryan O'Sullivan [Fri, 23 Dec 2011 08:14:53 +0000 (00:14 -0800)] 
Bump version to 0.11.1.11

7 years agoMake encoding slightly faster.
Bryan O'Sullivan [Fri, 23 Dec 2011 07:53:24 +0000 (23:53 -0800)] 
Make encoding slightly faster.

The improvement mainly comes from dropping a redundant check when decoding
an ASCII byte.

7 years agoAdded tag 0.11.1.10 for changeset 407937739e9e
Bryan O'Sullivan [Wed, 21 Dec 2011 05:31:25 +0000 (21:31 -0800)] 
Added tag 0.11.1.10 for changeset 407937739e9e

7 years agoMerge
Bryan O'Sullivan [Wed, 21 Dec 2011 05:31:20 +0000 (21:31 -0800)] 
Merge

7 years agoAdded tag 0.11.1.10 for changeset 9f47a2cfc9e5
Bryan O'Sullivan [Wed, 21 Dec 2011 05:30:06 +0000 (21:30 -0800)] 
Added tag 0.11.1.10 for changeset 9f47a2cfc9e5

7 years agoBump version to 0.11.1.10
Bryan O'Sullivan [Wed, 21 Dec 2011 05:30:03 +0000 (21:30 -0800)] 
Bump version to 0.11.1.10

8 years agoMerge pull request #15 from tibbe/base-dep
Bryan O'Sullivan [Wed, 14 Dec 2011 17:54:57 +0000 (09:54 -0800)] 
Merge pull request #15 from tibbe/base-dep

Bump depedency on integer-gmp

8 years agoBump depedency on integer-gmp
Johan Tibell [Wed, 14 Dec 2011 15:43:29 +0000 (16:43 +0100)] 
Bump depedency on integer-gmp

8 years agoDocumentation fix for breakOnAll.
John Chee [Tue, 6 Dec 2011 01:46:01 +0000 (17:46 -0800)] 
Documentation fix for breakOnAll.

Replaced occurences of 'find' in the 'breakOnAll' examples. Also
replaced 'find' with 'breakOnAll' in the Indexing discussion. Replaced
'find' with 'breakOnAll' in 'breakOn' discussion in Data.Text.Lazy.

8 years agoWiden criterion dep
Bryan O'Sullivan [Sat, 19 Nov 2011 05:24:10 +0000 (21:24 -0800)] 
Widen criterion dep

8 years agoDisable I/O QC tests, as they infinite-loop sometimes.
Bryan O'Sullivan [Fri, 4 Nov 2011 23:06:13 +0000 (16:06 -0700)] 
Disable I/O QC tests, as they infinite-loop sometimes.

8 years agoQuiet a warning.
Bryan O'Sullivan [Fri, 4 Nov 2011 23:05:56 +0000 (16:05 -0700)] 
Quiet a warning.

8 years agoAdded tag 0.11.1.9 for changeset 5dce2a934be5
Bryan O'Sullivan [Wed, 2 Nov 2011 16:04:57 +0000 (09:04 -0700)] 
Added tag 0.11.1.9 for changeset 5dce2a934be5

8 years agoBump version to 0.11.1.9
Bryan O'Sullivan [Wed, 2 Nov 2011 16:04:36 +0000 (09:04 -0700)] 
Bump version to 0.11.1.9

8 years agoFix build of tests under GHC 6.12.3
Bryan O'Sullivan [Wed, 2 Nov 2011 13:30:36 +0000 (06:30 -0700)] 
Fix build of tests under GHC 6.12.3

8 years agoQuieten some compilation warnings
Bryan O'Sullivan [Wed, 2 Nov 2011 04:02:20 +0000 (21:02 -0700)] 
Quieten some compilation warnings

8 years agoAdded tag 0.11.1.8 for changeset 9f01361a7307
Bryan O'Sullivan [Wed, 2 Nov 2011 03:46:59 +0000 (20:46 -0700)] 
Added tag 0.11.1.8 for changeset 9f01361a7307

8 years agoLoosen constraints on deepseq
Bryan O'Sullivan [Wed, 2 Nov 2011 03:46:48 +0000 (20:46 -0700)] 
Loosen constraints on deepseq

8 years agoAdded tag 0.11.1.7 for changeset 5ac062eace36
Bryan O'Sullivan [Sat, 29 Oct 2011 17:04:04 +0000 (10:04 -0700)] 
Added tag 0.11.1.7 for changeset 5ac062eace36

8 years agoBump version to 0.11.1.7
Bryan O'Sullivan [Sat, 29 Oct 2011 17:03:33 +0000 (10:03 -0700)] 
Bump version to 0.11.1.7

8 years agoMerge
Bryan O'Sullivan [Sat, 29 Oct 2011 17:02:55 +0000 (10:02 -0700)] 
Merge

8 years agoMerge pull request #12 from tibbe/ffi-fix
Bryan O'Sullivan [Sat, 29 Oct 2011 16:38:04 +0000 (09:38 -0700)] 
Merge pull request #12 from tibbe/ffi-fix

Fix build with GHC 7.3+

8 years agoFix build with GHC 7.3+
Johan Tibell [Sat, 29 Oct 2011 15:25:32 +0000 (08:25 -0700)] 
Fix build with GHC 7.3+

8 years agoImport foreign types constructors to satisfy GHC 7.3+
Mikhail Vorozhtsov [Fri, 28 Oct 2011 07:40:26 +0000 (14:40 +0700)] 
Import foreign types constructors to satisfy GHC 7.3+

8 years agoUpdate maintainer list
Bryan O'Sullivan [Tue, 25 Oct 2011 22:28:06 +0000 (15:28 -0700)] 
Update maintainer list

8 years agoMerge with Daniel
Bryan O'Sullivan [Tue, 25 Oct 2011 22:25:16 +0000 (15:25 -0700)] 
Merge with Daniel

8 years agoAdded tag 0.11.1.6 for changeset 9d6d3a9690ad
Bryan O'Sullivan [Tue, 25 Oct 2011 22:21:58 +0000 (15:21 -0700)] 
Added tag 0.11.1.6 for changeset 9d6d3a9690ad

8 years agoBump version
Bryan O'Sullivan [Tue, 25 Oct 2011 22:20:36 +0000 (15:20 -0700)] 
Bump version

8 years agoFix buildTable
Daniel Fischer [Tue, 25 Oct 2011 00:34:27 +0000 (02:34 +0200)] 
Fix buildTable

On chunk boundaries, the character was not recorded and the global index not
incremented, causing issue #10.

8 years agoFix a corner case in lazy text search.
Bryan O'Sullivan [Tue, 4 Oct 2011 14:42:06 +0000 (07:42 -0700)] 
Fix a corner case in lazy text search.

On a chunk boundary, we were not passing the correct mask and skip values
along to the function that would process the next chunk.

8 years agoSilence a compiler warning.
Bryan O'Sullivan [Tue, 4 Oct 2011 14:12:43 +0000 (07:12 -0700)] 
Silence a compiler warning.

8 years agoEliminate a useless fromIntegral.
Bryan O'Sullivan [Tue, 4 Oct 2011 14:11:03 +0000 (07:11 -0700)] 
Eliminate a useless fromIntegral.

8 years agoUse a simpler restreaming state
Jasper Van der Jeugt [Sat, 27 Aug 2011 13:40:54 +0000 (15:40 +0200)] 
Use a simpler restreaming state

8 years agoWiden dependency on directory
Bryan O'Sullivan [Mon, 22 Aug 2011 06:40:45 +0000 (23:40 -0700)] 
Widen dependency on directory

8 years agoAdd top-level QuickCheck test support.
Bryan O'Sullivan [Mon, 22 Aug 2011 06:25:12 +0000 (23:25 -0700)] 
Add top-level QuickCheck test support.

The "real" tests remain in tests/tests - this test suite is built without
optimization, and simply lets us do a quick pass/fail during automated builds.

8 years agoPoint the master repo and bugtracker at github.
Bryan O'Sullivan [Mon, 22 Aug 2011 06:03:03 +0000 (23:03 -0700)] 
Point the master repo and bugtracker at github.

8 years agoMerge
Bryan O'Sullivan [Mon, 22 Aug 2011 05:56:25 +0000 (22:56 -0700)] 
Merge

8 years agoMerge 1b33e0812bf036b250c094d3c39df1b6351af890 into 3845ffd4b616fb920e1ef16a61ab3c384...
GitHub Merge Button [Mon, 22 Aug 2011 05:51:24 +0000 (22:51 -0700)] 
Merge 1b33e0812bf036b250c094d3c39df1b6351af890 into 3845ffd4b616fb920e1ef16a61ab3c384e20cd78

8 years agoConsistently use ByteString's for IO
Jasper Van der Jeugt [Thu, 18 Aug 2011 12:59:21 +0000 (14:59 +0200)] 
Consistently use ByteString's for IO

8 years agoBenchmark stream and restreamXXX as well
Jasper Van der Jeugt [Sat, 13 Aug 2011 11:13:22 +0000 (13:13 +0200)] 
Benchmark stream and restreamXXX as well

8 years agoBenchmark all Encoding.Fusion streaming functions
Jasper Van der Jeugt [Sat, 13 Aug 2011 10:27:52 +0000 (12:27 +0200)] 
Benchmark all Encoding.Fusion streaming functions

8 years agoAdd Streaming benchmarks
Jasper Van der Jeugt [Sat, 13 Aug 2011 10:09:01 +0000 (12:09 +0200)] 
Add Streaming benchmarks

8 years agoAdded tag 0.11.1.5 for changeset 53906ad0c7e6
Bryan O'Sullivan [Fri, 22 Jul 2011 19:32:41 +0000 (12:32 -0700)] 
Added tag 0.11.1.5 for changeset 53906ad0c7e6

8 years agoBump version
Bryan O'Sullivan [Wed, 20 Jul 2011 18:32:28 +0000 (11:32 -0700)] 
Bump version

8 years agoFix an overly cautious bit of arithmetic checking.
Bryan O'Sullivan [Wed, 20 Jul 2011 18:32:10 +0000 (11:32 -0700)] 
Fix an overly cautious bit of arithmetic checking.

Even though the value behind a Size is an Int, we actually intend that those
values should always be non-negative. (We don't use the notionally more
appropriate Word because GHC doesn't do a very good job with it.)

But non-negative means that 0+0 should be 0! Um, oops.

8 years agoMerge e1bc8a8a3e9861706471da5749b8b3ea0f83e5ac into 9e9d83ee1c989dd900a7ab2902ab892e2...
GitHub Merge Button [Fri, 15 Jul 2011 17:31:35 +0000 (10:31 -0700)] 
Merge e1bc8a8a3e9861706471da5749b8b3ea0f83e5ac into 9e9d83ee1c989dd900a7ab2902ab892e2468d5d6

8 years agoBump dependency on integer-gmp
Johan Tibell [Fri, 15 Jul 2011 12:45:52 +0000 (14:45 +0200)] 
Bump dependency on integer-gmp

8 years agoMark the ASCII decoding functions as deprecated.
Bryan O'Sullivan [Mon, 11 Jul 2011 07:39:28 +0000 (00:39 -0700)] 
Mark the ASCII decoding functions as deprecated.

8 years agoChange where we look for test data
Bryan O'Sullivan [Mon, 11 Jul 2011 07:02:34 +0000 (00:02 -0700)] 
Change where we look for test data

8 years agoUpdate
Bryan O'Sullivan [Mon, 11 Jul 2011 05:26:28 +0000 (22:26 -0700)] 
Update

8 years agoPortable native UTF-8 decoder gives 3.7x faster decoding
Bryan O'Sullivan [Sun, 10 Jul 2011 21:03:26 +0000 (14:03 -0700)] 
Portable native UTF-8 decoder gives 3.7x faster decoding

This code is derived from Björn Höhrmann's UTF-8 decoder.  Compared
to the original Haskell decoder from cac7dbcbc392, it's between
2.17 and 3.68 times faster.  It's even between 1.18 and 3.58 times
faster than the improved Haskell decoder from 71ead801296a.

The x86-specific decoding path gives a substantial win for entirely
and partly ASCII text, e.g. HTML and XML, at the cost of being about
17% slower than the portable C decoder for entirely non-ASCII text.

8 years agoMerge
Bryan O'Sullivan [Sun, 10 Jul 2011 20:40:58 +0000 (13:40 -0700)] 
Merge

8 years agoTransplant UTF-8 decoding benchmarks as of 44d20dca8f35
Bryan O'Sullivan [Sun, 10 Jul 2011 20:33:26 +0000 (13:33 -0700)] 
Transplant UTF-8 decoding benchmarks as of 44d20dca8f35

8 years agoAdd Chinese HTML to decode benchmark
Bryan O'Sullivan [Sun, 10 Jul 2011 20:20:16 +0000 (13:20 -0700)] 
Add Chinese HTML to decode benchmark

8 years agoAllow decoding of multiple files when benchmarking
Bryan O'Sullivan [Sun, 10 Jul 2011 20:10:02 +0000 (13:10 -0700)] 
Allow decoding of multiple files when benchmarking

8 years agoBenchmark the performance of iconv.
Bryan O'Sullivan [Fri, 8 Jul 2011 20:33:18 +0000 (13:33 -0700)] 
Benchmark the performance of iconv.

On my Mac, it takes 33ms, vs about 20ms for the Haskell code.

8 years agoBump version
Bryan O'Sullivan [Fri, 8 Jul 2011 17:59:28 +0000 (10:59 -0700)] 
Bump version

8 years agoMerge
Bryan O'Sullivan [Fri, 8 Jul 2011 07:18:18 +0000 (00:18 -0700)] 
Merge

8 years agoSpeed up UTF-8 decoding by a little over 2x
Bryan O'Sullivan [Fri, 8 Jul 2011 06:47:34 +0000 (23:47 -0700)] 
Speed up UTF-8 decoding by a little over 2x

The previous code was more concise, but alas GHC boxed each Word8
it read from the ByteString, which resulted in poor performance.

This mankier code adds (seemingly required) strictness annotations,
along with a little bit of manual CSE.

Timing of the DecodeUtf8/Strict benchmark went from 41.8ms to 19.6ms,
a pleasing improvement.

8 years agoAdded tag 0.11.1.3 for changeset b75d3041d275
Bryan O'Sullivan [Wed, 29 Jun 2011 05:49:47 +0000 (22:49 -0700)] 
Added tag 0.11.1.3 for changeset b75d3041d275

8 years agoBump version
Bryan O'Sullivan [Wed, 29 Jun 2011 02:45:32 +0000 (19:45 -0700)] 
Bump version

8 years agoMerge
Bryan O'Sullivan [Wed, 29 Jun 2011 02:44:25 +0000 (19:44 -0700)] 
Merge

8 years agoOh noes! I was miscalculating the initial buffer size!
Bryan O'Sullivan [Tue, 28 Jun 2011 01:45:16 +0000 (18:45 -0700)] 
Oh noes!  I was miscalculating the initial buffer size!

When performance testing encodeUtf8, I noticed that for some reason I
was still seeing "ensure" show up in the profile, when I expected it
shouldn't have been.

Turns out I was using a "min" where I should have been using a "max",
and thus allocating an initial bytestring that would almost always be
too small, thus forcing reallocations and copying. Boo!

8 years agoEliminate unnecessary resizes from encodeUtf8.
Bryan O'Sullivan [Tue, 28 Jun 2011 00:42:36 +0000 (17:42 -0700)] 
Eliminate unnecessary resizes from encodeUtf8.

We had been performing a resize any time that (a) we had data to write
and (b) we got to within 4 bytes of filling the target bytestring.
This was safe, but suboptimal, as it meant that in the common case of
encoding ASCII text, we would *always* perform a resize.

Now, we check the exact number of bytes we need to fit, and resize
only if they won't fit.  This eliminates resizes for ASCII data, and
makes them a little less likely for other data.

8 years agoAdded tag 0.11.1.2 for changeset ed3a60ec627a
Bryan O'Sullivan [Mon, 27 Jun 2011 07:27:54 +0000 (00:27 -0700)] 
Added tag 0.11.1.2 for changeset ed3a60ec627a

8 years agoSwitch to native code for copying and comparison.
Bryan O'Sullivan [Mon, 27 Jun 2011 07:21:08 +0000 (00:21 -0700)] 
Switch to native code for copying and comparison.

--HG--
rename : Data/Text/Unsafe.hs => Data/Text/Unsafe/Base.hs

8 years agoIgnore more
Bryan O'Sullivan [Mon, 27 Jun 2011 06:56:26 +0000 (23:56 -0700)] 
Ignore more

8 years agoMerge
Bryan O'Sullivan [Mon, 27 Jun 2011 04:58:05 +0000 (21:58 -0700)] 
Merge

8 years agoMerge
Bryan O'Sullivan [Mon, 27 Jun 2011 04:57:50 +0000 (21:57 -0700)] 
Merge

8 years agoMerge
Bryan O'Sullivan [Mon, 27 Jun 2011 04:57:41 +0000 (21:57 -0700)] 
Merge

8 years agoMerge
Bryan O'Sullivan [Thu, 23 Jun 2011 20:43:13 +0000 (13:43 -0700)] 
Merge

8 years agoMerge
Bryan O'Sullivan [Thu, 23 Jun 2011 20:43:05 +0000 (13:43 -0700)] 
Merge

8 years agoMerge pull request #6 from jaspervdj/tests
Bryan O'Sullivan [Thu, 23 Jun 2011 20:41:53 +0000 (13:41 -0700)] 
Merge pull request #6 from jaspervdj/tests

Port tests to cabal based infrastructure

8 years agoMerge 420d46b4289a8802c82e75828e701e7111d35a7b into f23938f81ec4b912f5f7822fef01a4a6b...
GitHub Merge Button [Thu, 23 Jun 2011 20:41:43 +0000 (13:41 -0700)] 
Merge 420d46b4289a8802c82e75828e701e7111d35a7b into f23938f81ec4b912f5f7822fef01a4a6b0133fcf

8 years agoMerge pull request #5 from jaspervdj/master
Bryan O'Sullivan [Thu, 23 Jun 2011 20:40:56 +0000 (13:40 -0700)] 
Merge pull request #5 from jaspervdj/master

Further work on benchmarks

8 years agoMerge 7d61b058190922a4181f00cd82d2a34ffdd5e762 into 419ee9be61a89cc45f77831a1e4115f47...
GitHub Merge Button [Thu, 23 Jun 2011 20:40:47 +0000 (13:40 -0700)] 
Merge 7d61b058190922a4181f00cd82d2a34ffdd5e762 into 419ee9be61a89cc45f77831a1e4115f47f20f7dd

8 years agoIncrease test coverage a little
Jasper Van der Jeugt [Thu, 23 Jun 2011 08:28:29 +0000 (10:28 +0200)] 
Increase test coverage a little

8 years agoRemove old tests, fix README
Jasper Van der Jeugt [Wed, 22 Jun 2011 13:04:56 +0000 (15:04 +0200)] 
Remove old tests, fix README

8 years agoPort Makefile/script to generate coverage reports
Jasper Van der Jeugt [Wed, 22 Jun 2011 12:30:46 +0000 (14:30 +0200)] 
Port Makefile/script to generate coverage reports

8 years agoAdd regressions in cabal tests
Jasper Van der Jeugt [Tue, 21 Jun 2011 10:48:10 +0000 (12:48 +0200)] 
Add regressions in cabal tests

8 years agoMove more utility functions away from Properties
Jasper Van der Jeugt [Tue, 21 Jun 2011 10:02:47 +0000 (12:02 +0200)] 
Move more utility functions away from Properties

8 years agoMove =^= to TestUtils
Jasper Van der Jeugt [Tue, 21 Jun 2011 09:18:37 +0000 (11:18 +0200)] 
Move =^= to TestUtils

8 years agoCabal target for the IO coverage tests
Jasper Van der Jeugt [Tue, 21 Jun 2011 09:02:45 +0000 (11:02 +0200)] 
Cabal target for the IO coverage tests

8 years agoSeparate module for main function
Jasper Van der Jeugt [Tue, 21 Jun 2011 08:55:43 +0000 (10:55 +0200)] 
Separate module for main function

8 years agoTry a cabal file for tests management
Jasper Van der Jeugt [Mon, 20 Jun 2011 13:51:33 +0000 (15:51 +0200)] 
Try a cabal file for tests management

8 years agoMerge pull request #2 from nudded/patch-2
Jasper Van der Jeugt [Thu, 16 Jun 2011 14:23:34 +0000 (07:23 -0700)] 
Merge pull request #2 from nudded/patch-2

Ruby fold benchmark: Added idiomatic way of dividing array into parts.

8 years agoMerge pull request #1 from nudded/patch-1
Jasper Van der Jeugt [Thu, 16 Jun 2011 14:22:56 +0000 (07:22 -0700)] 
Merge pull request #1 from nudded/patch-1

Ruby fold benchmark: Even better now.