Releases: scott-griffiths/bitstring
bitstring.3.1.1
March 21st 2013: version 3.1.1 released
This is a bug fix release.
- Fix for problem where concatenating bitstrings sometimes modified method's arguments
bitstring-3.1.0
February 26th 2013: version 3.1.0 released
This is a minor release with a couple of new features and some bug fixes.
New 'pad' token
This token can be used in reads and when packing/unpacking to indicate that
you don't care about the contents of these bits. Any padding bits will just
be skipped over when reading/unpacking or zero-filled when packing.
>>> a, b = s.readlist('pad:5, uint:3, pad:1, uint:3')
Here only two items are returned in the list - the padding bits are ignored.
New clear and copy convenience methods
These methods have been introduced in Python 3.3 for lists and bytearrays,
as more obvious ways of clearing and copying, and we mirror that change here.
t = s.copy()
is equivalent to t = s[:]
, and s.clear()
is equivalent to del s[:]
.
Other changes
- Some bug fixes.
bitstring-3.0.2b
February 7th 2012: version 3.0.2 released
This is a minor update that fixes a few bugs.
- Fix for subclasses of bitstring classes behaving strangely (Issue 121).
- Fix for excessive memory usage in rare cases (Issue 120).
- Fixes for slicing edge cases.
There has also been a reorganisation of the code to return it to a single
'bitstring.py' file rather than the package that has been used for the past
several releases. This change shouldn't affect users directly.
bitstring-3.0.1
November 21st 2011: version 3.0.1 released
This release fixed a small but very visible bug in bitstring printing.
bitstring-3.0.0
November 21st 2011: version 3.0.0 released
This is a major release which breaks backward compatibility in a few places.
Backwardly incompatible changes
Hex, oct and bin properties don't have leading 0x
, 0o
and 0b
If you ask for the hex, octal or binary representations of a bitstring then
they will no longer be prefixed with 0x
, 0o
or 0b
. This was done as it
was noticed that the first thing a lot of user code does after getting these
representations was to cut off the first two characters before further
processing.
>>> a = BitArray('0x123')
>>> a.hex, a.oct, a.bin
('123', '0443', '000100100011')
Previously this would have returned ('0x123', '0o0443', '0b000100100011')
This change might require some recoding, but it should all be simplifications.
ConstBitArray renamed to Bits
Previously Bits was an alias for ConstBitStream (for backward compatibility).
This has now changed so that Bits and BitArray loosely correspond to the
built-in types bytes
and bytearray
.
If you were using streaming/reading methods on a Bits object then you will
have to change it to a ConstBitStream.
The ConstBitArray name is kept as an alias for Bits.
Stepping in slices has conventional meaning
The step parameter in __getitem__
, __setitem__
and __delitem__
used to act
as a multiplier for the start and stop parameters. No one seemed to use it
though and so it has now reverted to the conventional meaning for containers.
If you are using step then recoding is simple: s[a:b:c]
becomes s[a*c:b*c]
.
Some examples of the new usage:
>>> s = BitArray('0x0000')
s[::4] = [1, 1, 1, 1]
>>> s.hex
'8888'
>>> del s[8::2]
>>> s.hex
'880'
New features
New readto
method
This method is a mix between a find and a read - it searches for a bitstring
and then reads up to and including it. For example:
>>> s = ConstBitStream('0x47000102034704050647')
>>> s.readto('0x47', bytealigned=True)
BitStream('0x47')
>>> s.readto('0x47', bytealigned=True)
BitStream('0x0001020347')
>>> s.readto('0x47', bytealigned=True)
BitStream('0x04050647')
pack
function accepts an iterable as its format
Previously only a string was accepted as the format in the pack function.
This was an oversight as it broke the symmetry between pack and unpack.
Now you can use formats like this:
fmt = ['hex:8', 'bin:3']
a = pack(fmt, '47', '001')
a.unpack(fmt)
bitstring-2.2.0
June 18th 2011: version 2.2.0 released
This is a minor upgrade with a couple of new features.
New interleaved exponential-Golomb interpretations
New bit interpretations for interleaved exponential-Golomb (as used in the
Dirac video codec) are supplied via 'uie' and 'sie':
>>> s = BitArray(uie=41)
>>> s.uie
41
>>> s.bin
'0b00010001001'
These are pretty similar to the non-interleaved versions - see the manual
for more details. Credit goes to Paul Sargent for the patch.
New package-level bytealigned variable
A number of methods take a 'bytealigned' parameter to indicate that they
should only work on byte boundaries (e.g. find, replace, split). Previously
this parameter defaulted to 'False'. Instead it now defaults to
'bitstring.bytealigned', which itself defaults to 'False', but can be changed
to modify the default behaviour of the methods. For example:
>>> a = BitArray('0x00 ff 0f ff')
>>> a.find('0x0f')
(4,) # found first not on a byte boundary
>>> a.find('0x0f', bytealigned=True)
(16,) # forced looking only on byte boundaries
>>> bitstring.bytealigned = True # Change default behaviour
>>> a.find('0x0f')
(16,)
>>> a.find('0x0f', bytealigned=False)
(4,)
If you're only working with bytes then this can help avoid some errors and
save some typing!
Other changes
- Fix for Python 3.2, correcting for a change to the binascii module.
- Fix for bool initialisation from 0 or 1.
- Efficiency improvements, including interning strategy.
bitstring-2.1.1
February 23rd 2011: version 2.1.1 released
This is a release to fix a couple of bugs that were introduced in 2.1.0.
- Bug fix: Reading using the 'bytes' token had been broken (Issue 102).
- Fixed problem using some methods on ConstBitArrays.
- Better exception handling for tokens missing values.
- Some performance improvements.
bitstring-2.1.0
January 23rd 2011: version 2.1.0 released
New class hierarchy introduced with simpler classes
Previously there were just two classes, the immutable Bits which was the base
class for the mutable BitString class. Both of these classes have the concept
of a bit position, from which reads etc. take place so that the bitstring could
be treated as if it were a file or stream.
Two simpler classes have now been added which are purely bit containers and
don't have a bit position. These are called ConstBitArray and BitArray. As you
can guess the former is an immutable version of the latter.
The other classes have also been renamed to better reflect their capabilities.
Instead of BitString you can use BitStream, and instead of Bits you can use
ConstBitStream. The old names are kept as aliases for backward compatibility.
The classes hierarchy is:
ConstBitArray
/ \
/ \
BitArray ConstBitStream (formerly Bits)
\ /
\ /
BitStream (formerly BitString)
Other changes
A lot of internal reorganisation has taken place since the previous version,
most of which won't be noticed by the end user. Some things you might see are:
- New package structure. Previous versions have been a single file for the
module and another for the unit tests. The module is now split into many
more files so it can't be used just by copying bitstring.py any more. - To run the unit tests there is now a script called runtests.py in the test
directory. - File based bitstring are now implemented in terms of an mmap. This should
be just an implementation detail, but unfortunately for 32-bit versions of
Python this creates a limit of 4GB on the files that can be used. The work
around is either to get a 64-bit Python, or just stick with version 2.0. - The ConstBitArray and ConstBitStream classes no longer copy byte data when
a slice or a read takes place, they just take a reference. This is mostly
a very nice optimisation, but there are occassions where it could have an
adverse effect. For example if a very large bitstring is created, a small
slice taken and the original deleted. The byte data from the large
bitstring would still be retained in memory. - Optimisations. Once again this version should be faster than the last.
The module is still pure Python but some of the reorganisation was to make
it more feasible to put some of the code into Cython or similar, so
hopefully more speed will be on the way.
bitstring-2.0.3
July 26th 2010: version 2.0.3 released
- Bug fix: Using peek and read for a single bit now returns a new bitstring
as was intended, rather than the old behaviour of returning a bool. - Removed HTML docs from source archive - better to use the online version.
bitstring-2.0.2
July 25th 2010: version 2.0.2 released
This is a major release, with a number of backwardly incompatible changes.
The main change is the removal of many methods, all of which have simple
alternatives. Other changes are quite minor but may need some recoding.
There are a few new features, most of which have been made to help the
stream-lining of the API. As always there are performance improvements and
some API changes were made purely with future performance in mind.
The backwardly incompatible changes are:
- Methods removed.
About half of the class methods have been removed from the API. They all have
simple alternatives, so what remains is more powerful and easier to remember.
The removed methods are listed here on the left, with their equivalent
replacements on the right:
s.advancebit() -> s.pos += 1
s.advancebits(bits) -> s.pos += bits
s.advancebyte() -> s.pos += 8
s.advancebytes(bytes) -> s.pos += 8*bytes
s.allunset([a, b]) -> s.all(False, [a, b])
s.anyunset([a, b]) -> s.any(False, [a, b])
s.delete(bits, pos) -> del s[pos:pos+bits]
s.peekbit() -> s.peek(1)
s.peekbitlist(a, b) -> s.peeklist([a, b])
s.peekbits(bits) -> s.peek(bits)
s.peekbyte() -> s.peek(8)
s.peekbytelist(a, b) -> s.peeklist([8*a, 8*b])
s.peekbytes(bytes) -> s.peek(8*bytes)
s.readbit() -> s.read(1)
s.readbitlist(a, b) -> s.readlist([a, b])
s.readbits(bits) -> s.read(bits)
s.readbyte() -> s.read(8)
s.readbytelist(a, b) -> s.readlist([8*a, 8*b])
s.readbytes(bytes) -> s.read(8*bytes)
s.retreatbit() -> s.pos -= 1
s.retreatbits(bits) -> s.pos -= bits
s.retreatbyte() -> s.pos -= 8
s.retreatbytes(bytes) -> s.pos -= 8*bytes
s.reversebytes(start, end) -> s.byteswap(0, start, end)
s.seek(pos) -> s.pos = pos
s.seekbyte(bytepos) -> s.bytepos = bytepos
s.slice(start, end, step) -> s[start:end:step]
s.tell() -> s.pos
s.tellbyte() -> s.bytepos
s.truncateend(bits) -> del s[-bits:]
s.truncatestart(bits) -> del s[:bits]
s.unset([a, b]) -> s.set(False, [a, b])
Many of these methods have been deprecated for the last few releases, but
there are some new removals too. Any recoding needed should be quite
straightforward, so while I apologise for the hassle, I had to take the
opportunity to streamline and rationalise what was becoming a bit of an
overblown API.
- set / unset methods combined.
The set/unset methods have been combined in a single method, which now
takes a boolean as its first argument:
s.set([a, b]) -> s.set(1, [a, b])
s.unset([a, b]) -> s.set(0, [a, b])
s.allset([a, b]) -> s.all(1, [a, b])
s.allunset([a, b]) -> s.all(0, [a, b])
s.anyset([a, b]) -> s.any(1, [a, b])
s.anyunset([a, b]) -> s.any(0, [a, b])
- all / any only accept iterables.
The all and any methods (previously called allset, allunset, anyset and
anyunset) no longer accept a single bit position. The recommended way of
testing a single bit is just to index it, for example instead of:
>>> if s.all(True, i):
just use
>>> if s[i]:
If you really want to you can of course use an iterable with a single
element, such as 's.any(False, [i])', but it's clearer just to write
'not s[i]'.
- Exception raised on reading off end of bitstring.
If a read or peek goes beyond the end of the bitstring then a ReadError
will be raised. The previous behaviour was that the rest of the bitstring
would be returned and no exception raised.
- BitStringError renamed to Error.
The base class for errors in the bitstring module is now just Error, so
it will likely appears in your code as bitstring.Error instead of
the rather repetitive bitstring.BitStringError.
- Single bit slices and reads return a bool.
A single index slice (such as s[5]) will now return a bool (i.e. True or
False) rather than a single bit bitstring. This is partly to reflect the
style of the bytearray type, which returns an integer for single items, but
mostly to avoid common errors like:
>>> if s[0]:
... do_something()
While the intent of this code snippet is quite clear (i.e. do_something if
the first bit of s is set) under the old rules s[0] would be true as long
as s wasn't empty. That's because any one-bit bitstring was true as it was a
non-empty container. Under the new rule s[0] is True if s starts with a '1'
bit and False if s starts with a '0' bit.
The change does not affect reads and peeks, so s.peek(1) will still return
a single bit bitstring, which leads on to the next item...
- Empty bitstrings or bitstrings with only zero bits are considered False.
Previously a bitstring was False if it had no elements, otherwise it was True.
This is standard behaviour for containers, but wasn't very useful for a container
of just 0s and 1s. The new behaviour means that the bitstring is False if it
has no 1 bits. This means that code like this:
>>> if s.peek(1):
... do_something()
should work as you'd expect. It also means that Bits(1000), Bits(0x00) and
Bits('uint:12=0') are all also False. If you need to check for the emptiness of
a bitstring then instead check the len property:
if s -> if s.len
if not s -> if not s.len
- Length and offset disallowed for some initialisers.
Previously you could create bitstring using expressions like:
>>> s = Bits(hex='0xabcde', offset=4, length=13)
This has now been disallowed, and the offset and length parameters may only
be used when initialising with bytes or a file. To replace the old behaviour
you could instead use
>>> s = Bits(hex='0xabcde')[4:17]
- Renamed 'format' parameter 'fmt'.
Methods with a 'format' parameter have had it renamed to 'fmt', to prevent
hiding the built-in 'format'. Affects methods unpack, read, peek, readlist,
peeklist and byteswap and the pack function.
- Iterables instead of *format accepted for some methods.
This means that for the affected methods (unpack, readlist and peeklist) you
will need to use an iterable to specify multiple items. This is easier to
show than to describe, so instead of
>>> a, b, c, d = s.readlist('uint:12', 'hex:4', 'bin:7')
you would instead write
>>> a, b, c, d = s.readlist(['uint:12', 'hex:4', 'bin:7'])
Note that you could still use the single string 'uint:12, hex:4, bin:7' if
you preferred.
- Bool auto-initialisation removed.
You can no longer use True and False to initialise single bit bitstrings.
The reasoning behind this is that as bool is a subclass of int, it really is
bad practice to have Bits(False) be different to Bits(0) and to have Bits(True)
different to Bits(1).
If you have used bool auto-initialisation then you will have to be careful to
replace it as the bools will now be interpreted as ints, so Bits(False) will
be empty (a bitstring of length 0), and Bits(True) will be a single zero bit
(a bitstring of length 1). Sorry for the confusion, but I think this will
prevent bigger problems in the future.
There are a few alternatives for creating a single bit bitstring. My favourite
it to use a list with a single item:
Bits(False) -> Bits([0])
Bits(True) -> Bits([1])
- New creation from file strategy
Previously if you created a bitstring from a file, either by auto-initialising
with a file object or using the filename parameter, the file would not be read
into memory unless you tried to modify it, at which point the whole file would
be read.
The new behaviour depends on whether you create a Bits or a BitString from the
file. If you create a Bits (which is immutable) then the file will never be
read into memory. This allows very large files to be opened for examination
even if they could never fit in memory.
If however you create a BitString, the whole of the referenced file will be read
to store in memory. If the file is very big this could take a long time, or fail,
but the idea is that in saying you want the mutable BitString you are implicitly
saying that you want to make changes and so (for now) we need to load it into
memory.
The new strategy is a bit more predictable in terms of performance than the old.
The main point to remember is that if you want to open a file and don't plan to
alter the bitstring then use the Bits class rather than BitString.
Just to be clear, in neither case will the contents of the file ever be changed -
if you want to output the modified BitString then use the tofile method, for
example.
- find and rfind return a tuple instead of a bool.
If a find is unsuccessful then an empty tuple is returned (which is False in a
boolean sense) otherwise a single item tuple with the bit position is returned
(which is True in a boolean sense). You shouldn't need to recode unless you
explicitly compared the result of a find to True or False, for example this
snippet doesn't need to be altered:
>>> if s.find('0x23'):
... print(s.bitpos)
but you could now instead use
>>> found = s.find('0x23')
>>> if found:
... print(found[0])
The reason for returning the bit position in a tuple is so that finding at
position zero can still be True - it's the tuple (0,) - whereas not found can
be False - the empty tuple ().
The new features in this release are:
- New count method.
This method just counts the number of 1 or 0 bits in the bitstring.
>>> s = Bits('0x31fff4')
>>> s.count(1)
16
- read and peek methods accept integers.
The read, readlist, peek and peeklist methods now accept integers as parameters
to mean "read this many bits and return a bitstring". This has allowed a number
of methods to be removed from this release, so for example...