cmp function for words is broken #8232

seblabbe · 2010-02-10T16:03:32Z

As discussed on sage-combinat-devel, cmp is broken for words.

Amusant: this boils down to:

sage: W = Words(['a','b','c'])
sage: W('a') == W([])
True
sage: W([]) == W('a')
False

it causes problem else where :

sage: A = AlgebrasWithBasis(QQ).example(); A
An example of an algebra with basis: the free algebra on the
generators ('a', 'b', 'c') over Rational Field
sage: [a,b,c] = A.algebra_generators()
sage: a.is_one()
True
sage: b.is_one()
True
sage: c.is_one()
True
sage: A.one().is_one()
True
sage: (a+b).is_one()
False
sage: (a+A.one()).is_one()
False

CC: @sagetrac-abmasse

Component: combinatorics

Author: Sébastien Labbé

Reviewer: Alexandre Blondin Massé

Merged: sage-4.3.3.alpha1

Issue created by migration from https://trac.sagemath.org/ticket/8232

The text was updated successfully, but these errors were encountered:

seblabbe · 2010-02-10T17:42:00Z

comment:1

I just applied a patch which does the following things.

Fixed __cmp__ for Word_class which was broken.
Remove the __cmp__ from FiniteWord_class since the same function in Word_class does the job anyway and in a cleaner way : it doesn't use the (useless?) coerce function. Surprinsingly, removing it makes it faster :

BEFORE:

    sage: w = Word([0]*10000)
    sage: z = Word([0]*10000, alphabet=[0,1])
    sage: type(w)
    <class 'sage.combinat.words.word.FiniteWord_list'>
    sage: type(z)
    <class 'sage.combinat.words.word.FiniteWord_list'>
    sage: %timeit w.__cmp__(w)
    125 loops, best of 3: 3.79 ms per loop
    sage: %timeit w.__cmp__(z)
    25 loops, best of 3: 13.3 ms per loop
    sage: %timeit z.__cmp__(w)
    5 loops, best of 3: 50.1 ms per loop
    sage: %timeit z.__cmp__(z)
    25 loops, best of 3: 35.7 ms per loop


AFTER:

    sage: w = Word([0]*10000)
    sage: z = Word([0]*10000, alphabet=[0,1])
    sage: type(w)
    <class 'sage.combinat.words.word.FiniteWord_list'>
    sage: type(z)
    <class 'sage.combinat.words.word.FiniteWord_list'>
    sage: %timeit w.__cmp__(w)
    125 loops, best of 3: 3.89 ms per loop
    sage: %timeit w.__cmp__(z)
    125 loops, best of 3: 5.4 ms per loop
    sage: %timeit z.__cmp__(w)
    25 loops, best of 3: 35.9 ms per loop
    sage: %timeit z.__cmp__(z)
    25 loops, best of 3: 35.7 ms per loop

NOTE : The difference between w and z above is that the parent of w is the alphabet of all python objects which uses the cmp of python to compare the letters whereas z compares its letters relatively to the order of the letters defined by its parent (here 0 < 1 but one could also say 1 < 0) which is slower.

The broken __cmp__ was hidding one bug in longest_common_prefix. Indeed a doctest was passing while it wasn't supposed to:

BEFORE:

    sage: w = Word('12345')
    sage: w.longest_common_prefix(Word())
    word: 1

AFTER:

    sage: w = Word('12345')
    sage: w.longest_common_prefix(Word())
    word:

seblabbe · 2010-02-10T17:55:07Z

Depends on #8186

seblabbe · 2010-02-10T17:55:29Z

comment:2

Attachment: trac_8232_word_cmp_bug-sl.patch.gz

sagetrac-abmasse · 2010-02-16T11:01:35Z

comment:3

Hi, Sébastien !

I finally got some time to look at your patch and everything seems fine, code makes sense, documentation builds without warning and the bugs mentionned in the description are fixed.

The only observation I would make is that it seems costly to use all those try and catch blocks in the __cmp__(...) function. Don't you think it may be better to use the izip_longest function of the itertools library, which fills the shortest iterator with a special character ? This way, you would only have to check if that character appear in self_it or in other_it` to choose which one is the smallest w.r.t the lexicographic order.

sagetrac-abmasse · 2010-02-16T23:35:16Z

comment:4

Never mind my last observation, it seems more complicated to use izip_longest since you have to choose a different character from the one occurring in the compared words... and there is no clean way that comes up to me since the letters of word can be any object.

Anyway, the goal of the patch is reached, the documentation builds correctly, all tests pass, the bugs are fixed.

Positive review !

sagetrac-abmasse · 2010-02-16T23:35:16Z

Reviewer: Alexandre Blondin Massé

sagetrac-abmasse · 2010-02-16T23:35:16Z

Author: Sébastien Labbé

sagetrac-mvngu · 2010-02-17T20:38:19Z

Merged: sage-4.3.3.alpha1

seblabbe added this to the sage-4.3.3 milestone Feb 10, 2010

seblabbe added c: combinatorics labels Feb 10, 2010

seblabbe assigned sagetrac-sage-combinat Feb 10, 2010

seblabbe added the s: needs review label Feb 10, 2010

sagetrac-abmasse mannequin added s: positive review and removed s: needs review labels Feb 16, 2010

sagetrac-mvngu mannequin removed the s: positive review label Feb 17, 2010

sagetrac-mvngu mannequin closed this as completed Feb 17, 2010

qed777 mannequin mentioned this issue Feb 11, 2010

Kazhdan-Lusztig polynomials, Bruhat order, and related features #7751

Closed

sagetrac-abmasse mannequin mentioned this issue Mar 8, 2010

Split word.py file into 4 files #8429

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cmp function for words is broken #8232

cmp function for words is broken #8232

seblabbe commented Feb 10, 2010

seblabbe commented Feb 10, 2010

seblabbe commented Feb 10, 2010

seblabbe commented Feb 10, 2010

sagetrac-abmasse mannequin commented Feb 16, 2010

sagetrac-abmasse mannequin commented Feb 16, 2010

sagetrac-abmasse mannequin commented Feb 16, 2010

sagetrac-abmasse mannequin commented Feb 16, 2010

sagetrac-mvngu mannequin commented Feb 17, 2010

cmp function for words is broken #8232

cmp function for words is broken #8232

Comments

seblabbe commented Feb 10, 2010

seblabbe commented Feb 10, 2010

seblabbe commented Feb 10, 2010

seblabbe commented Feb 10, 2010

sagetrac-abmasse mannequin commented Feb 16, 2010

sagetrac-abmasse mannequin commented Feb 16, 2010

sagetrac-abmasse mannequin commented Feb 16, 2010

sagetrac-abmasse mannequin commented Feb 16, 2010

sagetrac-mvngu mannequin commented Feb 17, 2010