Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'sage-location' shouldn't "initialize" .pc (pkg-config) files more than once #11760

Closed
nexttime mannequin opened this issue Aug 30, 2011 · 36 comments
Closed

'sage-location' shouldn't "initialize" .pc (pkg-config) files more than once #11760

nexttime mannequin opened this issue Aug 30, 2011 · 36 comments

Comments

@nexttime
Copy link
Mannequin

nexttime mannequin commented Aug 30, 2011

The logic of "initializing" .pc files exactly once is currently broken, which at the moment "only" causes trouble with libpng if pkg-config is installed.

The main reason is that sage-location doesn't really check whether a .pc file already contains a definition of SAGE_ROOT, and unconditionally replaces any occurrences of $SAGE_ROOT (i.e., its current value) by ${SAGE_ROOT}, then adding a (in the case of update_pkgconfig_files(), replacing the first) line defining SAGE_ROOT to have the current value.

The specific problem with libpng (see e.g. #11686, which was originally intended to just again fix a completely unrelated issue with the matplotlib spkg) arises since it installs both a libpng12.pc file and a symbolic link libpng.pc, the latter pointing to the former. sage-location now "initializes" all of Sage's .pc files, not checking whether a .pc file is just a symbolic link to another one, nor checking whether a file already got modified as mentioned above.

Note that the "initialization" doesn't take place during a build, but when Sage is started for the first time, also when e.g. running tests, such that problems with modified .pc files occur later, when one tries to (re)install some package(s), which includes upgrading Sage.

In the long run, sage-location shouldn't (have to) deal with .pc files at all, but we IMHO need a quick fix until we have a better, more general solution to all issues with pkg-config. (There are a couple of other issues; cf. for example #10202, #9668, #11687, #11681.) Therefore this ticket isn't intended to fix any of the other issues.


Apply

  1. attachment: trac_11760-avoid_multiple_defs_of_SAGE_ROOT.scripts.patch
  2. attachment: trac_11760-additional_improvements_to_pkgconfig_funs.optional.scripts.patch
    to the Sage scripts repository.

Component: scripts

Keywords: pkgconfig libpng Duplicate definition SAGE_ROOT

Author: Leif Leonhardy

Reviewer: John Palmieri

Merged: sage-4.7.2.alpha4

Issue created by migration from https://trac.sagemath.org/ticket/11760

@nexttime nexttime mannequin added this to the sage-4.7.2 milestone Aug 30, 2011
@nexttime nexttime mannequin added c: scripts labels Aug 30, 2011
@nexttime

This comment has been minimized.

@nexttime
Copy link
Mannequin Author

nexttime mannequin commented Aug 30, 2011

Changed keywords from pkgconfig libpng to pkgconfig libpng Duplicate definition SAGE_ROOT

@jdemeyer
Copy link
Contributor

jdemeyer commented Oct 9, 2011

comment:3

leif, this ticket is listed as blocker. Any ideas for a quick fix?

@nexttime
Copy link
Mannequin Author

nexttime mannequin commented Oct 9, 2011

SCRIPTS repo. Based on Sage 4.7.2.alpha3.

@nexttime

This comment has been minimized.

@nexttime
Copy link
Mannequin Author

nexttime mannequin commented Oct 9, 2011

Author: Leif Leonhardy

@nexttime
Copy link
Mannequin Author

nexttime mannequin commented Oct 9, 2011

comment:4

Attachment: trac_11760-avoid_multiple_defs_of_SAGE_ROOT.scripts.patch.gz

Couldn't resist to also do some clean-up.

Actual (trivial) fix starting on line 178.

@nexttime nexttime mannequin added the s: needs review label Oct 9, 2011
@nexttime
Copy link
Mannequin Author

nexttime mannequin commented Oct 9, 2011

comment:5

P.S.:

The fix doesn't cure already broken installations (i.e., .pc files already containing multiple definitions, like usually libpng[12].pc).

To test the patch, you can change the contents of or remove the file $SAGE_ROOT/local/lib/sage-current-location.txt (perhaps first removing the line SAGE_ROOT=${SAGE_ROOT} from libpng12.pc). The former triggers update_pkgconfig_files(), the latter initialize_pkgconfig_files().

@nexttime
Copy link
Mannequin Author

nexttime mannequin commented Oct 9, 2011

comment:6

Replying to @nexttime:

The fix doesn't cure already broken installations (i.e., .pc files already containing multiple definitions, like usually libpng[12].pc).

I could more or less easily also add that, but it wouldn't help much (on e.g. upgrades) since neither update_pkgconfig_files() nor initialize_pkgconfig_files() is called later, or more precisely, before we run into potential errors due to multiple definitions.

@jhpalmieri
Copy link
Member

comment:7

In lines 178-181:

            if re.search("^SAGE_ROOT=", config, re.MULTILINE): 
	                # There's already a definition of SAGE_ROOT, 
	                # so skip this file. (Cf. #11760). 
	                continue 

what if the path to SAGE_ROOT has changed? Would it be better to delete this first line (containing SAGE_ROOT=...) and then proceed with the rest of the code in the loop? I guess that these files shouldn't have SAGE_ROOT in them already at this point, but just in case?

@nexttime
Copy link
Mannequin Author

nexttime mannequin commented Oct 9, 2011

comment:8

Replying to @jhpalmieri:

In lines 178-181:

            if re.search("^SAGE_ROOT=", config, re.MULTILINE): 
	        # There's already a definition of SAGE_ROOT, 
	        # so skip this file. (Cf. #11760). 
	        continue 

what if the path to SAGE_ROOT has changed?

This is initialize_pkgconfig_files(), not update_...().

Would it be better to delete this first line (containing SAGE_ROOT=...) and then proceed with the rest of the code in the loop? I guess that these files shouldn't have SAGE_ROOT in them already at this point, but just in case?

As mentioned in the description, I was aiming at a quick fix which solves the issue we have, not at curing arbitrary corrupted pkg-config files.

Also, initialize_pkgconfig_files() is only called once (unless one messes around with Sage's files as noted for testing the patch).

Nevertheless, any wrong definition of SAGE_ROOT present during initialize_pkgconfig_files() will be corrected in the subsequent call to update_pkgconfig_files().

Since sage-location won't have to deal with .pc files in the long term anyway, it doesn't make sense to implement sophisticated consistency checks and corrections there.

@nexttime
Copy link
Mannequin Author

nexttime mannequin commented Oct 9, 2011

comment:9

Replying to @nexttime:

Nevertheless, any wrong definition of SAGE_ROOT present during initialize_pkgconfig_files() will be corrected in the subsequent call to update_pkgconfig_files().

Ah, I see. update_pkgconfig_files() is not called in the same invocation of sage-location if initialize_pkgconfig_files() was called, since install_moved() in this case returns False.

So we could in principle add

diff --git a/sage-location b/sage-location
--- a/sage-location
+++ b/sage-location
@@ -68,6 +68,7 @@
         write_location_file()
         update_library_files()
         initialize_pkgconfig_files()
+        update_pkgconfig_files()
         return False
     elif path != SAGE_ROOT:
         # location moved

but that's IMHO really a minor issue since that situation doesn't occur unless the user damages his pkg-config files.

@nexttime
Copy link
Mannequin Author

nexttime mannequin commented Oct 9, 2011

comment:10

John, would the following make you happy?

diff --git a/sage-location b/sage-location
--- a/sage-location
+++ b/sage-location
@@ -167,6 +167,7 @@
     the Sage installation has moved, i.e., paths have changed.
     """
     import re
+    pat = re.compile(r"^SAGE_ROOT=.*\n", re.MULTILINE)
     LIB = os.path.join(os.path.abspath(SAGE_ROOT), 'local', 'lib')
     PKG = os.path.join(LIB, 'pkgconfig')
     for name in os.listdir(PKG):
@@ -175,10 +176,23 @@
             with open(filename) as file:
                 config = file.read()
 
-            if re.search("^SAGE_ROOT=", config, re.MULTILINE):
-                # There's already a definition of SAGE_ROOT,
-                # so skip this file. (Cf. #11760).
-                continue
+            first_match = pat.search(config)
+            if first_match:
+                # There are already one or more definitions of SAGE_ROOT,
+                # which also happens if we previously processed the same
+                # file (here) because of [symbolic or hard] links, so this
+                # isn't necessarily an error. (Cf. #11760).
+                # For simplicity, we just remove any definition, and only
+                # issue a warning if there was more than one, and don't
+                # check whether a previous definition was already current:
+                config, n = pat.subn("", config)
+                if n>1:
+                    sys.stderr.write(
+                        "Warning: sage_location: initialize_pkgconfig_files():\n" +
+                        "  File \"%s\" already contained %d definitions of SAGE_ROOT.\n" %
+                            (name, n) +
+                        "  These will get replaced by a single, current definition.\n")
+                    sys.stderr.flush()
 
             new_config = config.replace(os.path.abspath(SAGE_ROOT), "${SAGE_ROOT}")
             new_config = 'SAGE_ROOT=%s\n' % os.path.abspath(SAGE_ROOT) + new_config

If so, I'll just attach that patch, to be applied on top of the other one.

@nexttime
Copy link
Mannequin Author

nexttime mannequin commented Oct 9, 2011

comment:11

I could of course remove the temporary variable first_match; it's only there for documentation purposes. ;-)

I.e.,

            if pat.search(config):
                ...

would be sufficient.

@nexttime
Copy link
Mannequin Author

nexttime mannequin commented Oct 9, 2011

comment:12

P.S.: We could do similar in update_pkgconfig_files().

@nexttime
Copy link
Mannequin Author

nexttime mannequin commented Oct 9, 2011

comment:13

Replying to @nexttime:

P.S.: We could do similar in update_pkgconfig_files().

Like this, which IMHO is also somehow a bit cleaner:

diff --git a/sage-location b/sage-location
--- a/sage-location
+++ b/sage-location
@@ -194,16 +207,17 @@
     ``initialize_pkgconfig_files()``, in particular, each of them contains a
     definition of the variable ``SAGE_ROOT``, since only that is updated here.
     """
+    import re
+    pat = re.compile(r"^SAGE_ROOT=.*\n", re.MULTILINE)
     LIB = os.path.join(os.path.abspath(SAGE_ROOT), 'local', 'lib')
     PKG = os.path.join(LIB, 'pkgconfig')
     for name in os.listdir(PKG):
-        filename = os.path.join(PKG,name)
+        filename = os.path.join(PKG, name)
         if os.path.splitext(filename)[1]==".pc":
             with open(filename) as file:
                 config = file.read()
 
-            prefix_start = config.find('SAGE_ROOT=')
-            if prefix_start==-1:
+            if not pat.search(config):
                 # This should never happen, unless the user modified the file.
                 sys.stderr.write(
                     "Error: sage_location: update_pkgconfig_files():\n" +
@@ -211,12 +225,16 @@
                     name + "  Skipping it...\n")
                 sys.stderr.flush()
                 continue
+                # We could of course call initialize_pkgconfig_file(filename)
+                # here instead to fix this issue, but unfortunately there's no
+                # such function (for a single file) [yet].
 
-            prefix_end = config.find('\n', prefix_start)
-            new_prefix = 'SAGE_ROOT=%s' % os.path.abspath(SAGE_ROOT)
-            new_config = config[:prefix_start] + new_prefix + config[prefix_end:]
+            # Delete all previous definitions of SAGE_ROOT:
+            config = pat.sub("", config)
+
+            definition = "SAGE_ROOT=%s\n" % os.path.abspath(SAGE_ROOT)
             with open(filename, 'w') as file:
-                file.write(new_config)
+                file.write(definition + config)
 
 
 def remove_files(path, remove_extensions):

@nexttime
Copy link
Mannequin Author

nexttime mannequin commented Oct 9, 2011

comment:14

Ok, I've further improved and normalized both functions...

Just wonder whether using use_pat.sub("${SAGE_ROOT}", config) (where use_pat is a compiled pattern matching os.path.abspath(SAGE_ROOT)) would be faster than config.replace(SAGE_ROOT_absolute, "${SAGE_ROOT}").

(I've factored out os.path.abspath(SAGE_ROOT) since it was not only used more than once, but also inside the loops. 8/ shudder)

@nexttime
Copy link
Mannequin Author

nexttime mannequin commented Oct 9, 2011

Attachment: trac_11760-additional_improvements_to_pkgconfig_funs.optional.scripts.patch.gz

SCRIPTS repo. Optional improvements to {initialize,update}_pkgconfig_files(). Apply on top of other patch..

@nexttime
Copy link
Mannequin Author

nexttime mannequin commented Oct 9, 2011

comment:15

Ok, I've attached another, optional patch, to be applied on top of the other one.

@nexttime

This comment has been minimized.

@nexttime
Copy link
Mannequin Author

nexttime mannequin commented Oct 9, 2011

Attachment: trac_11760-cumulative_diff_of_both_patches.diff.gz

Cumulative diff of both patches against Sage 4.7.2.alpha3. For review only.

@nexttime
Copy link
Mannequin Author

nexttime mannequin commented Oct 10, 2011

comment:16

Note that (as I already mentioned elsewhere), on a follow-up or one of the other tickets, we have to (re-)initialize .pc files whenever a package [providing a .pc file] gets (re)installed (preferably at the end of sage-spkg, just like sage-make_relative is called), since that's very likely to overwrite our "once" modified / initialized pkg-config files, until all spkgs fix their own .pc files in their spkg-install files (or just call a Sage script / shell function that does that for them from there).

This ticket just fixes the worst cases, and especially the one that currently always occurs (since more than a year by the way), namely duplicate definitions of SAGE_ROOT in a .pc file.

If sh*t happens during reinstallation of one or more packages, with the patches here, the user will now be able to repair all pkg-config files by simply deleting $SAGE_ROOT/local/lib/sage-current-location.txt, and afterwards running Sage (e.g. just ./sage -c quit), without having to check and manually edit any pkg-config files, which is certainly not perfect, but an improvement.

@jhpalmieri
Copy link
Member

comment:17

Regarding the comment

Will have weird effects in case SAGE_ROOT_absolute contains other characters having special
meaning in regular expressions. 

You could use re.escape to deal with this issue.

In update_pkgconfig_files, the code

# Delete all previous definitions of SAGE_ROOT:
config = def_pat.sub("", config)

The sub method only replaces one definition, not all of them. There should be only one at this point anyway, of course. Perhaps change the comment to reflect this?

Otherwise, the combined patch looks pretty good to me. I still have to actually test it...

@nexttime
Copy link
Mannequin Author

nexttime mannequin commented Oct 10, 2011

comment:18

Replying to @jhpalmieri:

You could use re.escape to deal with this issue.

Yep, but I haven't tested [yet] whether using sub() is faster than str.replace().

The sub method only replaces one definition, not all of them.

Nope. The optional count parameter defaults to 0, which means replace all occurrences.

@nexttime
Copy link
Mannequin Author

nexttime mannequin commented Oct 10, 2011

comment:19

Replying to @nexttime:

Replying to @jhpalmieri:

You could use re.escape to deal with this issue.

Yep, but I haven't tested [yet] whether using sub() is faster than str.replace().

There seems to be no (measurable) difference (10.9--11 vs. 11 ns with %timeit -c ..., and the "largest" .pc file we have). I only tested substituting ${SAGE_ROOT} by ${SAGE_FOO} though, i.e., on an already "initialized" .pc file. Might differ with longer patterns (i.e. paths) to match, and the original file is slightly larger of course.

@nexttime
Copy link
Mannequin Author

nexttime mannequin commented Oct 10, 2011

comment:20

Btw., with re.escape() typical paths get much longer... 8&

@nexttime
Copy link
Mannequin Author

nexttime mannequin commented Oct 10, 2011

comment:21

Looks like IPython's %timeit is broken; I always get 11 ns, regardless of the size of the file / string... XD

@nexttime
Copy link
Mannequin Author

nexttime mannequin commented Oct 10, 2011

comment:22

Replying to @nexttime:

Looks like IPython's %timeit is broken; I always get 11 ns, regardless of the size of the file / string... XD

Oh, my bad, I quoted the statement... 8/

Btw., interestingly the timing framework somehow makes the braces special characters, so I have to escape them for the timing tests, but the backslashs also end up in the result... Weird.

@jhpalmieri
Copy link
Member

comment:23

Replying to @nexttime:

Replying to @jhpalmieri:

The sub method only replaces one definition, not all of them.

Nope. The optional count parameter defaults to 0, which means replace all occurrences.

Sorry, I misread the documentation (thought it said "leftmost non-overlapping occurrence" instead of "leftmost non-overlapping occurrences").

Btw., with re.escape() typical paths get much longer... 8&

Yes, but you don't have to ever look at them, do you? Just pass them on to re.

Anyway, things look good to me. In testing, the old version acts badly if, for example, you delete "sage-location.txt", whereas the new version does a good job of cleaning up the resulting mess. I've tried to break the new version in other ways, but not successfully. There may still be holes, but I'm not sure where. In any case, it's an improvement over the previous version.

(If you feel like adding a comment about re.escape, in case someone else works on this, go ahead. No need for further review in that case.)

@jhpalmieri
Copy link
Member

Reviewer: John Palmieri

@jhpalmieri

This comment has been minimized.

@nexttime
Copy link
Mannequin Author

nexttime mannequin commented Oct 11, 2011

comment:25

Replying to @jhpalmieri:

Replying to @nexttime:

Btw., with re.escape() typical paths get much longer... 8&

Yes, but you don't have to ever look at them, do you? Just pass them on to re.

Of course, but longer strings take longer to compile... ;-)

(If you feel like adding a comment about re.escape, in case someone else works on this, go ahead. No need for further review in that case.)

It seems sub() is indeed slower than str.replace() (about 7:4 in CPU time), so it's not worth using it there instead. Users won't notice any difference anyway I think; I was just curious. (And it's hardly worth updating the patch or providing a new one just to update the comment on that.)

@nexttime
Copy link
Mannequin Author

nexttime mannequin commented Oct 11, 2011

comment:26

P.S.: Thanks for reviewing this; triple X when it's finally merged and we get rid of this long-lasting issue.

@jdemeyer
Copy link
Contributor

comment:27

Replying to @nexttime:

P.S.: Thanks for reviewing this; triple X when it's finally merged and we get rid of this long-lasting issue.

So where is the party?

@jdemeyer
Copy link
Contributor

Merged: sage-4.7.2.alpha4

@jhpalmieri
Copy link
Member

comment:28

See #12331 for a follow-up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants