Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Getting error pocket-reader--url-domain: Wrong type argument: arrayp, nil with blank buffer (no entries) #54

Open
deen1 opened this issue Oct 7, 2024 · 19 comments
Assignees
Labels
Milestone

Comments

@deen1
Copy link

deen1 commented Oct 7, 2024

Using Emacs 29.1 on Linux, pocket-reader.el version 0.4-pre.

I hadn't used pocket-reader in about a month or so, and additionally,
had to update the iPadOS version of the app as well, which required
re-signing in to that (I add this detail in case it's important).

I initially tried M-x pocket-reader, and ended up with an empty buffer
with the message wrong-type-argument sequencep 1 appearing. I looked
to see if this was a known problem, and found issue #53.

Since a fix to that issue had been pushed to the latest release on
master, I upgraded pocket-reader via Straight, with the head at commit
7f55668.

After reloading pocket-reader.el, I ran M-x pocket-reader again, and
got a blank buffer again, this time with the message
pocket-reader--url-domain: Wrong type argument: arrayp, nil.

Subsequently, I:

  • Updated pocket-reader's dependencies, via
    straight-pull-package-and-deps

  • Replicated this bug in a minimal config, running emacs -Q and
    loading Straight's bootstrap.el before loading pocket-reader
    alone.

  • Restarted my current Emacs session and re-built/re-compiled
    pocket-reader.

The issue persists.

ADDED

Running M-x pocket-reader after toggling debug-on-error produces the
following:

Debugger entered--Lisp error: (wrong-type-argument arrayp nil)
  replace-regexp-in-string("\\`\\(?:amp\\|www\\)\\." "" nil)
  pocket-reader--url-domain(nil)
  pocket-reader--get-url(#<hash-table eq 2/65 0x157537f010ad>)
  pocket-reader--add-items((... ... ... ... ... ... ... ... ... ... ... ... ... ... ...))
  pocket-reader-search(nil)
  pocket-reader-refresh()
  pocket-reader-mode()
  pocket-reader()
  funcall-interactively(pocket-reader)
  command-execute(pocket-reader record)
  execute-extended-command(nil "pocket-reader" nil)
  funcall-interactively(execute-extended-command nil "pocket-reader" nil)
  command-execute(execute-extended-command)

@alphapapa
Copy link
Owner

Thanks. Please report the value of this option in your configuration: pocket-reader-url-priorities

@alphapapa alphapapa self-assigned this Oct 8, 2024
@alphapapa alphapapa added the bug label Oct 8, 2024
@alphapapa alphapapa added this to the 0.4 milestone Oct 8, 2024
@deen1
Copy link
Author

deen1 commented Oct 8, 2024

The value of pocket-reader-url-priorities is

 (amp_url resolved_url given_url)

@radixm46
Copy link

radixm46 commented Oct 9, 2024

Although I haven’t thoroughly checked the conditions yet, I’ve noticed a similar issue. It seems that when trying to retrieve what might be old archived articles, the returned items sometimes miss fields like time_added, favorite, and *_url, which might be causing the problem.

While I haven’t done detailed testing yet, setting default values for ht-get or ht-get* seems to temporarily prevent the error.

@alphapapa
Copy link
Owner

@radixm46 Thanks. I'm guessing that maybe they cleaned up their backend code a bit, and maybe the values are a bit different, as you noticed, and this old code made some assumptions that no longer hold.

@deen1
Copy link
Author

deen1 commented Dec 3, 2024

Just to add, I had thought that things were too messed up to use pocket-reader in any way. However it turns out this is true of the default/unread view. Hitting F for instance shows me the list of favorite unread articles, and using s I can still search for things, including archived articles. It's just the main view that's broken.

This might help work around the issue: just mark an unread/unarchived article as a "favorite", and then go to pocket-reader and hit F.

@alphapapa
Copy link
Owner

@deen1 @radixm46 I think that should fix it. At least, it should make the article list work again. If an item is still missing a URL as received from the server, a warning should be displayed. Feel free to share such a warning if you see one, in case there's some other change needed to adjust to changes in the API.

@deen1
Copy link
Author

deen1 commented Dec 18, 2024

@alphapapa Thanks for looking into this and providing this patch. Unfortunately it doesn't seem to work, with a similar error as before.

(I did a straight-pull-package and had to resolve some issues with dependencies as a result, which might be the cause).

Now I get the error:

pocket-reader--url-domain: Wrong type argument: char-or-string-p, t

and the *Warnings* buffer pops up with the message

⛔ Warning (pocket-reader): No URLs found for item: #s(hash-table size 65 test eq rehash-size 1.5 rehash-threshold 0.8125 data (status "2" item_id "39438322")).

Doing M-x toggle-debug-on-error and M-x pocket-reader gives (as well as popping the *Warnings* buffer)

Debugger entered--Lisp error: (wrong-type-argument char-or-string-p t)
  url-generic-parse-url(t)
  pocket-reader--url-domain(t)
  pocket-reader--add-items((... ... ... ... ... ... ... ... ... ... ... ... ... ... ...))
  pocket-reader-search(nil)
  pocket-reader-refresh()
  pocket-reader-mode()
  pocket-reader()
  funcall-interactively(pocket-reader)
  command-execute(pocket-reader record)
  execute-extended-command(nil "pocket-reader" nil)
  funcall-interactively(execute-extended-command nil "pocket-reader" nil)
  command-execute(execute-extended-command)

@alphapapa alphapapa reopened this Dec 18, 2024
@alphapapa
Copy link
Owner

@deen1 That is weird. I tested the fix on my system before I pushed it. Something in the results you're getting back from the server must be unusual.

At this point I think there's no alternative but to look at the data, i.e. what's elided in the pocket-reader--add-items((...)) form in the backtrace. You'll need to set edebug-print-level to nil and then generate the backtrace again. Feel free to edit URLs and page titles to protect your privacy if you want; it's probably the structure that's most important.

@deen1
Copy link
Author

deen1 commented Dec 20, 2024

OK, I might be barking up the wrong tree here, but stepping through the backtrace, I noticed that pocket-reader--add-items is fed a list of items, the first of which is an entry that looks like this:

(\39438322 (item_id . "39438322") (status . "2"))

The remaining entries in the list look reasonable, like this:

(\3234488357 (item_id . "3234488357") (favorite . "0") (status . "0") (time_added . "1726653878") (time_updated . "1726653878") (time_read . "0") (time_favorited . "0") (sort_id . 19) (tags) (top_image_url . "https://images.tedium.co/uploads/tedium012021.gif") (resolved_id . "3234488357") (given_url . "https://tedium.co/2021/01/20/on-battery-power-mete...") (given_title . "Duracell PowerCheck History: An Energizer of a Bat...") (resolved_title . "Power Struggle") (resolved_url . "https://tedium.co/2021/01/20/on-battery-power-mete...") (excerpt . "Today in Tedium: The world has changed immeasurabl...") (is_article . "1") (is_index . "0") (has_video . "0") (has_image . "1") (word_count . "2376") (lang . "en") (time_to_read . 11) (listen_duration_estimate . 920) (authors (\76445652 (author_id . "76445652") (name . "Ernie Smith") (url . "https://tedium.co/author/ernie/") (item_id . "3234488357"))) (domain_metadata (name . "tedium.co")) (images (\1 (item_id . "3234488357") (image_id . "1") (src . "https://images.tedium.co/uploads/patreon.png") (width . "0") (height . "0") (credit . "") (caption . "")) (\2 (item_id . "3234488357") (image_id . "2") (src . "https://images.tedium.co/uploads/Energizer-Battery...") (width . "1000") (height . "596") (credit . "") (caption . "")) (\3 (item_id . "3234488357") (image_id . "3") (src . "https://images.tedium.co/uploads/Patent-Battery-Dr...") (width . "1000") (height . "1472") (credit . "") (caption . "")) (\4 (item_id . "3234488357") (image_id . "4") (src . "https://images.tedium.co/uploads/Kodak-Patent-Batt...") (width . "1000") (height . "657") (credit . "") (caption . "")) (\5 (item_id . "3234488357") (image_id . "5") (src . "https://images.tedium.co/uploads/Batteries.jpg") (width . "1000") (height . "662") (credit . "") (caption . ""))) (image (item_id . "3234488357") (src . "https://images.tedium.co/uploads/patreon.png") (width . "0") (height . "0")))

As I described above, the warning from your patch arose for the item I described at the star, with item_id 39438322:

display-warning(pocket-reader 
"No URLs found for item: #s(hash-table size 65 test eq rehash-size 1.5 rehash-threshold 0.8125 data 
(status \"2\" item_id \"39438322\")).")

I suspect this is the source of the error, though I'm not sure.

I don't think I can remove it from the Pocket web interface, as it doesn't seem to exist, and I've tried (ht-remove! pocket-reader-items 39438322) without success. I also tried to "skip" this entry by adding

(unless (eq id 39438322) ...)

to functions like pocket-reader--add-items and others, but that doesn't seem to work.

Do you agree that this rogue "invisible" item is the problem? If so, where in the code could I try to skip or delete it?

EDIT Found a few more of these ghost entries by properly expanding out the top level lisp forms in the backtrace. (I have a very long list of unread pocket articles ...)

(\39438322
(item_id . "39438322")
(status . "2"))
(\296131230
(item_id . "296131230")
(status . "2"))
(\1394583446
(item_id . "1394583446")
(status . "2"))

However, it seems like just the first one alone (39438322) triggers the error. This might have to do with pocket-reader--items-to-tabulated-list-entries or related not being able to handle the "empty" values for date added etc. and being unable to display the tabulated list of entries.

@alphapapa
Copy link
Owner

@deen1 Seems like you're on the right track. I don't know why Pocket would be returning entries without even URLs; that seems like a bug on their end.

If you want to delete those items, those list items should be enough to do it using the pocket-lib-delete function, e.g. (pocket-lib-delete '(\39438322 (item_id . "39438322") (status . "2"))). Whether that's a good idea, I can't say; imagine if there were a temporary bug causing Pocket to not return URLs for those items, in which case deleting them would mean you'd probably never see them again. OTOH if it's some kind of unusual corruption in their system and those entries truly lack URLs and titles, then they're probably useless anyway.

It's a bit disappointing, but I guess we need to handle this case after all, of items lacking URLs and titles.

@deen1
Copy link
Author

deen1 commented Dec 24, 2024

After a lot of tinkering around, I managed to arrive at a configuration where I could access the main pocket-reader view.

One minor point: I might be wrong, but I don't think your patch works as intended. The line

(display-warning 'pocket-reader (format "No URLs found for item: %S." item))

pops up the *Warnings* buffer, but the URL "https://example.com/?error=item-had-no-URL" is not being passed to the function calling pocket-reader--get-url, which is why we still get the error

replace-regexp-in-string("\\`\\(?:amp\\|www\\)\\." "" nil)
pocket-reader--url-domain("No URLs found for item: #s(hash-table size 65 test eq rehash-size 1.5 rehash-threshold 0.8125 data (status \"2\" item_id \"39438322\")).")

But that's just the first thing to work around. With the following tweaks, I got pocket-reader to show my saved articles plus display entries for the "ghost items" mentioned above:

Minimal working fix

Comment out display-warning, otherwise default URL doesn't seem to be passed.

(defun pocket-reader--get-url (item &optional &key first)
  "Return URL for ITEM.
If FIRST is non-nil, return the first URL found, not the best
one.  ITEM should be a hash-table with the appropriate keys, one
of which is chosen as configured by
`pocket-reader-url-priorities'."
  (or (when-let ((prioritized-url
                  (cl-loop for key in pocket-reader-url-priorities
                           for url = (ht-get item key) ; Gets the URL
                           when (s-present? url)
                           return url)))
        (if first
            prioritized-url
          (if-let ((domain (pocket-reader--url-domain prioritized-url))
                   (key (cl-loop for (key . vals) in pocket-reader-domain-url-type-map
                                 when (member domain vals)
                                 return key))
                   (domain-preferred-url (ht-get item key)))
              domain-preferred-url
            prioritized-url)))
      ;; (display-warning 'pocket-reader (format "No URLs found for item: %S." item))
      "https://example.com/?error=item-had-no-URL"))

Provide default values for 'time_added, 'favorite, and one of either 'resolved_url or 'given_url.

(defun pocket-reader--items-to-tabulated-list-entries (items)
  "Convert ITEMS to a list of vectors of lists.
Suitable for `tabulated-list-entries'."
(cl-loop for it being the hash-values of items
        collect (let ((id (string-to-number (ht-get it 'item_id)))
                        (added (pocket-reader--format-timestamp
                                (string-to-number
                                (ht-get it 'time_added "1730000000"))))
                        (favorite (pocket-reader--favorite-string (ht-get it 'favorite "0")))
                        (title (pocket-reader--not-empty-string
                                (pocket-reader--or-string-not-blank
                                (ht-get it 'resolved_title)
                                (ht-get it 'given_title)
                                "[untitled]")))
                        (domain (pocket-reader--url-domain
                                ;; Don't use --get-url here, because, e.g. we don't want an "amp." to be shown in the list
                                (pocket-reader--or-string-not-blank
                                (ht-get it 'resolved_url)

                                (ht-get it 'given_url       "https://example.com/?error=item-had-no-URL"))))
                        (tags (pocket-reader--not-empty-string (s-join "," (ht-get it 'tags)))))
                    (list id (vector added favorite title domain tags)))))

Need to handle missing value error by providing a default time (probably a better way of writing this):

(defun my/test-string-to-number (str)
  (condition-case err
      (string-to-number str)
    (error 1735000000 )) )

Use the above to replace string-to-number in the chosen sorting function:

(defun pocket-reader--added-fancy< (a b)
  "Return non-nil if A should be sorted before B.
Items are compared by date, then favorite status, then tags, then
domain.  Suitable for sorting `tabulated-list-entries'."
  (cl-flet ((day (it) (let* ((id (car it))
                             (added-string (ht-get* pocket-reader-items id 'time_added)) )
                        (time-to-days (my/test-string-to-number added-string)))))
    (let* ((a-day (day a))
           (b-day (day b)))
      (if (= a-day b-day)
          ;; Same day: compare favorite, then tags, then domain
          (cl-case (pocket-reader--compare-favorite a b)
            ('< nil)
            ('> t)
            ('= (cl-case (pocket-reader--compare-tags a b)
                  ('< nil)
                  ('> t)
                  ('=
                   ;; Same tags; compare domain (invert since the default order is descending)
                   (not (pocket-reader--domain< a b))))))
        ;; Different day: compare day
        (< a-day b-day)))))

Results

This all works, but

We now get warnings like this:

Warning: Case '> will match ‘quote’.  If that’s intended, write (> quote) instead.  Otherwise, don’t quote ‘>’.
Warning: Case '= will match ‘quote’.  If that’s intended, write (= quote) instead.  Otherwise, don’t quote ‘=’.

I find 7 such "ghost items" with no URL, date added or content, which have archive status "2" rather than "1" or "0". They appear in the tabulated view but are sorted incorrectly by date.

2024-12-24-145044_956x279_scrot

I don't know if this is due to the changes I made above, but as seen in the screenshot above, archived entries are being displayed as well.

I've found out how to export my Pocket articles to csv, so I will delete those entries and see what happens.

@alphapapa
Copy link
Owner

Oops, I made a mistake in that I didn't realize that display-warning returns t. I pushed a fix for that in d507c37.

I find 7 such "ghost items" with no URL, date added or content, which have archive status "2" rather than "1" or "0".

Interesting. The code at https://github.com/Pocket/pocket-monorepo/blob/591b18e1eeba9db79333723222bb6acbb932f31d/servers/list-api/src/types/index.ts#L230-L234 seems to indicate that such items are "deleted"--so it's a mystery why they would be returned by the API. Maybe some old database entries didn't get cleaned up properly.

FWIW, https://getpocket.com/developer/ now shows that:

Pocket does not offer direct technical support for our API; if you are encountering issues, we recommend consulting with other third-party developers. Requests for integrating new applications are currently closed.

That, and the fact that they've apparently put all the code on GitHub, suggests to me that their long-term plans don't involve maintaining the Pocket API, so it wouldn't surprise me if one day they just turn it off. A shame, but not a surprise.

@deen1
Copy link
Author

deen1 commented Dec 26, 2024

Hmm … these "ghost" items seem un-deletable!

In the pocket-reader view, I tried pressing D with each item selected, and they were removed from the tabulated list. However, hitting d or G to reload or refresh the view returns the deleted items.

As you suggested above, I then evaluated:

(pocket-lib-delete '(\39438322 (item_id . "39438322") (status . "2")))

which produced the following output:

((status . 1) (action_errors . [nil]) (action_results . [t]))

However, reloading pocket-reader still shows the same number of items. (I haven't figured out how to check the item_id's of the entries when they are listed in tabulated form.) It appears I'm stuck with them.

Just to double-check: do you also see archived entries in the main view? And do you get the quote warnings described above (regarding '< and '= in the sorting functions)? Would you mind evaluating the changes I made above and seeing if they affect your pocket-reader experience the same way?

That, and the fact that they've apparently put all the code on GitHub, suggests to me that their long-term plans don't involve maintaining the Pocket API, so it wouldn't surprise me if one day they just turn it off. A shame, but not a surprise.

Yes, I suspect something like this will happen. I have also noticed that the Pocket app/web interface no longer provides an Article View for some sites that they previously offered one for -- perhaps sites and publishers aren't feeling too happy with the service.

@deen1
Copy link
Author

deen1 commented Dec 26, 2024

The code at … seems to indicate that such items are "deleted"–so it's
a mystery why they would be returned by the API. Maybe some old database
entries didn't get cleaned up properly.

Having re-read your last post, the reason why these posts apparently can't be deleted is probably because they've been already been marked as deleted (by being given status "2").

I just realized that the reason I couldn't get skip these items with a conditional was because "2" is a string and I was using eq instead of string=.

Now, having pulled your latest update (which puts the missing URL display-warning and the default value in a progn), I only need the following change to pocket-reader--add-items to skip over the ghost/deleted entries and get pocket-reader to display without any errors:

(defun pocket-reader--add-items (items)
  "Add ITEMS to `pocket-reader-items' and update display."
  (--each items
    (let* ((item (ht<-alist (cdr it) #'eq))
           (id (string-to-number (ht-get item 'item_id)))
           (domain (pocket-reader--url-domain (pocket-reader--get-url item)))
           (status (ht-get item 'status))
           (tags (pocket-lib--process-tags (ht-get item 'tags))))
      (ht-set item 'domain domain)
      (ht-set item 'tags tags)
      (if (not (string= status "2"))
          (ht-set pocket-reader-items id item) )      )     )


  (pocket-reader--set-tabulated-list-format)
      (setq tabulated-list-entries (pocket-reader--items-to-tabulated-list-entries pocket-reader-items))

  (tabulated-list-init-header)
  (tabulated-list-print 'remember-pos)
  (pocket-reader--finalize))

The warning about quoting the = or < conditionals goes away – probably because these were all triggered by having to compare regular entries with the ghost entries, or the identical ghost entries with each other (which ended up getting very computationally intensive).

I think there ended up being 125 such entries (accumulated over a dozen or so years of usage), which I found by adding a (message (format "Ghost entry: %d" id)) to the if statement above and hitting o to load more items into view.

However, I now have the problem of archived entries still showing up in the default pocket-reader list. Hitting o loads more entries, included older archived ones as well. That can't just be because of the one change I've made above, right?

@alphapapa
Copy link
Owner

Whether archived items are returned is controlled by the search query, which is described at https://github.com/alphapapa/pocket-reader.el?tab=readme-ov-file#searching

However, reloading pocket-reader still shows the same number of items.

AFAIK that would be the number of displayed items, not the total number of items in your account. Deleting one old entry wouldn't change the number that is displayed, since the one you deleted would be replaced by the next-oldest one in the search results.

Do note, if you haven't already, that the tabulated-list view is just a view; it's a representation of the data, not the data itself. Deleting an item from the list view directly only deletes its representation from the list view; it does not tell the API to delete the item from your account. That's why this package binds the command that both deletes it from the list view and deletes it via the API at the same time.

@deen1
Copy link
Author

deen1 commented Dec 28, 2024

Whether archived items are returned is controlled by the search query, which is described at https://github.com/alphapapa/pocket-reader.el?tab=readme-ov-file#searching

I should have been clearer, this is in the default pocket-reader view, which according to the linked documentation should be :unread. I've just checked pocket-reader-default-queries and it's set to the default value of nil, which is probably why I'm seeing the archived entries.

But before this problem started, and I was on an older version of pocket-reader, it seemed the default view was precisely the unread articles, and I didn't have to change anything. Perhaps something changed in the package at some point?

Deleting an item from the list view directly only deletes its representation from the list view; it does not tell the API to delete the item from your account. That's why this package binds the command that both deletes it from the list view and deletes it via the API at the same time.

I not 100% sure what you mean here because I have been attempting to delete the item from the list view using D, which seems to call pocket-lib-delete at some point. I don't think I was at any point removing it from just the list view.

As I said above, I also tried to manually delete an item by evaluating

(pocket-lib-delete '(\39438322 (item_id . "39438322") (status . "2")))

but that didn't work either.

@deen1
Copy link
Author

deen1 commented Dec 28, 2024

In fact, setting

(setq pocket-reader-default-queries '(":unread"))

pretty much fixes the entire problem, without having to tweak any other functions. So maybe having it set to nil by default is the root of the matter.

@alphapapa
Copy link
Owner

pretty much fixes the entire problem, without having to tweak any other functions. So maybe having it set to nil by default is the root of the matter.

Do you notice a difference in behavior if you set that variable to an empty string, compared to its being set to nil?

@deen1
Copy link
Author

deen1 commented Dec 31, 2024

No,

(setq pocket-reader-default-queries '(""))

leads to the same

⛔ Warning (pocket-reader): No URLs found for item: #s(hash-table size 65 test eq rehash-size 1.5 rehash-threshold 0.8125 data (status "2" item_id "39438322")).

as setting it to nil.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants