Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

producer and recommendations sometimes gives error #86

Closed
duck7000 opened this issue Feb 20, 2017 · 13 comments
Closed

producer and recommendations sometimes gives error #86

duck7000 opened this issue Feb 20, 2017 · 13 comments

Comments

@duck7000
Copy link
Contributor

duck7000 commented Feb 20, 2017

I have a function in my program to update all movies through this library.
If i use that function sometimes it gives below errors and breaks executing the script because the max execution time is exceeded.

Sometimes this gives invalid argument foreach in the method producer at line 1521 in title.php

The same happens for movie recommendations, but than undefined index at line 524 and 526 in title.php

Those errors are both in the same update all run

It never happens to any of the other methods, only those two

The strange thing is that if i update a single movie there is nothing wrong..

I can't pin point where this problem may exist
Maybe someone can get a closer look to this methods?

Here is the apache log:

[Mon Feb 20 22:32:03 2017] [warn] [client 192.168.0.1] PHP Warning:  Invalid argument supplied for foreach() in C:\\webserver\\wwwroot\\phpmoviedb\\lib\\util\\imdbphp\\src\\Imdb\\Title.php on line 1521, referer: http://www.riethorst.net/phpmoviedb/?go=updateallform
[Mon Feb 20 22:36:14 2017] [notice] [client 192.168.0.1] PHP Notice:  Trying to get property of non-object in C:\\webserver\\wwwroot\\phpmoviedb\\lib\\util\\imdbphp\\src\\Imdb\\Title.php on line 524, referer: http://www.riethorst.net/phpmoviedb/?go=updateallform
[Mon Feb 20 22:36:14 2017] [notice] [client 192.168.0.1] PHP Notice:  Undefined offset: 1 in C:\\webserver\\wwwroot\\phpmoviedb\\lib\\util\\imdbphp\\src\\Imdb\\Title.php on line 526, referer: http://www.riethorst.net/phpmoviedb/?go=updateallform
[Mon Feb 20 22:37:06 2017] [error] [client 192.168.0.1] PHP Fatal error:  Maximum execution time of 600 seconds exceeded in C:\\webserver\\wwwroot\\phpmoviedb\\lib\\util\\imdbphp\\src\\Imdb\\Request.php on line 150, referer: http://www.riethorst.net/phpmoviedb/?go=updateallform
[Mon Feb 20 23:02:58 2017] [warn] [client 192.168.0.1] PHP Warning:  Invalid argument supplied for foreach() in C:\\webserver\\wwwroot\\phpmoviedb\\lib\\util\\imdbphp\\src\\Imdb\\Title.php on line 1521, referer: http://www.riethorst.net/phpmoviedb/?go=updateallform
tboothman added a commit that referenced this issue Feb 21, 2017
@tboothman
Copy link
Owner

Add better logging to your application so you can see which titles are making it error. I'd like a film without producers for the first one ... the second one must be a film without a year .. an example of that would be nice too

@duck7000
Copy link
Contributor Author

I'll try to find the movie that gives the trouble.

But i don't understand why you alterred the method directors, while this issue is about producers?
Is there also a problem with directors?

@duck7000
Copy link
Contributor Author

I found one movie that gives the error on producer method!

http://www.imdb.com/title/tt0149937/
This movie has no producer listed at imdb

tboothman added a commit that referenced this issue Feb 21, 2017
@tboothman
Copy link
Owner

tboothman commented Feb 21, 2017

The problem was that get_table_rows was returning a false if there were no rows in the table, which meant producers was erroring when it tried to loop on it. I changed directors because it was a disgusting function with multiple coding errors (it only didn't loop on false because count(false) happens to return 0 and not an error, it checked for get_table_rows returning false by comparing its return to null with == ...etc)

I probably should've learnt my lesson about totally refactoring a function after causing this bug but the more this stuff gets refactored into sensible code the less likely I'll mess up in the future

@duck7000
Copy link
Contributor Author

Thanks for your explanation!

I'm trying to reproduce the error on movie recommendations but i can't find it anymore.

There is however a strange thing going on with the recommendations at imdb's site.
The info on the tile page is not consistent! if you refresh (F5) the page a few times than the info from recommendations is sometimes gone, sometimes there and sometimes only first 6 but later more than 6.

A few examples:
http://www.imdb.com/title/tt0367479/
http://www.imdb.com/title/tt0118636/
http://www.imdb.com/title/tt0109327/
http://www.imdb.com/title/tt0144814/

@duck7000
Copy link
Contributor Author

duck7000 commented Feb 22, 2017

@tboothman
I found at least one movie that gives the errors on movie recommendations.
This movie has 2 recommendations without a year
http://www.imdb.com/title/tt0299283/

And i think that the method has to be re done because of the use of the same variable name...(my bad, sorry)

@duck7000
Copy link
Contributor Author

I found a few others too:

http://www.imdb.com/title/tt0878652/
http://www.imdb.com/title/tt0092796/ (tales of the darkside has a year pair, only first one captured)
http://www.imdb.com/title/tt0478125/ (creepshow has no year)
http://www.imdb.com/title/tt0285492/ (Cubed has no year)
http://www.imdb.com/title/tt0377713/ (Cubed has no year)
http://www.imdb.com/title/tt1090671/ (No Mans Land@2735590 has no year)
http://www.imdb.com/title/tt0071282/ (The Keep@1757890 has no year)
http://www.imdb.com/title/tt0339840/ (Winchester@1072748 has no year)

It turns out that if there is no year available, there isn't a span with class="nobr". this will always result in a error in this method..

My apologies for making a little bit of a mess of this method.. i thought it worked fine but apparently not.

@duck7000
Copy link
Contributor Author

I have made a attempt to alter this method..

public function movie_recommendations() {
   if (empty($this->movierecommendations)) {
     $doc = new \DOMDocument();
     @$doc->loadHTML($this->getPage("Title"));
     $xp = new \DOMXPath($doc);
     $cells = $xp->query("//div[@id=\"title_recs\"]/div[@class=\"rec_overviews\"]/div[@class=\"rec_overview\"]/div[@class=\"rec_details\"]");
     foreach ($cells as $cell) {
       preg_match('!tt(\d+)!',$cell->getElementsByTagName('a')->item(0)->getAttribute('href'),$match);
       $movie['title'] = trim($cell->getElementsByTagName('a')->item(0)->nodeValue);
       $movie['imdbid'] = $match[1];
       if (preg_match('!(\d+)!',$cell->getElementsByTagName('span')->item(0)->nodeValue,$ref)){
   		$movie['year'] = $ref[1];
   	}
       else{
   		$movie['year'] = "";
   	}
       $this->movierecommendations[] = $movie;
     }
   }
   return $this->movierecommendations;
 }

But it's not complete yet, the year pair is not captured though

@duck7000
Copy link
Contributor Author

There is more wrong with my above attempt..

http://www.imdb.com/title/tt0450385/
this movie recommendations Whispers has a additional span between the titel and year

@duck7000
Copy link
Contributor Author

Next try..

public function movie_recommendations() {
    if (empty($this->movierecommendations)) {
      $doc = new \DOMDocument();
      @$doc->loadHTML($this->getPage("Title"));
      $xp = new \DOMXPath($doc);
      $cells = $xp->query("//div[@id=\"title_recs\"]/div[@class=\"rec_overviews\"]/div[@class=\"rec_overview\"]/div[@class=\"rec_details\"]");
      foreach ($cells as $cell) {
        preg_match('!tt(\d+)!',$cell->getElementsByTagName('a')->item(0)->getAttribute('href'),$match);
        $movie['title'] = trim($cell->getElementsByTagName('a')->item(0)->nodeValue);
        $movie['imdbid'] = $match[1];
        if (preg_match('!(\d+)!',$cell->getElementsByTagName('span')->item(0)->nodeValue,$ref)){
			$movie['year'] = $ref[1];
		}
		elseif(preg_match('!(\d+)!',$cell->getElementsByTagName('span')->item(1)->nodeValue,$ref)){
			$movie['year'] = $ref[1];
		}
        else{
			$movie['year'] = "";
		}
        $this->movierecommendations[] = $movie;
      }
    }
    return $this->movierecommendations;
  }

@duck7000
Copy link
Contributor Author

Oke finally something that's actually working!

It's not using dom or xpath, couldn't get it working...

It will capture the following year appearances: Single Year, Year span (xxxx-xxxx), or empty if no Year available. It strips out any characters other than the year digits.

It feels like it's faster than dom/ xpath but i'm not sure
It doesn't spit out any php errors, and i'm pretty sure that my code could be made simpler or better, but for me it's a victory that it is actually working though.

#-------------------------------------------------------[ Recommendations ]---
  /**
   * Get recommended movies (People who liked this...also liked)
   * @return array recommendations (array[imdbid,title,year])
   * @see IMDB page / (TitlePage)
   */
  public function movie_recommendations() {
	if (empty($this->movierecommendations)) {
		$this->getPage("Title");
		if ( preg_match_all('!<div class="rec-title">\s*(.*?)\s*</div>!ims', $this->page["Title"], $matches) ) {
			for ($i=0;$i<count($matches[0]);++$i){
				if (preg_match('!<a\s+href="/title/tt(\d+)/[^>]*>\s*(.+)\s*</a>!ims',$matches[0][$i],$match)){
					if (preg_match('!<span class="nobr">\s*(.*?)\s*</span>!ims',$matches[0][$i],$year)){
						$temp = preg_replace('/[^0-9]/','',$year[0]);
						if(mb_strlen(trim($temp)) >4){
							$year = substr_replace($temp, "-", 4, 0);
						}
						else{
							$year = trim($temp);
						}
						$this->movierecommendations[] = array('title'=>strip_tags($match[2]),'imdbid'=>$match[1],'year'=>$year);
					}
					else{
						$this->movierecommendations[] = array('title'=>strip_tags($match[2]),'imdbid'=>$match[1],'year'=>'');
					}
				}
				else{
					return $this->movierecommendations;
				}
			}
		}
	}
    return $this->movierecommendations;
  }

@duck7000
Copy link
Contributor Author

duck7000 commented Mar 7, 2017

@tboothman
Are you going to do something with the info in this thread?

Maybe it slipt your mind haha

@duck7000
Copy link
Contributor Author

Okay... i understand that this has no priority or any interest but maybe someone can help me a little bit.
Im not a prof programmer, i do this as a hobby

I made a attempt to extend the recommendations method with rating and plotoutline, this works well but as you can see in the code it isn't very efficient or nicely programmed probably because.. yep lack of skills i know

So my question is am i on the right track here?

#-------------------------------------------------------[ Recommendations ]---
  /**
   * Get recommended movies (People who liked this...also liked)
   * @return array recommendations (array[imdbid,title,year,rating,plotoutline])
   * @see IMDB page / (TitlePage)
   */
  public function movie_recommendations() {
	if (empty($this->movierecommendations)) {
		$this->getPage("Title");
		preg_match_all('!<div class="rec-title">\s*(.*?)\s*</div>!ims', $this->page["Title"], $valueTitle);
		preg_match_all('!<span class="value">\s*(.*?)\s*</span>!ims', $this->page["Title"], $valueRating);
		$title = count($valueTitle[1]);
		$rating = count($valueRating[1]);
		if($title != $rating){
			$pos = $title - $rating;
			for($i=0; $i<$pos; $i++){
				$valueRating[1][] = "Not yet released";
			}
		}
		preg_match_all('!<div class="rec-outline">\s*(.*?)\s*</div>!ims', $this->page["Title"], $valuePlotoutline);
		$plot2 = count($valuePlotoutline[1]);
		if($title != $plot2){
			$pos = $title - $plot2;
			for($i=0; $i<$pos; $i++){
				$valuePlotoutline[1][] = "";
			}
		}
		$matches = array();
		for ($i=0;$i<count($valueTitle[1]);$i++){
			$matches[] = array('title'=>$valueTitle[1][$i], 'rating'=>$valueRating[1][$i], 'plotoutline'=>$valuePlotoutline[1][$i]);
		}
		
		for ($i=0;$i<count($matches);$i++){
			if (preg_match('!<a\s+href="/title/tt(\d+)/[^>]*>\s*(.+)\s*</a>!ims',$matches[$i]["title"],$match)){
				if($matches[$i]["plotoutline"] != ""){
					$plot = preg_replace('!\s+<a href="/title/tt\d+/(plotsummary|synopsis)[^>]*>\s*(.+)\s*</a>.*$!ims','',$matches[$i]["plotoutline"]);
					$plot1 = trim(strip_tags($plot), ". ,");
					$plotoutline = $plot1.'.';
				}
				else{
					$plotoutline = "";
				}
				if(preg_match('/[A-Za-z0-9]/', $matches[$i]["rating"])){
					$ratingClean = $matches[$i]["rating"];
				}
				else{
					$ratingClean = "";
				}
				if (preg_match('!<span class="nobr">\s*(.*?)\s*</span>!ims',$matches[$i]["title"],$year)){
					$temp = preg_replace('/[^0-9]/','',$year[0]);
					if(mb_strlen(trim($temp)) >4){
						$year = substr_replace($temp, "-", 4, 0);
					}
					else{
						$year = trim($temp);
					}
					$this->movierecommendations[] = array('title'=>strip_tags($match[2]),'imdbid'=>$match[1],'year'=>$year, 'rating'=>$ratingClean, 'plotoutline'=>$plotoutline);
					unset($ratingClean);
					unset($plotoutline);
					unset($year);
				}
				else{
					$this->movierecommendations[] = array('title'=>strip_tags($match[2]),'imdbid'=>$match[1],'year'=>'', 'rating'=>$ratingClean, 'plotoutline'=>$plotoutline);
					unset($ratingClean);
					unset($plotoutline);
				}
			}
			else{
				return $this->movierecommendations;
			}
		}
	}
    return $this->movierecommendations;
  }

Thanks
Ed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants