Skip to content

Commit

Permalink
因應 FB 改版,調整登入元素的 Class、搜尋粉絲頁面的文字分析及 Class
Browse files Browse the repository at this point in the history
  • Loading branch information
Dean Lin committed Aug 18, 2022
1 parent 6ae46ad commit 57adff5
Show file tree
Hide file tree
Showing 21 changed files with 127 additions and 88 deletions.
11 changes: 6 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,8 +79,9 @@
> **2021.11.18**:因應 IG 改版,調整登入檢測程式;並修改部分範例連結,相關 commit 請看:[連結](https://github.com/dean9703111/social_crawler/commit/fe7118dceb474150a93320d7db82b7edcbdd5b87)
> **2021.12.13**:因應 IG 改版,調整抓取追蹤人數的 XPath,相關 commit 請看:[連結](https://github.com/dean9703111/social_crawler/commit/854245776e6631f27fd8957be8df891791d6d3c0)
> **2021.12.17**:因應 IG 改版,調整抓取追蹤人數的 XPath(IG 最近很喜歡改來改去的 QQ),相關 commit 請看:[連結](https://github.com/dean9703111/social_crawler/commit/7836528ae38606af2edb05bfc1fec101f705e127)
> **2021.2.13**:因應 IG 改版,調整抓取追蹤人數的 XPath(IG 常常會有路徑上細微的調整),相關 commit 請看:[連結](https://github.com/dean9703111/social_crawler/commit/1736b56e3a3fb341c6f3d37b8b88b801c545d8da)
> **2021.3.22**:因應 FB 改版,調整確認是否登入的 Class,相關 commit 請看:[連結](https://github.com/dean9703111/social_crawler/commit/0583009d8aa24613e7a409d7ba51daeab11f7968)
> **2021.4.27**:因應 FB 改版,調整搜尋粉絲頁面的文字分析,相關 commit 請看:[連結](https://github.com/dean9703111/social_crawler/commit/199ec02b328a8731e400bc18baded1270ab45965)
> **2021.6.1**:因應 IG 改版,粉絲人數原本用 Xpath,現在改用 Class 來抓,相關 commit 請看:[連結](https://github.com/dean9703111/social_crawler/commit/630a00b6f1b1a2cdb5f9fb46ad30aaf36ea5f1c6)
> **2021.7.14**:因應 FB 改版,當追蹤人數破萬後不顯示詳細數據,故將前往搜尋頁面的程式註解,並調整原程式架構,相關 commit 請看:[連結](https://github.com/dean9703111/social_crawler/commit/5f0ddfb58102958c9b918ad77feb9cc81af50310)
> **2022.2.13**:因應 IG 改版,調整抓取追蹤人數的 XPath(IG 常常會有路徑上細微的調整),相關 commit 請看:[連結](https://github.com/dean9703111/social_crawler/commit/1736b56e3a3fb341c6f3d37b8b88b801c545d8da)
> **2022.3.22**:因應 FB 改版,調整確認是否登入的 Class,相關 commit 請看:[連結](https://github.com/dean9703111/social_crawler/commit/0583009d8aa24613e7a409d7ba51daeab11f7968)
> **2022.4.27**:因應 FB 改版,調整搜尋粉絲頁面的文字分析,相關 commit 請看:[連結](https://github.com/dean9703111/social_crawler/commit/199ec02b328a8731e400bc18baded1270ab45965)
> **2022.6.1**:因應 IG 改版,粉絲人數原本用 Xpath,現在改用 Class 來抓,相關 commit 請看:[連結](https://github.com/dean9703111/social_crawler/commit/630a00b6f1b1a2cdb5f9fb46ad30aaf36ea5f1c6)
> **2022.7.14**:因應 FB 改版,當追蹤人數破萬後不顯示詳細數據,故將前往搜尋頁面的程式註解,並調整原程式架構,相關 commit 請看:[連結](https://github.com/dean9703111/social_crawler/commit/5f0ddfb58102958c9b918ad77feb9cc81af50310)
> **2021.8.19**:因應 FB 改版,調整登入元素的 Class、搜尋粉絲頁面的文字分析及 Class,相關 commit 請看:[連結](https://github.com/dean9703111/social_crawler/commit/199ec02b328a8731e400bc18baded1270ab45965)
12 changes: 7 additions & 5 deletions ch12/index.js
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ async function loginFacebookGetTrace () {
login_ele.click();

// 用登入後才有的元件,來判斷是否登入
await driver.wait(until.elementLocated(By.xpath(`//*[contains(@class,"oajrlxb2")]`)));
await driver.wait(until.elementLocated(By.xpath(`//*[contains(@class,"om3e55n1")]`)));

// 前往粉專頁面
const fan_page = "https://www.facebook.com/baobaonevertell/";
Expand All @@ -72,7 +72,7 @@ async function loginFacebookGetTrace () {
let fb_trace = null; // 這是紀錄FB追蹤人數
let is_accurate = true; // 確認追蹤人數是否精準
// 因為考慮到每個粉專顯示追蹤人數的位置都不一樣,所以就採用全抓再分析
const fb_trace_eles = await driver.wait(until.elementsLocated(By.xpath(`//*[contains(@class,"lrazzd5p")]`)));
const fb_trace_eles = await driver.wait(until.elementsLocated(By.xpath(`//*[contains(@class,"g4qalytl")]`)));
for (const fb_trace_ele of fb_trace_eles) {
const fb_text = await fb_trace_ele.getText();
if (fb_text.includes('位追蹤者')) { // 新版顯示方式
Expand All @@ -85,15 +85,17 @@ async function loginFacebookGetTrace () {
}
break;
} else if (fb_text.includes('個讚')) {
fb_trace = fb_text.replace(/\D/g, ''); // 只取數字
fb_trace = fb_text.
substr(0, fb_text.indexOf('個讚')). // 先移除後面字串
replace(/\D/g, ''); // 只取數字
}
}
if (fb_trace === null) {
const fb_trace_eles2 = await driver.wait(until.elementsLocated(By.xpath(`//*[contains(@class,"b1v8xokw")]`)));
const fb_trace_eles2 = await driver.wait(until.elementsLocated(By.xpath(`//*[contains(@class,"pbevjfx6")]`)));
for (const fb_trace_ele of fb_trace_eles2) {
const fb_text = await fb_trace_ele.getText();
if (fb_text.includes('人在追蹤')) { // 經典版顯示方式
fb_trace = fb_text.replace(/\D/g, ''); // 只取數字
fb_trace = fb_text.replace(/\D/g, ''); // 只取數字
break;
}
}
Expand Down
14 changes: 8 additions & 6 deletions ch14/index.js
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ async function loginInstagramGetTrace (driver) {
let ig_trace = null;//這是紀錄IG追蹤人數
// const ig_trace_xpath = `//*[@id="react-root"]/section/main/div/header/section/ul/li[2]/a/div/span`;
// 原本的 Xpath 被 IG 改掉了,改用 Class 來抓
const ig_trace_xpath =`//*[contains(@class,"_ac2a")]`
const ig_trace_xpath = `//*[contains(@class,"_ac2a")]`;
const ig_trace_eles = await driver.wait(until.elementsLocated(By.xpath(ig_trace_xpath)));
// 剛好這個 Class 只有 3 個,我們需要的資訊在第 2 個 Class,IG 因為當人數破萬時會縮寫顯示,所以改抓title
const ig_text = await ig_trace_eles[1].getAttribute('title');
Expand All @@ -89,7 +89,7 @@ async function loginFacebookGetTrace (driver) {
login_ele.click();

//用登入後才有的元件,來判斷是否登入
await driver.wait(until.elementLocated(By.xpath(`//*[contains(@class,"oajrlxb2")]`)));
await driver.wait(until.elementLocated(By.xpath(`//*[contains(@class,"om3e55n1")]`)));

//前往粉專頁面
const fan_page = "https://www.facebook.com/baobaonevertell/";
Expand All @@ -99,7 +99,7 @@ async function loginFacebookGetTrace (driver) {
let fb_trace = null;//這是紀錄FB追蹤人數
let is_accurate = true;//確認追蹤人數是否精準
//因為考慮到每個粉專顯示追蹤人數的位置都不一樣,所以就採用全抓再分析
const fb_trace_eles = await driver.wait(until.elementsLocated(By.xpath(`//*[contains(@class,"lrazzd5p")]`)));
const fb_trace_eles = await driver.wait(until.elementsLocated(By.xpath(`//*[contains(@class,"g4qalytl")]`)));
for (const fb_trace_ele of fb_trace_eles) {
const fb_text = await fb_trace_ele.getText();
if (fb_text.includes('位追蹤者')) { // 新版顯示方式
Expand All @@ -112,15 +112,17 @@ async function loginFacebookGetTrace (driver) {
}
break;
} else if (fb_text.includes('個讚')) {
fb_trace = fb_text.replace(/\D/g, ''); // 只取數字
fb_trace = fb_text.
substr(0, fb_text.indexOf('個讚')). // 先移除後面字串
replace(/\D/g, ''); // 只取數字
}
}
if (fb_trace === null) {
const fb_trace_eles2 = await driver.wait(until.elementsLocated(By.xpath(`//*[contains(@class,"b1v8xokw")]`)));
const fb_trace_eles2 = await driver.wait(until.elementsLocated(By.xpath(`//*[contains(@class,"pbevjfx6")]`)));
for (const fb_trace_ele of fb_trace_eles2) {
const fb_text = await fb_trace_ele.getText();
if (fb_text.includes('人在追蹤')) { // 經典版顯示方式
fb_trace = fb_text.replace(/\D/g, ''); // 只取數字
fb_trace = fb_text.replace(/\D/g, ''); // 只取數字
break;
}
}
Expand Down
12 changes: 7 additions & 5 deletions ch15/tools/crawlerFB.js
Original file line number Diff line number Diff line change
Expand Up @@ -33,14 +33,14 @@ async function loginFacebook (driver) {

//因為登入這件事情要等Server回應,你直接跳轉粉絲專頁會導致登入失敗
//用登入後才有的元件,來判斷是否登入
await driver.wait(until.elementLocated(By.xpath(`//*[contains(@class,"oajrlxb2")]`)));
await driver.wait(until.elementLocated(By.xpath(`//*[contains(@class,"om3e55n1")]`)));
}

async function getTrace (driver, fan_page_name) {
let fb_trace = null;//這是紀錄FB追蹤人數
let is_accurate = true;//確認追蹤人數是否精準
//因為考慮到每個粉專顯示追蹤人數的位置都不一樣,所以就採用全抓再分析
const fb_trace_eles = await driver.wait(until.elementsLocated(By.xpath(`//*[contains(@class,"lrazzd5p")]`)));
const fb_trace_eles = await driver.wait(until.elementsLocated(By.xpath(`//*[contains(@class,"g4qalytl")]`)));
for (const fb_trace_ele of fb_trace_eles) {
const fb_text = await fb_trace_ele.getText();
if (fb_text.includes('位追蹤者')) { // 新版顯示方式
Expand All @@ -53,15 +53,17 @@ async function getTrace (driver, fan_page_name) {
}
break;
} else if (fb_text.includes('個讚')) {
fb_trace = fb_text.replace(/\D/g, ''); // 只取數字
fb_trace = fb_text.
substr(0, fb_text.indexOf('個讚')). // 先移除後面字串
replace(/\D/g, ''); // 只取數字
}
}
if (fb_trace === null) {
const fb_trace_eles2 = await driver.wait(until.elementsLocated(By.xpath(`//*[contains(@class,"b1v8xokw")]`)));
const fb_trace_eles2 = await driver.wait(until.elementsLocated(By.xpath(`//*[contains(@class,"pbevjfx6")]`)));
for (const fb_trace_ele of fb_trace_eles2) {
const fb_text = await fb_trace_ele.getText();
if (fb_text.includes('人在追蹤')) { // 經典版顯示方式
fb_trace = fb_text.replace(/\D/g, ''); // 只取數字
fb_trace = fb_text.replace(/\D/g, ''); // 只取數字
break;
}
}
Expand Down
10 changes: 6 additions & 4 deletions ch16/tools/crawlerFB.js
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ async function loginFacebook (driver) {

//因為登入這件事情要等Server回應,你直接跳轉粉絲專頁會導致登入失敗
//用登入後才有的元件,來判斷是否登入
await driver.wait(until.elementLocated(By.xpath(`//*[contains(@class,"oajrlxb2")]`)), long_time);
await driver.wait(until.elementLocated(By.xpath(`//*[contains(@class,"om3e55n1")]`)), long_time);
return true;
} catch (e) {
console.error('FB登入失敗');
Expand All @@ -60,7 +60,7 @@ async function getTrace (driver, fan_page_name) {
let is_accurate = true;//確認追蹤人數是否精準
try {
//因為考慮到每個粉專顯示追蹤人數的位置都不一樣,所以就採用全抓再分析
const fb_trace_eles = await driver.wait(until.elementsLocated(By.xpath(`//*[contains(@class,"lrazzd5p")]`)), short_time);
const fb_trace_eles = await driver.wait(until.elementsLocated(By.xpath(`//*[contains(@class,"g4qalytl")]`)), short_time);
for (const fb_trace_ele of fb_trace_eles) {
const fb_text = await fb_trace_ele.getText();
if (fb_text.includes('位追蹤者')) { // 新版顯示方式
Expand All @@ -73,11 +73,13 @@ async function getTrace (driver, fan_page_name) {
}
break;
} else if (fb_text.includes('個讚')) {
fb_trace = fb_text.replace(/\D/g, ''); // 只取數字
fb_trace = fb_text.
substr(0, fb_text.indexOf('個讚')). // 先移除後面字串
replace(/\D/g, ''); // 只取數字
}
}
if (fb_trace === null) {
const fb_trace_eles2 = await driver.wait(until.elementsLocated(By.xpath(`//*[contains(@class,"b1v8xokw")]`)), short_time);
const fb_trace_eles2 = await driver.wait(until.elementsLocated(By.xpath(`//*[contains(@class,"pbevjfx6")]`)), short_time);
for (const fb_trace_ele of fb_trace_eles2) {
const fb_text = await fb_trace_ele.getText();
if (fb_text.includes('人在追蹤')) { // 經典版顯示方式
Expand Down
10 changes: 6 additions & 4 deletions ch17/tools/crawlerFB.js
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ async function loginFacebook (driver) {

//因為登入這件事情要等Server回應,你直接跳轉粉絲專頁會導致登入失敗
//用登入後才有的元件,來判斷是否登入
await driver.wait(until.elementLocated(By.xpath(`//*[contains(@class,"oajrlxb2")]`)), long_time);
await driver.wait(until.elementLocated(By.xpath(`//*[contains(@class,"om3e55n1")]`)), long_time);
return true;
} catch (e) {
console.error('FB登入失敗');
Expand All @@ -69,7 +69,7 @@ async function getTrace (driver, fan_page_name) {
let is_accurate = true;//確認追蹤人數是否精準
try {
//因為考慮到每個粉專顯示追蹤人數的位置都不一樣,所以就採用全抓再分析
const fb_trace_eles = await driver.wait(until.elementsLocated(By.xpath(`//*[contains(@class,"lrazzd5p")]`)), short_time);
const fb_trace_eles = await driver.wait(until.elementsLocated(By.xpath(`//*[contains(@class,"g4qalytl")]`)), short_time);
for (const fb_trace_ele of fb_trace_eles) {
const fb_text = await fb_trace_ele.getText();
if (fb_text.includes('位追蹤者')) { // 新版顯示方式
Expand All @@ -82,11 +82,13 @@ async function getTrace (driver, fan_page_name) {
}
break;
} else if (fb_text.includes('個讚')) {
fb_trace = fb_text.replace(/\D/g, ''); // 只取數字
fb_trace = fb_text.
substr(0, fb_text.indexOf('個讚')). // 先移除後面字串
replace(/\D/g, ''); // 只取數字
}
}
if (fb_trace === null) {
const fb_trace_eles2 = await driver.wait(until.elementsLocated(By.xpath(`//*[contains(@class,"b1v8xokw")]`)), short_time);
const fb_trace_eles2 = await driver.wait(until.elementsLocated(By.xpath(`//*[contains(@class,"pbevjfx6")]`)), short_time);
for (const fb_trace_ele of fb_trace_eles2) {
const fb_text = await fb_trace_ele.getText();
if (fb_text.includes('人在追蹤')) { // 經典版顯示方式
Expand Down
10 changes: 6 additions & 4 deletions ch18/tools/crawlerFB.js
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ async function loginFacebook (driver) {

//因為登入這件事情要等Server回應,你直接跳轉粉絲專頁會導致登入失敗
//用登入後才有的元件,來判斷是否登入
await driver.wait(until.elementLocated(By.xpath(`//*[contains(@class,"oajrlxb2")]`)), long_time);
await driver.wait(until.elementLocated(By.xpath(`//*[contains(@class,"om3e55n1")]`)), long_time);
return true;
} catch (e) {
console.error('FB登入失敗');
Expand All @@ -73,7 +73,7 @@ async function getTrace (driver, fan_page_name) {
let is_accurate = true;//確認追蹤人數是否精準
try {
//因為考慮到每個粉專顯示追蹤人數的位置都不一樣,所以就採用全抓再分析
const fb_trace_eles = await driver.wait(until.elementsLocated(By.xpath(`//*[contains(@class,"lrazzd5p")]`)), short_time);
const fb_trace_eles = await driver.wait(until.elementsLocated(By.xpath(`//*[contains(@class,"g4qalytl")]`)), short_time);
for (const fb_trace_ele of fb_trace_eles) {
const fb_text = await fb_trace_ele.getText();
if (fb_text.includes('位追蹤者')) { // 新版顯示方式
Expand All @@ -86,11 +86,13 @@ async function getTrace (driver, fan_page_name) {
}
break;
} else if (fb_text.includes('個讚')) {
fb_trace = fb_text.replace(/\D/g, ''); // 只取數字
fb_trace = fb_text.
substr(0, fb_text.indexOf('個讚')). // 先移除後面字串
replace(/\D/g, ''); // 只取數字
}
}
if (fb_trace === null) {
const fb_trace_eles2 = await driver.wait(until.elementsLocated(By.xpath(`//*[contains(@class,"b1v8xokw")]`)), short_time);
const fb_trace_eles2 = await driver.wait(until.elementsLocated(By.xpath(`//*[contains(@class,"pbevjfx6")]`)), short_time);
for (const fb_trace_ele of fb_trace_eles2) {
const fb_text = await fb_trace_ele.getText();
if (fb_text.includes('人在追蹤')) { // 經典版顯示方式
Expand Down
10 changes: 6 additions & 4 deletions ch19/tools/crawlerFB.js
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ async function loginFacebook (driver) {

//因為登入這件事情要等Server回應,你直接跳轉粉絲專頁會導致登入失敗
//用登入後才有的元件,來判斷是否登入
await driver.wait(until.elementLocated(By.xpath(`//*[contains(@class,"oajrlxb2")]`)), long_time);
await driver.wait(until.elementLocated(By.xpath(`//*[contains(@class,"om3e55n1")]`)), long_time);
return true;
} catch (e) {
console.error('FB登入失敗');
Expand All @@ -73,7 +73,7 @@ async function getTrace (driver, fan_page_name) {
let is_accurate = true;//確認追蹤人數是否精準
try {
//因為考慮到每個粉專顯示追蹤人數的位置都不一樣,所以就採用全抓再分析
const fb_trace_eles = await driver.wait(until.elementsLocated(By.xpath(`//*[contains(@class,"lrazzd5p")]`)), short_time);
const fb_trace_eles = await driver.wait(until.elementsLocated(By.xpath(`//*[contains(@class,"g4qalytl")]`)), short_time);
for (const fb_trace_ele of fb_trace_eles) {
const fb_text = await fb_trace_ele.getText();
if (fb_text.includes('位追蹤者')) { // 新版顯示方式
Expand All @@ -86,11 +86,13 @@ async function getTrace (driver, fan_page_name) {
}
break;
} else if (fb_text.includes('個讚')) {
fb_trace = fb_text.replace(/\D/g, ''); // 只取數字
fb_trace = fb_text.
substr(0, fb_text.indexOf('個讚')). // 先移除後面字串
replace(/\D/g, ''); // 只取數字
}
}
if (fb_trace === null) {
const fb_trace_eles2 = await driver.wait(until.elementsLocated(By.xpath(`//*[contains(@class,"b1v8xokw")]`)), short_time);
const fb_trace_eles2 = await driver.wait(until.elementsLocated(By.xpath(`//*[contains(@class,"pbevjfx6")]`)), short_time);
for (const fb_trace_ele of fb_trace_eles2) {
const fb_text = await fb_trace_ele.getText();
if (fb_text.includes('人在追蹤')) { // 經典版顯示方式
Expand Down
10 changes: 6 additions & 4 deletions ch21/tools/crawlerFB.js
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ async function loginFacebook (driver) {

//因為登入這件事情要等Server回應,你直接跳轉粉絲專頁會導致登入失敗
//用登入後才有的元件,來判斷是否登入
await driver.wait(until.elementLocated(By.xpath(`//*[contains(@class,"oajrlxb2")]`)), long_time);
await driver.wait(until.elementLocated(By.xpath(`//*[contains(@class,"om3e55n1")]`)), long_time);
return true;
} catch (e) {
console.error('FB登入失敗');
Expand All @@ -73,7 +73,7 @@ async function getTrace (driver, fan_page_name) {
let is_accurate = true;//確認追蹤人數是否精準
try {
//因為考慮到每個粉專顯示追蹤人數的位置都不一樣,所以就採用全抓再分析
const fb_trace_eles = await driver.wait(until.elementsLocated(By.xpath(`//*[contains(@class,"lrazzd5p")]`)), short_time);
const fb_trace_eles = await driver.wait(until.elementsLocated(By.xpath(`//*[contains(@class,"g4qalytl")]`)), short_time);
for (const fb_trace_ele of fb_trace_eles) {
const fb_text = await fb_trace_ele.getText();
if (fb_text.includes('位追蹤者')) { // 新版顯示方式
Expand All @@ -86,11 +86,13 @@ async function getTrace (driver, fan_page_name) {
}
break;
} else if (fb_text.includes('個讚')) {
fb_trace = fb_text.replace(/\D/g, ''); // 只取數字
fb_trace = fb_text.
substr(0, fb_text.indexOf('個讚')). // 先移除後面字串
replace(/\D/g, ''); // 只取數字
}
}
if (fb_trace === null) {
const fb_trace_eles2 = await driver.wait(until.elementsLocated(By.xpath(`//*[contains(@class,"b1v8xokw")]`)), short_time);
const fb_trace_eles2 = await driver.wait(until.elementsLocated(By.xpath(`//*[contains(@class,"pbevjfx6")]`)), short_time);
for (const fb_trace_ele of fb_trace_eles2) {
const fb_text = await fb_trace_ele.getText();
if (fb_text.includes('人在追蹤')) { // 經典版顯示方式
Expand Down
Loading

0 comments on commit 57adff5

Please sign in to comment.