Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Deno runtime #1913

Open
tigitz opened this issue May 12, 2023 · 6 comments
Open

Support Deno runtime #1913

tigitz opened this issue May 12, 2023 · 6 comments
Labels
feature Issues that represent new features or improvements to existing features. t-tooling Issues with this label are in the ownership of the tooling team.

Comments

@tigitz
Copy link

tigitz commented May 12, 2023

Which package is the feature request for? If unsure which one to select, leave blank

crawlee

Feature

I would like to use Deno for my crawling projects.

Motivation

All the features and comparisons with node can be found here: https://deno.com/runtime

Ideal solution or implementation, and any additional constraints

Add some tests in CI to make sure package can be run with Deno runtime

Alternative solutions or implementations

No response

Other context

Here are currently discovered list of Deno issues blocking the support:

@tigitz tigitz added the feature Issues that represent new features or improvements to existing features. label May 12, 2023
@B4nan
Copy link
Member

B4nan commented May 31, 2023

I see the unref issue you created on deno side is now resolved, was that the only problem?

FWIW we will most probably switch to native ESM in the next major.

@tigitz
Copy link
Author

tigitz commented Jun 1, 2023

@B4nan I've updated the list of issue I've posted in the Deno project since then.

With a manual patch, it's now possible to run the current simple usage example:

// Add import of CheerioCrawler
import { RequestQueue, CheerioCrawler } from 'crawlee';

const requestQueue = await RequestQueue.open();
await requestQueue.addRequest({ url: 'https://crawlee.dev' });

// Create the crawler and add the queue with our URL
// and a request handler to process the page.
const crawler = new CheerioCrawler({
    requestQueue,
    // The `$` argument is the Cheerio object
    // which contains parsed HTML of the website.
    async requestHandler({ $, request }) {
        // Extract <title> text with Cheerio.
        // See Cheerio documentation for API docs.
        const title = $('title').text();
        console.log(`The title of "${request.url}" is: ${title}.`);
    }
})

// Start the crawler and wait for it to finish
await crawler.run();

However, I've tested for my own large project in the meantime and I've encountered some issues with enqueueLinks not queuing anything and the processing time being around ~10x higher. Which is definitely not the improvements I expected.

I plan to redo some tests and gather as much info as my expertise allows me to before sharing those findings.

@lloydjatkinson
Copy link

I'd also like to see Deno support! If Crawlee ships native ESM I imagine that would be beneficial too.

@mtrunkat mtrunkat added the t-tooling Issues with this label are in the ownership of the tooling team. label Jul 18, 2023
@maheshbansod
Copy link

Would love to use crawlee in my deno project! Is this on the project timeline?

@pkoretic
Copy link

Deno 2.x seems to work fine with latest npm crawlee package version.

@vikyw89
Copy link

vikyw89 commented Dec 12, 2024

Didnt work on deno 2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Issues that represent new features or improvements to existing features. t-tooling Issues with this label are in the ownership of the tooling team.
Projects
None yet
Development

No branches or pull requests

7 participants