Web Scraping for LLM in 2024: Jina AI Reader API, Mendable Firecrawl, and Crawl4AI and More

Ғылым және технология

In this video, we look into various tools for web scraping, both free and paid. Learn how to scrape data from web pages and PDFs using Beautiful Soup, Reader API from Jena AI, and Firecrawl from Mendable. We also discuss advanced web scraping solutions like Scrape Graph AI and Crawl4AI. Ideal for creating LLM applications, this video provides practical examples and code demonstrations. Subscribe for more tutorials on building LLM applications and tools!
#webscraping #llm #parsing
🦾 Discord: / discord
☕ Buy me a Coffee: ko-fi.com/promptengineering
|🔴 Patreon: / promptengineering
💼Consulting: calendly.com/engineerprompt/c...
📧 Business Contact: engineerprompt@gmail.com
Become Member: tinyurl.com/y5h28s6h
💻 Pre-configured localGPT VM: bit.ly/localGPT (use Code: PromptEngineering for 50% off).
RAG Beyond Basics Course:
prompt-s-site.thinkific.com/c...
LINKS:
Notebook: tinyurl.com/5n8dcbj8
Reader API: jina.ai/reader/
FireCrawl: www.firecrawl.dev/
Crawl4AI: github.com/unclecode/crawl4ai
ScrapeGraphAI: github.com/VinciGit00/Scrapeg...
TIMESTAMPS
00:00 Introduction to Data Scraping Series
00:21 Challenges of Web Data
01:32 Overview of Web Scraping Tools
01:59 Example Web Pages for Scraping
03:05 BeautifulSoup: The Baseline Approach
05:05 Reader API: JINA AI
08:21 FireCrawl: An Alternative Tool
10:42 Crawl4Ai and ScrapeGraphAI
12:13 Conclusion and Next Steps
All Interesting Videos:
Everything LangChain: • LangChain
Everything LLM: • Large Language Models
Everything Midjourney: • MidJourney Tutorials
AI Image Generation: • AI Image Generation Tu...

Пікірлер: 41

    Келесі