A data extraction tool to convert PDF to Markdown and JSON
Scrape website content and generate sitemaps
Convert PDFs to a Hugging Face dataset