Skip to main content
Open In ColabOpen on GitHub

PullMd Tool Usage

Overview

This guide demonstrates how to use the pull_md tool within the Langchain framework to convert URLs into Markdown. pull_md is capable of handling web pages built with dynamic JavaScript frameworks like React, Angular, and Vue.js, efficiently retrieving fully rendered Markdown without consuming local server resources.

Setup

First, install the necessary packages using pip:

%pip install -qU pull-md langchain-community

Instantiation

Import the necessary classes and instantiate the PullMdAPIWrapper and PullMdQueryRun tools:

from langchain_community.utilities import PullMdAPIWrapper
from langchain_community.tools import PullMdQueryRun

# Instantiate the API Wrapper
pull_md_wrapper = PullMdAPIWrapper()

# Instantiate the Query Run Tool
pull_md_tool = PullMdQueryRun()

Invocation

Use the PullMdAPIWrapper to convert a URL to Markdown directly, or use the PullMdQueryRun tool within Langchain to handle the conversion process internally.

# Using PullMdAPIWrapper to convert a URL to Markdown
markdown = pull_md_wrapper.convert_url_to_markdown("http://example.com")
print(markdown)
# Example Domain
This domain is established to be used for illustrative examples in documents. You may use this domain in literature without prior coordination or asking for permission.
More information...
# Using PullMdQueryRun Tool to convert a URL to Markdown
markdown_tool = pull_md_tool.invoke({"url": "http://example.com"})
print(markdown_tool)
# Example Domain
This domain is established to be used for illustrative examples in documents. You may use this domain in literature without prior coordination or asking for permission.
More information...

API Reference

PullMdAPIWrapper

PullMdAPIWrapper provides a straightforward interface to convert URLs to Markdown. It leverages the pull.md service to handle JavaScript-rendered pages, ensuring efficient resource usage by offloading rendering tasks.

Methods

  • convert_url_to_markdown(url: str) -> str: Converts the provided URL into Markdown format. Supports dynamic content from frameworks like React, Angular, and Vue.js.

PullMdQueryRun

Documentation is available in the PullMdQueryRun API Reference.

You can directly pass the URL to the tool, and it will handle the conversion process internally, simplifying integration into larger workflows.


Was this page helpful?