To start using Unstructured right away, skip ahead to the UI quickstart or API quickstart now!
What is Unstructured?
Unstructured provides a platform and tools to ingest and process unstructured documents for retrieval-augmented generation (RAG) and agentic AI. This 60-second video describes more about what Unstructured does and its benefits (no sound):
This 40-second video demonstrates a simple use case that Unstructured helps solve (no sound):
This 60-second video shows why using Unstructured is preferable to building your own similar solution:
You can use Unstructured through a user interface (UI), an API, or both. Read on to learn more.
Unstructured UI quickstart
If you do not already have an Unstructured account, sign up for free. After you sign up, you are automatically signed in to your new Unstructured Starter account, at https://platform.unstructured.io. After you are signed in, watch the following video: TODO: Add a how-to video here. The following procedure outlines the steps in the preceding video.- After you are signed in, the Start page appears.
-
In the Welcome area, do one of the following:
- Click one of the sample files, such as realestate.pdf, to have Unstructured parse and transform that sample file.
- Click Browse files, and then browse to and select one of your own files, to have Unstructured parse and transform it.
- After Unstructured has finished parsing and transforming the file (a process known as partitioning), you will see the file’s contents in the center and Unstructured’s results on the right.
- The view on the right shows a formatted view of Unstructured’s results, which is designed for human readability. To see the underlying JSON view of the results, which is designed for RAG and agentic AI, click JSON at the top of the view on the right side of the screen. Learn about what’s in the JSON view.
- To download the results as a local JSON file, click the download icon to the left of the Formatted and JSON buttons.
- To have Unstructured partition a different file, click Add new file on the left, and then browse to and select the target file.
- To view the results for a file that was previously partitioned during this session, click the file’s name in the Recent files list on the left.
- To return to the Start page, click the X (close) button at the top left of the page, next to Transform.
- To have Unstructured do more than just partitioning, such as chunking, enriching, and embedding, click Edit in Workflow Editor at the top right of the page, or skip over to the walkthrough.
-
To get an associated code snippet that you can use to have Unstructured parse and transform a file programmatically instead of by using the Unstructured user interface,
click the down arrow next to Copy curl command at the top right of the page, and then do one of the following:
-
Click Show options to see the associated
curl
, Unstructured Python SDK, and Unstructured JavaScript/TypeScript SDK code snippets. Then do one of the following:- Click the Copy icon in the top right corner to copy the active code snippet to your system’s clipboard.
- Click My API keys to get your Unstructured API key, which is necessary when calling Unstructured programmatically.
- Click API Documentation to learn how to set up, customize, and run the code.
-
Click Copy curl command to copy the
curl
code snippet to your system’s clipboard without viewing the code snippet first. - Click Copy Python SDK code to copy the Unstructured Python SDK code snippet to your system’s clipboard without viewing the code snippet first.
- Click Copy JavaScript code to copy the Unstructured JavaScript/TypeScript SDK code snippet to your system’s clipboard without viewing the code snippet first.
-
Click Show options to see the associated
Unstructured API quickstart
- If you do not already have an Unstructured account, sign up for free. After you sign up, you are automatically signed in to your Unstructured Starter account, at https://platform.unstructured.io.
- Watch the following 3-minute video:
Run this quickstart as a notebook on Google Colab instead.
Get the sample code for this video.
Get the full setup instructions for this video.
Learn more.
Pricing
Unstructured offers several account types with different pricing plans:- Starter - A single user, with a single workspace, hosted alongside other accounts on Unstructured’s cloud infrastructure.
- Team - Multiple users and workspaces, hosted alongside other accounts on Unstructured’s cloud instrastructure.
-
Enterprise - Multiple users and workspaces, isolated from all other accounts, with two hosting options for additional security and control:
- Dedicated instance - Hosted within a virtual private cloud (VPC) running inside Unstructured’s cloud infrastructure.
- In-VPC - Hosted within your own VPC on your own cloud infrastructure.
- For these file types, a page is a page, slide, or image:
.pdf
,.pptx
, and.tiff
. - For
.docx
files that have page metadata, Unstructured calculates the number of pages based on that metadata. - For all other file types, Unstructured calculates the number of pages as the file’s size divided by 100 KB.
- For non-file data, Unstructured calculates a page as 100 KB of incoming data to be processed.
Questions? Need help?
- For general questions about Unstructured products and pricing, email Unstructured Sales at sales@unstructured.io.
- For technical support for Unstructured accounts, email Unstructured Support at support@unstructured.io.
- For technical support for the Unstructured open source library, use our Slack community.