Skip to content

Commit

Permalink
update to docs
Browse files Browse the repository at this point in the history
  • Loading branch information
Filimoa committed Apr 9, 2024
1 parent 07ab5a2 commit 9167f75
Show file tree
Hide file tree
Showing 5 changed files with 116 additions and 9 deletions.
3 changes: 1 addition & 2 deletions .github/workflows/publish-docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@ on:
permissions:
contents: write
jobs:

deploy:
runs-on: ubuntu-latest
steps:
Expand All @@ -26,5 +25,5 @@ jobs:
path: .cache
restore-keys: |
mkdocs-material-
- run: pip install mkdocs-material
- run: pip install -r requirements-docs.txt
- run: mkdocs gh-deploy --force
43 changes: 41 additions & 2 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@ Open Parse is designed to fill this gap by providing a flexible, easy-to-use lib

## Quick Start

## Basic Example

```python
import openparse

Expand All @@ -31,8 +33,45 @@ for node in parsed_basic_doc.nodes:
print(node)
```

**📓 Try the sample notebook** <a href="https://colab.research.google.com/drive/1Z5B5gsnmhFKEFL-5yYIcoox7-jQao8Ep?usp=sharing" class="external-link" target="_blank">here</a>


## Semantic Processing Example

Chunking documents is fundamentally about grouping similar semantic nodes together. By embedding the text of each node, we can then cluster them together based on their similarity.

```python
from openparse import processing, DocumentParser

semantic_pipeline = processing.SemanticIngestionPipeline(
openai_api_key=OPEN_AI_KEY,
model="text-embedding-3-large",
min_tokens=64,
max_tokens=1024,
)
parser = DocumentParser(
processing_pipeline=semantic_pipeline,
)
parsed_content = parser.parse(basic_doc_path)
```

**📓 Sample notebook** <a href="https://github.com/Filimoa/open-parse/blob/main/src/cookbooks/semantic_processing.ipynb" class="external-link" target="_blank">here</a>

<br>

**📓 Try the sample notebook** <a href="https://colab.research.google.com/drive/1Z5B5gsnmhFKEFL-5yYIcoox7-jQao8Ep?usp=sharing" class="external-link" target="_blank">here</a>

<br><br>

## Cookbooks

https://github.com/Filimoa/open-parse/tree/main/src/cookbooks


## Sponsors

<!-- sponsors -->

<a href="https://www.data.threesigma.ai/filings-ai" target="_blank" title="Three Sigma: AI for insurance filings."><img src="https://sergey-filimonov.nyc3.digitaloceanspaces.com/open-parse/marketing/three-sigma-wide.png" width="250"></a>

<!-- /sponsors -->

Does your use case need something special? Reach [out](https://www.linkedin.com/in/sergey-osu/).
70 changes: 67 additions & 3 deletions mkdocs.yml
Original file line number Diff line number Diff line change
@@ -1,18 +1,69 @@
site_name: Open Parse
site_author: Sergey Filimonov
repo_url: "https://github.com/Filimoa/open-parse/"
repo_name: "open-parse"
site_url: "https://github.com/Filimoa/open-parse/"
theme:
name: material
features:
- content.code.copy

palette:
- scheme: default
primary: black
accent: indigo
toggle:
icon: material/brightness-7
name: Switch to dark mode
- scheme: slate
primary: black
accent: indigo
toggle:
icon: material/brightness-4
name: Switch to light mode
font:
text: Roboto
code: Roboto Mono
markdown_extensions:
- abbr
- admonition
- pymdownx.details
- attr_list
- def_list
- footnotes
- md_in_html
- toc:
permalink: true
- pymdownx.arithmatex:
generic: true
- pymdownx.betterem:
smart_enable: all
- pymdownx.caret
- pymdownx.details
- pymdownx.emoji:
emoji_generator: !!python/name:material.extensions.emoji.to_svg
emoji_index: !!python/name:material.extensions.emoji.twemoji
- pymdownx.highlight:
anchor_linenums: true
line_spans: __span
pygments_lang_class: true
- pymdownx.inlinehilite
- pymdownx.snippets
- pymdownx.superfences
- pymdownx.keys
- pymdownx.mark
- pymdownx.smartsymbols
- pymdownx.snippets:
auto_append:
- includes/mkdocs.md
- pymdownx.superfences:
custom_fences:
- name: mermaid
class: mermaid
format: !!python/name:pymdownx.superfences.fence_code_format
- pymdownx.tabbed:
alternate_style: true
combine_header_slug: true
- pymdownx.tasklist:
custom_checkbox: true

nav:
- Home: index.md
- Parsing Text:
Expand All @@ -28,3 +79,16 @@ nav:
- Customization: processing/customization.md
- Serializing Results: serialization.md
- Visualization: visualization.md

plugins:
- search:
separator: '[\s\u200b\-_,:!=\[\]()"`/]+|\.(?!\d)|&[lg]t;|(?!\b)(?=[A-Z][a-z])'
- minify:
minify_html: true
- mkdocstrings:
handlers:
python:
options:
members_order: alphabetical
allow_inspection: true
show_bases: true
3 changes: 1 addition & 2 deletions requirements-dev.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
-r requirements.txt
-r requirements-docs.txt
pytest
ruff
mypy
Expand All @@ -8,5 +9,3 @@ beautifulsoup4
twine
packaging
wheel
mkdocs-material
mkdocs-material-extensions
6 changes: 6 additions & 0 deletions requirements-docs.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
mkdocs-material
mkdocs-material-extensions
mkdocstrings-python
mkdocs-jupyter
pymdown-extensions
mkdocs-minify-plugin

0 comments on commit 9167f75

Please sign in to comment.