Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ADaaS documentation. #64

Merged
merged 43 commits into from
Jul 24, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
abdb35b
Add ADaaS documentation.
samod Jul 9, 2024
b420bd6
Update fern/versions/public.yml
samod Jul 10, 2024
40aeee6
Update fern/versions/public.yml
samod Jul 10, 2024
0cb1eeb
Update fern/docs/pages/references/adaas/extracting_metadata.mdx
samod Jul 10, 2024
626a694
Update fern/docs/pages/references/adaas/adaas_overview.mdx
samod Jul 10, 2024
53f4ee5
Update fern/docs/pages/references/adaas/extracting_attachments.mdx
samod Jul 10, 2024
698f054
Update fern/docs/pages/references/adaas/adaas_overview.mdx
samod Jul 10, 2024
9c9069e
Move adaas/ up one level. Use hyphens in file names. Remove duplicate…
samod Jul 10, 2024
ac67db1
Modify titles to sentence case.
samod Jul 10, 2024
123a77b
Add Keyrings and Import sections to Concepts page.
samod Jul 15, 2024
21c2dea
Add Keyrings and Import sections to Concepts page.
samod Jul 15, 2024
902c913
Migrate entities definitions to separate pages.
samod Jul 15, 2024
5073650
Lowercase snap-ins. Remove terminology.
samod Jul 18, 2024
9db8cd8
Add Extract metadata update. Fix of capitalized terminology.
samod Jul 18, 2024
e89cbfb
Add delete phases. Replace Worker with snap-in.
samod Jul 18, 2024
ca048a8
Update fern/docs/pages/adaas/overview.mdx
samod Jul 18, 2024
940e2b9
Code review fixes.
samod Jul 18, 2024
7fde58b
Update fern/docs/pages/adaas/overview.mdx
samod Jul 22, 2024
71e116f
Update fern/docs/pages/adaas/overview.mdx
samod Jul 22, 2024
257d901
Update fern/docs/pages/adaas/extracting-attachments.mdx
samod Jul 22, 2024
a42552f
Update fern/docs/pages/adaas/extracting-attachments.mdx
samod Jul 22, 2024
8a92b12
Update fern/docs/pages/adaas/delete-phases.mdx
samod Jul 22, 2024
a0c203b
Update fern/versions/public.yml
samod Jul 22, 2024
f0e56e4
Remove latin abbrevations.
samod Jul 22, 2024
79f51de
Remove cluttering with EVENT_TYPE.
samod Jul 22, 2024
87c9371
Fix sentence cases.
samod Jul 22, 2024
4f86538
Update fern/docs/pages/adaas/getting-started.mdx
samod Jul 22, 2024
a1728f8
Update fern/docs/pages/adaas/getting-started.mdx
samod Jul 22, 2024
15f1de2
Update fern/docs/pages/adaas/getting-started.mdx
samod Jul 22, 2024
d6e2cde
First term use in italics, pt. 1.
samod Jul 22, 2024
10040b6
Modify the order of Getting Started.
samod Jul 22, 2024
55da773
Numbered list on extraction phases.
samod Jul 22, 2024
686b317
Correct the Extract metadata. Replace with .
samod Jul 22, 2024
8becb0a
Replace Artifacts with artifacts.
samod Jul 22, 2024
5473d59
Fix Data extraction section.
samod Jul 22, 2024
ad09872
Fix ATtachments extraction section.
samod Jul 22, 2024
f5959ac
Remove commas from JSON snippet.
samod Jul 22, 2024
61007ef
Fix a typo.
samod Jul 22, 2024
df9a5eb
Image edit.
samod Jul 22, 2024
89b1e33
First uses of terms in italics, pt. 2.
samod Jul 22, 2024
6c73f00
Apply suggestions from code review
samod Jul 23, 2024
91bdf92
Code review fixes.
samod Jul 23, 2024
b67952f
Add chef-cli public repo.
samod Jul 24, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 42 additions & 0 deletions fern/docs/pages/references/adaas/adaas_overview.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# Airdrop-as-a-Service (ADaaS)

Airdrop-as-a-Service (ADaaS) as a system consists of the internal Airdrop components and a Worker (Extractor or Loader),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No cap on entities/objects, which are common nouns. https://www.notion.so/devrev/Should-a-product-or-feature-name-be-capitalized-20088da3e4074b18b0bec42555522c3a?pvs=4

Suggested change
Airdrop-as-a-Service (ADaaS) as a system consists of the internal Airdrop components and a Worker (Extractor or Loader),
Airdrop-as-a-Service (ADaaS) as a system consists of the internal Airdrop components and a worker (extractor or loader),

which is a Snap-In with a predefined structure.
These Workers (Extractors and Loaders) can be built by anyone, not just DevRevelers.

It is useful to know how the system works though, so this section explains how ADaaS works at a high level.
samod marked this conversation as resolved.
Show resolved Hide resolved
Message protocols and other specifics can be found in the Extractor and Loader sections.

It’s also useful to know that Airdrop uses its own AWS S3 structure
samod marked this conversation as resolved.
Show resolved Hide resolved
(and its own buckets) and that only Airdrop components have access to it.
This means that Workers can’t access these buckets directly,
which is why all their data has to be uploaded to DevRev through S3Interact.

For clarity, Artifact API and Airdrop are shown as separate entities,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this paragraph.

even though they are both accessed through DevRev’s API.
Worker is shown as external, to clarify that it is developed by third parties,
even though it runs on DevRev’s infrastructure.

```mermaid
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fern does not support Mermaid diagrams. You can go to mermaid.live and export a PNG for inclusion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use SVG on fern docs?

---
title: ADaaS architecture high-level overview
---
graph TB
externalSystem[External system]
worker([Worker])
s3interact[s3interact]
airdrop[Airdrop]
s3[(AWS S3)]

externalSystem <-- Get/create users, issues, ... --> worker

worker <-- Exchange artifact upload/download URLs --> s3interact
worker <-- Upload/download artifacts --> s3
worker <-- ADaaS messages and REST API --> airdrop


subgraph DevRev
s3interact <-- Prepare upload/download URL --> s3
airdrop <-- Download/upload artifacts --> s3
end
```
10 changes: 10 additions & 0 deletions fern/docs/pages/references/adaas/creating_a_keyring.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# Creating Keyrings
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too short to be its own page. Combine it with another page, maybe "Getting Started".


Keyrings are a DevRev-specific mechanism for managing authentication for External Systems.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Keyrings are a DevRev-specific mechanism for managing authentication for External Systems.
Keyrings are a DevRev-specific mechanism for managing authentication for external systems.

They are called Connections in the DevRev app.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok to capitalize and bold here since it's literal text found in the UI.

Suggested change
They are called Connections in the DevRev app.
They are called **Connections** in the DevRev app.


They provide a secure way to store and manage credentials within your DevRev Snap-In.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
They provide a secure way to store and manage credentials within your DevRev Snap-In.
They provide a secure way to store and manage credentials within your DevRev snap-In.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are snap-ins always lowercase? Since it is a sort of a brand name for DevRev?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have modified cases for snap-ins.

This eliminates the need to expose sensitive information like passwords
or access tokens directly within your code or configuration files, enhancing overall security.

Read more about Keyrings in the [developer documentation](https://developer.devrev.ai/snapin-development/references/keyrings).
39 changes: 39 additions & 0 deletions fern/docs/pages/references/adaas/extracting_attachments.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Attachment Extraction phase
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sentence case on titles. https://www.notion.so/devrev/Which-words-in-a-heading-or-title-should-be-capitalized-81984d54dd414af5b343bce1327c05bd?pvs=4

Suggested change
# Attachment Extraction phase
# Attachment extraction phase


For the Attachment Extraction phase of the import process,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
For the Attachment Extraction phase of the import process,
For the attachment extraction phase of the import process,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had a lengthy discussion regarding capitalizing terms and well, the Airdrop team argues, that capitalizing names of processes, domain objects, ... and other domain entities would make the documentation more clear.

Can we cross-link everything with the Terminology section?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed the Terminology section.

the Extractor has to upload each Attachment to DevRev’s S3 using the using S3Interact.
samod marked this conversation as resolved.
Show resolved Hide resolved

After uploading an attachment or a batch of attachments, the Extractor also has to prepare and upload an SSOR attachment file.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is SSOR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tried to simplify and clarify this section. SSOR (meaning Source System of Record) is a term we would like to avoid in the future.

It should contain the DevRev IDs of the extracted attachments, along with the parent and actor IDs from the External System.
This needs to be done because only the Extractor knows to what the attachment was attached in the External system.
The SSOR attachment file should then be sent just like any normal Artifact, but with the `ssor_attachment` item type.

## Examples

Here is an example of an SSOR attachment file:
```json
[
{
"id": {
"external": "don:core:dvrv-us-1:devo/1:artifact/1" // The DON of the uploaded artifact
},
"parent_id": {
"external": "12345", // ID of the parent in the source system
},
"actor_id": {
"external": "123456", // ID of the uploader in the source system
}
},
{
"id": {
"external": "don:core:dvrv-us-1:devo/1:artifact/2" // The DON of the uploaded artifact
},
"parent_id": {
"external": "12344", // ID of the parent in the source system
},
"actor_id": {
"external": "123457", // ID of the uploader in the source system
}
}
]
```
15 changes: 15 additions & 0 deletions fern/docs/pages/references/adaas/extracting_data.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# Data Extraction Phase
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too short to be its own page. Combine it with another page, maybe "Extraction phases".


In the data extraction phase, the Extractor is expected to call the External System’s APIs
to retrieve all the items that were updated since the start of the last extraction.
If there was no previous extraction (the current run is an Initial Import), then all the items should be extracted.
The Extractor should remember at what time it started each extraction,
so that it can extract only items created and/or updated since this date in the next extraction run.
The reason for remembering it at the start of the extraction, and not at the end, is to allow some overlap
in case the items are updated during extraction (which happens very often in bigger organizations).

Each batch of extracted items (the recommended batch size is 2000-5000 items) must be formatted in JSONL
(JSON Lines format), gzipped, and submitted as an Artifact to S3Interact (with tooling from `@devrev/adaas-sdk`).

Each Artifact is submitted with an `item_type`, defining a separate domain object from the External System.
Item types defined when uploading extracted data must match those in the Initial Domain Mapping Artifact.
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Extract External Sync Units Phase
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too short to be its own page. Combine it with another page, maybe "Extraction phases".


In the External Sync Unit extraction phase, the Extractor is expected to get the list of projects, repositories, etc.
(whatever the equivalent of those is called in the External System)
that it can extract with the provided credentials and send it to Airdrop in its response.

The most important structure in the message is the list of External Sync Units (event_data.external_sync_units in JSON),
which contains the following fields:
- ID: This is the unique identifier in the External System
- Name: This is the human-readable name in the External System
- Description: A short description if the External System provides it
- Item count (item_count): The number of items (issues, tickets, comments, etc.) in the External System,
if it can be obtained in a very lightweight manner, such as by calling a special API endpoint.
If there is no such way to get it, i.e., the items would need to be extracted to count them,
then the item count should be `-1`, to avoid blocking the import with long-running queries.

Example:
```json
[
{
"id": "a-microservice-repository",
"name": "A Microservice Repository",
"description": "Our greatest microservice repo",
"item_count": 232
}
]
```
Loading
Loading