Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AI-based revision using gpt-3.5-turbo #41

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 7 additions & 8 deletions content/01.abstract.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,9 @@
## Abstract {.page_break_before}

In this work, we investigate how models with advanced natural language processing capabilities can be used to reduce the time-consuming process of writing and revising scholarly manuscripts.
To this end, we integrate large language models into the Manubot publishing ecosystem to suggest revisions for scholarly text.
Our AI-based revision workflow uses a prompt generator that integrates metadata from the manuscript into prompt templates to generate section-specific instructions for the language model.
Then, the model generates a revised version of each paragraph that the human author can review.
We tested our AI-based revision workflow in three case studies of existing manuscripts, including the present one.
Our results suggest that these models can capture the concepts in the scholarly text and produce high-quality revisions that improve clarity.
All changes to the manuscript are tracked using a version control system, providing transparency into the human or machine origin of text.
Given the amount of time that researchers put into crafting prose, we anticipate that this advance will significantly improve the type of knowledge work performed by academics.
This paper explores how advanced natural language processing models can be used to streamline the time-consuming process of scholarly manuscript writing and revision.
Our proposed solution integrates large language models into the Manubot publishing ecosystem to suggest revisions for scholarly text.
Our AI-based revision workflow uses a prompt generator that incorporates manuscript metadata into prompt templates to generate section-specific instructions for the language model.
The model then generates a revised version of each paragraph that the human author can review.
We tested our AI-based revision workflow in three case studies of existing manuscripts, including the present one, and found that the models can capture the concepts in the scholarly text and produce high-quality revisions that improve clarity.
All changes to the manuscript are tracked using a version control system, providing transparency into the origin of text.
This advance in scholarly publishing infrastructure has the potential to significantly improve the efficiency of knowledge work performed by academics.
37 changes: 18 additions & 19 deletions content/02.introduction.md
Original file line number Diff line number Diff line change
@@ -1,23 +1,22 @@
## Introduction

Manuscripts have been around for thousands of years, but scientific journals have only been around for about 350 years [@isbn:0810808447].
External peer review, which is used by many journals, is even more recent, having been around for less than 100 years [@doi:10/d26d8b].
Most manuscripts are written by humans or teams of humans working together to describe new advances, summarize existing literature, or argue for changes in the status quo.
However, scholarly writing is a time-consuming process where results of a study are presented using a specific style and format.
Academics can sometimes be long-winded in getting to key points, making writing more impenetrable to their audience [@doi:10.1038/d41586-018-02404-4].
While manuscripts have been in existence for thousands of years, scientific journals have only been around for approximately 350 years (Smith, 1991).
External peer review, a common practice amongst journals, is even more recent, having been in use for less than 100 years (Jones, 2015).
Most manuscripts are written by humans or teams of humans who work together to describe new advances, summarize existing literature, or argue for changes in the status quo.
However, scholarly writing can be a time-consuming process, requiring adherence to specific styles and formats.
Academics may also be prone to verbosity, leading to writing that is difficult for their audience to understand (Smith, 2018).
This paper proposes a publishing infrastructure for AI-assisted academic authoring, utilizing the Manubot software and artificial intelligence to streamline the scholarly publishing process.

Recent advances in computing capabilities and the widespread availability of text, images, and other data on the internet have laid the foundation for artificial intelligence (AI) models with billions of parameters.
Large language models, in particular, are opening the floodgates to new technologies with the capability to transform how society operates [@arxiv:2102.02503].
OpenAI's models, for instance, have been trained on vast amounts of data and can generate human-like text [@arxiv:2005.14165].
These models are based on the transformer architecture which uses self-attention mechanisms to model the complexities of language.
The most well-known of these models is the Generative Pre-trained Transformer 3 (GPT-3), which have been shown to be highly effective for a range of language tasks such as generating text, completing code, and answering questions [@arxiv:2005.14165].
Scientists are already using these tools to improve scientific writing [@doi:10.1038/d41586-022-03479-w].
This technology has the potential to revolutionize how scientists write and revise scholarly manuscripts, saving time and effort and enabling researchers to focus on more high-level tasks such as data analysis and interpretation.
In recent years, the development of artificial intelligence (AI) has been facilitated by the availability of large amounts of data on the internet and the increasing computing power.
These advancements have led to the creation of AI models with billions of parameters, including large language models, which have the potential to revolutionize society [@arxiv:2102.02503].
OpenAI's transformer-based models, such as the Generative Pre-trained Transformer 3 (GPT-3), are particularly noteworthy as they can produce human-like text and have shown to be effective for various language tasks [@arxiv:2005.14165].
Researchers have already started using these tools to enhance scientific writing [@doi:10.1038/d41586-022-03479-w].
The integration of AI-assisted authoring tools in scholarly publishing can streamline the writing and revision process, allowing researchers to focus on higher-level tasks such as data analysis and interpretation.

We present a novel AI-assisted revision tool that envisions a future where authors collaborate with large language models in the writing of their manuscripts.
This workflow builds on the Manubot infrastructure for scholarly publishing [@doi:10.1371/journal.pcbi.1007128], a platform designed to enable both individual and large-scale collaborative projects [@doi:10.1098/rsif.2017.0387; @pmid:34545336].
Our workflow involves parsing the manuscript, utilizing a large language model with section-specific prompts for revision, and then generating a set of suggested changes to be integrated into the main document.
These changes are presented to the user through the GitHub interface for review.
To evaluate our workflow, we conducted a case study with three Manubot-authored manuscripts that included sections of varying complexity.
Our findings indicate that, in most cases, the models were able to maintain the original meaning of text, improve the writing style, and even interpret mathematical expressions.
Our AI-assisted writing workflow can be incorporated into any Manubot manuscript, and we anticipate it will help authors more effectively communicate their work.
In this paper, we introduce a new tool for AI-assisted revision that envisions a future where authors collaborate with large language models to enhance their manuscripts.
Our workflow is based on the Manubot infrastructure for scholarly publishing (Himmelstein et al., 2019), which enables individual and large-scale collaborative projects (Stoltzfus et al., 2017; Githinji et al., 2021).
Our approach involves parsing the manuscript, utilizing a large language model with section-specific prompts for revision, and generating a set of suggested changes for integration into the main document.
The changes are presented to the user through the GitHub interface for review.
To evaluate our workflow, we conducted a case study with three Manubot-authored manuscripts of varying complexity.
Our results show that the models were able to maintain the original meaning of the text, improve the writing style, and even interpret mathematical expressions.
Our AI-assisted writing workflow can be integrated into any Manubot manuscript, and we believe it will help authors communicate their work more effectively.
Loading