Add "strict" interface to `replicate.run()` #288

aron · 2024-07-08T13:17:59Z

This PR implements a new strict option that can be passed to run() to coerce the output into the same data types as defined by the model schema.

const stream = replicate.run("my-org/my-llm", {strict: true, input: { ... }});
for await (const chunk in stream) {
  ...
}

const files = replicate.run("my-org/my-gen-img", {strict: true, input: { ... }});
for await (const image in files) {
  const img = Object.createURLObject(await file.blob());
}

const file = replicate.run("my-org/my-model", {strict: true, input: { ... }});
const data = await file.arrayBuffer();

The main two benefits here are that:

Iterators get converted into actual async iterators instead of just arrays which means the stream() method can be deprecated in favor of a single interface. This opens up opportunities for streaming other types of output too.
Path & File outputs are backed by a Blob like representation that can be used to download the data opening up opportunities to stream files directly in the client. Note, that we don't use Blob, File or ArrayBuffer directly because it's likely the client will want the opportunity to decide whether to actually download the file or not. Similar to how body on Response is not consumed until an async read method is called. Feedback on this appreciated.

Note

The intention here is that once this code is merged and validated we migrate the SDKs to use this as the default interface to run() (a breaking change). The strict flag is a way of soft launching the feature. The name strict was chosen as in JavaScript land "use strict" and TypeScript --strict mode are existing functionality for tightening up interfaces.

The need to download the schema for the version is currently unfortunate, I'm also looking at extending the Prediction interface to include input_schema and output_schema to reduce this additional request.

Implementation

The bulk of the code exists in the coerceOutput() function, I think it implements the bulk of the Cog Output interface, but I may have missed some edge cases.

TODO

Currently there is no support for enum types.
Currently there is no support for date strings.

mattt · 2024-07-15T10:10:00Z

lib/util.js

+  #blob = null;
+  #source = null;
+  #name = null;
+
+  constructor(source, options) {
+    if (source instanceof URL) {
+      this.#source = source;
+    } else {
+      this.#blob = new Blob(source, options);
+    }
+
+    this.#name = options?.name;
+  }
+
+  toString() {
+    return `[LazyFile ${this.#source ?? this.#blob.toString()}]`;
+  }


So far, we've been conservative about JS language features like class fields, optional chaining, and nullish coalescing. These aren't new by any measure, but I'm still surprised to see them causing problems in the wild. To the extent that we can do the same without syntactic sugar (e.g. || instead of ??, && instead of ?.), I think that'd be preferable.

I think we need to tighten up how we talk about features by specifying platforms that we support and versions. We've made great progress here with the integration suite.

Currently all newer JS runtimes support these language features, Deno, Bun, CloudFlare etc, and Node has supported it since v14 released in 2020, we currently only work in 18+. All browsers since 2020 support these features (and our library doesn't work in the browser).

I'm all for being conservative, but I do also think we need to be pragmatic.

mattt · 2024-07-15T10:12:31Z

index.js

-          progress(updatedPrediction);
-        }
+    if (strict && !identifier.version) {
+      // Language models only support streaming at the moment.


Clarify potential ambiguity in use of "only" here

Suggested change

// Language models only support streaming at the moment.

// Currently, only language models support streaming

mattt · 2024-07-15T10:17:01Z

index.js

@@ -154,6 +153,7 @@ class Replicate {
      prediction = await this.predictions.create({
        ...data,
        model: `${identifier.owner}/${identifier.name}`,
+        stream: true,


Right now, this will cause all models that don't support streaming to respond with {"detail":"Streaming not supported for the output type of the requested version.","status":422}.

We've just fixed that, consumers no longer need to specify the stream: true flag. All streamable models will get a stream URL 🙌

mattt · 2024-07-15T10:24:27Z

index.js

+    const response = await this.models.versions.get(identifier.owner, identifier.name, identifier.version);
+    const { openapi_schema: schema } = response;


I think fetching the schema in a separate request is a deal-breaker here. That's something the Python client has been doing from the start, and it's added a lot of unnecessary overhead.

It's not as magical, but could we instead implement coerced run on a model / version? That way, you'd avoid duplicate requests and have a clearer expectations around how/when schemas are validated.

It's not as magical, but could we instead implement coerced run on a model / version? That way, you'd avoid duplicate requests and have a clearer expectations around how/when schemas are validated.

I'm not sure what you mean here?

aron added 2 commits July 8, 2024 14:03

Add new LazyFile interface to util.js

e4da484

Add new strict flag to replicate.run()

f03479b

mattt reviewed Jul 15, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add "strict" interface to `replicate.run()` #288

Add "strict" interface to `replicate.run()` #288

aron commented Jul 8, 2024

mattt Jul 15, 2024

aron Jul 15, 2024

mattt Jul 15, 2024

mattt Jul 15, 2024

aron Jul 15, 2024

mattt Jul 15, 2024

aron Jul 15, 2024

	// Language models only support streaming at the moment.
	// Currently, only language models support streaming

		const response = await this.models.versions.get(identifier.owner, identifier.name, identifier.version);
		const { openapi_schema: schema } = response;

Add "strict" interface to replicate.run() #288

Are you sure you want to change the base?

Add "strict" interface to replicate.run() #288

Conversation

aron commented Jul 8, 2024

mattt Jul 15, 2024

Choose a reason for hiding this comment

aron Jul 15, 2024

Choose a reason for hiding this comment

mattt Jul 15, 2024

Choose a reason for hiding this comment

mattt Jul 15, 2024

Choose a reason for hiding this comment

aron Jul 15, 2024

Choose a reason for hiding this comment

mattt Jul 15, 2024

Choose a reason for hiding this comment

aron Jul 15, 2024

Choose a reason for hiding this comment

Add "strict" interface to `replicate.run()` #288

Add "strict" interface to `replicate.run()` #288