Skip to content

Commit

Permalink
chore: bump datasets from 2.16.1 to 2.19.1 in /presets/tuning/tfs (#380)
Browse files Browse the repository at this point in the history
Bumps [datasets](https://github.com/huggingface/datasets) from 2.16.1 to
2.19.1.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/huggingface/datasets/releases">datasets's
releases</a>.</em></p>
<blockquote>
<h2>2.19.1</h2>
<h2>Bug fixes</h2>
<ul>
<li>Fix download for dict of dicts of URLs by <a
href="https://github.com/albertvillanova"><code>@​albertvillanova</code></a>
in <a
href="https://redirect.github.com/huggingface/datasets/pull/6871">huggingface/datasets#6871</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/huggingface/datasets/compare/2.19.0...2.19.1">https://github.com/huggingface/datasets/compare/2.19.0...2.19.1</a></p>
<h2>2.19.0</h2>
<h2>Dataset Features</h2>
<ul>
<li>Add Polars compatibility by <a
href="https://github.com/psmyth94"><code>@​psmyth94</code></a> in <a
href="https://redirect.github.com/huggingface/datasets/pull/6531">huggingface/datasets#6531</a>
<ul>
<li>convert to a Polars dataframe using <code>.to_polars()</code>;
<pre lang="python"><code>import polars as pl
from datasets import load_dataset
ds = load_dataset(&quot;DIBT/10k_prompts_ranked&quot;,
split=&quot;train&quot;)
ds.to_polars() \
    .groupby(&quot;topic&quot;) \
    .agg(pl.len(), pl.first()) \
    .sort(&quot;len&quot;, descending=True)
</code></pre>
</li>
<li>Use Polars formatting to return Polars objects when accessing a
dataset:
<pre lang="python"><code>ds = ds.with_format(&quot;polars&quot;)
ds[:10].group_by(&quot;kind&quot;).len()
</code></pre>
</li>
</ul>
</li>
<li>Add <code>fsspec</code> support for <code>to_json</code>,
<code>to_csv</code>, and <code>to_parquet</code> by <a
href="https://github.com/alvarobartt"><code>@​alvarobartt</code></a> in
<a
href="https://redirect.github.com/huggingface/datasets/pull/6096">huggingface/datasets#6096</a>
<ul>
<li>Save on HF in any file format:
<pre
lang="python"><code>ds.to_json(&quot;hf://datasets/username/my_json_dataset/data.jsonl&quot;)
ds.to_csv(&quot;hf://datasets/username/my_csv_dataset/data.csv&quot;)

ds.to_parquet(&quot;hf://datasets/username/my_parquet_dataset/data.parquet&quot;)
</code></pre>
</li>
</ul>
</li>
<li>Add <code>mode</code> parameter to <code>Image</code> feature by <a
href="https://github.com/mariosasko"><code>@​mariosasko</code></a> in <a
href="https://redirect.github.com/huggingface/datasets/pull/6735">huggingface/datasets#6735</a>
<ul>
<li>Set images to be read in a certain mode like &quot;RGB&quot;
<pre lang="python"><code>dataset =
dataset.cast_column(&quot;image&quot;, Image(mode=&quot;RGB&quot;))
</code></pre>
</li>
</ul>
</li>
<li>Add CLI function to convert script-dataset to Parquet by <a
href="https://github.com/albertvillanova"><code>@​albertvillanova</code></a>
in <a
href="https://redirect.github.com/huggingface/datasets/pull/6795">huggingface/datasets#6795</a>
<ul>
<li>run command to open a PR in script-based dataset to convert it to
Parquet:
<pre><code>datasets-cli convert_to_parquet &lt;dataset_id&gt;
</code></pre>
</li>
</ul>
</li>
<li>Add Dataset.take and Dataset.skip by <a
href="https://github.com/lhoestq"><code>@​lhoestq</code></a> in <a
href="https://redirect.github.com/huggingface/datasets/pull/6813">huggingface/datasets#6813</a>
<ul>
<li>same as IterableDataset.take and IterableDataset.skip
<pre lang="python"><code>ds = ds.take(10) # take only the first 10
examples
</code></pre>
</li>
</ul>
</li>
</ul>
<h2>General improvements and bug fixes</h2>
<ul>
<li>Bump huggingface-hub lower version to 0.21.2 by <a
href="https://github.com/albertvillanova"><code>@​albertvillanova</code></a>
in <a
href="https://redirect.github.com/huggingface/datasets/pull/6713">huggingface/datasets#6713</a></li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/huggingface/datasets/commit/bb2664cf540d5ce4b066365e7c8b26e7f1ca4743"><code>bb2664c</code></a>
Release 2.19.1 (<a
href="https://redirect.github.com/huggingface/datasets/issues/6872">#6872</a>)</li>
<li><a
href="https://github.com/huggingface/datasets/commit/a5a76a410a5b6407f43479357eba2b1c370bb9c1"><code>a5a76a4</code></a>
Fix download for dict of dicts of URLs (<a
href="https://redirect.github.com/huggingface/datasets/issues/6871">#6871</a>)</li>
<li><a
href="https://github.com/huggingface/datasets/commit/0d3c7462bc67407c42d3ad102b7f9d5914219d9d"><code>0d3c746</code></a>
Release: 2.19.0 (<a
href="https://redirect.github.com/huggingface/datasets/issues/6825">#6825</a>)</li>
<li><a
href="https://github.com/huggingface/datasets/commit/0bc709af303c8dc64c973a17016bd5aa5db2f3d5"><code>0bc709a</code></a>
Fix parquet export infos (<a
href="https://redirect.github.com/huggingface/datasets/issues/6822">#6822</a>)</li>
<li><a
href="https://github.com/huggingface/datasets/commit/2a14271263da2fda9f966af41c7bd885bfa42256"><code>2a14271</code></a>
Make convert_to_parquet CLI command create script branch (<a
href="https://redirect.github.com/huggingface/datasets/issues/6809">#6809</a>)</li>
<li><a
href="https://github.com/huggingface/datasets/commit/5eb93f61f9f6e7fefba5d800defe21e50ddf8c58"><code>5eb93f6</code></a>
Support indexable objects in <code>Dataset.__getitem__</code> (<a
href="https://redirect.github.com/huggingface/datasets/issues/6817">#6817</a>)</li>
<li><a
href="https://github.com/huggingface/datasets/commit/8983a3b4dec315bf25331a6065cb74de9017f0e8"><code>8983a3b</code></a>
add allow_primitive_to_str and allow_decimal_to_str instead of
allow_number_t...</li>
<li><a
href="https://github.com/huggingface/datasets/commit/a188022dc43a76a119d90c03832d51d6e4a94d91"><code>a188022</code></a>
Extract data on the fly in packaged builders (<a
href="https://redirect.github.com/huggingface/datasets/issues/6784">#6784</a>)</li>
<li><a
href="https://github.com/huggingface/datasets/commit/ed8860faef3e751f3b77c08e09ce723a74d2c2e5"><code>ed8860f</code></a>
Remove <code>os.path.relpath</code> in <code>resolve_patterns</code> (<a
href="https://redirect.github.com/huggingface/datasets/issues/6815">#6815</a>)</li>
<li><a
href="https://github.com/huggingface/datasets/commit/55eb1d9a34a91dbf2418166f9f1d92f7181e778b"><code>55eb1d9</code></a>
Add Dataset.take and Dataset.skip (<a
href="https://redirect.github.com/huggingface/datasets/issues/6813">#6813</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/huggingface/datasets/compare/2.16.1...2.19.1">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=datasets&package-manager=pip&previous-version=2.16.1&new-version=2.19.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <[email protected]>
Signed-off-by: Ishaan Sehgal <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Ishaan Sehgal <[email protected]>
  • Loading branch information
dependabot[bot] and ishaansehgal99 committed May 16, 2024
1 parent 8587d35 commit ff1d352
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion presets/tuning/tfs/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ pydantic>=2.7.1,<2.8 # Allow patch updates
uvicorn[standard]>=0.29.0,<0.30.0 # Allow patch updates

# Utility libraries
datasets==2.16.1
datasets==2.19.1
peft==0.8.2
bitsandbytes==0.42.0

Expand Down

0 comments on commit ff1d352

Please sign in to comment.