gguf_dump.py: fix markddown kv array print #8588

mofosyne · 2024-07-19T13:27:54Z

I have read the contributing guidelines
Self-reported review complexity:
- Low
- Medium
- High

The initial gguf dump didn't match the output of main.cpp so must have read it wrong. Adjusted the python script until it matched

POS	TYPE	Count	Key	Value
1	UINT32	1	GGUF.version	3
2	UINT64	1	GGUF.tensor_count	75
3	UINT64	1	GGUF.kv_count	33
4	STRING	1	general.architecture	'llama'
5	STRING	1	general.type	'model'
6	STRING	1	general.name	'TinyLLama'
7	STRING	1	general.author	'Maykeye'
8	STRING	1	general.version	'v0.0'
9	STRING	1	general.description	'This gguf is ported from a first version of Maykeye attempt '
10	STRING	1	general.quantized_by	'Mofosyne'
11	STRING	1	general.size_label	'4.6M'
12	STRING	1	general.license	'apache-2.0'
13	STRING	1	general.url	'https://huggingface.co/mofosyne/TinyLLama-v0-llamafile'
14	STRING	1	general.source.url	'https://huggingface.co/Maykeye/TinyLLama-v0'
15	[STRING]	5	general.tags	[ 'text generation', 'transformer', 'llama', 'tiny', 'tiny model' ]
16	[STRING]	1	general.languages	[ 'en' ]
17	[STRING]	2	general.datasets	[ 'https://huggin...GPT4-train.txt', 'https://huggin...GPT4-valid.txt' ]
18	UINT32	1	llama.block_count	8
19	UINT32	1	llama.context_length	2048
20	UINT32	1	llama.embedding_length	64
21	UINT32	1	llama.feed_forward_length	256
22	UINT32	1	llama.attention.head_count	16
23	FLOAT32	1	llama.attention.layer_norm_rms_epsilon	1e-06
24	UINT32	1	general.file_type	1
25	UINT32	1	llama.vocab_size	32000
26	UINT32	1	llama.rope.dimension_count	4
27	STRING	1	tokenizer.ggml.model	'llama'
28	STRING	1	tokenizer.ggml.pre	'default'
29	[STRING]	32000	tokenizer.ggml.tokens	[ '', '~~', '~~', '<0x00>', '<0x01>', ... ]
30	[FLOAT32]	32000	tokenizer.ggml.scores	[ 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ... ]
31	[INT32]	32000	tokenizer.ggml.token_type	[ 2, 3, 3, 6, 6, 6, 6, ... ]
32	UINT32	1	tokenizer.ggml.bos_token_id	1
33	UINT32	1	tokenizer.ggml.eos_token_id	2
34	UINT32	1	tokenizer.ggml.unknown_token_id	0
35	UINT32	1	tokenizer.ggml.padding_token_id	0
36	UINT32	1	general.quantization_version	2

From main.cpp:

llama_model_loader: loaded meta data with 33 key-value pairs and 75 tensors from Tinyllama-4.6M-v0.0-F16.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = llama
llama_model_loader: - kv   1:                               general.type str              = model
llama_model_loader: - kv   2:                               general.name str              = TinyLLama
llama_model_loader: - kv   3:                             general.author str              = Maykeye
llama_model_loader: - kv   4:                            general.version str              = v0.0
llama_model_loader: - kv   5:                        general.description str              = This gguf is ported from a first vers...
llama_model_loader: - kv   6:                       general.quantized_by str              = Mofosyne
llama_model_loader: - kv   7:                         general.size_label str              = 4.6M
llama_model_loader: - kv   8:                            general.license str              = apache-2.0
llama_model_loader: - kv   9:                                general.url str              = https://huggingface.co/mofosyne/TinyL...
llama_model_loader: - kv  10:                         general.source.url str              = https://huggingface.co/Maykeye/TinyLL...
llama_model_loader: - kv  11:                               general.tags arr[str,5]       = ["text generation", "transformer", "l...
llama_model_loader: - kv  12:                          general.languages arr[str,1]       = ["en"]
llama_model_loader: - kv  13:                           general.datasets arr[str,2]       = ["https://huggingface.co/datasets/ron...
llama_model_loader: - kv  14:                          llama.block_count u32              = 8
llama_model_loader: - kv  15:                       llama.context_length u32              = 2048
llama_model_loader: - kv  16:                     llama.embedding_length u32              = 64
llama_model_loader: - kv  17:                  llama.feed_forward_length u32              = 256
llama_model_loader: - kv  18:                 llama.attention.head_count u32              = 16
llama_model_loader: - kv  19:     llama.attention.layer_norm_rms_epsilon f32              = 0.000001
llama_model_loader: - kv  20:                          general.file_type u32              = 1
llama_model_loader: - kv  21:                           llama.vocab_size u32              = 32000
llama_model_loader: - kv  22:                 llama.rope.dimension_count u32              = 4
llama_model_loader: - kv  23:                       tokenizer.ggml.model str              = llama
llama_model_loader: - kv  24:                         tokenizer.ggml.pre str              = default
llama_model_loader: - kv  25:                      tokenizer.ggml.tokens arr[str,32000]   = ["<unk>", "<s>", "</s>", "<0x00>", "<...
llama_model_loader: - kv  26:                      tokenizer.ggml.scores arr[f32,32000]   = [0.000000, 0.000000, 0.000000, 0.0000...
llama_model_loader: - kv  27:                  tokenizer.ggml.token_type arr[i32,32000]   = [2, 3, 3, 6, 6, 6, 6, 6, 6, 6, 6, 6, ...
llama_model_loader: - kv  28:                tokenizer.ggml.bos_token_id u32              = 1
llama_model_loader: - kv  29:                tokenizer.ggml.eos_token_id u32              = 2
llama_model_loader: - kv  30:            tokenizer.ggml.unknown_token_id u32              = 0
llama_model_loader: - kv  31:            tokenizer.ggml.padding_token_id u32              = 0
llama_model_loader: - kv  32:               general.quantization_version u32              = 2

gguf-py/scripts/gguf_dump.py

Co-authored-by: compilade <[email protected]>

mofosyne · 2024-07-20T04:34:35Z

@compilade thanks. This is how it will look like now

POS	TYPE	Count	Key	Value
1	UINT32	1	GGUF.version	3
2	UINT64	1	GGUF.tensor_count	75
3	UINT64	1	GGUF.kv_count	33
4	STRING	1	general.architecture	"`llama`"
5	STRING	1	general.type	"`model`"
6	STRING	1	general.name	"`TinyLLama`"
7	STRING	1	general.author	"`Maykeye`"
8	STRING	1	general.version	"`v0.0`"
9	STRING	1	general.description	"`This gguf is ported from a first version of Maykeye attempt` "
10	STRING	1	general.quantized_by	"`Mofosyne`"
11	STRING	1	general.size_label	"`4.6M`"
12	STRING	1	general.license	"`apache-2.0`"
13	STRING	1	general.url	"`https://huggingface.co/mofosyne/TinyLLama-v0-llamafile`"
14	STRING	1	general.source.url	"`https://huggingface.co/Maykeye/TinyLLama-v0`"
15	[STRING]	5	general.tags	[ "`text generation`", "`transformer`", "`llama`", "`tiny`", "`tiny model`" ]
16	[STRING]	1	general.languages	[ "`en`" ]
17	[STRING]	2	general.datasets	[ "`https://hugging`...`-GPT4-train.txt`", "`https://hugging`...`-GPT4-valid.txt`" ]
18	UINT32	1	llama.block_count	8
19	UINT32	1	llama.context_length	2048
20	UINT32	1	llama.embedding_length	64
21	UINT32	1	llama.feed_forward_length	256
22	UINT32	1	llama.attention.head_count	16
23	FLOAT32	1	llama.attention.layer_norm_rms_epsilon	1e-06
24	UINT32	1	general.file_type	1
25	UINT32	1	llama.vocab_size	32000
26	UINT32	1	llama.rope.dimension_count	4
27	STRING	1	tokenizer.ggml.model	"`llama`"
28	STRING	1	tokenizer.ggml.pre	"`default`"
29	[STRING]	32000	tokenizer.ggml.tokens	[ "`<unk>`", "`<s>`", "`</s>`", "`<0x00>`", "`<0x01>`", ... ]
30	[FLOAT32]	32000	tokenizer.ggml.scores	[ 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ... ]
31	[INT32]	32000	tokenizer.ggml.token_type	[ 2, 3, 3, 6, 6, 6, 6, ... ]
32	UINT32	1	tokenizer.ggml.bos_token_id	1
33	UINT32	1	tokenizer.ggml.eos_token_id	2
34	UINT32	1	tokenizer.ggml.unknown_token_id	0
35	UINT32	1	tokenizer.ggml.padding_token_id	0
36	UINT32	1	general.quantization_version	2

compilade · 2024-07-20T04:39:54Z

gguf-py/scripts/gguf_dump.py

@@ -249,21 +249,29 @@ def dump_markdown_metadata(reader: GGUFReader, args: argparse.Namespace) -> None
        if len(field.types) == 1:
            curr_type = field.types[0]
            if curr_type == GGUFValueType.STRING:
-                value = repr(str(bytes(field.parts[-1]), encoding='utf-8')[:60])
+                value = "\"`{strval}`\"".format(strval=str(bytes(field.parts[-1]), encoding='utf-8')[:60])


The quotes render a bit weird, and what if the string contains `? I suggest to remove the quotes or to include them inside the inline code blocks, and... Hmm not sure how to escape ` except by adding more surrounding ` than the longest inner occurrence, and separate the delimiters by spaces if the string happens to start or finish with `.

I don't know if there's a limit, let's see: ```````````````````` (20 inner, 21 outer `) seems to work, so there might be no limit.

Added the inline code blocks because of <unk> rendering weirdly... I'm inclined to just remove "

compilade · 2024-07-20T04:45:24Z

gguf-py/scripts/gguf_dump.py

+                        else:
+                            array_elements.append(value_string)
+                    value_array_inner = ["\"`{strval}`\"".format(strval=strval) for strval in array_elements]
+                    value = f'[ {", ".join(value_array_inner).strip()}{", ..." if total_elements > len(array_elements) else ""} ]'


This is good, but conditionally appending "..." to value_array_inner might be better than inserting the string ", ..." after.

compilade

The changes look reasonable, but you might want to fix escaping and/or change the truncation of inner strings in lists of strings.

mofosyne · 2024-07-20T05:27:36Z

The changes look reasonable, but you might want to fix escaping and/or change the truncation of inner strings in lists of strings.

POS	TYPE	Count	Key	Value
1	UINT32	1	GGUF.version	3
2	UINT64	1	GGUF.tensor_count	75
3	UINT64	1	GGUF.kv_count	33
4	STRING	1	general.architecture	`llama`
5	STRING	1	general.type	`model`
6	STRING	1	general.name	`TinyLLama`
7	STRING	1	general.author	`Maykeye`
8	STRING	1	general.version	`v0.0`
9	STRING	1	general.description	`This gguf is ported from a fir`...`M but using Llama architecture`
10	STRING	1	general.quantized_by	`Mofosyne`
11	STRING	1	general.size_label	`4.6M`
12	STRING	1	general.license	`apache-2.0`
13	STRING	1	general.url	`https://huggingface.co/mofosyne/TinyLLama-v0-llamafile`
14	STRING	1	general.source.url	`https://huggingface.co/Maykeye/TinyLLama-v0`
15	[STRING]	5	general.tags	[ `text generation`, `transformer`, `llama`, `tiny`, `tiny model` ]
16	[STRING]	1	general.languages	[ `en` ]
17	[STRING]	2	general.datasets	[ `https://hugging`...`-GPT4-train.txt`, `https://hugging`...`-GPT4-valid.txt` ]
18	UINT32	1	llama.block_count	8
19	UINT32	1	llama.context_length	2048
20	UINT32	1	llama.embedding_length	64
21	UINT32	1	llama.feed_forward_length	256
22	UINT32	1	llama.attention.head_count	16
23	FLOAT32	1	llama.attention.layer_norm_rms_epsilon	1e-06
24	UINT32	1	general.file_type	1
25	UINT32	1	llama.vocab_size	32000
26	UINT32	1	llama.rope.dimension_count	4
27	STRING	1	tokenizer.ggml.model	`llama`
28	STRING	1	tokenizer.ggml.pre	`default`
29	[STRING]	32000	tokenizer.ggml.tokens	[ `<unk>`, `<s>`, `</s>`, `<0x00>`, `<0x01>`, ... ]
30	[FLOAT32]	32000	tokenizer.ggml.scores	[ 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ... ]
31	[INT32]	32000	tokenizer.ggml.token_type	[ 2, 3, 3, 6, 6, 6, 6, ... ]
32	UINT32	1	tokenizer.ggml.bos_token_id	1
33	UINT32	1	tokenizer.ggml.eos_token_id	2
34	UINT32	1	tokenizer.ggml.unknown_token_id	0
35	UINT32	1	tokenizer.ggml.padding_token_id	0
36	UINT32	1	general.quantization_version	2

How about this?

mofosyne · 2024-07-20T05:29:19Z

FYI, I'm pretty happy with this now. If you are happy with the adjustments, you can press merge whenever.

gguf-py/scripts/gguf_dump.py

>>> escape_markdown_inline_code("hello world") '`hello world`' >>> escape_markdown_inline_code("hello ` world") '``hello ` world``'

…tring

mofosyne · 2024-07-20T08:00:58Z

On a side note, added the dump to https://huggingface.co/mofosyne/TinyLLama-v0-5M-F16-llamafile/blob/main/TinyLLama-4.6M-v0.0-F16.dump.md so you can see how it appears in huggingface as well.

* gguf_dump.py: fix markddown kv array print * Update gguf-py/scripts/gguf_dump.py Co-authored-by: compilade <[email protected]> * gguf_dump.py: refactor kv array string handling * gguf_dump.py: escape backticks inside of strings * gguf_dump.py: inline code markdown escape handler added >>> escape_markdown_inline_code("hello world") '`hello world`' >>> escape_markdown_inline_code("hello ` world") '``hello ` world``' * gguf_dump.py: handle edge case about backticks on start or end of a string --------- Co-authored-by: compilade <[email protected]>

mofosyne added the Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix label Jul 19, 2024

github-actions bot added the python python script changes label Jul 19, 2024

gguf_dump.py: fix markddown kv array print

d99a34b

mofosyne force-pushed the gguf-dump-fix-markdown-kv-array-print branch from 834de1a to d99a34b Compare July 19, 2024 13:36

mofosyne added the bugfix fixes an issue or bug label Jul 19, 2024

mofosyne requested a review from compilade July 20, 2024 02:30

compilade reviewed Jul 20, 2024

View reviewed changes

gguf-py/scripts/gguf_dump.py Outdated Show resolved Hide resolved

gguf-py/scripts/gguf_dump.py Outdated Show resolved Hide resolved

gguf-py/scripts/gguf_dump.py Outdated Show resolved Hide resolved

gguf-py/scripts/gguf_dump.py Outdated Show resolved Hide resolved

mofosyne and others added 2 commits July 20, 2024 13:42

Update gguf-py/scripts/gguf_dump.py

130d396

Co-authored-by: compilade <[email protected]>

gguf_dump.py: refactor kv array string handling

923886f

compilade reviewed Jul 20, 2024

View reviewed changes

compilade approved these changes Jul 20, 2024

View reviewed changes

gguf_dump.py: escape backticks inside of strings

1d37843

compilade reviewed Jul 20, 2024

View reviewed changes

gguf-py/scripts/gguf_dump.py Outdated Show resolved Hide resolved

mofosyne added 2 commits July 20, 2024 15:59

gguf_dump.py: inline code markdown escape handler added

50d55d6

>>> escape_markdown_inline_code("hello world") '`hello world`' >>> escape_markdown_inline_code("hello ` world") '``hello ` world``'

gguf_dump.py: handle edge case about backticks on start or end of a s…

1949847

…tring

mofosyne added the merge ready indicates that this may be ready to merge soon and is just holding out in case of objections label Jul 20, 2024

mofosyne merged commit c3776ca into ggerganov:master Jul 20, 2024
8 checks passed

mofosyne deleted the gguf-dump-fix-markdown-kv-array-print branch July 20, 2024 07:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gguf_dump.py: fix markddown kv array print #8588

gguf_dump.py: fix markddown kv array print #8588

mofosyne commented Jul 19, 2024

mofosyne commented Jul 20, 2024

compilade Jul 20, 2024

mofosyne Jul 20, 2024 •

edited

Loading

compilade Jul 20, 2024

compilade left a comment •

edited

Loading

mofosyne commented Jul 20, 2024

mofosyne commented Jul 20, 2024

mofosyne commented Jul 20, 2024 •

edited

Loading

gguf_dump.py: fix markddown kv array print #8588

gguf_dump.py: fix markddown kv array print #8588

Conversation

mofosyne commented Jul 19, 2024

mofosyne commented Jul 20, 2024

compilade Jul 20, 2024

Choose a reason for hiding this comment

mofosyne Jul 20, 2024 • edited Loading

Choose a reason for hiding this comment

compilade Jul 20, 2024

Choose a reason for hiding this comment

compilade left a comment • edited Loading

Choose a reason for hiding this comment

mofosyne commented Jul 20, 2024

mofosyne commented Jul 20, 2024

mofosyne commented Jul 20, 2024 • edited Loading

mofosyne Jul 20, 2024 •

edited

Loading

compilade left a comment •

edited

Loading

mofosyne commented Jul 20, 2024 •

edited

Loading