Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for new-style string formatting #38

Closed
sprt opened this issue May 5, 2015 · 4 comments
Closed

Add support for new-style string formatting #38

sprt opened this issue May 5, 2015 · 4 comments

Comments

@sprt
Copy link

sprt commented May 5, 2015

No description provided.

@MattDMo
Copy link
Owner

MattDMo commented May 26, 2015

Can you please elaborate? What sort of support would you like to see?

@sprt
Copy link
Author

sprt commented May 26, 2015

See https://docs.python.org/2/library/string.html#formatstrings

For example, {} in '{}'.format(foo) isn't properly highlighted but '%s' % foo is.

@MattDMo
Copy link
Owner

MattDMo commented May 26, 2015

OK, I'll see what I can do. Development is kind of at a low ebb at the moment, as I'm pretty busy with my real job, but I'll definitely put this on the list. If you have any contributions to make, feel free to submit a PR.

@sprt
Copy link
Author

sprt commented Jun 13, 2015

Okay I came up with this regex:

(?<![^\{]\{)\{(?:(?:(?:[A-Za-z_]\w*|(?:[1-9]\d*|0|0[Oo]?[0-7]+|0[Xx][0-7]+|0[Bb][01]+)))?(?:\.[A-Za-z_]\w*|\[(?:(?:[1-9]\d*|0|0[Oo]?[0-7]+|0[Xx][0-7]+|0[Bb][01]+)|[^\]\}}\{{]+)\])*)?(?:\![rsa])?(?:\:(?:(?:.?[<>=\^])?[\x20+-]?\#?0?(?:[1-9]\d*|0|0[Oo]?[0-7]+|0[Xx][0-7]+|0[Bb][01]+)?,?(?:\.(?:[1-9]\d*|0|0[Oo]?[0-7]+|0[Xx][0-7]+|0[Bb][01]+))?[bcdEeFfGgnosXx%]?))?\}(?!\}[^\}])

This is the script I used to build it:

import re

def merge(*args):
    return r'(?:{})'.format(r'|'.join(args))

# integer
bininteger = r'0[Bb][01]+'
hexinteger = r'0[Xx][0-7]+'
octinteger = r'0[Oo]?[0-7]+'
decimalinteger = r'[1-9]\d*|0'
integer = merge(decimalinteger, octinteger, hexinteger, bininteger)

# format_spec
type_ = r'[bcdEeFfGgnosXx%]'
precision = integer
width = integer
sign = r'[\x20+-]'
align = r'[<>=\^]'
fill = r'.'
format_spec = r'(?:(?:{fill}?{align})?{sign}?\#?0?{width}?,?(?:\.{precision})?{type_}?)'.format(**locals())

conversion = r'[rsa]'
index_string = r'[^\]\}}\{{]+'
identifier = r'[A-Za-z_]\w*'
element_index = merge(integer, index_string)
attribute_name = identifier
arg_name = r'(?:{})?'.format(merge(identifier, integer))
field_name = r'(?:{arg_name}(?:\.{attribute_name}|\[{element_index}\])*)'.format(**locals())
replacement_field = r'(?<![^\{{]\{{)\{{{field_name}?(?:\!{conversion})?(?:\:{format_spec})?\}}(?!\}}[^\}}])'.format(**locals())

print(replacement_field)

You can see here that it works pretty well. I just copy-pasted the examples from the Python docs.

These are the only strings it can't parse:

'{:%Y-%m-%d %H:%M:%S}'
'{0:{fill}{align}16}'
'{0:{width}{base}}'

but to be honest they don't look valid according to the grammar. There seem to be some discrepancies between the docs and the implementation as I had to edit the index_string regex according to this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants